IES990276A2 - An inter-computer communications apparatus - Google Patents

An inter-computer communications apparatus

Info

Publication number
IES990276A2
IES990276A2 IES990276A IES990276A2 IE S990276 A2 IES990276 A2 IE S990276A2 IE S990276 A IES990276 A IE S990276A IE S990276 A2 IES990276 A2 IE S990276A2
Authority
IE
Ireland
Prior art keywords
code
datastore
domain name
target
server
Prior art date
Application number
Inventor
Michael Carlile Val Cassidy
Original Assignee
Iesearch Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Iesearch Ltd filed Critical Iesearch Ltd
Priority to IES990276 priority Critical patent/IES81055B2/en
Publication of IES990276A2 publication Critical patent/IES990276A2/en
Publication of IES81055B2 publication Critical patent/IES81055B2/en

Links

Abstract

An inter-computer communications apparatus for improving the efficiency and transparency of communications between computer systems and for managing memory to reduce overall network traffic. The method and apparatus described optimise use of system resources by retrieving only selected portions of a target group of information sources, which are stored locally and automatically updated. This allows users to obtain the required information without incurring the overhead associated with network communication traffic providing complete results in a truly real time manner.

Description

An Inter-Computer Communications Apparatus The present invention relates to an inter-computer communications apparatus and more particularly to a method and apparatus for improving the efficiency and transparency of communications between computer systems. The invention also relates to a method for optimising memory management to reduce communication delays and overall network traffic. For the purposes of this specification, the term inter-computer communications refers to communication between remote and local data processing entities.
In many data processing systems, it is common to transfer data from between a number of disparate and often geographically remote sources to a local or target computer system. Occurrences of such transfers have increased at a rate, which was impossible to predict with the advent of the Internet and World Wide Web. As the number of new network users is increasing exponentially, so to are number of data requests, placing unprecedented demands on the bandwidth capability of such networks. This rate of increase shows no sign of abating and in fact is likely to increase, not least because many governments have set targets for percapita connection figures.
Bandwidth tolerances are further tested in that information sources for these networks 20 frequently use different hardware and software platforms to the local target computer. These differences increase the complexity of data transfers often necessitating additional bandwidth provision by network designers and operators. As the numbers of users increase the number of different hardware and software platforms also increase with inevitable problems. When there are a large number of variations it becomes virtually impossible to transfer data in a 25 transparent manner, in that, the data must be converted at each source, into a format suitable for use by the local target computer. Even this is not a viable solution in all situations, for example, when the source is not designed or configured for this type of operation having been developed over a long period of time. Such legacy systems contain large quantities of information, which may be required for the purposes outlined by the local system.
Obviously owners of such source systems wish to unlock the information stored to enable users to fully exploit the new technologies.
BNSDOCID: IE 990276 The data may be transferred for storage or for processing to provide a result, which may then re-transferred to the source. The most common form of transfer for users of the World Wide Web is an information request. As the numbers of users increase, the number of such requests must similarly increase. Search engines or portals normally process these information requests. It has proven impossible to quantify request traffic through even a select few mainstream search engines because of the rate of increase, however, best estimates put this figure at approximately one hundred million requests per month in early nineteen ninety eight. As each of these requests will seek information from approximately one hundred information sources the throughput demands placed on the communication system are enormous.
More efficient web browsers, such as that described in International Patent Application no. WO 98/06033 have undoubtedly improved the efficiency of information request processing, however, they have not addressed the fundamental problems associated with the bandwidth required for real time processing of information requests.
There is therefore a need for an inter-computer communication method and apparatus, which will provide communications between disparate data sources and which will overcome the aforementioned problems.
Accordingly there is provided an inter-computer communications apparatus having link means for connecting the apparatus to a computer system for communicating with a plurality of geographically remote computers using an internet data communications protocol, the link means incorporating, a server for processing information requests by retrieving data from one or more remote computers using the internet protocol, means for retrieving data associated with the information request and identifying 30 a data type and address for the retrieved data, and a translation means for automatically identifying a sequential dataset for the BNSDOCID: IE 990276 retrieved data address, the apparatus performing the sequential steps of:5 initiating a domain name seek function using the server to retrieve an interrogation routine stored in local memory; automatically identifying a target address for a target from a predefined array of target addresses, extracting and compiling a resource locator associated with the address and linking the server to the target; retrieving and streaming un-interpreted source code before parsing the retrieved code to discard pre-selected code segments identified by code headers to generate residual code; piping the residual code stream to a stack for sequential accessing to extract a domain name; and checking a local datastore for a datastore content value corresponding to the 20 extracted domain name and in response to a no match condition appending the extracted domain name to the datastore.
Preferably, the apparatus performs the further step of automatically generating a unique refreshable timestamp identifier for each datastore content value.
Ideally, the apparatus performs the further steps of :accessing the datastore to define an access subset by reading the timestamp identifier for each value and comparing the timestamp identifier with a pre-set value; formatting a resource locator for each access subset entry and addressing a resource associated with the formatted resource locator; BNSDOCID: IE 990276 retrieving and streaming un-interpreted source code before parsing the retrieved source code to discard pre-selected code segments identified by code headers to generate residual code; piping the residual code stream to the stack for sequential accessing to extract a domain name; checking the local datastore for a value corresponding to the extracted domain 10 name and in response to a no match condition appending the extracted domain name to the datastore; and deleting white space from text extracted from the source code to create a condensed character stream and appending the stream to a database.
According to another aspect of the invention there is provided an inter-computer communications method performing the sequential steps of: 20 initiating a domain name seek function using the server to retrieve an interrogation routine stored in local memory; automatically identifying a target address for a target from a predefined array of target addresses, extracting and compiling a resource locator associated with the address and linking the server to the target; retrieving and streaming un-interpreted source code before parsing the retrieved code to discard pre-selected code segments identified by code headers to generate residual code; piping the residual code stream to a stack for sequential accessing to extract a domain name; BNSDOCID: IE 990276 checking a local datastore for a datastore content value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore; · automatically generating a unique refreshable timestamp identifier for each datastore content value; accessing the datastore to define an access subset by reading the timestamp identifier for each value and comparing the timestamp identifier with a pre-set value; formatting a resource locator for each access subset entry and addressing a resource associated with the formatted resource locator; retrieving and streaming un-interpreted source code before parsing the retrieved source code to discard pre-selected code segments identified by code headers to generate residual code; piping the residual code stream to the stack for sequential accessing to extract a domain name; checking the local datastore for a value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore; and deleting white space from text extracted from the source code to create a condensed character stream and appending the stream to a database.
The invention will be more clearly understood from the following description of an embodiment thereof with reference to the accompanying drawing, given by way of example only, in which: BNSDOCID: IE 990276 Fig. 1 is a flow diagram illustrating operation of an inter-computer communications apparatus formed in accordance with the invention.
For the purposes of this description, specific system architectures, processors, memory 5 devices, timing and performance details have been omitted in order not to unnecessarily obscure the present invention. Thus, the constituent components of the invention have been described in terms of functionality, as many ways of achieving the said functionality will be readily apparent to those skilled in the art.
An inter-computer communications apparatus according to the invention is connected to a computer system to allow communications with a large number of geographically remote computers using Transmission Control Protocol/Intemet Protocol. (TCP/IP).
The apparatus connects to the computer system using a server, which processes Hypertext Transport Protocol (HTTP) requests. HTTP is the foundation of the World Wide Web (WWW) where the simplest through to the most complex of browsers use HTTP to issue requests to WWW servers and to receive and to display the response to those requests. The server retrieves information using this protocol from local systems using the method described in detail below. Retrieved data may be one of a number of types depending on the request, for example static data which might include text, graphics or other forms of binary data used to build images on the local system. This data is typically stored in a hierarchical file system in UNIX and is identified using a Uniform Resource Locator (URL). For data of this type the URL identifies the file which is to be transmitted. A translation mechanism of the apparatus is used to identify a sequential dataset to which the URL relates. For example, the URL /w/x/y/z will be taken to refer to the dataset w.x.y member z. Similarly the URL lafold will be taken to refer to the sequential dataset a.b.c. Once translated, the apparatus locates the dataset or dataset/member combination and returns the appropriate information to the computer system in response to the request as described below. The contents of the located data are identified using either a logical member type or the last level of the dataset name identified by the URL.
The previously known method of processing information requests is now described to SNSDOCID: IE 990276 facilitate understanding of the current invention and to highlight the important technical advantages associated therewith.
When a user requests information relating to a search from a search engine, the HTTP 5 client generating the search request, attempts to connect to the machine address where an HTTP server is running based on an address provided by the user associated with a given search engine. The HTTP server of that search engine is normally listening for incoming requests on a TCP/IP port.
The HTTP engine server normally accepts the connection at which time the client is free to send data. This data may include search criteria or relate to the selection of a preset information grouping on the server. The HTTP server may also elect not to accept the request at which time the connection will be broken. This may occur where the HTTP server only wishes to service requests from certain TCP/IP addresses. Requests may also be refused due to heavy network traffic as interpreted by Call Accept Criteria (C AC) functionality incorporated into the engine server which controls access against server performance, to service accepted requests as efficiently as possible.
The HTTP client sends the HTTP request to the HTTP server encapsulating various levels of information in standard HTTP headers. Request content is also sent and will normally include length and type HTTP headers to enable the HTTP server to interpret the content correctly. The HTTP server receives the request over the TCP/IP connection and begins processing the request to identify resources the server then links to appropriate storage and communications links to locate the URL’s, which identify requested data. The result thus located by the server is then returned to the request source. Once received, the data is processed, which, in the case of a browser, means displaying the output for the user, and closes it’s end of the TCP/IP connection. This process is repeated for each information request or URL pages of results, to which the information request refers.
The corresponding functions of the current invention are now described with reference to Fig. 1. In step 1 the server initiates a domain name seek function. This function in tum retrieves and interrogation routine stored in local memory in step 2. A target address is BNSDOCID: IE 990276 identified by this routine in step 3 from a predefined array of target addresses A. In step 4, a URL associated with the address identified in step 3 is compiled before linking the server to the target in step 5.
Once the link of step 5 has been established the source HTML code is retrieved in step 6. It is important to note that the source HTML is not interpreted by a browser but streamed for processing. This processing is initiated in step 10 where the streamed HTML code is parsed to automatically remove pre-selected code segments. These segments are identified by headers and represent a significant portion of the overall code length. Once all static elements have been removed in step 10 a check is performed in step 11 to identify and remove code portions relating to embedded programs. These programs which form the active components of the source HTML again represent a significant portion of the remaining code. Once the removal from the streamed code has been completed in steps 10 and 11 the residual code stream is piped to a stack in step 12. This stack is sequentially accessed in step 15 and contains a list of domain name to be accessed. As the name is accessed from the stack in step 15 a check is performed in step 16 to determine which the address exists in a local datastore. If the address is not found in the datastore, the address is appended to the datastore in step 17. If the address is found in the data store the sequential accessing is continued. This process continues until the entire stack created in step 12 has been accessed.
When this process is completed, the data store represents a list of sites to be accessed. The data store also includes timestamp identifier for each entry. This identifier designates the last access date for a given site or HTML reference. Entries appended to the data store in step 17 will have a null identifier automatically added indicating that this address has not previously been known to the server.
At a desired time the server data store is access and a subset of the site details defined in step 50. This subset may be defined as a list of entries having a null identifier or may be specified by sites not visited in a given period. The subset defined in step 50 is sequentially accessed in step 51 and a URL for each entry in turn is formatted in step 52. The server receives the formatted URL in step 53 and attempts to link to the site in step 55. If a BNSDOCID: IE 990276 communications error is detected in step 56 the next entry of the subset is processed as described in steps 50 to 55. If no communications error is detected in step 56 the home page text is extracted from the HTML code as described in steps 1 to 12 in step 57. All white space is deleted from the text in step 58 to leave a condensed character stream. The character stream is then appended to a database entry in step 60. The database entry has appended thereto, the URL formatted in step 52, the domain name and the character stream together with the date. Other domain names located on the page are accessed as described above to define a tree structure until all sites have been accessed.
It will of course be understood that the invention is not limited to the specific details described herein, which are given by way of example only, and that various modifications and alterations are possible within the scope of the invention.

Claims (5)

CLAIMS:
1. An inter-computer communications apparatus having link means for connecting the apparatus to a computer system for communicating with a plurality of geographically 5 remote computers using an internet data communications protocol, the link means incorporating, a server for processing information requests by retrieving data from one or more remote computers using the internet protocol, means for retrieving data associated with the information request and identifying a data type and address for the retrieved data, and a translation means for automatically identifying a sequential dataset for the 15 retrieved data address, the apparatus performing the sequential steps of:initiating a domain name seek function using the server to retrieve an 20 interrogation routine stored in local memory; automatically identifying a target address for a target from a predefined array of target addresses, extracting and compiling a resource locator associated with the address and linking the server to the target; retrieving and streaming un-interpreted source code before parsing the retrieved code to discard pre-selected code segments identified by code headers to generate residual code; 30 piping the residual code stream to a stack for sequential accessing to extract a domain name; and BNSDOCID: IE 990276 checking a local datastore for a datastore content value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore. 5
2. An inter-computer communications apparatus as claimed in claim 1 wherein the apparatus performs the further step of automatically generating a unique refreshable timestamp identifier for each datastore content value.
3. An inter-computer communications apparatus as claimed in claim I and claim 2 10 wherein the apparatus performs the further steps of :accessing the datastore to define an access subset by reading the timestamp identifier for each value and comparing the timestamp identifier with a pre-set value; 15 formatting a resource locator for each access subset entry and addressing a resource associated with the formatted resource locator; retrieving and streaming un-interpreted source code before parsing the retrieved source code to discard pre-selected code segments identified by code headers to 20 generate residual code; piping the residual code stream to the stack for sequential accessing to extract a domain name; 25 checking the local datastore for a value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore; and deleting white space from text extracted from the source code to create a condensed 30 character stream and appending the stream to a database. BNSDOCID; IE 990276
4. An inter-computer communications method performing the sequential steps of: initiating a domain name seek function using the server to retrieve an interrogation routine stored in local memory; automatically identifying a target address for a target from a predefined array of target addresses, extracting and compiling a resource locator associated with the address and linking the server to the target; 10 retrieving and streaming un-interpreted source code before parsing the retrieved code to discard pre-selected code segments identified by code headers to generate residual code; piping the residual code stream to a stack for sequential accessing to extract a domain 15 name; checking a local datastore for a datastore content value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore; automatically generating a unique refreshable timestamp identifier for each datastore content value; accessing the datastore to define an access subset by reading the timestamp identifier 25 for each value and comparing the timestamp identifier with a pre-set value; formatting a resource locator for each access subset entry and addressing a resource associated with the formatted resource locator; 30 retrieving and streaming un-interpreted source code before parsing the retrieved source code to discard pre-selected code segments identified by code headers to generate residual code; BNSDOCID: IE 990276 piping the residual code stream to the stack for sequential accessing to extract a domain name; 5. Checking the local datastore for a value corresponding to the extracted domain name and in response to a no match condition appending the extracted domain name to the datastore; and deleting white space from text extracted from the source code to create a condensed 10 character stream and appending the stream to a database.
5. A method and apparatus substantially as hereinbefore described, with reference to and as illustrated in the accompanying drawing.
IES990276 1999-04-06 1999-04-06 An inter-computer communications apparatus IES81055B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
IES990276 IES81055B2 (en) 1999-04-06 1999-04-06 An inter-computer communications apparatus

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
IES990276 IES81055B2 (en) 1999-04-06 1999-04-06 An inter-computer communications apparatus

Publications (2)

Publication Number Publication Date
IES990276A2 true IES990276A2 (en) 1999-12-29
IES81055B2 IES81055B2 (en) 1999-12-29

Family

ID=11042035

Family Applications (1)

Application Number Title Priority Date Filing Date
IES990276 IES81055B2 (en) 1999-04-06 1999-04-06 An inter-computer communications apparatus

Country Status (1)

Country Link
IE (1) IES81055B2 (en)

Also Published As

Publication number Publication date
IES81055B2 (en) 1999-12-29

Similar Documents

Publication Publication Date Title
US8312172B2 (en) Method and system for delta compression
CN1221898C (en) System and method for updating network proxy cache server object
US6397253B1 (en) Method and system for providing high performance Web browser and server communications
US5935207A (en) Method and apparatus for providing remote site administrators with user hits on mirrored web sites
US7657595B2 (en) Method and system for generating auxiliary-server cache identifiers
US7603483B2 (en) Method and system for class-based management of dynamic content in a networked environment
CN1351729A (en) Handling a request for information provided by a networks site
CN1352775A (en) Selecting a cache
RU2689439C2 (en) Improved performance of web access
US20030097429A1 (en) Method of forming a website server cluster and structure thereof
US20080222244A1 (en) Method and apparatus for acceleration by prefetching associated objects
US11729249B2 (en) Network address resolution
WO2002005126A2 (en) Dynamic web page caching system and method
EP1277118A4 (en) A system and method to accelerate client/server interactions using predictive requests
US7069297B2 (en) Data transfer scheme using re-direct response message for reducing network load
US6236661B1 (en) Accelerating access to wide area network information
CN106959975B (en) Transcoding resource cache processing method, device and equipment
JP2000285052A (en) Url conversion method and device
IES990276A2 (en) An inter-computer communications apparatus
NO20013308L (en) Device for searching the Internet
IE990277A1 (en) An inter-computer communications apparatus
GB2339516A (en) An inter-computer communications method and apparatus
JPH10301954A (en) Method and system for retrieving image
JP2005063192A (en) Web caching device, web caching method, and web caching program
CA2415641A1 (en) Dynamic web page caching system and method

Legal Events

Date Code Title Description
MM4A Patent lapsed