CA2328082A1 - Computer system for managing links and method using said system - Google Patents

Computer system for managing links and method using said system Download PDF

Info

Publication number
CA2328082A1
CA2328082A1 CA002328082A CA2328082A CA2328082A1 CA 2328082 A1 CA2328082 A1 CA 2328082A1 CA 002328082 A CA002328082 A CA 002328082A CA 2328082 A CA2328082 A CA 2328082A CA 2328082 A1 CA2328082 A1 CA 2328082A1
Authority
CA
Canada
Prior art keywords
link
pages
links
web
server
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
CA002328082A
Other languages
French (fr)
Inventor
Franck Jeannin
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
LINKGUARD Ltd
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Publication of CA2328082A1 publication Critical patent/CA2328082A1/en
Abandoned legal-status Critical Current

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/30Managing network names, e.g. use of aliases or nicknames
    • H04L61/3005Mechanisms for avoiding name conflicts
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9558Details of hyperlinks; Management of linked annotations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/958Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L61/00Network arrangements, protocols or services for addressing or naming
    • H04L61/30Managing network names, e.g. use of aliases or nicknames

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention concerns mainly a computer system for managing links, in particular hypertext links, and a method using such a system, characterised in that it comprises a server (3) for changing links which collects information on the pages (1.1), in particular in HTML comprising links, preferably external, on page address modifications and page cancellations. When a modification or cancellation occurs, the server (9) for changing links informs the web servers (1) concerned indicating the former page addresses. The invention is mainly applicable to pages comprising links on the World Wide Web.

Description

Resendl0- 8-00; W 34PM;FUrman & Kallio Rga ;+~ 306 359 6956 # 4~/ 66 A COMPUTER SYSTEM FOR MANAGING LINKS AID A METHOD
IMPLEMENTING SAID SYSTEM
The present application claims priority from Frexlch application No. 98 04660 of April 15, 1998 which is incorporated herein by reference.
The present invention relates mainly to a computer system for managing links, in particular hypertext links, and to a method implementing such a system.
The network of networks~ the INTERNET, provides IO Col~,ti1'luous iz~tez'connection between computer networks, and is experiencing Ever greater success due, in part, to the ;
ease with which the network can be accessed via a temporary connection, in particular a te7.ephone link to a computer connected to the INTERNET and belonging to a suppJ.~.ex of access or "provider', and in part to the ease with which it is possible to search for information described in the page description language HI'~L within a subset of the INTERNET known as the world WidE Web or ~twww"- Pages described in the HTML language are --2Q interpreted and displayed by browser software, in particular by Nd~.VZGAT012~ from NETSCPiPE or INTERNET
~P~dRER~ from MICROSOFT. Each I~ERNET computer has a permanent or temporary address I1~ that is made up of a run of digits separated by dots. Nevertheless, to make a connection via the INTERNET to a "web server" computer ---that has pages for display in the HTML language, all a user needs to know is its domain z~ame which is normally made up of a string of characters of the type: ,._ http://www.xxx.com/
It is INTERNET that transcodes domain names into the corresponding TP addresses. Similarly, each document on the web is identified by a character string known as a universal resource locator Qr 'tuRLT'.
For example:
http://www.xxx.com/abc/other/mypage.html corresponds to the URL of a page entitled mypage, which is described in HTML, and which is situat~:d in the Resendl0- 8-00; 2~34PM;FUrman & Kalllo Rge ;+~ 306 359 6956 # 42/ 66 subdirectory other of the directory abc of the web server www.xxx_eom_ The web makes browsing easy and user-friendly by the presence cf links which, on being selected, in particuJ.ax~
by clicking on them in a page described in the HTML
language, enable various actions to be performed:
go to a page (to a bookmark);
- g4 to another page of the same document (same server, internal links?;
1o ~ send messages to a box for receiving e-ma~,l; or go to a page of a different web server (external link) _ An external Link to the default page of the subdireetory other Qf the directory abc of the server www.xxx.com is written as follows in the FiTML language:
ca href = "http://www.xxx.com/abc/other/"~
By default, links appear on an HTML page as underlined blue text. ~rhis text incorporated in the code for the page and following the link is terminated by the --character string:
~ca/~
ether presentations can be declared (such as other colors, images, etc.j. when passing aver a link, the cursor changes into a drawing representing a hand pointing upwards.
This browsing technique zn which the user need not know and a fortiori need not key in URLs, makes the web remarkably user-friendly. Nevertheless, if a link points to an erroneous address, then an undesired page is 34 ~lisplaxed, while if the URL of the link is not valid, then error 404 of the HTxP protocol occurs and a message of the following type is displayed:
FILE NOT FOUND
The requested URL/xyz.htm was not found. on this server.
Exceptionally, such an error comes from an erroY in the TJR~ as input for the link, but more commonly, the R2S0Y1d10- 8-09; 2'.34PM;FUYmen r~ K91110 Rye ;+~ 306 359 6956 # 43/ 66 error arises because the URL points to d page that has subsequently been moved ar deleted. The :Link is said to be 'Tbroken~~. while browsing, the appearance of such a message is particularly unfriendly. This zs part~,o,~7,axly S true for external links where the person responsible for the web site or "webmaster'r does not have a tool fox verifying that links are consistent and is not necessarily informed when the addresses of pages pointed to by the la.nks axe changed. To remedy this problem, to proposals have been made to replace links by invax~,ant symbolic names known as universal resource names or URNS.
One or mare name servers would translate invariant URNS
into URLs. Thus, only the name server(S) would knOW the physical location of the documents, thereby avoiding such 15 informat7.oxi being stored redundantly by all servers making use of them. The name server would be easy to update. Such a solution has never been adopted on the INTERNET because it suffers from numerous draWbackS.
Firstly, to connect to a site, it would be necessary --20 initially to connect to the name server, which would double the number of connections, and consequently double the time recluired to obtain the desired information. In addition, the name server would be interrogated by very many web servers and would give rise to a very severe _ 25 bottleneck penalty in the transmission of information.
PITKOW: "Suppox't~.ng Ghe web: a distributed hyperlink database system", Computer Networks and ISDN Systems, Vol. 28, No. 11, May 199, pages 981-991, describes incorporating an "Atlas server" in a web server fox 30 communicating with other "Atlas servers~~ incorporated in other web servers to inform them about page changes.
Thus, each web server must have an "At7.as server". In contrast, the system of the present invention can have a single link server operating with a very large number of 35 web servers.
Consequently, an abject of the present invention is to provide a computer system making it possible to avail ReSendl~- 8-00; W 34PM;FUYm9n & K91I1o Rg9 ;+1 308 359 6956 # 44/ 66 broken links appearing, particularly on the world wide Web.
Another object of the present invention is to pxovxde such a sxstem while genexat~.ng only a small amount of traffic on the network.
Another abject of the present invention is to provide a system that operates very reliably.
Another object of the present is to provide a system for updating the bookmarks that specify favorite sites in 7.0 tkr.e browser softwaxe of individual cpnsultation stations.
'these objects are achieved by a computer system of the present invention having a link change sez~rer which collects information about pages, in particular in the HTML language and containing links, preferably external links, and about changes to the addresses of pages and about the deletion of pages. when a page is modif~,ed ox' deleted, the link change server informs the web servers concerned that are specifying the old addresses of the pages. -Advantageously, information about pages containing links is collected in co-operation with tk~e web server hosting the pages.
Advantageously, the notified change is executed automatically on the server hosting the page containing the link to be modified.
The present invention will he better understood frpm the following description and the accompanying figures given as non-limiting examples, and in which:
~ Figure 1 is a diagram of a few computers connected to the INTERNET and includirig valid links;
~ Figure 2 is a similar diagram of the computers after some pf the links have been mpdified;
Figure 3 is a flow chart showing hew axe ~~ernitter't module serves to keep track of the links under surveillance;
Figure 4 is a flow chart of a process for receiving a link change notification;

R2SQnd10- 8-00; 2'34PM;FUYmeY1 a K81110 Rge ;+1 306 359 6956 # 45/ 66 Figure 5 is a flow chart of the pxocess for receiving notification of uRL modifications; and Figure 6 is a flow chart of a module for receiving notification of a change of U1~L.
5 In Figures 1 to 6, the same references are used to designate the same Elements.
In Figures 7, and 2, there can be seen three web servers 7., 3, and 5 connected by means of the INTERNET 7 to a link server 9, a convt~ritional type of consultation station 17., and a consultation station 13 in accordance with the present invent-on.
The pages usually published on the web are described in the HTML language and are defined firstly by the name and the location of the files stored on the server, and secondly, optionally, by a title which is incorporated in the code of tk~e page with the tag title>. The server 7.
gives access t0 pages 2.1, 1.2, 1.3, 1.4, and 1.5 which are described in the HTML language, far example.
The server 3 gives access to pages 3.1, 3.2, and --3_3, e.g. described in the HTML language. The server 5 gives access to pages 5.1, 5.z, and 5.3, e.g. described in the HTML laxzguage .
Nevertheless, it should be understood that the present invention is not limited in any way to using the , HTML language, but applies to any description of eonter~t that enables links to be created, in particular hypertext links, e.g, by using the following page dE=scriptiori '~-languages: SGML, XML, DHTML, ~Sp, the FiYPERCARDC~ software :"-from APPLE, or software for managing documentation.
The page 1.1 has a first link 15.1 for going to the top of the page without using scroll bars. A link 15.2 points to the page 1.3. The links 15.1 and 7.5.2 are internal links whioh are relatively easy t:o manage.
Firstly site-creation software can incorporate a tool for managing the consistency of internal links. secondly, all of the pages 1.1 to 1.s are normally under the RAS0Y1d10- 8-00; 2'34PM;FUrmeY1 & K91110 Rge ;+1 306 359 6956 # 46/ 66 responsibility of a single person, the webmaster of web server 1.
A link 15.3 points form page 1.1 to the bottom of page 3.2 of server 3. A link 15.4 pdirits from page 1.1 to page 3.1 of server 3. A link 15.5 points from page 1.Z to page 5-3 of server 5. The links 15.3 to 15.5 are external links insofar as they poinC to pages stored on servers other than the server 1 hosting the page 1.1. A
user using a consultation station 11, typically a microcomputer with browser software for consulting pages, such as INTERNET EXPLORER, for example, can connect to :, tk~e server 1 and cause the page 1.i to be displayed on the station. ~rhe Links 15.1 to 15.5 make it easier to .-browse through the information.
Other links, not shown, can point to ime~ges, to JAVA
applets, etc. A link pointing to the image me.gzf in director~r abe of server www.xxx. com is written in I3TriIL as follows:
<img source = ~~http://www.XXX.oom/abc/me.gif"~ -Hy clicking on one of the links, the user changes page or position on a par3e and does so without kri~WiYlg, and without having to key in, the URLs of the various pages desj~gnated, by the links 7,5.~ to 15.5, and without knowing the bookmark pointed to by the link I5.1. This situation can be disrupted, as shown in Figure 2, by certain pages as designated by the links, in particular the external links, either disappearing or else moving_ ~-Indeed, tk~e J.oCations of Certain pages on web servers are ,., kept unchanged solely to avoid breaking links Chat designate those pages. The server where these pages are hosted can evolve but the webmaster freezes its structure to avoid breaking links.
In the example of Figure 2, the link 15.3 is broken because page 3.2 has been deleted. 'fhe contents of page 5.3 has been moved to page 5.1 of the wel~ server 5. The content of page 3.1 has been moved to page 5.5 of web servex 5.

R250nd10- 8-00; 2'34PM;FUrm9n & K91110 Rg6 ;+t 306 959 6956 # 47/ 66 Thus, a user of consultation station 11 connected to the ZNTERNE'f will suffer disturbance to a consultation unless the links Z5.3, 15.4, and X5.5 axe updated. Such updating is not easy, insofar as the webmasters of computers 3 and 5 cannot, without lengthy searching orz the web, find out which links point to pages hbsted on their own servers.
In the present invention, a list of .links is , generated and is kept up to.date so tk~at in the event of 1o a page being modified or deleted (servers 3 and 5) an alert is issued to the server (Z) that has links pointing _ to the pages that have been moved Or that rio lodger exist.
After eorrectioz~, the J.~,nk 15.3 in Figure 2 is z5 referenced 15.3. After correction, the link 15.~ zz~
Figure 2 is referenced 25.4'. After correction, the link 15_5 in Figure 2 is referenced 15.5'. After Correction, the link 1.5.6 in Figure 2 is referenced 15.5.
In the preferred example of the present invention, a 20 link server 9 stores and keeps up to date the list of external links on th2 INTERNET, and in particular on the world wide web. If there is a change in a pointed-to page, the server alerts the servers concerned that have links pointing to said page. Advantageously, the list is 25 ge~exate~l and/or updated in co-operative manner with web servers declaring pages that have been created, modified, destroyed, or moved, and also pages pointed to by the ~ '-links in the pages they host. This co-operation is particularly important for servers with restricted 30 access, in particular with INTRANET servers or servers requiring an access password that makes its impossible for a computer robot to explore the pages and consequently extract the links of the pages hosted. zn a variant embodiment, the browser software iri consultation stations 35 Z3 also makes declarations to the link server 9 about favorite sites or bookmarks for consultation on the web, possibly together with their own e-mail addresses. In R2S0Y1d10- 8-00; 2'.34PM;FUrm9Yt & K91110 Rge ;+1 306 359 6956 # 48/ 66 the event of a change in the address of a site ox of pages at a site, the server 9 informs the consultation stations 13 having browser software in accordance with the present invention of the updates they need to make, or in the event of a direct connection to tk~e sexvex~ 9, it makes those changes.
For example, consultation station 13 has a shortcut 15.6 to page 3.1 of web server 3. After being notified by the lxz~k server 9, the shortcut 15.6' of consultation station 13 points to page 5-5 of web server 5.
However, the same link 15.6 to page 3.1 in a consultation station 11 of conventional type is riot automatically modified arid consequently points to a page that is no longer relevant.
J-5 In a variant, on detecting a broken link (H'fTP error 404), the browser software does not d~.splay the associated messages, but connects to the link server 9 to read the new address of the pointed-to page. 7Cn this way, the link server 9 is consulted only for broken -links, thereby limiting traffic on the IN~.CERNET 7.
Advantageousl~r, the computer system of the present invention includes: a module for sending out informatiozz about the installed links on the various web servers, as shown in Figure 3; a link server 9 provided with a module far receiving information about links, as shown in Figure -r a mpd~,~,]~e for acquiring or receiving information about modifications to pages on the various web servers, as F_ shown in Figure 5; and receiver modules as shown in ~-Figure 6, advantageously distributed over the various web servers for receiving information concerrii.ng modified pages pointed to by the links of the receiver server.
The receiver module of Figure 3 has a step 1.6 of storing a difference file or D file showing the history of modifications to page addresses. rn a variant, any modification to a page (deletion, moving) is immediately notified to the link server 9 without waiting for the L!
file to be generated by scanning all of the pages hosted R2SeY1d10- 8-0~; 2'34PM;FUrm9Y1 & K91110 RgA ;+~ 306 359 6956 # 49/ 66 in a search for links to be pxocessed. The scan is preferably limited to external links. Go to 17.
Scanning step 17 consists, for example, in searching through the code of the pages for a character string of the following type:
<a href = protocol://server/directory/file>
where:
protocol designates the protocol used, e.g. HTTP;
server designates the address or the designation of the hosting server;
directory designates the'directory and any ;
subdirectories in which the code file is stored; and file designates the name of the code file forming the page.
Go to 19.
At 19, the list at instant ~ is drawn up of all of the links to be processed (in particular external links).
The list i_ also contains zogical locations corresponding to the various HTML pages. --GO to 21.
At 21, a check is made to see whethez° an earlier list exists.

If so. g4 to 23.

Tf not, go to 25.

At 23, the present list (list i.) is compared with '-the preceding list Mist i-1) and the difference is I

_ stored in a difference or D file.

Go to 27.

At 25, the pxesent list (list i) is stored in the difference or D file.

Go to 27.
At 27, the O list is sent to the link server 9. In a variant, the A list, optionally together with a more or less complete history of changes to the various pages hosted by the sexver, is made available for consultation via the TI~'~ERNET.

Re5end10- 8-00; 2'34PM;FUrmHn & Ka1110 Rge ;+1 306 359 6956 # 50/ 66 ~y way of example, the D list has: page-addition messages concerning pages that have been newly added;
page-modification messages concerning pages that have been renamed or moved; page-deletion messages concerning 5 pages that have been deleted; Link-addition messages concerning newly created external links; link-modification messages about links that have been modified; and link-deletion messages concerning links that have been deleted. For example, the sexwex 3 alerts to the link server that the page 3.2 has been deleted and that the page 3.1 has been modified, while web server 5 indicates that pages 5.1 and 5_3 have been modified and that pages 5.4 and 5.5 have been added. ;-.
Transmission 27 can take place by e-mail, and the procedure can take place automatically or after the webmaster has validated the sender to verify that the notification to the link server 9 is consistent, correct.
and does not contain confidential information.
TransmxssiQn can also be performed using the network --transmission protocol TCP/IP, in particular on interrogation of the web server by the link server 9.
Transmission 27 can also be performed by a high level protocol of the HTTP type. For example, the link server 9 connects to the web server and executes.,a standard script, e.g. using the standard known as common gateway interface or "CGI" and advantageously executes a script which is preferably written in the PEAL language which is particularly optimised for manipulating arbitrary character strings. The script displays the list of the Q file which is recovered by the link server 9. The link server 9 visits all of the web servers that declare they include a sender module. The D file zs advantageously deleted on the web server.
More generally, transmission 27 can be performed using any protocol that can be understood by the destination, e.g. voice synthesis, fax, a message on a Resendl0- B-00; 2~34PM;FUrman & Kalllo Rge ;+1 306 359 6956 # 51/ 66 pager or on a Short Message System (SMS) sent to the webmaster of the server concerned:
Figures 4 and 5 show how the link server 9 operates.
At 29, the server receijres D files by e-mail, by TCF/IF mode transmission, or by HTTP mode transmission.
Co to 31.
,At 31, the J,~,nk server 9 updates the link database 32, in particular concerning extErnal linlcs on the World wide Web. In a variant, it also receives declarations of bookmarks or favorites from browser software iri stations 13 of the present invention..
At 33 (Figure 5) the link server receives notifications of modifications to the URLs of web pages. .-The information about these modifications can be included I5 in the D files, or it can be stored and transmitted separately.
In a variant, in non-co-operative mode, the link sexwex ~ scans tl~e various web servers to draw up the list of web pages, their locations, and the links they --contain. The list can also be drawn up from the rNTFiT
iridexing databass established by search engines and including an index of links_ The web server 9 advantageously allocates a compact signature to the page.
This signature includes the tag atitle> in HTML wages, preferably together with pertinent data for identifying the page and based an occurrences of words, images, page layout used, and/or by semantic analysis of the text or a E__ checksum, i.e. the possibly weighted value of the sum of ., the values of the characters making up the page, so as to make it easier to identify pages that have been moved.
1n a variant, the server 1 hosting pages that have external links 15.3 to 15.5 ~7ointing to pages 3.1, 3.2, and 5.3 hosted by other web servers 3 and 5 notifies them of the existence of said links and of their content. In x'etux'r1, a server 3, 5 Qn moving a page 3.1 or 5.1 or on deleting a page 3.2 notifies these changes to Ghe servers ReSendlO- 8-00; 2v34PM;FUrmBn & Kellio Rg8 ;+~ 306 359 6956 # 52/ 66 that have informed them that they are hosting pages with links pointing to the modified pages.
Go to 35.
At 35, the link server s scans the link base to establish a list of any links that are affected by pages that have been changed or deleted, i.e. the list of pages that includes lznks that are now broken.
Go to 37.
At 37, the J.ink' server 9 notifies page z0 modifications, moves, or deletions to the web servers that have broken links.
This notification can likewise be performed by e-mail, n9tifiCatipn in TCP/IP transmission mode, in HTTP
type transmission mode, ete.
The operation of a notified web server 1 is shown in Figure 5.
At 39, the server i receives notification concerning Changes to the URLs of pages designated b_y the links 15.3 to 15.5. __ Go to 41.
At 41, the webmaster advantageously validates the proposed modifications. If validation fails, go to 43.
Validation may fail because of uncertainty about the origin of a message xecezved at 3~, or about its .
~5 relevance.
If validation succeeds, go to 45.
At 45, the uRLs in the HTML pages 1.1 concerned are modified_ The program ends at 43.
The sender and receiver modules on the various servers 7., 3, or 5 eaz~ be zmp~,emented periodically by being triggered manually by the webmaster, paxt~.cularly after pages have been modified, or else they can operate as a background task, being activated iri partiCUlar in the event of HTML pages being modified. Far e~rample, with a server operating under the UNIXC~7 operating sy6tem, these modules can be constituted by programs known as DAEMONS, whereas an servers operating u~.dex~ the WINDOWS

RAS2Y1d10- 8-00; 2.'34PM;FUY~18Y1 & K91110 Rge ;+1 306 359 6956 # 63/ 66 NT~ operating system, they can be modules known as SERvZCES.
Advantageously, in combined co-operative and non-co operative mode, the robot that oxawls over the World Wide web to build a database concerning the UR~s of TiTML pages and the links they contain, avoids exploring eo-operative servers that have notified a ~ list.
In a vaxiant, where a server 3 or 5 modifies ~r deletes a page, it makes a connection to the link ser~r'er 1o 9 to inquire from the list of servers 1 including pages 1.1 having links 15.3. 15.x, and 15.5 pointiz~g to a page that has been modified, and it itself makes the _ notifications to the sex~rer 1.
It oan be extremely advantageous to provide the system of the present. invention with security devices preventing wrong notification, in particular malicious notifications and/or attempts to create uridesix'ed links.
Any security system of conventional type can be used, in i paxticular for authenticating the sender of a message and -the integrity of its contents. By way of example, messages can be encrypted, e.g. wit.h so-called l~public key" encrypting algorithms such as RSA or DSA, PGP or PGP/Mine or 5/~ime protocols. Cryptographic systems based on public keys are described in particular in US-A-4 204 770, US A-4 218 582, US-A-4 405 829, US-A-4 424 414, and US-.Fr--~ 995 092, and also in the book "Applied cryptography", second edition, by Bruce Schneier.
In a variant, it is also possible to use a callback mechanism_ The server calls back the Sender of the message with an incorporated authentication random number. The reply includes the random number or a number that is derived from the authentication random number.
The preferred embodiment of the present i~vantion takes account of the fact that the webmaster of a web site, in particular of a small web site, often has a local copy of the site in a computer or workstation that RBSeYtdlO- 8-00; 2'.50PM;FUPm9Y1 & K91110 Rge ;+~ 3p6 359 6956 # 54/ 66 is not permanently connected to the INTERNET. In contrast, the server of the web site is permanently connected thereto. Modifications to pages in the local copy are uploaded to the web server proper, e.g. using the FTP protocol. Under such circumstances, the webmaster loads client software on the computer that has the local copy of the site for the purpose of co-operating with the link server 9.
The webmaster registers with the link server 9 by 1o gi~ring the address of the web site (e. g.:
http://www.myserver.com or possibly d subdirectory if the site is shared http://www.aserver.com/mydirectory/) and .
its e-mail address (e . g. myname(~myserver. com) .
This takes place either directly by filliz~g a foam 1S on the web site of the link server 9, or by configuring the client software.
zn a first variant embodiment, the list of pairs (link location, link) is drawn up by the client software which looks through aJ.J, of tk~e fiJ,es contained in the --~0 local copy of the site and extracts the tags of the language used, in particular ~zTMZ~, that correspond to links. For each link found, the client software cxeates a new entry in the list, including the address which the page tk~at i.z~cludes the 1 ink has on the web server proper, _ 25 and associated with the value of the link Qnca the list has been drawn up, the client software connects to the link server 9, e.g, wia an HTTP protocol and its ~
..
transmits the list o~ pairs (location of link, link), _ rn the preferred embodiment, the client software 30 establishes solely a list of pages on its own site and registers it in the link server 9. To d4 this, the client software looks through all of the files contained in the local copy of the site and connects to the link server 9 via, for example, the HTTP protocol, passing the 35 address of the pages looked through in parameters.

RBSeY1d10- 8-00; 2'SOPM;FUrm9Y1 & K81110 Rge ;+i 3~6 359 6956 # 55/ 66 zf the connection to the link server 9 is working, the file is copied into a ~~Deltae (0) directory, for subsequent use for comparison purposes.
The link servers connect themselves in HTTP to the 5 pages registered in this way and look through them searching for external links. Any external links found are added to the link base under the reference of the web site of user 3.
'floe link server continuously monitors all of the 10 registered links beJ.onging to nQn-registered servers that make connections via HTTP.
once a link has been detected as being broken or .
moved, and regardless of whether this is by the link ,-server 9 performing direct detection or by notification 15 of another user via the link server 9, an electronic message is senC to the user 3 requesting synchronization with the link server.
The webmaster 3 who receives a notification message requesting synchronization uses the client software to -24 connect in HTTP to the link server. The date of the most recent update of tkie "client.~~ is passed a:~ a parameter during the connection, and all modifications later than said date and relating to the current site are transmitted to the "client" in HTNIL format.
2~ The client software interprets the HTML code returned in the preceding step and applies the corresponding modifications tQ the local copy of the _._ files of the site. zt does this either by deleting the links (for deleted pages) or by replacing the links (for 30 pages that have moved), and it does so after validatiot'1 by the webmaster.
The webmaster works on the local copy of the site deleting, adding, movit~g, ox modifying the content of its pages.
35 The webmaster updates the web site proper which is permanently conrieCted to the INTERNET by copying the Resendta- 5-00; 2'.S~PM;FUYm9n & K81110 Rge ;+1 3~6 359 6956 # 56/ 68 1~
local copy of the files to the site proper (e.g. via FTP) .
The webmaster uses the client software to specify all of the c~-~.anges that apply to the site.
The software detects pages that have been added, deleted, moved, or modified by making a comparison between the "Delta" (Q) directory and the local copy of the site.
This information can be verified and modified by the webmaster.
The client software connects tv the link server 9 in HTTP to inform it about the modifications.
Advantageously, the link servex verifies the validity of the information transmitted in the preceding 1S step by connecting in HTTP via the INTERNET to the webmaster's site. Tk~e pages that have been modified or added are looked through to determine ti~ezx external links. Far pages that have been moved, it is verified that the old page na longer exists and that the new page -does indeed exist. Fox pages that have been deleted, it is verified that the old page no longer exists.
The purpose of thi$ step is to avoid information being falsified and to avoid the risks of if1-i.r~tentioned false notifications. , Once the information has been validated, the link r-base is looked through and the sites concerned 1, 11, 13 ax'e noti~~,ed by electronic mail.
In a variant, the webmaster can registex directly ---from the web site proper which is permanently connected to the INTERNET and aari recei~re notifications thereat.
Nevertheless, it should be understood that the server 9 can also detect when pages are moved and deleted on servers that are not registered with tile link server 9. Under such circumstances, it is assumed that the internal coherence of each web site is properly managed, i.e. that internal links are updated i.n the event of a page being moved within the web site.

RBSQndlO- 8-00; 2'50PM;FUYrnAn & KAIlIO RgA ;+1 306 359 6956 # 57/ B6 The web site is looked through to find an internal link to the page painted to by the external link which it might be desired to oorrect subsequently. The addresses of the page containing said internal link is stored as is the content of the link.
When the page that zs pointed to no longer exists (HTTP error 404), a COriri~CtiOri iS made Containing the internal link and the new address pointed to by said link is determined. Since the internal link will logically already have been updated by the webmaster o~ the site, the new address of the page is thus obtained, thereby resolving the external link. The address pointed to by , the corrected internal link is notified as a aorrevted external link to the pages of other web servers that include an external link equal to the old internal link.
The present invention applies to any computer system that has Links, and in particular external Links, such as document management systems, local networks, bulletin boards. -The present invention applies mainly to pages that include links on the World Wide WeH.
L

Claims (14)

18
1/ A computer system comprising data storage means that store links and/or shortcuts to display pages, automatic means for identifying stored links or shortcuts, means for generating and automatically storing a list of pairs [location of link (1.1), link (15.3, 15.4, 15.5)), and means for making the list available so that, in the event of a change in the address of a page (3.1, 3.2, 5.3) painted to by a link (15.3, 15.4, 15.5), it is possible to notify the change for the purpose of correcting the corresponding link (15.3', 15.4', 15.5'), and further including means for transmitting the list of pairs [location of link (1.1), link (15.3, 15.4, 15.5)] to a link server (9) receiving lists of pairs from a plurality of said computer systems.
2/ A system according to claim 1, characterized in that said system is a server on a network, in particular an INTERNET server, preferably a web server (1), and in that the means for identifying links comprise means for reading the code of display pages and means for extracting external links from display pages (1. 1).
3/ A system according to claim 2, characterized in that the display pages are described in the NTML or the XML
language.
4/ A computer system according to claim 1, characterized in that said system is a station (13) for online consultation of web sites (1, 3, 5), and in that it includes means for storing shortcuts (15.6) to favorite sites or favorite pages.
5/ A link server, characterized in that it includes means for drawing up a list of links and/or of shortcuts pointing to world Wide Web pages hosted on a plurality of computer Systems hosting pages, means for determining changes in the addresses of pointed-t4 pages (3.1, 3.2,
5.3), and means for notifying to a computer system (1, 13) hosting the medium containing the link or the shortcut of changes in the addressees of pointed-to pages.
6/ A link server, characterized in that it includes means for receiving notification of lists of pairs [location of link (1.1), link (15.3, 15.4, 15.5)] transmitted by a computer system according to claim 4 or 5, arid means for noCifying a computer system (1, 13) hosting the medium containing the link or the shortcut of changes in the addresses of pointed-to pages.
7/ A server according to Claim 5 ar 6, Characterized in that it includes means fox consulting display pages published on the web (1.1), means for extracting external links incorporated in the code of said pages, means for generating and storing a list of pairs (original page (1.1), link (15.3, 3.2; 15.4, 3.1; 15.5, 5.3}], and means for generating a list of pages including :links to each processed page.
8/ A web server according to claim 5, 6, or characterized in that it further includes means for notifying changes in the addresses of web pages (3.1, 3.2, 5.3).
9/ A server according to claim 6, 7, or 8, characterized in that it includes means for consulting web pages, means for identifying web pages, means for storing pairs [web page identity, address on the web] and means for comparing prior web page addresses with present web page addresses to deduce therefrom a list that identifies web pages that have changed address accompanied by their address, and also for identifying web pages that have disappeared.
10/ A method of repairing broken links on the INTERNET, in particular on the World Wide Web, the method being characterised in that it comprises the steps consisting in:
~ receiving notifications of links or shortcuts pointing to web pages;
~ receiving notifications of modifications to web page addresses;
~ generating a list of web pages pointing to pages that have changed address; and ~ notifying the changes of address of pointed-to web pages to the Computer system hosting links pointing to web pages whose addresses have been modified.
11/ A method of repairing broken links or shortcuts, characterized in that it comprises the steps consisting in:
~ consulting web pages accessible on the World Wide web and extracting eternal links therefrom pointing to web pages hosted on other sites;
~ storing the addresses of the various web pages pointed to by links or shortcuts;
~ drawing up a list of web pages that point to pages whose addresses have changed; and ~ notifying the changes of address of the pointed-to web pages to the computer system hosting the links that point to web pages whose addresses have been modified.
12/ A method according to claim 11, characterized in that it further includes a step of modifying links that point to web pages whose addresses have been modified so that they become the corresponding notified addresses.
13/ A method according to claim 10 or 12, characterized in that it further includes a step of notifying a server that hosts links pointing to pages whose addresses have been modified, which server is isolated by means for restricting and authorizing access, in particular an INTRANET server.
14/ A method according to claim 10, 11, 12, or 13, characterized in that it includes the steps consisting in:
storing an external link to be protected;
~ searching the web server hosting the page pointed to by the link for pages that include an internal link to the pointed-to page;
~ storing at least one internal link location associated with said link;
~ in the event of the pointed-to page having disappeared, connecting to the web server and reading the new link that replaces the link to the pointed-to page;
and ~ using or notifying the new internal link as the correct new link pointing to the page which was pointed to by the old link that is broken.
CA002328082A 1998-04-15 1999-04-13 Computer system for managing links and method using said system Abandoned CA2328082A1 (en)

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
FR9804660A FR2777725B1 (en) 1998-04-15 1998-04-15 COMPUTER SYSTEM FOR MANAGING LINKS AND METHOD FOR IMPLEMENTING SUCH SYSTEM
FR98/04660 1998-04-15
PCT/FR1999/000861 WO1999053669A1 (en) 1998-04-15 1999-04-13 Computer system for managing links and method using said system

Publications (1)

Publication Number Publication Date
CA2328082A1 true CA2328082A1 (en) 1999-10-21

Family

ID=9525238

Family Applications (1)

Application Number Title Priority Date Filing Date
CA002328082A Abandoned CA2328082A1 (en) 1998-04-15 1999-04-13 Computer system for managing links and method using said system

Country Status (9)

Country Link
EP (1) EP1072141A1 (en)
JP (1) JP2002511627A (en)
AU (1) AU3153599A (en)
CA (1) CA2328082A1 (en)
FR (1) FR2777725B1 (en)
IL (1) IL138945A0 (en)
RU (1) RU2000128642A (en)
WO (1) WO1999053669A1 (en)
ZA (1) ZA200005364B (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6591260B1 (en) * 2000-01-28 2003-07-08 Commerce One Operations, Inc. Method of retrieving schemas for interpreting documents in an electronic commerce system
IL143288A0 (en) * 2001-05-22 2002-04-21 Parity Bit Ltd A method for organizing an internet search according to user purposeful activities
US20030084095A1 (en) * 2001-10-26 2003-05-01 Hayden Douglas Todd Method to preserve web page links using registration and notification
JP2006236084A (en) * 2005-02-25 2006-09-07 Ricoh Co Ltd Database system
US8176166B2 (en) 2007-04-19 2012-05-08 International Business Machines Corporation Autonomic management of uniform resource identifiers in uniform resource identifier bookmark lists
US10719568B2 (en) 2017-11-28 2020-07-21 International Business Machines Corporation Fixing embedded richtext links in copied related assets

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5805824A (en) * 1996-02-28 1998-09-08 Hyper-G Software Forchungs-Und Entwicklungsgesellschaft M.B.H. Method of propagating data through a distributed information system

Also Published As

Publication number Publication date
FR2777725B1 (en) 2003-02-21
EP1072141A1 (en) 2001-01-31
FR2777725A1 (en) 1999-10-22
IL138945A0 (en) 2001-11-25
JP2002511627A (en) 2002-04-16
AU3153599A (en) 1999-11-01
RU2000128642A (en) 2002-10-27
WO1999053669A1 (en) 1999-10-21
ZA200005364B (en) 2002-02-25

Similar Documents

Publication Publication Date Title
AU2005263962B2 (en) Improved user interface
US20030018707A1 (en) Server-side filter for corrupt web-browser cookies
US6947991B1 (en) Method and apparatus for exposing network administration stored in a directory using HTTP/WebDAV protocol
US20040260680A1 (en) Personalized indexing and searching for information in a distributed data processing system
CA2365368A1 (en) Information collection server, information collection method, and recording medium
US6952723B1 (en) Method and system for correcting invalid hyperlink address within a public network
CN101243464A (en) Enhanced e-mail folder security
JP2004005500A (en) Information processor and information processing program
US20070277091A1 (en) Electronic document update notification device and electronic document update notifying method
US6480887B1 (en) Method of retaining and managing currently displayed content information in web server
JPH10107840A (en) Electronic mail system and mail server
US20040122916A1 (en) Establishment of network connections
JP5286946B2 (en) Information processing apparatus, input information restoration method and restoration program
CA2328082A1 (en) Computer system for managing links and method using said system
JP4445243B2 (en) Spam blocking method
JP4998302B2 (en) Mail misdelivery prevention system, mail misdelivery prevention method, and mail misdelivery prevention program
JP2002351730A (en) Method and device for filing electronic document
AU2009202441A1 (en) Computer readable medium, information processing apparatus, image reading apparatus, and information processing system
JP2005276042A (en) System for monitoring job-supporting system and support program
JP4546072B2 (en) Information processing method and computer system
JP5026130B2 (en) Mail management method, mail management system, and mail management program
KR20000072758A (en) clientprogram have user native interface of authentication / security support client / server application for implemented method
CN100483383C (en) Remote proxy server agent
JP2002183002A (en) Server device reporting domain name as candidate to be corrected, client computer using domain name as candidate to be corrected reported by the same server device, recording medium with recorded program running on the same client computer, and mail server reporting mail address as candidate to be corrected
JPH10222414A (en) Document processing method

Legal Events

Date Code Title Description
EEER Examination request
FZDE Discontinued