AU4757899A - Document building using interactive browsing - Google Patents

Document building using interactive browsing Download PDF

Info

Publication number
AU4757899A
AU4757899A AU47578/99A AU4757899A AU4757899A AU 4757899 A AU4757899 A AU 4757899A AU 47578/99 A AU47578/99 A AU 47578/99A AU 4757899 A AU4757899 A AU 4757899A AU 4757899 A AU4757899 A AU 4757899A
Authority
AU
Australia
Prior art keywords
document
anchors
hyper
text
documents
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
AU47578/99A
Other versions
AU743115B2 (en
Inventor
Julian Benjamin Kelsey
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Inc
Original Assignee
Canon Inc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AUPP5954A external-priority patent/AUPP595498A0/en
Application filed by Canon Inc filed Critical Canon Inc
Priority to AU47578/99A priority Critical patent/AU743115B2/en
Publication of AU4757899A publication Critical patent/AU4757899A/en
Application granted granted Critical
Publication of AU743115B2 publication Critical patent/AU743115B2/en
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Description

S F Ref: 472554
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT
ORIGINAL
Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: ASSOCIATED PROVISIONAL [311 Application No(s) PP5954 Canon Kabushiki Kaisha 30-2, Shimomaruko 3-chome Ohta-ku Tokyo 146
JAPAN
Julian Benjamin Kelsey Spruson Ferguson, Patent Attorneys Level 33 St Martins Tower, 31 Market Street Sydney, New South Wales, 2000, Australia Document Building Using Interactive Browsing APPLICATION DETAILS [33] Country
AU
[32] Application Date 15 September 1998 *000 The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5815 -1- DOCUMENT BUILDING USING INTERACTIVE BROWSING Field of the Invention The present invention relates to computer-based document browsing and, in particular, to a technique for advanced and convenient selection of source material for a document to be created.
Background Document creation has evolved significantly over the past ten years through the development of advanced word processing and desktop publishing computer software applications. Such applications today are quite complex products incorporating capacities to handle information sourced from the variety of different data types and the like.
However, one significant disadvantage of conventional arrangements is that document creation may only be performed using one source document at a time. If it is desired to combine two source documents, it is necessary to open one document with the word processing package for example, separately open the second document and then S 15 copy or cut and paste the contents of the second document into the first document to create a single document having both sources of information. Other methods of combining documents can be achieved by pre-programming a master document to import various individual documents into the master upon the master being suitably processed.
However, this necessitates carefully establishing the master to ensure that the word S" 20 processing package or the like correctly implements the import procedure.
The issues noted above for word processing and desktop publishing are further exacerbated when it is desired to create a document from information sourced over a computer network such as the Internet or World Wide Web. Typically, it is also necessary to identify individual components of the sources which are then individually 000•0 25 combined into the desired document.
In order to access computer networks, the Internet, and traverse the World Wide Web, use is often made of special browsing software such as Microsoft Internet Explorer (Microsoft Corporation), Netscape Navigator (Netscape Corporation) or Macintosh Finder (Apple Computer Corporation). On entering a web site or some other location, various 472554 CFP1440AU Page23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554.DOC:LDP -2computer facilities become available to the user in order to manipulate data, programs and the like. Such facilities include the printing of data, copying, running software, listening to audio and receiving video data, amongst others.
When traversing the Web with any one of the above browsing software packages, documents and/or data may include one or more hyperlink anchors which are displayed and are selectable by the user to traverse the Web or document to identify further information. At any time, the browsing software provides a facility to print the document currently being displayed, whether that document is a search engine or a specific location or information sourced on the Web. The hyperlink anchors as displayed typically incorporate a uniform resource location (URL) or uniform resource indicator (URI) which is used a network address to source the document being referred to by the hyperlink anchor. As a result of a user following a series of hyperlink anchors and the URLs sourced therefrom, the user creates a virtual linear path through the various documents, the path being conceptually linked to the user's interest in those documents.
One significant deficiency with following such links is that where a number of links are presented simultaneously, the user has the opportunity only to select one link and i"""•follow a path from that link. If the other links simultaneously are desired to be reviewed, :i the user must backtrack and then follow the path provided by those links. If the user forgets or is unable to locate the previously identified link, that path may then become 20 unavailable for traversal and for the sourcing of information.
It is an object of the present invention to substantially overcome, or at least o• °ameliorate, one or more of the problems mentioned above.
Summary of the Invention In accordance with one aspect of the present invention there is disclosed a method of constructing a document from a plurality of sources accessible via a computer network, said method comprising the steps: examining at least one source document incorporating a plurality of hyperlink anchors; 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP simultaneously selecting a (first) set of said anchors, said set including at least two of said anchors; and creating a (first) contiguous document incorporating matter sourced from each of the (first) set of anchors selected simultaneously.
Preferably the contiguous document includes a plurality of printable pages each with matter sourced from at least one anchor of the set.
Advantageously, the document includes a plurality of further anchors from which a (second) set of said further anchors are selectable and a further contiguous document is creatable incorporating matter sourced from each anchor of said second set. In such an embodiment, preferably the second set of anchors is selectable from a plurality of the printable pages of the first document. Typically, the further document is one of an independent document, a document appended to the first document or is merged into the first document.
Preferably the selection is made by scribing a computer pointing device such as a 15 mouse about an area in which the anchors are present. In some preferred situations, the selection is made by selecting all those anchors represented within a displayed window of a document.
i" Apparatus and computer program products for performing the above are also disclosed.
Brief Description of the Drawings A preferred embodiment of the present invention will now be described with reference to the accompanying drawings in which: Fig. 1 is a schematic block diagram representation of a network based document creation system and depicting the operating environment of the preferred embodiment of the present invention; Fig. 2 shows the visual appearance of a user interface to a Hypertext Document Creation Tool; Fig. 3 is a block diagram of an internal structure of the Hypertext Document Creation Tool; 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP Fig. 4 is a block diagram of a general purpose computer upon which the Hypertext Document Creation Tool and the preferred embodiment of the present invention can be practiced; SFig. 5 is an example of the display screen during hyper-text document preparation; Fig. 6 is a flowchart depicting operation of hyper-text document formatting; Fig. 7 is exemplary illustration of a display screen of the document creation system of Fig. 1; Fig. 8 provides an exemplary representation of documents sourced via the network seen in Fig. 1; Fig. 9 illustrates one example of a document creation arrangement of the preferred embodiment; Fig. 10 is a schematic representation of a document created using the arrangement in Fig. 4; and Fig. 11 is a flowchart for creating a document from a plurality of network sources.
Detailed Description of the Best and Other Modes To assist users in being able to track and trace their traversal of the Web, Canon Information Systems Research Australia Pty Ltd has developed a Hypertext Document 20 Collating Tool which is currently the subject of United States Patent Application No.
08/903,743 filed 31 July 1997 (Attorney Ref: 378728US CFP0568US Page+20), the salient disclosure of which is included below under the heading of the same title. The Hypertext Document Collating Tool operates in a background mode behind the browsing i..
software to automatically and transparently create a printable document that includes the various Web sites and documents encountered by a user during a traversal of the Web.
The preferred embodiment of the present invention is implemented as an additional feature in the Hypertext Document Collating Tool. However, the present invention is not limited to use with the Hypertext Document Collating Tool or other similar products, but has wider application and may for example be implemented in 472554 CFP1440AU Page23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554.DOC:LDP generic browsing or viewing software, as will be appreciated by those skilled in the art having read and understood this specification.
Hypertext Document Collating Tool Many computer based document mark-up languages have been developed in order to allow computer-aided document preparation. Examples of such languages include TROFF, TeX, RTF, as well as many proprietary formats associated with computer hosted word processing applications. These mark-up languages are designed to allow the computer assisted preparation of a document destined for printing. As a consequence to these developments, the prevalence and active nature of digital computers has encouraged the introduction of hyper-links in documents.
A hyper-link is a pointer, typically embedded in a document, that provides a direct link to another portion of the same document, another document, another resource, available on the current network node or another network node. Hyper-links are often used on the Internet, and in particular the World Wide Web to link a document at one 15 Web site with a document at another Web site.
Hyper-links are only operational when a document is viewed on-line, and not when the document is in printed form. The increased value of these on-line hyper-text documents has caused a weakening of the previous focus on printing. New generation languages used to interpret hyper-text linked documents such as SGML and HTML 20 (Hyper-Text Mark-up Language), have few features to support the description of their printed form. More importantly, because the principle value of hyper-text documents is for on-line viewing, these documents are formatted by their authors in a manner which is appropriate for screen viewing, and not necessarily for viewing in printed form.
As a result it is now the case that very large quantities of information are recorded in network accessed on-line services in formats which are appropriate for screen based viewing, but not as appropriate for viewing in printed form. Further, because printing is not a focus of applications which access these hyper-text documents (that is, hyper-text browser applications), their printing facilities are generally poor.
Common problems encountered when printing hyper-text documents include: 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554. DOC: LDP -6information is broken up into small hyper-text documents, and many documents need to be collated to form a desired body of information; text is formatted with fewer words per line than is common for printed pages, and in general the density of information is less than is typical for printed pages; hyper-text document viewing programs are document-centric, that is they operate on a single hyper-text document at a time, which results in this being the unit of printing, resulting in much repetitive work by the user to print a set of linked hyper-text documents, and typically no more than one hyper-text document on each printed page; hyper-text document viewing programs generally do not print all the features ofhypertext pages which are displayed on-screen (a display device), in particular the target of hyper-links is often not included in printouts.
It is possible for the provider of a hyper-text document designed for screen viewing to also provide substantially the same document in a different form designed for printing, but this requires double handling by the document provider. It also often results 15 in significant differences between the screen version of the document and the printed form.
The problem of no more than one hyper-text document per printed page can i' sometimes be addressed by the reduction and rotation of the image of each basic page and printing each reduced page image on, say, one half of a printed page. However this 20 method does not save paper at a given scale. For example, if a large number of small hyper-text documents are printed, each of which only occupies 25% of a printed (physical) page, even though the documents are photo-reduced and printed two per physical page, each physical page still has 75% blank space. Further, this method does not provide continuous page-length columns. Continuous column printing provides improved readability and space utilization.
The Hypertext Document Collating Tool is described as a computer application program hosted on the WindowsTM operating system developed by Microsoft Corporation.
However, those skilled in the art will recognise that the described arrangement may can be implemented on computer systems hosted by other operating systems. For example, the 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP Hypertext Document Collating Tool can be performed on computer systems running UNIXTM, OS/2TM, DOSTM. The application program has a user interface which includes menu items and controls that respond to mouse and keyboard operations. The application program has the ability to transmit data to one or more printers either directly connected to a host computer or accessed over a network. The application program also has the ability to transmit and receive data to a connected digital communications network (for example the "Internet").
Fig. 1 provides an overview of the environment in which the Hypertext Document Collating Tool may operate. A hyper-text browser 10 is provided to output to a display device 11 for viewing hyper-text documents. Typically, the hyper-text browser is of the form of application software implemented on a general purpose computer system (eg. IBM PC or compatible, Apple Macintosh, Sun-Workstation etc.) and hypertext documents include images, linked documents and simple TEXT documents. Current examples of the hyper-text browser include Microsoft Explorer and NETSCAPE The computer system (not shown in Fig. 1) usually forms an interface which connects a network system 12 of computers to the display device 11 and to a print output device 13.
A hyper-text document formatter 14, preferably implemented as a software module on the general purpose computer, is operable to format a hyper-text document and controlled in part by instructions derived 15 from the hyper-text browser 10 responding to a user's request. Further, the hyper-text document formatter 14 communicates with the network system 12 to perform a multitude of functions including gathering, formatting, and collating documents with direct instructions from the hyper-text browser 10 or the user.
S Referring to Fig. 2, there is shown a user interface layout of the hyper-text 25 document formatter 14 as displayed on the display device 11 and which comprises a menu and control area 21, a print list display 22, and a print preview display 23. The print list display 22 includes a list of print items 22A, 22B, 22C, each of which include a print item mark box 24, a hyper-text document title text field 25, a fetch status text field 26 and a 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP location text field 27. The print list display 22 and the print preview display 23 are scrollable by means of scroll bar controls 28 and 29.
The print preview display 23 displays (shows) representations of the printed pages which are to be produced on the printer output device 13 using current selected print options, for example in a WYSIWYG ("what you see is what you get") format. The user is free to select from the menu and controls 21 a print option other than the current print option. Such print option can include print settings for the print output device 13, portrait or landscape orientation of pages, print resolution and scaling. Upon user selection of an option, the current print preview display 23 is appropriately updated.
However the display in the print preview display 23 is regenerated automatically as a current application state changes without intervention required by the user. Application states which can effect the print preview display 23 include, but are not limited to, the currently selected printer, the currently selected paper type, formatting options which can be set by the operator, the set of marked items in a print list (ie. those selected by a mark in the print item mark box 24) and the order of marked items associated with the print list.
The hyper-text document formatter 14 can be practised using a general-purpose computer system 40 shown in Fig. 4 connectable to the communication network 12 which .provides links 19 to web sites 16 and 18. The computer system 40 includes a computer module 41, input devices such as a keyboard 51 and mouse 52, and output devices including a printer 13 and a video display device 11. A modulator-demodulator (modem) transceiver device 54 is used by the computer module 41 for communicating to and from S: "computer systems at other locations via the communications network 12, those computer systems for example including the web sites 16 and 18.
The computer module 41 has a number of components typically including at least one processor unit 42, a memory unit 43, for example formed from semiconductor random access memory (RAM) and read only memory (ROM), input/output (I/O) e interfaces including a video interface 50 for the display 11, an I/O interface 41 for the keyboard 52 and mouse 53 and a communications interface 49 for the modem 54 and printer 13. A storage device 46 is provided and typically includes a hard disk drive 47 472554 CFP1440AU Paget-23 [I:\ELEC\CISRA\PAGEPLUS\PAGE231]472554.DOC:LDP and a floppy disk drive 48. A CD-ROM drive 45 is typically provided as a non-volatile source of data. The components 42-50 of the computer module 41, typically communicate via an interconnected bus 55 and in a manner which results in a conventional mode of operation of the computer system 40 known to those in the relevant art. Examples of such computer systems 40 include IBM PC/AT and similar machines, Sun Sparksations and Apple Macintosh. Further, the web-sites 31 and 32 may be implemented on such computer systems.
Typically, the application program of the hyper-text document formatter 14 is resident on a hard disk drive 47 and read and controlled using the processor 42.
Intermediate storage of the program and the print list and any data fetched from the network may be accomplished using the semiconductor memory 43, possibly in concert with the hard disk drive 47. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk, or alternatively could be read by the user from the network via the modem device 54.
Fig. 3 shows a block diagram representation of an internal structure of the hypertext document formatter 14, which comprises a user interface task 30, a monitoring task 31, a data fetching task 32, a formatting task 33, an internal print list storage 34, the print list display 22 (also shown in Fig. the print preview display 23, a temporary file storage 35, a network and file system interface 36, and a printer interface 37.
oooD The internal print list storage 34 is structured as a list of records in the memory 46 of the general purpose computer system 40, each record being referred to o* hereinafter as a "print item". Each print item represents at least one hyper-text document, and comprises a Uniform Resource Locator (URL) by which the associated hyper-text too: document can be retrieved as well as a further list of records, each of which is referred to herein as a sub-item. Each sub-item represents a distinct file-like unit of data which is required to complete the formatting and displaying of the hyper-text document associated i with the print item. These units of data (or sub-items) are most commonly hyper-text documents in HTML format and images in GIF or JPEG format. Each sub-item records a file name within the temporary file storage where the unit of data will be, or is, stored.
472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP In Fig. 3, the four tasks 30, 31, 32, 33 are shown, each of which is implemented as a separate thread within a single application process. The internal print list storage 34 is shared by the tasks 30-33 in a manner to avoid conflicts. Each task 30-33 gains access to the print list on the internal storage 34 by first obtaining a "mutex" lock (mutually exclusive lock). Once the lock is obtained, the task reads and possibly modifies the print list and then releases the lock. Upon release of the lock, if changes were made to the print list, messages are forwarded to the user interface task 30, the formatting task 33 and the data fetching task 32 to inform them that changes have been made.
The user interface task 30 performs user interface operations by having a waiting state 30A and by acceptance of user interface events such as clicks and movements of the mouse 52, responds to process 30B as appropriate to each event. Operation of the task is achieved by a message loop structure processing each operating system generated event in turn and is linked to the print list display 22.
The monitoring task 31 performs monitoring 31A of user initiated access to documents including hyper-text documents using the hyper-text browser 10, and entering 31 B each such document accessed by the user into the print list. In particular, the browser includes an application program interface (API) which allows viewing of information i being cached by the browser 10. In this manner, the monitoring task 31 is able to take and S".:"maintain a record of the operation, typically sequential, of the browser 10. From the *oo.
S 20 record, the print list 34 is automatically created using the URL's of the items located. The user is then able to edit the print list 34 by de-selecting those items not required to be S-printed.
The fetching task 32 performs fetching of all documents which are listed in the print list along with associated data necessary for producing a visually pleasing (desired) or viewable formatted version of the documents in print form. Typically, the associated data includes print settings for a print devices to which the documents are to be directed.
i Operation of the fetching task 32 is preferably achieved through use of Internet protocols and/or network access techniques provided by the host operating system and includes a wait stage 32A for detecting any change in the print list, and a fetching stage 32B, for 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -11fetching the required data and storing the data in a temporary file storage 35 typically formed within the memory 43. The fetching task 32 is also responsible for initiating further fetches and amending the print list accordingly. Amending the print list or adding to the print list hyper-text pages which are hyper-linked from one of the pages previously fetched, by the fetching task 32, is typically performed as a background task to the hypertext browser 10. Hyper-links previously visited by the fetching task 32 are preferably not re-visited to avoid repetition. The user may elect, as part of optional settings that the fetching task 32 visits, a predetermined number of hyper-link pages for augmenting the print list accordingly.
Preferably, the fetching task 32 provides a cross-referencing feature, should the user select or desire such option, which maintains a cross referencing to URL or hyperlinks of hyper-text documents to be printed (formatted version) with an indexing of cross references and a corresponding page (number) in the document to be printed.
In this connection, the formatted version includes a table of contents listing each hyper-text document represented in the document to be printed. Each entry in the table of contents is labelled with the position (page number) at which the associated hyper-text document occurs within the said formatted version.
The formatting task 33 performs formatting of all documents which are listed in the print list in a manner suitable for printed output, and also optionally showing a preview of the printed output which would be produced in the print preview area. Its operation is achieved by a recursive descent HTML parser and formatter, and results from waiting 33A for a change in the print list, and a format stage 33B which formats the documents and forwards it to a printer interface 37 for hard copy reproduction.
S"Notwithstanding that the updating of the print preview display 23 appears, under •ee.
some circumstances, to depend on an availability of a hyper-text document through the network, a substantial portion of the tasks described with reference to Fig. 3 are i *performed substantially instantaneously in background mode unbeknown or at least not immediately apparent to the user. Typically, the tasks 30-33 can be performed synchronously or asynchronously with a user's access pattern. Usually, a user accesses or 472554 CFP1440AU Page23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -12visits, with the aid of the browser application, root hyper-text documents. Described in an alternative way, hyper-text documents visited by a user are referred to herein as root hyper-text documents, and any further hyper-links and their associated documents are visited and fetched by the fetching task 32 respectively. The depth to which hyper-links are followed in fetching hyper-text documents is user defined. Preferably, all hyper-links of a root hyper-text document having predetermined characteristics are visited by the fetching task 32 and the associated (hyper-text) documents are retrieved. For example, a user may mark hyper-links to be followed to a predetermined depth or the user may specify characteristics of hyper-links, and their associated documents, to be all documents descendent from a current root hyper-text document containing predetermined keyword.
Fig. 5 provides an illustrative representation of the hyper-text document formatter 14 in use. Fig. 5 shows a display screen 60 of the display 11 which has two windows clearly displayed. A window 70 is a web-browser application window that displays a text document 67 (corresponding to a few of the introductory paragraphs of this patent description). This forms a background window and is representative of the hypertext browser application 10 covering the entire screen area. Superimposed on top of the window 70 is a window 63 corresponding to a working display of the application program of the hyper-text document formatter 14, described earlier with reference to Fig. 2. The user in this case is preparing a document formed from three sources, each mentioned in 20 the print display list 61. A first source 68, called FRED, is a simple text source previously encountered during a Web review, and occupies a first position in the °document being formed. A second source 69, being a picture of a vehicle, occupies a second position, whilst a third source, corresponding to the background text document 67, occupies the third position. It is seen from the print display list that a Search engine, used to locate the text document 67 has been de-selected (N-No) from display, and hence does not appear in the WYS1WYG print preview 65. The display list indicates that each source has been fetched is its corresponding URL, and is selected (Y-Yes) for display. In each case the location identifier provides the Web site address for the source material.
472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -13- As seen in Fig. 5, the second column 64 of the print preview 65 has a blank section 66. As seen from the print display list 61, the text document 67 remains in a "fetching" state, where the text is being retrieved and formatted for WYSIWYG display.
Once this is completed, the section 66 displays the text that has since been fetched and the print display list 61 is updated to indicate a "fetched" status for that document.
In compiling the print document, the application program, and in particular, the document formatter 33B, recognises that the width of FRED and the picture are narrower than the page, and therefore establishes a column corresponding to their width. Because of its length, the text document 67 is formatted, firstly into a narrower, left hand column 62 related to the width of FRED 68 and the picture 69, and then to flow into the right hand column 64 which is adjusted to a width to substantially fill the page. Importantly, the application program is configured to automatically detect the selected content of a source, and to incorporate that content into the print preview display 23 (65) in an economical manner so that as many hyper-text documents as can reasonably be fitted to a page can be displayed. This reduces paper consumption.
The hyper-text document formatter 14 is configured to operate in background mode whilst the user is traversing the World Wide Web to automatically create and •format a printable document representing a chronological history of the user's traversal of the World Wide Web. Typically, the hyper-text document formatter 14 operates in a background mode as a window operating behind a web browser window. As seen in Fig.
6, a flowchart of procedures 100 of the hyper-text document formatting portion of the hyper-text document formatter 14 commences at a starting point 102. This entry point leads to a step 104 where the application attempts to read an HTML element from a Web document currently being viewed using a Web browser program. At step 106, which follows step 104, an assessment of data availability is made and if none is available, step 108 assesses whether or not another document can be opened. If so, control is returned to *oo.
step 104 for handling the new document. If not, document formatting is completed at step 110.
472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -14- If data is available at step 106, control is passed to step 112 where the HTML element of the current Web site location is formatted into a standard form able to be printed using the application program. At step 114, an assessment is undertaken as to whether or not the formatted element is able to fit on to the page to be printed. If so, control is transferred to step 118 where the formatted HTML document is emitted as an output document. If the formatted element does not fit on to the page as determined by step 114, control is passed to step 116 which splits off, or culls, the non-fitting remainder of the formatted element. This enables control to be passed to step 118 for emitting of the remaining formatted HTML document. After step 118, control is passed to step 120 which assesses whether or not there is a remainder, for example left over from step 116.
If so, control is returned to step 112 so that the remainder can be formatted and processed in the manner described above. If there is no remainder, control is returned to step 104 in order to read the next HTML element.
With the arrangement described in Fig. 6, whilst the user browses the World Wide Web, the application program continually assesses the data being viewed in the browser window and automatically formats that data into a continuous printable document displayed in the window for example shown in Figs. 2 and 5. When the user has completed browsing, the window of the application program (ie. window 63 of Fig. can be selected. Using the print display list 61, the user can either select or de-select certain documents located during the Web browsing session for printing. During the course of a browsing session, all documents seen are automatically enabled in the print *document window. Accordingly, prior to printing all that is necessary is for the user to to cull out or de-select those components not desired for printing. For example, if the user had made use of a search engine during the Web browsing session, there may be little point in printing out the text associated with that search engine. All that would be necessary to print could be the actual document or Web site location found as a result of *oeo 0: o the search, such as shown in the example of Fig. A further advantage of the present invention is that, in the printed document, at the completion of each section relating to an individual Web location, the actual Web 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554.DOC:LDP location is printed onto the printed document so that the user has a permanent hard copy record of not only the information sourced but of the location of that source.
The foregoing only describes one configuration of the Hypertext Document Collating Tool, however, modifications and/or changes can be made thereto without departing firom the scope of the concept. The Hypertext Document Collating Tool described above may be conceptualised by the following numbered paragraphs: I1. A method of collating hyper-text documents, said method comprising the steps ofmonitoring a user's access patterns to said hyper-text documents; and accessing said hyper-text documents including structure inform-ation of the accessed hyper-text documents; creating a formatted version of the accessed hyper-text documents for said user.
2. A method according to paragraph 1, wherein steps and are conducted while the user accesses hyper-text documents.
3. A method according to paragraph 1, wherein said formatted version of the accessed hyper-text document is updated upon new hyper-text pages being accessed.
4. A method according to paragraph 1, wherein said steps are performed in background mode.
5. A method according to paragraph 1, wherein steps and are performed asynchronously with a user's access to said hyper-text documents.
6. A method according to paragraph 1, wherein said steps are performed substantially in synchronism with a user's access to said hyper-text documents.
7. A method according to paragraph 1, wherein said formatted version is formatted to be suitable for single or multiple column page printing on a printer output device.
8. A method according to paragraph 7, wherein said formatted version suitable for single or multiple column page printing comprises as many hyper-text 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE231472554.DOC:LDP 16documents on each printed page as can reasonably fit in a space available on said each printed page.
9. A method according to paragraph 1, wherein said formatted version includes a table of contents listing each hyper-text document represented in said formatted version wherein each entry in the said table of contents is labelled with the position at which the associated hyper-text document occurs within the said formatted version.
A method according to paragraph 1, wherein said formatted version includes a hyper-link index of all the hyper-link references in all the said hyper-text documents represented in said formatted version.
11. A method according to paragraph 10, wherein each hyper-link reference in the said formatted version is tagged with a cross-reference to its entry in said hyper-link index.
12. A method according to paragraph 10, wherein said hyper-link index excludes hyper-link references of hyper-text documents represented in said formatted version.
13. A method according to paragraph 1, wherein the said hyper-text Sdocuments are HTML documents.
14. A method according to paragraph 1, wherein the said hyper-text documents are accessed using Internet protocols.
15. A method according to paragraph 1, wherein said formatted version is displayed in preview form continuously while said user accesses said hyper-text documents.
16. A method of collating hyper-text documents, said method comprising steps of: accessing said hyper-text documents including structure information; creating a formatted version of said accessed hyper-text documents wherein said formatted version is characterised by a single or multiple column printing such that each printed page contains as many of said hyper-text documents as can reasonably fit in an available space on a printed page.
472554 CFP1440AU Page+23 [:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -17- 17. A method according to paragraph 16, wherein said hyper-text documents are determined by accepting a specification from a user of one or more root hyper-text documents and adding to said root hyper-text documents all derived hyper-text documents which are hyper-linked from said root hyper-text documents and have certain specified characteristics defined by said user.
18. A method according to paragraph 16, wherein said formatted version includes a table of contents listing each hyper-text document represented in said formatted version wherein each entry in the said table of contents is labelled with the position at which the associated hyper-text document occurs within the said formatted version.
19. A method according to paragraph 16, wherein said formatted version includes a hyper-link index of all the hyper-link references in all the said hyper-text documents represented in said formatted version.
A method according to paragraph 16 wherein each hyper-link reference in the said formatted version is tagged with a cross-reference to its entry in said hyper-link index.
21. A method according to paragraph 16, wherein said hyper-link index excludes hyper-link references of hyper-text documents represented in said formatted version.
22. A method according to paragraph 16, wherein the said hyper-text documents are HTML documents.
23. A method according to paragraph 16, wherein the said hyper-text documents are accessed using Internet protocols.
24. A method according to paragraph 16, wherein said formatted version is displayed in preview form continuously while said user accesses said hyper-text documents.
25. Apparatus configured to implement the method of paragraph 1.
26. Apparatus configured to implement the method of paragraph 16.
27. A computer implemented method for collating a plurality of documents obtained from a plurality of sources, said method comprising the steps of: 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -18monitoring accesses to documents in sequence; recording the contents of a plurality of selected documents including structure information relating to each said selected document; and collating said selected documents according to a predetermined order of collation, said collating comprising arranging none or more display pages according to the sizes of each said selected document based upon said corresponding structure information, wherein said collating forms a single document reproducible at least by printing.
28. A computer system comprising: a network comprising a source of a plurality of documents each individually accessible via a resource locater, wherein ones of said documents include therein links that give access to others of said documents; means for monitoring said resource locater and compiling a display list of said documents, said list including the corresponding links and structure information pertaining to each document; and means for collating the display list into a selected order and for formatting said documents within said display list into a single printable document having corresponding components arranged in said selected order.
oo•30. A computer readable medium including instruction modules arranged to collate for printing as a single document a plurality of documents derived from a plurality of sources in a network, said modules comprising: a monitoring module for monitoring browsing operations throughout said network; S"a compiling module for compiling a display list of selected documents 25 encountered during said browsing operations; and 00o0 a collating module for collating the selected documents into a single printable document in which each said selected document in formatted according to structure information derived during said monitoring module whereby said single printable 472554 CFP1440AU Paget-23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -19document is collated to be substantially seamless in printing reproduction and to minimize vacant or wasted space on any and each printed page.
31. A medium according to paragraph 30 wherein said medium is one of a computer network, a hard disk, a floppy disk and an optical disk.
32. A computer program product having a computer readable medium having a computer program recorded thereon for collating hyper-text documents, said computer program product comprising: means for monitoring a user's access patterns to said hyper-text documents; means for accessing said hyper-text documents including structure information of the accessed hyper-text documents; and means for creating a formatted version of the accessed hyper-text documents for said user.
Preferred Embodiments Like the Hypertext Document Collating Tool described above, the preferred embodiment may also be performed within the arrangement of Fig. 1 and also as a software application operating within the computer system of Fig. 4, again preferably stored in the hard disk drive 47. The Hypertext Document Collating Tool operates :.:":through links to the network 12 and the browser 10 to create a document containing ~selected components of the various documents identified over the network using the 20 browser 10. The application package 14 can provide an output directly to a printer 13 or an interactive output 16 via the video display device 11.
During an Internet or Web browsing session, a user of the computer system enables operation of the browsing software which is typically stored in the hard disk drive 999* 47 and which facilitates communications via the modem 54 to provide a connection to a 25 web-site.
eo- Locations accessible via the communications network 12 are individually addressable using the URLs and URIs. These may be entered by the user of the computer system 40 to directly access a particular web-site. Alternatively, web-site documents and 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP the like (including search engines) may include hyper-text which, when selected, provide direct links to locations identified by URL's associated with the hyper-text.
Fig. 8 shows a block diagram 76 representation of the two exemplary Internet web-sites 16,18 and their associated URLs.
As seen in Fig. 8, the Web sites 16 and 18 incorporate a number of links (LINK n) 78 to various URL locations (URL#n) which contain corresponding information (info n) and are accessed via the links. The actual arrangement of the links 78 shown in Fig. 3 will be appreciated by those skilled in the art as being only exemplary of one arrangement that may be encountered. Many other arrangements may also be encountered.
When browsing either one of the Web sites 16 and 18 using a known browsing software application, a number of hypertext anchors corresponding to the various links shown will be reproduced on the display device 11 before the user. By accessing those hypertext anchors, typically by clicking the mouse 52 for example, the user is then transported via the link to the corresponding URL and the information referenced thereby.
Fig. 7 expands upon the arrangement shown in Fig. 2 and to which the same reference numbers take the same meaning as described above. However, for the purpose of review of the Hypertext Document Collating Tool application, the display window includes a first area 21 including menus and control symbols such as those known in the art. A further section 22 provides a list of document locations which are selectable to form a print display list of a document being produced. The document being produced is shown in a print preview display 23. Each of the areas 22 and 23 are able to be viewed in their entirety using scroll bars 28 and 29 respectively. The print display list includes a number of items 22A, 22B and 22C for which in this example only item 22A includes entries. The entries include firstly a enabling entry 24 that identifies that the user wishes the particular information to form part of the printable document 23. The next entry includes the title of the document, whereas the entry 26 represents a fetching status of the document and the entry 27 the specific URL of that document.
472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE231472554.DOC:LDP -21 As seen in the print preview display 23, the document includes a number of hypertext anchors 130-136 which are selectable by the user for example by clicking on the mouse 52. The hypertext anchors 130-1364 as so-called because they are formed using text, and form a subset of various types of hyperlink anchors corresponding for example to those typically seen in known browsing software packages for example.
According to arrangements such as the Hypertext Document Collating Tool, the document creation system may be configured to automatically access each of the URLs referenced by hyperlink anchors formed within the document level currently being reviewed. For example, if the document level currently being reviewed incorporates a home page, the Hypertext Document Collating Tool may be enabled to formulate a document illustrating only the home page itself and no subsidiary pages. Alternatively all subsidiary pages appended to the home page (ie having a common URL component) may be automatically selected. This feature, whilst desirable in accessing all information, is often undesirable because too much information may be accessed and in particular information not desired to be accessed. The only alternative is to disable the automatic accessing of all lower level documents and for the user to individually select those links desired to be added to the document being created.
°Turning now to Figs. 9, 10 and 11, according to the preferred embodiment of the present invention, a mechanism is provided by which a limited plurality of the links may be simultaneously selected for incorporation into the printable document.
S°Fig. 9 shows a further print preview 60 similar to that of Fig. 2, but where only the individual pages to be printed are shown illustrated (and thus in many respects similar to the print preview function in many modem word processing packages). As seen, the print preview 60 includes two printable pages 61 and 62, the first page 61 incorporating each of the hyperlink anchors 130-136 shown in Fig. 2. As seen, further hyperlink anchors, including the anchor 138, are displayed on the second page 62.
According to the preferred embodiment, and according to the method 150 of Fig.
11, a user may choose to construct a document at step 152 from a number of sources. The user, at step 154 examines a primary source document, such as the print preview 60, to 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DO:LDP -22identify whether or not it contains multiple anchors. This determination, at step 156, permits those cases where only a single anchor is selected to create a simple linear document, in the fashion described above in relation to the Hypertext Document Collating Tool. Where multiple anchors exist, the selection of multiple ones of the hyperlink anchors is performed at step 158 by the user simultaneously selecting a number of the links, for example by creating a bounding box surrounding those links desired to be selected. In particular, and according to the preferred embodiment, the user of the computer system 40 down clicks the mouse 52 at a position 64 adjacent, but as seen, just outside, the hyperlink 130, and drags the mouse in a direction 66 to create a bounding box 65 which encloses at least two of the other hyperlink anchors (in this case the anchors 130 and 134). In this fashion, the bounding box 65 may then be enabled by up-clicking the mouse 52 to select each of those hypertext anchors within the bounding box 65 and then to incorporate each of those selected anchors into a contiguous document to be formed by the hyper text document creation software. In particular, following steps 160 and 162 are performed automatically in the preferred embodiment by the anchors selected by the bounding box 65 being automatically input to the print display list 22 of the Hypertext Document Collating Tool (see Fig. which then automatically sources the material from each of the selected anchors and formats an appropriate contiguous reproducible document.
As seen in Fig. 9, the clicking and dragging of the mouse in the directions 68 and 70 may extend the bounding box 65 to enclose further hypertext anchors (eg. LINK03 and LINK07) and also to omit from selection certain other hypertext anchors (LINK04, ••LINK05, LINK06) using bounding boxes 67 and 69.
Fig. 10 illustrates a layout of a reproducible document 80 formed using the selection methods illustrated in Fig. 9 and based upon each of the hypertext anchors 130, 132, 134 and 138 enclosed by the bounding box 69. As seen, the reproducible document includes four pages 81, 82, 83 and 84 with the information contained in those pages being comprised of the information sourced by each of the hypertext anchors 130, 132, 134 and 138 in the order in which they are encountered in the print preview 60. As seen, 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP -23the document 80 has each of its pages formatted into two columns and the first page 81 commences with information 85 sourced via the hypertext anchor 130 which follows onto the second column of the first page 81 having information 86. On conclusion of that information 86, further information 87 sourced from hypertext anchor 132 completes that page and follows on to the second page 82 where it occupies both columns 88 and 89, concluding in a first column 90 of the third page 83. The third information sourced from the hypertext anchor 134 then completes columns 91 and 92 of page 3 as well the initial portion of a column 93 on the fourth page 84. The information sourced from the last hypertext anchor 138, completes a first column 94 of the fourth page 84.
As a consequence of using the preferred embodiment of the present invention, a contiguous document is created using information obtained from selected sources simultaneously.
Further, where the contiguous document 80 includes further hypertext anchors, further documents may be created from the contiguous document 80 in a corresponding manner. The further document created may then be utilised independently or appended to the first document or alternatively merged into the first document.
:so In an alternative to selecting using the mouse pointing device 52, selection may be made using a predetermined window of the browsing or document creation software.
For example, referring to Fig. 7, whilst the document being created may include a large number of hypertext anchors some of which are not temporally visible in the windows S"that portion of which is seen in the print preview section 23 includes only four hyper text anchors 130-136. According to a further embodiment of the present invention, the window 23 may be used to identify those hypertext anchors to be selected for the inclusion in a further reproducible document.
The present invention provides a means by which a single contiguous document may be created through simultaneously selecting a number of source documents from a larger number of available source documents. The simultaneous selection provides for the documents according to the preferred embodiment to be formatted into a single document in the order in which they appear in the initial selection. In alternative 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554DOC:LDP -24implementations, the selection area transcribed by movement of the mouse 52 need not be a simple rectangular bounding box but may be modifiable in shape, in a manner corresponding to drawing polygons in computer graphics packages, to create a nonuniform bounding area. Embodiments of the present invention find application in automated document creation from networked sources such as the Internet and LAN's, using for example Web Browser, word processing or desk-top publishing applications.
The foregoing describes only one embodiment of the present invention, and modifications can be made thereto without departing from the scope of the present invention.
In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including" and not "consisting only of'. Variations of the word comprising, such as "comprise" and "comprises" have corresponding meanings.
o* o 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554D0CLDP

Claims (22)

  1. 2. A method according to claim 1, wherein said document includes a plurality of printable pages each with matter sourced from at least one anchor of the set.
  2. 3. A method according to claim 2, wherein said document includes a plurality of further anchors from which a (second) set of said further anchors are selectable and a further contiguous document is creatable incorporating matter sourced from each anchor of said second set.
  3. 4. A method according to claim 3, wherein the second set of anchors is selectable from a plurality of said printable pages of said first document.
  4. 5. A method according to claim 3 or 4, wherein said further document is one of: an independent document; 25 appended to said first document; or merged into said first document. 472554 CFP1440AU Paget-23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP 26
  5. 6. A method according to any one of the preceding claims, wherein the selection is made by scribing a computer pointing device about an area in which the anchors are present.
  6. 7. A method according to any one of the preceding claims, wherein selection is made by selecting all those anchors represented within a displayed window of a document.
  7. 8. Apparatus for constructing a document from a plurality of sources accessible via a computer network, said apparatus comprising: means for simultaneously selecting from at least one source document incorporating a plurality of hyperlink anchors a (first) set of said anchors, said set including at least two of said anchors; and means for creating a (first) contiguous document incorporating matter sourced from each of the (first) set of anchors selected simultaneously.
  8. 9. Apparatus according to claim 8, wherein said document comprises a plurality of printable pages each with matter sourced from at least one anchor of the set. .°ooo•
  9. 10. Apparatus according to claim 9, wherein said document comprises a plurality of further anchors from which a (second) set of said further anchors are selectable and means S•for forming a further contiguous document incorporating matter sourced from each anchor S of said second set.
  10. 11. Apparatus according to claim 10, wherein the second set of anchors is selectable from a plurality of said printable pages of said first document.
  11. 12. Apparatus according to claim 10 or 11, wherein said further document is one of: an independent document; 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23472554.DOC:LDP -27- appended to said first document; or merged into said first document.
  12. 13. Apparatus according to any one of claims 8 to 12, wherein the selection is made by scribing a computer pointing device about an area in which the anchors are present.
  13. 14. Apparatus according to any one of claims 8 to 13, wherein selection is made by selecting all those anchors represented within a displayed window of a document.
  14. 15. A computer readable medium incorporating a computer program product for constructing a document from a plurality of sources accessible via a computer network, said computer program product comprising: means for simultaneously selecting from at least one source document incorporating a plurality of hyperlink anchors a (first) set of said anchors, said set including at least two of said anchors; and means for creating a (first) contiguous document incorporating matter sourced ••from each of the (first) set of anchors selected simultaneously.
  15. 16. A computer readable medium according to claim 15, wherein said document comprises a plurality of printable pages each with matter sourced from at least one anchor ofthe set.
  16. 17. A computer readable medium according to claim 16, wherein said document comprises a plurality of further anchors from which a (second) set of said further anchors are selectable and means for forming a further contiguous document incorporating matter sourced from each anchor of said second set.
  17. 18. A computer readable medium according to claim 17, wherein the second set of anchors is selectable from a plurality of said printable pages of said first document. 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP r- -28-
  18. 19. A computer readable medium according to claim 17 or 18, wherein said further document is one of: an independent document; appended to said first document; or merged into said first document. A computer readable medium according to any one of claims 15 to 19, wherein the selection is made by scribing a computer pointing device about an area in which the anchors are present.
  19. 21. A computer readable medium according to any one of claims 15 to 19, wherein selection is made by selecting all those anchors represented within a displayed window of a document.
  20. 22. A method of constructing a document from a plurality of sources accessible via a computer network substantially as described herein with reference to the drawings. •S s*
  21. 23. A computer readable medium incorporating a computer program product for performing the method of claim 22. S
  22. 24. A computerised system for constructing a document from a plurality of sources "accessible via a computer network substantially as described herein with reference to the drawings. DATED this FOURTEENTH day of SEPTEMBER 1999 S Canon Kabushiki Kaisha Patent Attorneys for the Applicant/Nominated Person SPRUSON FERGUSON 472554 CFP1440AU Page+23 [I:\ELEC\CISRA\PAGEPLUS\PAGE23]472554.DOC:LDP
AU47578/99A 1998-09-15 1999-09-14 Document building using interactive browsing Ceased AU743115B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU47578/99A AU743115B2 (en) 1998-09-15 1999-09-14 Document building using interactive browsing

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
AUPP5954 1998-09-15
AUPP5954A AUPP595498A0 (en) 1998-09-15 1998-09-15 Document building using interactive browsing
AU47578/99A AU743115B2 (en) 1998-09-15 1999-09-14 Document building using interactive browsing

Publications (2)

Publication Number Publication Date
AU4757899A true AU4757899A (en) 2000-03-23
AU743115B2 AU743115B2 (en) 2002-01-17

Family

ID=25627952

Family Applications (1)

Application Number Title Priority Date Filing Date
AU47578/99A Ceased AU743115B2 (en) 1998-09-15 1999-09-14 Document building using interactive browsing

Country Status (1)

Country Link
AU (1) AU743115B2 (en)

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5761436A (en) * 1996-07-01 1998-06-02 Sun Microsystems, Inc. Method and apparatus for combining truncated hyperlinks to form a hyperlink aggregate

Also Published As

Publication number Publication date
AU743115B2 (en) 2002-01-17

Similar Documents

Publication Publication Date Title
EP1597680B1 (en) Markup language cut-and-paste
US7240294B2 (en) Method of constructing a composite image
US7685426B2 (en) Managing and indexing content on a network with image bookmarks and digital watermarks
US6421070B1 (en) Smart images and image bookmarking for an internet browser
US6360236B1 (en) Computer product for integrated document development
US6332150B1 (en) Integrated document development method
JP3588337B2 (en) Method and system for capturing graphical printing techniques in a web browser
EP0834822A2 (en) World wide web news retrieval system
US20080065982A1 (en) User Driven Computerized Selection, Categorization, and Layout of Live Content Components
US20020196272A1 (en) Smart images and image bookmarks for an internet browser
US20040215719A1 (en) Method and system for designing, editing and publishing web page content in a live internet session
WO1999008210A1 (en) Creating and saving multi-frame web pages
JP4109807B2 (en) Document processing method and apparatus
TWI317487B (en) System, method, and computer readable medium for annotating a displayed received document without changing the received document content
WO2008092079A2 (en) System, method and apparatus for selecting content from web sources and posting content to web logs
AU5353896A (en) An integrated development platform for distributed publishing and management of hypermedia over wide area networks
KR19990044880A (en) Asynchronous Printing Method of Web Documents and Its System
US20040117732A1 (en) Method of and apparatus for creating a computer document
AU760816B2 (en) System for capturing, annotating and transmitting images of internet web pages
JP2001084212A (en) Method for preparing homepage
WO2002037939A9 (en) Method of constructing a composite image within an image space of a webpage
US20020010720A1 (en) Hyper-text document formatting collating and printing
US8151197B1 (en) On-line system for creating a printable product
TW201337605A (en) Multipurpose network editing page automatic conversion mechanism
AU743115B2 (en) Document building using interactive browsing

Legal Events

Date Code Title Description
FGA Letters patent sealed or granted (standard patent)