AU2004201773A1 - Method of Printing a Selected Element within a Web Page - Google Patents

Method of Printing a Selected Element within a Web Page Download PDF

Info

Publication number
AU2004201773A1
AU2004201773A1 AU2004201773A AU2004201773A AU2004201773A1 AU 2004201773 A1 AU2004201773 A1 AU 2004201773A1 AU 2004201773 A AU2004201773 A AU 2004201773A AU 2004201773 A AU2004201773 A AU 2004201773A AU 2004201773 A1 AU2004201773 A1 AU 2004201773A1
Authority
AU
Australia
Prior art keywords
file
hierarchical model
code
web page
html
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2004201773A
Inventor
Richard Cudd
Matthew William Gallagher
John Stewart Reeves
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Canon Information Systems Research Australia Pty Ltd
Original Assignee
Canon Information Systems Research Australia Pty Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Canon Information Systems Research Australia Pty Ltd filed Critical Canon Information Systems Research Australia Pty Ltd
Priority to AU2004201773A priority Critical patent/AU2004201773A1/en
Publication of AU2004201773A1 publication Critical patent/AU2004201773A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Description

S&F Ref: 676212
AUSTRALIA
PATENTS ACT 1990 COMPLETE SPECIFICATION FOR A STANDARD PATENT Name and Address Name and Address of Applicant: Actual Inventor(s): Address for Service: Invention Title: Canon Information Systems Research Australia Pty Ltd, an Australian Company, ACN 003 943 780, of 1 Thomas Holt Drive, North Ryde, New South Wales, 2113, Australia Matthew William Gallagher Richard Cudd John Stewart Reeves Spruson Ferguson St Martins Tower Level 31 Market Street Sydney NSW 2000 (CCN 3710000177) Method of Printing a Selected Element within a Web Page The following statement is a full description of this invention, including the best method of performing it known to me/us:- 5845c METHOD OF PRINTING A SELECTED ELEMENT WITHIN A WEB PAGE Field of the Invention The present invention relates to HTML web page documents and, in particular, to the printing of user selected elements within these documents.
Background There are many known techniques for printing structured documents and, in particular, for printing HTML documents. HTML is an abbreviation of HyperText Markup Language which is the authoring language used to create documents for the World Wide Web. HTML defines the structure and layout of a web document by using a variety of tags and attributes arranged in a hierarchical structure. The basic structure for an HTML document begins with the <HTML> tag followed by a <HEAD> tag that contains information as well as scripts that can run based on interaction or events within the document. Tags are closed using a symbol before the tag name. For example, the <HTML> tag is closed by the </HTML> tag which will normally be that last tag in the document. An HTML tag is a marker for an element, which may be thought of as a node in the hierarchy.
There are many other tags that are used to format the content of web pages including <TABLE> that lays out data in rows of data items and <FRAMESET> that allows multiple pages to be viewed as frames contained within a single page. A web browser is a computer operable application program that enables reading of HTML and other file types and for the reproduction of documents, images and other media to a user. Examples of web browsers include Internet Explorer T M manufactured by Microsoft Corp. and Netscape Navigator T M manufactured by Netscape Corp.
676212.doc Current methods of printing HTML files typically involve using web browser printing functions that include the ability to print selected regions of web pages and individual frames of a page. This can be achieved using the "Print Selection" option in Internet Explorer T M 6.0. The present inventors are not aware of an approach to print an individual element on a page that a user has selected. All that is possible is to print a region that the user has selected by dragging the mouse over the document. This can be restrictive and also inaccurate since the area that is selected can sometimes be difficult to control.
There is also currently no method for previewing this selected region. It is also possible to print a selected frame when viewing a web page containing a <FRAMESET> tag but this does not allow the user to print a particular element within that frame.
A number of other applications that use structured document formatting have printing capabilities. Microsoft ExcelTM has the ability to print selected regions including printing individual cells, but is organised in a rigid grid structure that does not involve a tag/element hierarchy in the same way as HTML. This structure makes it very clear to the user which cell or cells are selected for printing. This is not the case in HTML pages.
Postscript T M and PDF page description languages, both developed by Adobe Systems, describe pages as collections of geometric objects and operate essentially as object oriented programming languages. They are not organised in a hierarchical structure and it is only possible to print selected pages or regions rather than specific elements in the document.
Summary of the Invention It is an object of the present invention to substantially overcome or at least ameliorate one or more problems associated with existing arrangements.
676212.doc 4' -3- In accordance with one aspect of the present invention, there is disclosed a method of printing a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, the method comprising the steps of: determining a selected element of the web page; accessing the hierarchical model of said web page; traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; using the hierarchical model to extract HTML data for the further element from said HTML file, the extracted data being retained in a temporary file; and printing an interpretation by the browser of the temporary file.
In a preferred implementation, the method further comprises, before step the step of: previewing an interpretation by the browser of the temporary file.
Advantageously, step further comprises the steps of: (ca) comparing a size of the HTML data for the selected element against a predetermined threshold size; (cb) if the size is smaller, then selecting the next element up in the hierarchical model.
Step may further include the step of: (cc) repeating steps (ca) and (cb) until the size of the extracted HTML data is above the threshold size.
676212.doc 4' -4- Preferably step follows step and step (cc) is user selectable to thereby preview the temporary file to give a user the option of selecting or not selecting the next element up in the hierarchical model.
Numerous other aspects of the present invention are also disclosed.
Brief Description of the Drawings At least one embodiment of the present invention will now be described with reference to the drawings and appendices, in which: Fig. 1 shows a modified toolband used in the Internet Explorer T M web browser application incorporating the custom print option of the present disclosure; Fig. 2 is a schematic block diagram of a general purpose computer upon which arrangements described can be practiced; Fig. 3 depicts a custom printing approach for Internet Explorer;TM Fig.4 shows a selected TD element on an example page containing a TABLE element; Fig.5 is a flow diagram showing the selection of and printing of an element; Fig. 6 depicts a prior art printing of an example web page seen in Fig. Fig. 7 shows a source image used in the web page of Fig. Fig. 8 shows a printed page derived from a first selection of the web page of Fig. Fig. 9 shows a printed page derived from thresholded selection of the web page of Fig. Fig. 10 shows a printed page derived from another thresholded third selection of the web page ofFig. Fig. 11 shows a printed page derived from another selection of the web page of Fig. 676212.doc Fig. 12 shows a printed page derived from a further selection of the Web page of Fig. Fig. 13 shows a printed page derived from of another selection of the web page of Fig. Fig. 14 shows a printed page derived from another thresholded selection of the web page of Fig. Fig. 15 shows an example web page represented using a prior art browser GUI; Fig. 16 shows an example frames test page represented using a generic browser
GUI;
Fig. 17 shows a printed page derived from the web page of Fig. 16; Fig. 18 shows another printed page derived from the web page of Fig. 16; Appendix 1 shows the HTML source code for the web page shown in Fig. Appendix 2 shows the HTML source code for the web page shown in Fig. 16; and Appendix 3 provides example code demonstrating the activeElement property.
Detailed Description including Best Mode The present disclosure proposes a printing application, as termed herein as "custom print", which is preferably implemented as an add-on toolband to a generic browser application, such as Internet Explorer TM 5.5 or 6.0. An example of this is seen in Fig. 1 where part of the Internet Explorer TM 6.0 graphical user interface (GUI) 100 is shown. The GUI 100 includes a known first toolband 102 corresponding to that which permits access to the Google T M search engine and associated facilities. A further toolband 104, developed for customised printing as part of the present disclosure, is provided to enable customised printing of web pages. It is observed that the custom print toolband 104 is distinct from a generic print icon 106 contained in the toolband 108 of the GUI 100. In this connection, the 676212.doc -6selection of the print icon 106 will result in traditional printing of the web page displayed by the GUI 100, whereas selection of a print icon 110 or a print preview icon 112, will enable printing in the fashion now to be described with reference to the remaining drawings for the printing of individual elements with the HTML page.
The method of customised printing is preferably practiced using a general-purpose computer system 200, such as that shown in Fig. 2 wherein the processes of Figs. 3 to may be implemented as software, such as an application program executing within the computer system 200. In particular, the steps of method of customised printing are effected by instructions in the software that are carried out by the computer. The instructions may be formed as one or more software code modules, each for performing one or more particular tasks. The software may also be divided into three separate parts, in which a first part performs the browsing methods, a second part performs the customised printing methods and a third part that manages a user interface between the first part and second parts and the user. The third part incorporates the GUI 100 including the toolband 102 and the icons 110 and 112. The software may be stored in a computer readable medium, including the storage devices described below, for example. The software is loaded into the computer from the computer readable medium, and then executed by the computer. A computer readable medium having such software or computer program recorded on it is a computer program product. The use of the computer program product in the computer preferably effects an advantageous apparatus for customised printing of individual elements of structured documents such as web pages. The application program for customised printing may operate in concert with a web browser application also operating within the computer system 200.
676212.doc The computer system 200 is formed by a computer module 201, input devices such as a keyboard 202 and mouse 203, output devices including a printer 215, a display device214 and loudspeakers217. A Modulator-Demodulator (Modem) transceiver device216 is used by the computer module201 for communicating to and from a communications network 220, and is connectable via a telephone line 221, for example, or other functional medium. The modem 216 can be used to obtain access to a server 522, the Internet, World Wide Web and other network systems, such as a Local Area Network (LAN) or a Wide Area Network (WAN), and may be incorporated into the computer module 201 in some implementations.
The computer module 201 typically includes at least one processor unit 205, and a memory unit 206, for example formed from semiconductor random access memory (RAM) and read only memory (ROM). The module 201 also includes an number of input/output (1O) interfaces including an audio-video interface 207 that couples to the video display 214 and loudspeakers 217, an 1/O interface 213 for the keyboard 202 and mouse 203 and optionally a joystick (not illustrated), and an interface208 for the modem216 and printer 215. In some implementations, the modem 2116 may be incorporated within the computer module 201, for example within the interface 208. A storage device 209 is provided and typically includes a hard disk drive 210 and a floppy disk drive 211. A magnetic tape drive (not illustrated) may also be used. A CD-ROM drive 212 is typically provided as a non-volatile source of data. The components 205 to 213 of the computer module 201, typically communicate via tn interconnected bus 204 and in a manner which results in a conventional mode of operation of the computer system 200 known to those in the relevant art. Examples of computers on which the described arrangements can be 676212.doc practised include IBM-PC's and compatibles, Sun Sparcstations or alike computer systems evolved therefrom.
Typically, the browser application program, by which a user of the computer 200 access the Web, is resident on the hard disk drive 210 and read and controlled in its execution by the processor 205. Intermediate storage of the program and any data fetched from the network 220 may be accomplished using the semiconductor memory 206, possibly in concert with the hard disk drive 210. In some instances, the application program may be supplied to the user encoded on a CD-ROM or floppy disk and read via the corresponding drive 212 or 211, or alternatively may be read by the user from the network 220 via the modem device 216. Still further, the software can also be loaded into the computer system 200 from other computer readable media. The term "computer readable medium" as used herein refers to any storage or transmission medium that participates in providing instructions and/or data to the computer system 200 for execution and/or processing.
Examples of storage media include floppy disks, magnetic tape, CD-ROM, a hard disk drive, a ROM or integrated circuit, a magneto-optical disk, or a computer readable card such as a PCMCIA card and the like, whether or not such devices are internal or external of the computer module 201. Examples of transmission media include radio or infra-red transmission channels as well as a network connection to another computer or networked device, and the Intemrnet or Intranets including e-mail transmissions and information recorded on Websites and the like. Actuation of the print icons 106 or 110 causes the respective print functions to effect printing using the printer 215, for example.
The modem 216 enables the user of the computer 200 to access a web page via the network 220. The web page may be resident on the server computer 222 and accessible via a Web address defined by a Uniform Resource Locator (URL) to thereby reproduce content 676212.doc -9to the user, typically via the display 214. Typically the browser application activates the GUI 100 upon the display 214 by which the content and other information associated with the web page is presented.
From Internet Explorer T M version 5.5 onwards, it has been possible customize how that web browser application prints and previews documents. The mechanism for printing and previewing is controlled by print templates, which are HTML files that developers can create to control the layout and look of a print job. The HTML files contain JScript code (a Microsoft version of JavaScript T M developed by Sun Microsystems and Netscape Inc.) that is used for manipulating content and accessing objects within the print template. While print templates are HTML files, they can only be applied by making calls from within code written in the programming language. This code is loaded as a dynamically linked library (.dll) file and embedded in the web browser, thereby enabling the web browser to issue IDM_PRINT or IDM_PRINTPREVIEW commands. These are the commands that are used for normal printing, such as via the icon 106, and print previewing by Internet Explorer T M and provide the path to the new print template. Since the code is embedded in the web browser, the code can also be used to access the Document Object Model (DOM) of the currently loaded page in the browser. Browsers traditionally interpret the source HTML file by parsing the HTML code to create the corresponding DOM which, of itself, models the hierarchical structure of the web page. The DOM can be used to both manipulate and process the currently loaded document. It is a combination of this C++ code, which is used to analyse the DOM, and the print template that is used to implement the custom print application.
Fig. 3 demonstrates how custom print templates are called from code. A method 300 of processing printing commands is shown which operates as part of the 676212.doc GUI 100. In a first step 302, the browser application detects and determines a user selection of print or print preview from the GUI 100. Where default Internet Explorer T M printing is selected, for example via the icon 106, step 304 allows Internet Explorer T M to call its own default print template so as to print in a traditional fashion. Where step 302 detects selection of custom printing via the icons 110 or 112, step 306 follows where the .dll code embedded in Internet Explorer T m issues a corresponding IDMPRINT or IDM_PRINTPREVIEW command. Step 308 then follows where the embedded code triggers the custom print template for HTML pages.
By creating print templates and toolbands it is possible to control: the layout of pages when printed/previewed, and the content that is printed/previewed on them; (ii) how print jobs are handled for instance, which pages are printed, and in what order; and (iii) the look of the print preview window and controls available on the print preview user interface.
The process of element printing can now be described with reference to the method 500 shown in the flowchart of Fig. 5 and an example web page depicted in Fig. 4.
The method 500 may be implemented as software as an add-on application to the browser application operating within the computer system 200 as described above. A web page 400, for example sourced from the server 222, may be browsed using the browser application, this being represented in Fig. 5 as step 502 and illustrated via the GUI 100 in Fig. 4. Whilst browsing the web page 400, a user can highlight using the keyboard 202, or click upon using the mouse 203, part of a web page in the normal browser window. Detection of the user selection is performed at step 504. This can be achieved using the DOM to identify a 676212.doc -11point in the hierarchal model corresponding to the selected element of the page 400. As seen in Fig. 4, the web page 400 includes a table 402 formed of a number of cells. A particular cell 404 has a portion highlighted.
If the user then clicks on Print 110 or Print Preview 112 from the browser custom print toolband 104, the DOM of the page is examined and an attribute called activeElement is returned in step 506. Before activeElement can return the element on the page that was highlighted or clicked upon, it is necessary to "switch on" the activeElement property. This can be achieved by setting the tabIndex attribute of each element in the page to be zero.
Some sample code is provided in Appendix 3 to demonstrate how this can be done. For example, in Fig. 4, a TD element (ie. a table data element) is active. This attribute returns the element containing the region on the page that was highlighted by the user. This element is referred to as the "selected element". If no element was selected, then the activeElement is set to BODY in the case of a non-frames page, and FRAMESET for a frames page. These two tags are the first tags in the HTML hierarchy and correspond to the whole page being selected.
If the activeElement property is set to a TD element, as in the case for the example of Fig. 4, then "element only" printing is enabled. This test is performed at step 508. In this example, if any other type of element is currently the activeElement then "element only" printing is ignored and normal printing of the entire page is performed according to step 510 after which the method 500 concludes at step 512. In Fig. 4, the selected TD element is the entire cell 404, even though only four letters (ie. "ther") of the text in that cell had been selected.
If a TD element is detected as the activeElement in step 508 then this indicates that part of a table has been selected. The user is then prompted in step 514 and asked if they 676212.doc 12wish to print normally or print using only the selected element. If the user decides to print normally, step 510 follows and normal printing operates as before. If the user selects to print using the selected element, then the method 500 commences processing the HTML.
Initially, at step 516, the size of the current element to be printed is examined. In the present example, the content of the TD element will be examined. The innerText attribute of the TD element in the DOM will return all the text that is contained within the element. If the amount of text in the element is over a certain threshold (for example a size represented by a length of 200 characters for a text element) then step 522 follows and just the TD element alone will be printed. Methods for changing and selecting this threshold are discussed later in this specification.
If the amount of text is below this threshold then the next element up in the HTML hierarchy will be extracted and considered for printing, as shown in step 520. The parent element of the TD element (which is always a TR, or table row element) can be extracted using the parentElement attribute in the DOM. Once this element has been extracted it can also be examined in step 518 using the innerText attribute to see how much data it contains.
A TR element may include a number of TD elements embedded within it and all the characters from these will be included when querying the innerText of the TR element. If the size of the innerText is still not above the threshold then the next element up in the HTML hierarchy will be extracted (again using the parentElement) which will be the entire table. Again this will be examined using innerText to determine the size of the entire table content. If the whole table is still below the threshold then the next element up in the HTML hierarchy is again extracted using the parentElement attribute and the process continues until either the threshold is passed or the top-level of the document is reached.
This is last test is performed in step 521 of Fig. 5, where if the next level is not the top 676212.doc 13level, step 516 follows to examine the relevant size. Where the next level is the top level, step 510 prints the page in the usual manner. As such, steps 516, 518 and 520 provide that, using the selected element as a commencement point, the DOM is traversed according to a predetermined criterion until the criterion is met.
s When the current element exceeds the threshold, the entire HTML for the current element is copied to a temporary file in step 522 so that it can be printed or previewed according to step 524. The method 500 concludes at step 526. Such printing or preview is, as noted above, performed by interpretation by the browser.
The threshold can be set in a number of ways. The user may simply select the size (in characters, image area or some combination of the two) for the threshold, and print if the element is above this size, or select the next element up in the hierarchy if not. The method described in the previous paragraph only considers setting the threshold based on the size of the text within a table element. Tables frequently include images and other objects embedded within them that may be quite large but not contain any text. It is possible to include these in the threshold value by querying each element's height and width attributes in order to determine the area taken up by these objects. This can then be factored into the threshold calculation. Alternatively, the user may be presented with a preview at each stage and having viewed the content could decide whether to print or get the next element up in the hierarchy. In such an implementation, steps 518 and 520 could be moved to follow the preview of step 524.
The threshold value may alternatively or additionally be based on a value that the user usually selects, or upon the user's previous history of selections. A threshold may then be suggested automatically after a user has carried out a number of prints.
676212.doc -14- Once a threshold has been met or the user has selected the element that is desired for printing, according to step 522, the correct HTML must be extracted for printing. The outerHTML attribute of the selected element is used to construct a temporary web page for printing. This page must also have the opening and closing HTML and BODY tags added so that it is correctly formatted as a separate page. A BASE tag must also be added so that images can be correctly referenced in the temporary local copy that is being created. This tag provides the path (in the form of a URL) where the images and other objects on the page can be found. The tag is required because the page will be stored locally so all the paths that were in the page whilst it was on the server will no longer be valid. Once the temporary file has been created, it can be used for printing or previewing according to step 524. If the method of selecting the tag to be printed involves previewing at each stage before printing, then it is necessary to create the temporary file at each stage before previewing the page.
The add-on program which affords the custom print functionality described above may be delivered to the computer 200 and user via the Web and access via the browser application. Upon downloading the add-on program, the browser application is automatically supplemented by the functionality of the add-on program to offer customised printing to the user.
Whilst the description above is focussed on the printing of a TABLE element or one of its child elements in the HTML hierarchy, the principles disclosed herein can be readily applied to any other HTML element that may be high lighted by the user, as will be apparent from the Examples that now follow.
Examples 676212.doc The custom printing approach described above is now illustrated in a number of forms using an example (non-frames) web page 1500 seen in Fig. 15 depicted within an Internet Explorer T M GUI. The source code, EXAMPLE.HTML, for the web page 1500 is seen in Appendix 1 which structures the page 1500 with a heading 1502, an ordered list 1504, an unordered list 1506, a simple text paragraph 1508, a paragraph 1510 containing text and an image, a paragraph 1512 containing text and a link, and a table 1514 having a first cell 1516 containing a text paragraph, and a second cell 1518 containing a first paragraph 1520, an image 1522 and another paragraph 1524. In the second cell 1518, the image 1522 forms its own paragraph, in view of the paragraph 1520 having a closing paragraph tag and the paragraph 1524 having an opening paragraph tag Fig. 7 shows the source image used at both 1510 and 1522, and which is seen to be rectangular, whereas the HTML source of Appendix 1 sets the images at 1510 and 1522 to be displayed at the same size of 50 x 50 pixels, as illustrated in Fig. The hierarchical structure of HTML source code includes, as a minimum, a root node, and one or more leaf nodes. Often, one or more intermediate nodes are provided. In Fig. 15, and as better seen from Appendix 1, the paragraphs 1508, 1510, 1512, 1516, 1520, 1524 and the image 1522 are examples of leaf nodes of the HTML hierarchy. The ordered list 1504, unordered list 1506 and the table 1514 are examples of intermediate nodes in the hierarchy.
It will be appreciated that, for example, Appendix 1 represents a mere printing of the HTML source file. In contrast, printing of the HTML source file, as that file has been interpreted by the relevant browser application (ie. the web page), results in a rendering to a physical page of the formatting and data defined by the HTML source file.
676212.doc -16- Traditional printing of the page 1500, using Internet Explorer T M 6.0 operates as follows: right clicking on any text will enable printing the whole page; right clicking on any image will print only the image; right clicking on any link will print the whole linked page; and selection of an area with the mouse, will enable printing the selected area. An example of this mode is seen in Fig. 6 where everything on the page 1500 has been selected except the table 1514.
Printing of the page 1500 can also be achieved using the custom print application described above, in the following ways.
Clicking on an element in either the ordered or the unordered list will print only the element if its size is greater than the threshold. This is seen in Fig. 8. If the selected element is not greater than the threshold, then the next element up in the hierarchy will be considered which, in this case, is an <OL> element as seen from Appendix 1. If that next element is greater than the threshold, the entire content of the element will be printed (ie. the entire ordered list 1504, as seen in Fig. If this element is still not greater than the threshold, then the next element up in the hierarchy will be selected, in this case a element. If this element is greater than the threshold then the entire contents of the element (ie. the ordered list 1504 and the unordered list 1506) will be printed, as seen in Fig. Selecting a portion of text in the line 1510 containing the text and the image will print the entire element, which in this case is a element, provided the size of the element is greater that the threshold. The print will contain both the text and the image as seen in Fig. 11. Here multiple data types, being within the same (leaf) node of the 676212.doc -17hierarchical HTML structure, are printed even though only part of one data type may have been selected. This can apply to other combinations of data types, such as text and tables, for example if text is selected and is in an element within the table.
Selecting on any part of the text 1512 containing the link will again print the entire element as shown in Fig. 12.
Clicking on the cell 1516 of the table 1514 will print the cell as long as the size of the cell is greater than the threshold, as seen in Fig. 13. If its size is not greater than the threshold, then the next element up in the hierarchy will be selected. In this case, as seen from Appendix 1, this is a <TR> (table row element). If this element is greater than the threshold, then that element will be printed, as shown in Fig. 14.
A further example of a frames web page 1600 is depicted in Fig. 16. The frames page 1600 is represented within a generic web browser GUI 1610 and includes a top frame 1602, and arranged beneath, a left frame 1604 and a right frame 1606. The source code, MAINFRAME.HTML, for the page 1600 is provided in Appendix 2. The top frame 1602 and left frame 1604 each comprise only a single line of text. It will be seen that the right frame 1606 is the web page 1500 of Fig. 15 and Appendix 1. In view of the page 1500 being larger than the frame 1606, only part of the page 1500 may be viewed on the GUI 1610. A scroll bar 1612 is provided associated with the right frame 1606 to enable the user to scroll the contents thereof. It will be appreciated that in Fig. 16, some scrolling has taken place in view of the heading 1502 of the page 1500 been shown truncated.
Traditional printing of the page 1600, using Internet Explorer T M 6.0 operates as follows: selecting print from the File menu in the toolbar will enable printing the entire page; 676212.doc 18right-clicking on any frame will enable printing the entire frame; right-clicking on any image will enable printing the image; right-clicking on any link will enable printing the whole linked page; and selecting an area with the mouse will enable printing of the selected area.
Printing of the page 1600 can also be achieved using the custom print application described above. Printing for content on a frames page will be the same as for a non-frames page except that each entire frame will be included in the hierarchy, and can be printed if the threshold is set high enough. Figs. 17 and 18 show the results of selecting for print the top frame 1602 and left frame 1604, respectively.
It will be apparent from the Examples that the custom print application has the ability to traverse the hierarchy of the HTML source to identify and extract, where the threshold size is set appropriately, leaf nodes of the hierarchy. This is seen from Fig. 8 and Appendix 1 where the printed matter comprises a single item (a leaf node) from the ordered list. The content typically referenced by intermediate nodes is printed when the threshold size is larger (eg. Fig. 9 and Fig. 10). Where the root node is selected (ie. the whole web page), custom printing provides a result akin to that of the traditional web browser except that the result is arrived at in a different fashion.
Industrial Applicability The arrangements described are applicable to the computer and data processing industries and particularly those involving the printing of data sourced from computer networks. The disclosure is directly applicable to web browsing applications and the printing of data sourced using such applications.
676212.doc -19- The foregoing describes only some embodiments of the present invention, and modifications and/or changes can be made thereto without departing from the scope and spirit of the invention, the embodiments being illustrative and not restrictive.
In the context of this specification, the word "comprising" means "including principally but not necessarily solely" or "having" or "including", and not "consisting only of'. Variations of the word "comprising", such as "comprise" and "comprises" have correspondingly varied meanings.
676212.doc Appendix 1 This Appendix provides the HTML source code for the web page seen in Fig.
EXAMPLE.HTML
<HTML>
<HEAD>
<TITLE>An example web page</TITLE>
</HEAD>
<BODY>
<H1> A sample web page </H1>
<P>
<OL>
<LI>This is the first item in an ordered list <LI>This is the second item in an ordered list <LI>This is the third item in an ordered list <LI>This is the fourth item in an ordered list
</OL>
<UL>
<LI>This is the first item in an unordered list <LI>This is the second item in an unordered list <LI>This is the third item in an unordered list <Ll>This is the fourth item in an unordered list
<P>
This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing some text. This is a paragraph containing 676212.doc -21some text. This is a paragraph containing some text. This is a paragraph containing some text.
<P>
This is a paragraph containing some text and an image.
<IMG SRC="image.jpg" HEIGHT="50"
<P>
This is a paragraph containing a link to <A HREF="http://www.bbc.co.uk">www.bbc.co.uk</A>.
<TABLE BORDER=1>
<TR>
<TD>
The United States has said the United Nations must act to prevent Iraq's "active and systematic efforts" to hide its efforts to produce weapons of mass destruction. Secretary of State Colin Powell presented tape recordings, satellite photographs and intelligence data showing Baghdad's "evasion and deception" in the face of UN weapons inspections.
</TD>
<TD>
<P>
A stunning Julian Gray strike and a Stephane Henchoz own-goal gave 10-man Crystal Palace a sensational win over Liverpool to provide the shock of the FA Cup fourth round.
Gray crashed his shot past Jerzy Dudek on 55 minutes to set up Palace's win and then provided the cross which Henchoz deflected into his own net 11 minutes from time.
<IMG SRC="image.jpg" HEIGHT="50"
<P>
676212.doc -22- Maybe this defeat will prove a blessing in disguise. In between the goals, Palace substitute Dougie Freedman was dismissed for elbowing Sami Hyypia in the face. But Liverpool were unable to make use of their numerical advantage and Gerard Houllier's misfiring team have given their manager another hugely disappointing performance to ponder.
<TD>
</TR>
<TABLE>
</BODY>
</HTML>
676212.doc 23 Appendix 2 This Appendix provides the HTML source code for the frames web page seen in Fig. 16.
MAINFRAME.HTML
<HTML>
<HEAD>
<TITLE> Frame Test Page</TITLE>
EAD>
<FRAMESET ROWS=100,* FRAMEBORDER =1> <FRAME FRAMEBORDER=O FRAMESPACING=O MARGINHEIGHT=O MARG INWIDTH=O NAME=BOTTOM SRC="frame1. html"> <FRAMESET COLS=200,* FRAMEBORDER=1 FRAMESPACING=1> <FRAME FRAMEBORDER=O FRAMESPACING=O MARGINHEIGHT=O MARGINWIDTH=O NAME=LEFT SRC="frame2.html"> <FRAME FRAMEBORDER=O FRAMESPACING=O NAME=RIGHT SRC="example. html">
</FRAMESET>
</FRAMESET>
<NO FRAM ES>
<CENTER>
Your browser does not support frames.
</CENTER>
</NOFRAMES>
</HTML>
FRAMEI .HTML
<HTML>
<BODY>
This is the top frame
</BODY>
67621 2.doc 24
</HTML>
FRAME2.HTML
<HTML>
<BODY>
</BODY>
</HTM L> This is the left frame 67621 2.doc Appendix 3 This Appendix provides example code demonstrating the activeElement property.
function Turn OnActive Elemento var allEiements documnentai for i 0 i allElements.Ilength allElements[i].tablndex 0 67621 2.doc

Claims (5)

1. A method of printing a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, said method comprising the steps of: determining a selected element of the web page; accessing the hierarchical model of said web page; traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; using the hierarchical model to extract HTML data for the further element from said HTML file, said extracted data being retained in a temporary file; and printing an interpretation by the browser of the temporary file.
2. A method as claimed in claim 1, further comprising, before step the step of: previewing an interpretation by the browser of the temporary file.
3. A method as claimed in claim 1 or 2, wherein step further comprises the steps of: (ca) comparing a size of the HTML data for the selected element against a predetermined threshold size; and (cb) if the size is smaller, then selecting the next element up in the hierarchical model.
4. A method as claimed in claim 3 wherein step further comprises the step of:
676212.doc -27- (cc) repeating steps (ca) and (cb) until the size of the extracted HTML data is above said threshold size. A method as claimed in claim 4 when dependent on claim 2, wherein step (f) follows step and step (cc) is user selectable to thereby preview said temporary file to give a user the option of selecting or not selecting the next element up in the hierarchical model. 6. A method as claimed in claim 3, 4 or 5, wherein said threshold size is determined based on previous user behaviour. 7. A method as claimed in claim 3, 4 or 5 wherein said threshold size is determined using a predetermined value for the length of the extracted HTML data. 8. A method as claimed in claim 3, 4 or 5 wherein said threshold size is determined using a predetermined value for the amount of space the extracted HTML data occupies on a page interpreted by the browser when printed. 9. A method as claimed in claim 3, 4 or 5 wherein said threshold size is determined by allowing the user to change the threshold size and view a result of the change in a preview of said temporary file. 676212.doc -28- A method as claimed in claim 3, 4 or 5 wherein said threshold size is determined by allowing the user to preview the print of the extracted HTML data and decide whether to select the next element up in the hierarchical model. 11. A method according to any one of the preceding claims wherein said element comprises a leaf node of said hierarchical model. 12. A method according to claim 1 wherein said element comprises an intermediate node and associated leaf nodes of said hierarchical model. 13. A method according to claim 1 wherein said element comprises matter within a cell of a table. 14. frame. A method according to claim 1 wherein said element comprises matter within a A method according to claim 1 wherein said element comprises a list. 16. A method according to claim 1 wherein said element comprises a table. 17. A method according to claim 11 wherein said leaf node comprises matter having multiple data types. 676212.doc -29- 18. A method according to claim 17 wherein said data types are selected from the group consisting of: text and images; (ii) text and tables; and (iii) text and another data type. 19. A method according to any one of the preceding claims wherein said hierarchical model comprises the Dominant Object Model of the web page. 20. A computer readable medium, having a program recorded thereon, where the program is configured to make a computer execute a procedure print a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, said program comprising: code for determining a selected element of the web page; code for accessing the hierarchical model of said web page; code for traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; code for using the hierarchical model to extract HTML data for the further element from said HTML file, said extracted data being retained in a temporary file; and code for printing an interpretation by the browser of the temporary file. 21. A computer readable medium as claimed in claim 20, further comprising code for previewing an interpretation by the browser of the temporary file. 676212.doc 22. A computer readable medium as claimed in claim 20 or 21 wherein said code for traversing further comprises: code for comparing a size of the HTML data for the selected element against a predetermined threshold size; and code for selecting the next element up in the hierarchical model if the size is smaller. 23. A computer readable medium as claimed in claim 22 wherein said code for traversing further comprises: code for executing said code for comparing and said code for selecting until the size of the extracted HTML data is above said threshold size. 24. A computer readable medium as claimed in claim 23 when dependent on claim 21, wherein operation of said code for executing is user selectable to thereby preview said temporary file to give a user the option of selecting or not selecting the next element up in the hierarchical model. A computer readable medium as claimed in claim 22, 23 or 24, wherein said threshold size is determined based on previous user behaviour. 26. A computer readable medium as claimed in claim 22, 23 or 24 wherein said threshold size is determined using a predetermined value for the length of the extracted HTML data. 676212.doc -31 27. A computer readable medium as claimed in claim 22, 23 or 24 wherein said threshold size is determined using a predetermined value for the amount of space the extracted HTML data occupies on a page interpreted by the browser when printed. 28. A computer readable medium as claimed in claim 22, 23 or 24 wherein said threshold size is determined by allowing the user to change the threshold size and view a result of the change in a preview of said temporary file. 29. A computer readable medium as claimed in claim 22, 23 or 25 wherein said threshold size is determined by allowing the user to preview the print of the extracted HTML data and decide whether to select the next element up in the hierarchical model. A computer readable medium according to any one of the preceding claims wherein said element comprises at least one of: a leaf node of said hierarchical model; (ii) an intermediate node and associated leaf nodes of said hierarchical model; (iii) matter within a cell of a table; (iv) matter within a frame; a list; and (vi) a table. 676212.doc -32- 31. A computer readable medium according to claim 30 wherein said leaf node comprises matter having multiple data types, said data types being selected from the group consisting of: text and images; s (ii) text and tables; and (iii) text and another data type. 32. A computer readable medium according to any one of claims 20 to 31 wherein said hierarchical model comprises the Dominant Object Model of the web page. 33. A web browser application program arranged for printing a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, said program comprising: code for determining a selected element of the web page; code for accessing the hierarchical model of said web page; code for traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; code for using the hierarchical model to extract HTML data for the further element from said HTML file, said extracted data being retained in a temporary file; and code for printing an interpretation by the browser of the temporary file. 676212.doc -33 34. An application program arranged for printing a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, said program comprising: code for incorporating said program into operation of said browser; code for determining a selected element of the web page; code for accessing the hierarchical model of said web page; code for traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; code for using the hierarchical model to extract HTML data for the further element from said HTML file, said extracted data being retained in a temporary file; and code for printing an interpretation by the browser of the temporary file. Apparatus for printing a selected element of a web page described by an HTML file, the file being interpretable by a browser to determine a hierarchical model of the file, said apparatus: means for determining a selected element of the web page; means for accessing the hierarchical model of said web page; means for traversing the hierarchical model from the selected element upwards within the hierarchy thereof to determine a further element to be printed based upon a predetermined criterion; means for using the hierarchical model to extract HTML data for the further element from said HTML file, said extracted data being retained in a temporary file; and 676212.doc -34- means for printing an interpretation by the browser of the temporary file. 36. A method of printing a selected element of a web page described by an HTML file, said method being substantially as described herein with reference to Fig.
5 of the drawings. 37. A computer readable medium, having a program recorded thereon, where the program is configured to make a computer execute a procedure print a selected element of a web page described by an HTML file, said program being substantially as described herein with reference to Fig. 5 of the drawings. 38. A browser application program configured to make a computer execute a procedure print a selected element of a web page described by an HTML file, said program being substantially as described herein with reference to Fig. 5 of the drawings. 39. A browser application program operable to perform printing substantially as described herein with reference to the drawings or the Examples. A computer system incorporating a browser application program according to claim 38 or 39. DATED this TWENTY-EIGHTH Day of APRIL 2004 CANON INFORMATION SYSTEMS RESEARCH AUSTRALIA PTY LTD Patent Attorneys for the Applicant SPRUSON&FERGUSON 676212.doc
AU2004201773A 2004-04-28 2004-04-28 Method of Printing a Selected Element within a Web Page Abandoned AU2004201773A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
AU2004201773A AU2004201773A1 (en) 2004-04-28 2004-04-28 Method of Printing a Selected Element within a Web Page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
AU2004201773A AU2004201773A1 (en) 2004-04-28 2004-04-28 Method of Printing a Selected Element within a Web Page

Publications (1)

Publication Number Publication Date
AU2004201773A1 true AU2004201773A1 (en) 2004-06-03

Family

ID=34280634

Family Applications (1)

Application Number Title Priority Date Filing Date
AU2004201773A Abandoned AU2004201773A1 (en) 2004-04-28 2004-04-28 Method of Printing a Selected Element within a Web Page

Country Status (1)

Country Link
AU (1) AU2004201773A1 (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110035657A1 (en) * 2009-06-09 2011-02-10 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
US20110096361A1 (en) * 2008-05-19 2011-04-28 Canon Kabushiki Kaisha Print control method and print control apparatus for controlling printing of structured document
US20120096341A1 (en) * 2010-10-15 2012-04-19 Canon Kabushiki Kaisha Information processing apparatus, information processing method and non-transitory computer-readable storage medium
CN111597010A (en) * 2020-05-27 2020-08-28 北京智美智学科技有限公司 Method and device for generating pictures of Web pages, printing equipment and recording medium

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20110096361A1 (en) * 2008-05-19 2011-04-28 Canon Kabushiki Kaisha Print control method and print control apparatus for controlling printing of structured document
US9141587B2 (en) * 2008-05-19 2015-09-22 Canon Kabushiki Kaisha Print control method and print control apparatus for controlling printing of structured document
US20110035657A1 (en) * 2009-06-09 2011-02-10 Canon Kabushiki Kaisha Image processing apparatus, image processing method, and storage medium
US9141324B2 (en) * 2009-06-09 2015-09-22 Canon Kabushiki Kaisha Outputting selective elements of a structured document
US20120096341A1 (en) * 2010-10-15 2012-04-19 Canon Kabushiki Kaisha Information processing apparatus, information processing method and non-transitory computer-readable storage medium
US9170759B2 (en) * 2010-10-15 2015-10-27 Canon Kabushiki Kaisha Information processing apparatus, information processing method and non-transitory computer-readable storage medium
CN111597010A (en) * 2020-05-27 2020-08-28 北京智美智学科技有限公司 Method and device for generating pictures of Web pages, printing equipment and recording medium

Similar Documents

Publication Publication Date Title
US7802182B2 (en) System and method for performing visual property updates
US7487447B1 (en) Web page zoom feature
US10387535B2 (en) System and method for selectively displaying web page elements
US7428699B1 (en) Configurable representation of structured data
CA2773152C (en) A method for users to create and edit web page layouts
US7712016B2 (en) Method and apparatus for utilizing an object model for managing content regions in an electronic document
US9015144B2 (en) Configuring web crawler to extract web page information
US7536641B2 (en) Web page authoring tool for structured documents
US6826553B1 (en) System for providing database functions for multiple internet sources
US9830309B2 (en) Method for creating page components for a page wherein the display of a specific form of the requested page component is determined by the access of a particular URL
US20080065982A1 (en) User Driven Computerized Selection, Categorization, and Layout of Live Content Components
US20100251143A1 (en) Method, system and computer program for creating and editing a website
US20150193386A1 (en) System and Method of Facilitating Font Selection and Manipulation of Fonts
US7490290B2 (en) System and method for a look and feel designer with a skin editor
US20110191671A1 (en) Website Font Previewing
WO2009142813A2 (en) Target-alignment-and-drop control for editing electronic documents
US20060271840A1 (en) Layout-based page capture
US8812551B2 (en) Client-side manipulation of tables
US20060174187A1 (en) System and method for a look and feel designer with a page-view interface
WO2002017162A2 (en) Capture, storage and retrieval of markup elements
AU2004201773A1 (en) Method of Printing a Selected Element within a Web Page
KR20060042095A (en) System and method for a tool pane within a markup language document
CA2631105A1 (en) System and method for creating and editing content on a webpage
GB2373698A (en) Storage of a portion of a web-page containing a link
AU2002100469A4 (en) A thin-client web authoring system, web authoring method

Legal Events

Date Code Title Description
MK1 Application lapsed section 142(2)(a) - no request for examination in relevant period