WO2002027520A1 - Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee - Google Patents

Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee Download PDF

Info

Publication number
WO2002027520A1
WO2002027520A1 PCT/US2001/030465 US0130465W WO0227520A1 WO 2002027520 A1 WO2002027520 A1 WO 2002027520A1 US 0130465 W US0130465 W US 0130465W WO 0227520 A1 WO0227520 A1 WO 0227520A1
Authority
WO
WIPO (PCT)
Prior art keywords
subdocuments
document
client
subdocument
client device
Prior art date
Application number
PCT/US2001/030465
Other languages
English (en)
Other versions
WO2002027520A9 (fr
Inventor
Richard D. Romero
Adam L. Berger
Original Assignee
Eizel Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US09/745,289 external-priority patent/US7613810B2/en
Priority claimed from US09/745,290 external-priority patent/US7210100B2/en
Application filed by Eizel Technologies, Inc. filed Critical Eizel Technologies, Inc.
Priority to JP2002531030A priority Critical patent/JP2004510253A/ja
Priority to KR10-2003-7004386A priority patent/KR20030045086A/ko
Priority to CA002423695A priority patent/CA2423695A1/fr
Priority to EP01975565A priority patent/EP1320806A4/fr
Priority to AU2001294881A priority patent/AU2001294881A1/en
Publication of WO2002027520A1 publication Critical patent/WO2002027520A1/fr
Publication of WO2002027520A9 publication Critical patent/WO2002027520A9/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/14Session management
    • H04L67/142Managing session states for stateless protocols; Signalling session states; State transitions; Keeping-state mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/04Protocols specially adapted for terminals or networks with limited capabilities; specially adapted for terminal portability
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/2866Architectures; Arrangements
    • H04L67/30Profiles
    • H04L67/303Terminal profiles
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/56Provisioning of proxy services
    • H04L67/565Conversion or adaptation of application format or content
    • H04L67/5651Reducing the amount or size of exchanged application data
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/30Definitions, standards or architectural aspects of layered protocol stacks
    • H04L69/32Architecture of open systems interconnection [OSI] 7-layer type protocol stacks, e.g. the interfaces between the data link level and the physical level
    • H04L69/322Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions
    • H04L69/329Intralayer communication protocols among peer entities or protocol data unit [PDU] definitions in the application layer [OSI layer 7]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/02Traffic management, e.g. flow control or congestion control
    • H04W28/06Optimizing the usage of the radio link, e.g. header compression, information sizing, discarding information
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04WWIRELESS COMMUNICATION NETWORKS
    • H04W28/00Network traffic management; Network resource management
    • H04W28/16Central resource management; Negotiation of resources or communication parameters, e.g. negotiating bandwidth or QoS [Quality of Service]
    • H04W28/18Negotiating wireless communication parameters

Definitions

  • This invention relates to segmenting, transforming, and viewing electronic documents.
  • Electronic documents such as web pages, text files, email, and enterprise (proprietary corporate) data using desktop or laptop computers that have display screens that are larger than 10 inches diagonally and using connections to the Internet that have a communication rate of at least 28.8kbps.
  • Electronic documents are typically designed for transmission to and rendering on such devices.
  • Internet-enabled devices like mobile phones, hand-held devices (PDAs), pagers, set-top boxes, and dashboard-mounted microbrowsers often have smaller screen sizes, (e.g., as little as two or three inches diagonally across), relatively low communication rates on wireless networks, and small memories. Some of these devices cannot render any part of a document whose size exceeds a fixed limit, while others may truncate a document after a prescribed length. Accessing electronic documents (which often contain many paragraphs of text, complex images, and even rich media content) can be unwieldy or impossible using these devices.
  • Automatic content transformation systems convert electronic documents originally designed for transmission to and rendering on large-screen devices into versions suitable for transmission to and rendering on small-display, less powerful devices such as mobile phones. See, for example, Wei-Ying Ma, Ilja Bedner, Grace Chang, Allan Kuchinsky, and HongJiang Zhang. A Framework for Adaptive Content Delivery in Heterogeneous Network Environments, of SPIE Multimedia Computing and Networking 2000. San Jose, CA, January, 2000. SUMMARY
  • the invention features a method that includes receiving a machine readable file containing a document that is to be served to a client for display on a client device, the organization of each of the documents in the file being expressed as a hierarchy of information, and deriving subdocuments from the hierarchy of information, each of the subdocuments being expressed in a format that permits it to be served separately to the client using a hypertext transmission protocol, at least one of the subdocuments containing information that enables it to be linked to another one of the subdocuments.
  • Implementations of the invention may include one or more of the following features.
  • the language is extensible mark-up language (XML).
  • the deriving includes traversing the hierarchy and assembling the subdocuments from segments, at least some of the subdocuments each being assembled from more than one of the segments.
  • the assembling conforms to an algorithm that tends to balance the respective sizes of the subdocuments or that tends to favor assembling each of the subdocuments from segments that have common parents in the hierarchy or that conforms to an algorithm that tends to favor assembling each of the subdocuments from segments for which replications of nodes in the hierarchy is not required.
  • the file is received from an origin server associated with the file.
  • the file is expressed in a language that does not organize segments of the document in a hierarchy, and the deriving of subdocuments includes first converting the file to a language that organizes segments of the document in a hierarchy.
  • the subdocuments are served to the client individually as requested by the client.
  • the subdocuments are served to the client using a hypertext transmission protocol.
  • the subdocuments are requested by the client based on the contained information that enables a subdocument to be linked to another of the subdocuments.
  • a portion of the document is identified that is to be displayed separately from the rest of the document.
  • a graphical device is embedded that can be invoked by the user to retrieve the subdocument that includes the portion of the document that is to be displayed separately.
  • the invention features a machine-readable document held on a storage medium for serving to a client, the document being organized as a set of subdocuments, each of the subdocuments containing information that enables the subdocument to be linked to another of the subdocuments, each of the subdocuments comprising an assembly of segments of the document that are part of a hierarchical expression of the document, the subdocuments being of approximately the same size.
  • Implementations of the invention may include one or more of the following features.
  • the information that enables the subdocument to be linked includes a URL.
  • the hierarchical expression includes extensible markup language (XML).
  • the invention features receiving from a client a request for a document to be displayed on a client device, serving separately to the client a subdocument that represents less than all of the requested document, each subdocument containing information that links it to at least one other subdocument, receiving from the client an invocation of the link to the other subdocument, and serving separately to the client device the other subdocument.
  • Implementations of the invention may include one or more of the following features.
  • the subdocuments are served to the client using a hypertext transmission protocol.
  • the subdocuments are of essentially the same length.
  • the subdocuments are of a length that can be displayed on the client device without further truncation.
  • the invention features a method that includes receiving from a server at a client device, a subdocument of a larger document for display on the client device, displaying the subdocument on the client device, receiving at the client device a request of a user to have displayed another subdocument of the larger document, receiving separately from the server at the client device, the other subdocument, and displaying the other subdocument on the client device, the subdocuments being of substantially the same length.
  • Implementations of the invention may include one or more of the following features. All of each of the subdocuments is displayed at one time on the client device, or less than all of each of the subdocuments is displayed on the client device at one time.
  • the invention features a method that includes displaying a subdocument of a document on a client device, displaying an icon with the subdocument, and in response to invocation of the icon, fetching another subdocument of the document from a server and displaying the other subdocument on the client device, each of the subdocuments being less than the entire document, the subdocuments being of approximately the same size.
  • Implementations of the invention may include one or more of the following features. An indication is given of the position of the currently displayed subdocument in a series of subdocuments that make up the document. The indication includes the total number of subdocuments in the series and the position of the currently displayed document in the sequence. The subdocuments are derived from the document at the time of a request from the client device for the document.
  • the subdocuments are derived in a manner that is based on characteristics of the client device.
  • the characteristics of the client device are provided by the client in connection with the request. The characteristics include the display capabilities and memory constraints of the client device.
  • the subdocuments are derived from the document before the client requests the document from the server.
  • the subdocuments are derived for different documents from different origin servers.
  • the subdocuments are derived from the document at a wireless communication gateway.
  • the invention features apparatus that includes a network server configured to receive a machine readable file containing a document that is to be served to a client for display on a client device, and to derive subdocuments from the file, each of the subdocuments being expressed in a format that permits it to be served separately to the client using a hypertext transmission protocol, at least one of the subdocuments containing information that enables it to be linked to another one of the subdocuments.
  • the invention features apparatus including comprising means for receiving a machine readable file containing a document that is to be served to a client for display on a client device, and means for deriving subdocuments from the file, each of the subdocuments being expressed in a format that permits it to be served separately to the client using a hypertext transmission protocol, at least one of the subdocuments containing information that enables it to be linked to another one of the subdocuments.
  • the invention features a machine-readable program stored on a machine-readable medium and capable of configuring a machine to receive a machine readable file containing a document that is to be served to a client for display on a client device, and derive subdocuments from the file, each of the subdocuments being expressed in a format that permits it to be served separately to the client using a hypertext transmission protocol, at least one of the subdocuments containing information that enables it to be linked to another one of the subdocuments
  • Figure 1 shows a document transforming and serving system.
  • FIG. 2 shows a document.
  • Figure 3 shows a flow diagram
  • Figures 4 and 5 show document hierarchies.
  • Figure 6 shows a process for document transformation.
  • Figure 7 shows a database.
  • Figure 8 shows a document transformation system
  • Figure 9 shows a process for expressing preferences.
  • Figure 10 shows a preference form.
  • Figures 11 and 12 show preference forms.
  • Figure 12 shows a wireless/wired communication system.
  • Figure 13 shows a document transformation system.
  • Figure 14 shows a web page.
  • Figures 15 and 16 show small-screen displays of portions of a web page.
  • Figure 17 shows isolating subdocuments for separate use.
  • electronic documents are segmented and transformed before being served through low bandwidth communication channels for viewing on user devices that have small displays and/or small memories.
  • segmentation feature first and then the transformation feature.
  • an Internet-enabled device 10 a WAP-enabled mobile phone, for example
  • an electronic document 12 e.g., a web page, an email, a text file, or a document in a proprietary format or markup language
  • the proxy server requests the document from an origin server 16 using the URL.
  • the origin server is a computer on the Internet responsible for the document.
  • the proxy server breaks (segments) the document into subdocuments.
  • the proxy server transmits the first of these subdocuments 1 to the client as a web page.
  • the segmenting of the document need not be done in the proxy server but can be done in other places in the network, as described later.
  • each of the subdocuments 20 delivered by the proxy server to the client contains hyperlinks 22, 24 to the next and previous (each where applicable) subdocuments in the series.
  • the hyperlinks are displayed to the user. If the user selects a forward-pointing (or backward-pointing) hyperlink from a subdocument, that request is transmitted to the proxy server, which responds with the next (or previous) subdocument.
  • the first step of the segmentation process is to determine (30) the maximum document size permissible by the client device. Ifthe client-server communication adheres to the HTTP protocol standards as described in RFC2616 (R. Fielding et al., RFC 2616: Hypertext Transfer Protocol - HTTP/1J. June, 1999. **http://www.
  • the client advertises information about itself to the proxy server within the header information sent in the HTTP request.
  • the server can use, for instance, the value of the USER- AGENT field to determine the type of microbrowser installed on the client device and, from this information, determine the maximum document size by consulting a table listing the maximum document size for all known devices.
  • the next step of the segmentation process is to convert the input document into XML (32), a markup language whose tags imply a hierarchical tree structure on the document.
  • XML a markup language whose tags imply a hierarchical tree structure on the document.
  • An example of such a tree structure is shown in figure 4. Conversion to XML from many different source formats, including HTML, can be done using existing software packages.
  • the third step is to apply a procedure to divide (34) the XML tree 40 into segments, each of whose length is not greater than M.
  • the leaves 42 of the tree represent elements of the original document — ext blocks, images, and so on.
  • Internal nodes 44 of the tree represent structural and markup information — markers denoting paragraphs, tables, hyperlinked text, regions of bold text, and so on.
  • One strategy for accomplishing the segmentation task is to use an agglomerative, bottom-up leaf-clustering algorithm.
  • the leaf-clustering approach begins by placing each leaf in its own segment (as shown in figure 4) and then iteratively merging segments until there exists no adjacent pair of segments that should be merged.
  • Figure 5 shows the same tree after two merges have occurred, leaving merged segments 50, 52.
  • Each merging operation generates a new, modified tree, with one fewer segments.
  • Each step considers all adjacent pairs of segments, and merges the pair that is optimal according to a scoring function defined on candidate merges. An example scoring function is described below.
  • the final segments represent partitions of the original XML tree.
  • a lower score represents a more desirable merge.
  • score of merging segments x and y is related to the following quantities:
  • the scoring function should favor merging smaller segments, rather than larger ones.
  • denote the number of bytes in segment x. All else being equal, if
  • 100,
  • 150, and
  • 25, then a good scoring function causes score(x,z) ⁇ score(y,z) ⁇ score(x,y). The effect of this criterion, in practice, is to balance the sizes of the resulting partitions.
  • segments x and y have a common parent z, then they comprise a more desirable merge than if they are related only through a grandparent (or more remote ancestor) node. That two segments are related only through a distant ancestor is less compelling evidence that the segments belong together than if they are related through a less distant ancestor.
  • the node replication required by the merge Internal nodes may have to be replicated when converting segments into well-formed documents. Of course, in partitioning an original document into subdocuments, one would like to minimize redundancy in the resulting subdocuments. Defining by d(x,y) the least number of nodes one must travel through the tree from segment x to segment y, and by r(x,y) the amount of node replication required by merging segments x and y.
  • Algorithm 1 ⁇ gglomerative segmentation of an XML document Input: D: XML document M: maximum permissible subdocument length
  • TextTiling is an algorithm designed to find optimal locations to place dividers within text sources.
  • the next step is to convert the segments of the final tree into individual, well-formed XML documents (36). Doing so may require replication of nodes. For instance, in Figure 5, merging leaves B and F has the effect of separating the siblings F and G. This means that when converting the first and second segments of the tree on the right into well-formed documents, each document must contain an instance of node C. In other words, node C is duplicated in the set of resulting subdocuments. The duplication disadvantage would have been more severe if nodes F and G were related not by a common parent, but by a common grandparent, because then both the parent and grandparent nodes would have to be replicated in both segments.
  • the proxy server After having computed a segmentation for the source document, the proxy server stores the individual subdocuments in a cache or database (38) to expedite future interaction with the user.
  • the request is forwarded to the proxy server, which responds (39) with the appropriate subdocument, now stored in its cache. If the proxy server is responsible for handling requests from many different clients, the proxy server maintains state (41) for each client to track which document the client is traversing and the constituent subdocuments of that document.
  • the proxy server can use the HTTP header information — this time to determine a unique identification (IP address, for example, or a phone number for a mobile phone) for the client device, and use this code as a key in its internal database, which associates a state with each user.
  • IP address for example, or a phone number for a mobile phone
  • the agglomerative segmentation algorithm (Algorithm 1, above) is performed only once per source document, at the time the user first requests the document. As the user traverses the subdocuments comprising the source document, the computational burden for the proxy server is minimal; all that is required is to deliver the appropriate, already-stored subdocument.
  • an original HTLM document 100 may contain a form 102.
  • the documented can be segmented into subdocuments 104, 106, and 108 that represent parts of the main body of the document and subdocuments 110, 112 that represent portions of the form 102.
  • One of the subdocuments 106 contains an icon 114 that represents a link 116 to the form.
  • Other links 118, 120, and 122 permit navigation among the subdocuments as described earlier.
  • the content of the subdocuments that are served to the user devices can be automatically transformed in ways that reduce the amount of data that must be communicated and displayed without rendering the information represented by the data unusable.
  • Users can customize this automatic transformation of electronic documents by expressing their preferences about desired results of the transformation. Their preferences are stored for later use in automatic customized transformation of requested documents. For example, a user may wish to have words in original documents abbreviated when viewing the documents on a size-constrained display. Other users may find the abbreviation of words distracting and may be willing to accept the longer documents that result when abbreviations are not used. These preferences can be expressed and stored and then used to control the later transformation of actual documents.
  • the proxy server receives the request (18) and fetches (20) the document from the origin server.
  • the proxy computer consults (24) a database 26 of client preferences to determine the appropriate parameters for the transformation process for the device 8 for the user who is making the request.
  • the proxy computer then applies (28) the transformations to the document to tailor it for transmission to (30) and rendering (32) on the client device.
  • the HTTP header in which the client device advertises information to the proxy server about itself can include two relevant pieces of information:
  • a unique identifier for the device For example, for wireless Internet devices equipped with a microbrowser distributed by Phone.com, the HTTP header variable X- UP-SUBNO is bound to a unique identifier for the device.
  • the device type For example, the HTTP header variable USER-AGENT is bound to a string that describes the type of browser software installed on the device.
  • FIG. 7 shows an example of rows in a fictitious database 24.
  • Each row 40 identifies a device by the device's telephone number.
  • the row associates user preferences (four different ones in the case of figure 7) with the identified device.
  • the telephone number e.g., of a mobile phone
  • the unique ID that serves as the key for the records in the database.
  • the proxy computer can use these values to guide its transformation process.
  • the inputs to the transformation process are a source document (in HTML, for instance) and a set of user preference values (one row in the database from figure 6)
  • document transformation includes a sequence of operations, such as date compression 52, word abbreviation 54, and image suppression 55, in converting an original document to a form more suitable for rendering on a small-display device.
  • the preferences for the target device are used to configure the transformation operations. For instance, the client-specific preferences could indicate that word abbreviation should be suppressed, or that image suppression 55 should only be applied to images exceeding a specified size.
  • images can be subjected to other kinds of transformations to reduce their size.
  • images may be compressed, downsampled, or converted from color to black and white.
  • words may be abbreviated.
  • There are many strategies for compressing words such as truncating long words, abbreviating common suffices ("national” becomes “natT), removing vowels or using a somewhat more sophisticated procedure like the Soundex algorithm (Margaret K. Odell and Robert C. Russell, United States Patents 1,261,167 (1918) and 1,435,663 (1922).).
  • the corresponding user-configurable parameter would be a Boolean value indicating whether the user wishes to enable or disable abbreviations. Enabling abbreviations reduces the length of the resulting document, but may also obfuscate the meaning of the document. Suppression of images
  • bitmapped images are likely to degrade in quality when rendered on low-resolution screens. For these reasons, users may control whether and which kinds of bitmapped images are rendered on their devices.
  • the corresponding user-configurable parameter in this case could be, for instance, a Boolean value (render or do not render) or a maximum acceptable size in pixels for the source image.
  • a transformation system can employ a natural language parser to detect and rewrite certain classes of strings into shorter forms. For instance, a parser could detect and rewrite dates into a shorter form, so that, for instance, "December 12, 1984” becomes “ 12/12/84", "February 4" becomes “2/4", and "The seventh of August” becomes “8/7".
  • the corresponding user-selectable parameter value could be a Boolean value (compress or do not compress), or it could take on one of three values: do not compress, compress into month/day/year format, or compress into day/month/year format.
  • a transformation system could parse and compress numeric quantities, so that (for instance) "seventeen” becomes “17” and “ten gigabytes” becomes “10GB.”
  • a wide variety of other transformation could be devised for a wide variety of types of documents.
  • a user can enter and maintain preferences by visiting the proxy computer using the same small-display device he uses for Internet access.
  • the proxy computer could store a hypertext form 60 that users of small-display devices retrieve and fill in according to their preferences.
  • the proxy computer Upon receiving an HTTP request 62 from a client device, the proxy computer will automatically (using the HTTP protocol) obtain the unique identifier for the client device.
  • the proxy computer then transmits to the user a form 64 that contains a set of preferences. Ifthe client device already has an associated entry in the database, the current value for each parameter can be displayed in the form; otherwise, a default value will be displayed.
  • the user may change parameters on this form as he sees fit and then submit the form back 66 to the proxy computer, which stores the updated values in the database in the record associated with that client device.
  • the user can visit the same URL using a conventional web browser on a desktop or laptop computer.
  • the proxy computer will be unable to determine automatically from the HTTP header information which device to associate the preferences with.
  • the user must explicitly specify the unique identifier — phone number, for instance — of the device for which the user wishes to set the preferences.
  • Figure 10 shows an example of the form appearing on a conventional HTML-based desktop web browser.
  • Figure 11 shows the first screen of the corresponding page appearing on a four-line mobile phone display (A user must scroll down to see the rest of the options.)
  • the user is a person accessing a remotely-stored document using a small-screen device, and a proxy computer (which performs the transformations) mediates between the user's device and the Internet as a whole.
  • a proxy computer which performs the transformations
  • Another setting in which configurable transformations are useful is for an individual or institution to exercise control over the appearance on small-display devices of documents that it generates.
  • the origin server responsible for storing and transmitting the data can be equipped with automatic content transformation software (using a module or "plug-in" for the web server software). The origin server host can then configure and control the transformation software as desired.
  • the origin server may also offer to an author of content an ability to configure transformations once for any user retrieving documents from that server for a particular type of client device.
  • an ability to configure transformations instead of offering the end user the ability to customize the transformations, one can instead offer this ability to the person or institution that authored the content.
  • This scenario is relevant when the content provider desires strict control over the appearance of their content on small-display devices. Rather than storing a database of user (individual device) preferences, then, the origin server stores only a single set of parameter values for the transformation for each type of device. The information flow from user to origin server is thus:
  • Origin server receives the request and information on the type of client device making the request.
  • Origin server consults the transformation parameters appropriate for that device in processing the requested document.
  • Origin server delivers the transformed document to the client device.
  • the previous section described a method for end users to specify and store preferences, to be associated with a single device.
  • This section described a method for content creators to configure the transformation of documents delivered from their origin server. These two scenarios are not incompatible. Imagine that an end user requests a document X from an origin server Y. Imagine further that the end user has registered a set of preferences for his transformations, and that there exists on the origin server a separate set of preferences for documents delivered from that origin server. The document will be transformed first according to the preferences in the origin server, and then according to the end user's preferences. In this scenario, the end user's preferences sometimes cannot be honored.
  • the preference information is not stored on a database remote from the client device, but rather on the device itself.
  • the information flow of per-device preference information in this setting is as follows:
  • a user of a small-display device submits a request to the proxy computer for the preferences form document.
  • the form document is transmitted from the proxy computer to the device.
  • the user fills in his preferences and submits the filled-in form back to the proxy computer.
  • the proxy computer responds with a confirmation document and also transmits, in the HTTP header information to the client device, a cookie containing that user's preferences. For example, the cookie might look like
  • the client device stores this cookie as persistent state.
  • the device When a user of the client device subsequently requests a document from the proxy computer, the device also transmits to the proxy computer the cookie containing the stored preferences: Cookie: PREFS- ' abbrevs.yes images:no dates.yes ".;
  • the proxy computer applies these preferences in transforming the requested document. If the client device did not transmit a cookie, either because the cookie expired or was erased, the proxy computer applies a default transformation.
  • wireless devices 50 and the "wired" Internet 53 typically occur through a gateway 52, which mediates between the wired and wireless worlds. For instance, a request for a document by a user of a WAP- capable device is transmitted to the wireless gateway, which forwards the request to the ' origin server 54 (on the Internet) responsible (according to the DNS protocol) for the requested document.
  • the ' origin server 54 on the Internet responsible (according to the DNS protocol) for the requested document.
  • the requested document has been designed specifically for the client device and written in the markup language accepted by the device—sometimes HTML, but more often another markup language such as WML, HDML, or a proprietary language— content transformation isn't necessary. Because different wireless data devices have different capabilities, a content creator would have to create a separate version not only for each target markup language but also for every possible target device. The content provider needs also to understand how to detect the type of client device and create a document optimally formatted for that client.
  • an automatic content transformation system 70 can automatically compress and reformat documents 72 into formats that are optimal for display on specific target devices. This leaves content creators free to concentrate on writing content rather than on retargeting content for a variety of target devices.
  • the content transformation system intercepts requests from non-traditional client devices, customizes the requested documents for display on the target device 78, and transmits the transformed documents 74 to the client.
  • the content transformation system employs user preferences 76 and device specifications 64 to guide the document transformation process.
  • system 70 which automatically compresses and reformats a document 72 for optimal display on a specific target device, content creators are free to concentrate on their core competency—writing content— and not on retargeting content for a variety of target devices.
  • a content transformation system intercepts requests from non-traditional client devices, customizes the requested document for display on the target device, and transmits the transformed document to the client.
  • Content transformation systems can use automatic document segmentation to stage the delivery of large documents to devices incapable of processing large documents in their entirety.
  • the core content transformation component 81 can include the segmentation process described earlier.
  • the XML cache object 84 is where the per-user subdocuments are stored for the segmentation process.
  • Content transformation is a server-side technology and can naturally be deployed at various locations in the client-origin server channel, anywhere from the wireless gateway to the origin server that holds the original content.
  • Figure 14 shows an example input document (a full-size web page) that was divided into five subdocuments.
  • Figure 15 shows the bottom of the fourth subdocument 72, corresponding to the middle of the "Bronx-Whitestone Bridge" section of the original page.
  • the hyperlinks (icons) labeled "prev” 74 and "next” 76 bring a user to the third and fifth subdocuments, respectively, when invoked.
  • Figure 16 shows the beginning of the fifth subdocument 78, which begins where the fourth leaves off. The user can scroll through the subdocument as needed.
  • the icons 74, 76 are only displayed when the user has scrolled to the beginning or end of the subdocument. In other examples, the icons could be displayed at all times.
  • the numbers and words in the original have been abbreviated
  • each subdocument also includes a display of the heading 79 of the original document. That heading is included in the subdocument when the subdocument is created from the original document.
  • the display also includes an indication of the total number of subdocuments 87 and the position 89 of the current subdocument in the series of subdocuments that make up the original document.
  • each subdocument rendered on the target device can contain a graphical status bar showing where the subdocument lies in the set of subdocuments comprising the original document. For instance, ooxoooo could mean "this is the third of seven subdocuments". Moreover, each of the o's in this status bar could be hyperlinked to that subdocument, enabling the user to randomly access different subdocuments in the document. This can be more efficient than proceeding subdocument by subdocument in order.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

L'invention concerne un procédé consistant à recevoir un fichier lisible par une machine contenant un document (12) devant être transmis à un client (10) pour être affiché sur un dispositif (10). Chaque document (12) contenu dans le fichier est organisé selon une hiérarchie d'informations. Les sous-documents (1) provenant de cette hiérarchie d'informations sont organisés dans un format permettant la transmission séparée au client à l'aide d'un protocole de transmission hypertexte, au moins un des sous-documents (1) contient les informations lui permettant d'être relié à un autre sous-document (1).
PCT/US2001/030465 2000-09-27 2001-09-27 Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee WO2002027520A1 (fr)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2002531030A JP2004510253A (ja) 2000-09-27 2001-09-27 能力が限られたデバイス上で用いるための電子文書の区分処理
KR10-2003-7004386A KR20030045086A (ko) 2000-09-27 2001-09-27 성능제한 장치용 전자문서의 세그먼트화
CA002423695A CA2423695A1 (fr) 2000-09-27 2001-09-27 Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee
EP01975565A EP1320806A4 (fr) 2000-09-27 2001-09-27 Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee
AU2001294881A AU2001294881A1 (en) 2000-09-27 2001-09-27 Segmenting electronic documents for use on a device of limited capability

Applications Claiming Priority (8)

Application Number Priority Date Filing Date Title
US23555100P 2000-09-27 2000-09-27
US60/235,551 2000-09-27
US23842400P 2000-10-10 2000-10-10
US60/238,424 2000-10-10
US09/745,289 2000-12-20
US09/745,290 2000-12-20
US09/745,289 US7613810B2 (en) 2000-09-27 2000-12-20 Segmenting electronic documents for use on a device of limited capability
US09/745,290 US7210100B2 (en) 2000-09-27 2000-12-20 Configurable transformation of electronic documents

Publications (2)

Publication Number Publication Date
WO2002027520A1 true WO2002027520A1 (fr) 2002-04-04
WO2002027520A9 WO2002027520A9 (fr) 2002-06-06

Family

ID=27499799

Family Applications (2)

Application Number Title Priority Date Filing Date
PCT/US2001/030465 WO2002027520A1 (fr) 2000-09-27 2001-09-27 Segmentation de documents electroniques pouvant etre utilises sur un dispositif a capacite limitee
PCT/US2001/030476 WO2002027516A1 (fr) 2000-09-27 2001-09-27 Transformation configurable de documents electroniques

Family Applications After (1)

Application Number Title Priority Date Filing Date
PCT/US2001/030476 WO2002027516A1 (fr) 2000-09-27 2001-09-27 Transformation configurable de documents electroniques

Country Status (6)

Country Link
EP (2) EP1330723A4 (fr)
JP (2) JP2004510251A (fr)
KR (3) KR100903528B1 (fr)
AU (2) AU2001294881A1 (fr)
CA (2) CA2423695A1 (fr)
WO (2) WO2002027520A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2378287A (en) * 2001-03-28 2003-02-05 Hewlett Packard Co Data delivery
WO2004045209A1 (fr) * 2002-11-14 2004-05-27 Lg Electronics,Inc. Procede de conservation d'historique de document electronique et procede d'apport d'un document mis a jour au moyen d'un nombre de versions basees sur xml
WO2006075872A1 (fr) * 2005-01-12 2006-07-20 Widerthan Co., Ltd. Systeme et procede pour la fourniture et la gestion de contenu web executable
US7496834B2 (en) 2002-08-23 2009-02-24 Lg Electronics, Inc. Electronic document request/supply method based on XML
US8862777B2 (en) 2011-04-01 2014-10-14 Verisign, Inc Systems, apparatus, and methods for mobile device detection

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100531842B1 (ko) * 2002-11-20 2005-12-02 엘지전자 주식회사 메시지 교정 방법 및 시스템
WO2005106684A1 (fr) * 2004-04-30 2005-11-10 Access Co., Ltd. Procede d’elargissement/de reduction dynamique d’image en navigation, dispositif terminal et programme
US7603426B2 (en) 2004-06-18 2009-10-13 Microsoft Corporation Flexible context management for enumeration sessions using context exchange
JP2007257365A (ja) * 2006-03-23 2007-10-04 Microsoft Corp データ送信管理装置、システム、方法およびプログラム
KR100817582B1 (ko) * 2006-11-29 2008-03-31 에스케이 텔레콤주식회사 모바일 웹 서비스 방법과 이를 위한 프록시 서버 및 모바일단말기
JP5090828B2 (ja) * 2007-09-04 2012-12-05 京セラドキュメントソリューションズ株式会社 情報処理装置
KR100905413B1 (ko) * 2007-11-06 2009-07-02 주식회사 케이티프리텔 이동 단말의 풀 브라우저에서 웹 페이지의 화면 표시영역을 조정하는 방법 및 장치
JP4739369B2 (ja) * 2008-05-15 2011-08-03 ソフトバンクモバイル株式会社 ウェブコンテンツ変換編集システム
KR101012206B1 (ko) * 2008-05-27 2011-02-08 주식회사 엘지유플러스 웹뷰어의 이미지 전송량 관리 시스템 및 그 방법
KR100873415B1 (ko) * 2008-07-15 2008-12-11 팬터로그인터액티브 주식회사 이동 통신 단말기에서 풀 브라우징 서비스를 제공하기 위한인터넷 접속 장치 및 그 방법
KR100994607B1 (ko) * 2008-09-24 2010-11-15 주식회사 엘지유플러스 마크업 페이지 중계 서버 및 그 제어방법
US8010089B2 (en) * 2009-01-19 2011-08-30 Telefonaktiebolaget L M Ericsson (Publ) System and method of providing identity correlation for an over the top service in a telecommunications network
CN101996162A (zh) * 2009-08-26 2011-03-30 华为技术有限公司 电子书章节处理方法、装置及系统
KR102140648B1 (ko) * 2018-12-07 2020-08-04 유병섭 한글워드파일의 웹 변환 시스템

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000039666A1 (fr) 1998-12-28 2000-07-06 Spyglass, Inc. Procede et systeme servant a transformer le contenu de donnees electroniques pour des dispositifs sans fil
WO2000056033A1 (fr) 1999-03-17 2000-09-21 Oracle Corporation Fourniture a des clients de services permettant d'extraire des donnees de sources de donnees ne fonctionnant pas necessairement sous le format demande par les clients
US6154738A (en) * 1998-03-27 2000-11-28 Call; Charles Gainor Methods and apparatus for disseminating product information via the internet using universal product codes
US6226675B1 (en) * 1998-10-16 2001-05-01 Commerce One, Inc. Participant server which process documents for commerce in trading partner networks
US6317781B1 (en) * 1998-04-08 2001-11-13 Geoworks Corporation Wireless communication device with markup language based man-machine interface

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05324427A (ja) * 1992-05-27 1993-12-07 Hitachi Ltd 文書情報圧縮装置
US6128663A (en) 1997-02-11 2000-10-03 Invention Depot, Inc. Method and apparatus for customization of information content provided to a requestor over a network using demographic information yet the user remains anonymous to the server
US6857102B1 (en) * 1998-04-07 2005-02-15 Fuji Xerox Co., Ltd. Document re-authoring systems and methods for providing device-independent access to the world wide web
US6278449B1 (en) * 1998-09-03 2001-08-21 Sony Corporation Apparatus and method for designating information to be retrieved over a computer network
US6336124B1 (en) * 1998-10-01 2002-01-01 Bcl Computers, Inc. Conversion data representing a document to other formats for manipulation and display

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6154738A (en) * 1998-03-27 2000-11-28 Call; Charles Gainor Methods and apparatus for disseminating product information via the internet using universal product codes
US6317781B1 (en) * 1998-04-08 2001-11-13 Geoworks Corporation Wireless communication device with markup language based man-machine interface
US6226675B1 (en) * 1998-10-16 2001-05-01 Commerce One, Inc. Participant server which process documents for commerce in trading partner networks
WO2000039666A1 (fr) 1998-12-28 2000-07-06 Spyglass, Inc. Procede et systeme servant a transformer le contenu de donnees electroniques pour des dispositifs sans fil
WO2000056033A1 (fr) 1999-03-17 2000-09-21 Oracle Corporation Fourniture a des clients de services permettant d'extraire des donnees de sources de donnees ne fonctionnant pas necessairement sous le format demande par les clients

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP1320806A4

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2378287B (en) * 2001-03-28 2003-12-24 Hewlett Packard Co Improvements relating to data delivery
GB2378287A (en) * 2001-03-28 2003-02-05 Hewlett Packard Co Data delivery
US7496834B2 (en) 2002-08-23 2009-02-24 Lg Electronics, Inc. Electronic document request/supply method based on XML
US8677231B2 (en) 2002-08-23 2014-03-18 Lg Electronics, Inc. Electronic document request/supply method based on XML
US7584421B2 (en) 2002-08-23 2009-09-01 Lg Electronics, Inc. Electronic document request/supply method based on XML
AU2003276738B9 (en) * 2002-11-14 2004-06-03 Lg Electronics, Inc. Electronic document versioning method and updated document supply method using version number based on XML
AU2003276738B2 (en) * 2002-11-14 2007-07-26 Lg Electronics, Inc. Electronic document versioning method and updated document supply method using version number based on XML
US7398466B2 (en) 2002-11-14 2008-07-08 Lg Electronics, Inc. Electronic document versioning method and updated document supply method using version number based on XML
US7484171B2 (en) 2002-11-14 2009-01-27 Lg Electronics, Inc. Electronic document versioning method and updated document supply method using version number based on XML
GB2411031A (en) * 2002-11-14 2005-08-17 Lg Electronics Inc Electronic document versioning method and updated document supply method using version number based on XML
US8631318B2 (en) 2002-11-14 2014-01-14 Lg Electronics, Inc. Electronic document versioning method and updated document supply method using version number based on XML
WO2004045209A1 (fr) * 2002-11-14 2004-05-27 Lg Electronics,Inc. Procede de conservation d'historique de document electronique et procede d'apport d'un document mis a jour au moyen d'un nombre de versions basees sur xml
WO2006075872A1 (fr) * 2005-01-12 2006-07-20 Widerthan Co., Ltd. Systeme et procede pour la fourniture et la gestion de contenu web executable
US8862777B2 (en) 2011-04-01 2014-10-14 Verisign, Inc Systems, apparatus, and methods for mobile device detection

Also Published As

Publication number Publication date
KR20080067022A (ko) 2008-07-17
KR20030045086A (ko) 2003-06-09
JP2004510253A (ja) 2004-04-02
CA2423695A1 (fr) 2002-04-04
EP1330723A4 (fr) 2009-04-01
CA2423611C (fr) 2011-03-08
JP2004510251A (ja) 2004-04-02
KR100855997B1 (ko) 2008-09-03
AU2001294881A1 (en) 2002-04-08
WO2002027516A1 (fr) 2002-04-04
AU2001294884A1 (en) 2002-04-08
WO2002027520A9 (fr) 2002-06-06
EP1330723A1 (fr) 2003-07-30
EP1320806A4 (fr) 2007-08-15
EP1320806A1 (fr) 2003-06-25
KR100903528B1 (ko) 2009-06-19
KR20030060899A (ko) 2003-07-16
WO2002027516A9 (fr) 2003-02-20
CA2423611A1 (fr) 2002-04-04

Similar Documents

Publication Publication Date Title
US7613810B2 (en) Segmenting electronic documents for use on a device of limited capability
US7210100B2 (en) Configurable transformation of electronic documents
CA2423611C (fr) Transformation configurable de documents electroniques
US6925595B1 (en) Method and system for content conversion of hypertext data using data mining
EP1412867B1 (fr) Dispositif et procédé de transformation d'un document joint envoyé dans un courriel en vue de distribuer ce document joint à un dispositif ayant une capacité de rendu limitée
US6338096B1 (en) System uses kernals of micro web server for supporting HTML web browser in providing HTML data format and HTTP protocol from variety of data sources
US5987466A (en) Presenting web pages with discrete, browser-controlled complexity levels
US6272484B1 (en) Electronic document manager
US9100861B2 (en) System and method for abbreviating information sent to a viewing device
US6523062B1 (en) Facilitating memory constrained client devices by employing deck reduction techniques
US7249197B1 (en) System, apparatus and method for personalising web content
GB2344197A (en) Content conversion of electronic documents
GB2347329A (en) Converting electronic documents into a format suitable for a wireless device
JP2003512666A (ja) インテリジェント・ハーベスティング及びナビゲーション・システム、及び方法
US20070067495A1 (en) Web server
US7987420B1 (en) System, method, and computer program product for a scalable, configurable, client/server, cross-platform browser for mobile devices
US20010039578A1 (en) Content distribution system
EP1630689B1 (fr) Procédé de représentation de contenu formaté sur un dispositif mobile
KR100517809B1 (ko) 사용자 선호 프로파일을 적용한 웹 컨텐트 전송 방법
US20020038343A1 (en) Process for supplying a web site designer or web site host type customer with a tool for transforming an image from a first format into a second format
CA2563488C (fr) Systeme et methode pour abreger l'information envoyee a un dispositif de visualisation
Dey et al. Bringing internet services to wireless devices

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US US US US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

AK Designated states

Kind code of ref document: C2

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG US US US US UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: C2

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

COP Corrected version of pamphlet

Free format text: PAGES 1/13-13/13, DRAWINGS, REPLACED BY NEW PAGES 1/7-7/7; DUE TO LATE TRANSMITTAL BY THE RECEIVINGOFFICE

DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 2423695

Country of ref document: CA

WWE Wipo information: entry into national phase

Ref document number: 1020037004386

Country of ref document: KR

WWE Wipo information: entry into national phase

Ref document number: 2002531030

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2001975565

Country of ref document: EP

WWP Wipo information: published in national office

Ref document number: 1020037004386

Country of ref document: KR

WWP Wipo information: published in national office

Ref document number: 2001975565

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642