WO2008132706A1 - Procédé et système de navigation internet - Google Patents

Procédé et système de navigation internet Download PDF

Info

Publication number
WO2008132706A1
WO2008132706A1 PCT/IE2008/000046 IE2008000046W WO2008132706A1 WO 2008132706 A1 WO2008132706 A1 WO 2008132706A1 IE 2008000046 W IE2008000046 W IE 2008000046W WO 2008132706 A1 WO2008132706 A1 WO 2008132706A1
Authority
WO
WIPO (PCT)
Prior art keywords
segment
content
primarily
original document
navigation
Prior art date
Application number
PCT/IE2008/000046
Other languages
English (en)
Inventor
Pavel Sykora
Original Assignee
Markport Limited
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Markport Limited filed Critical Markport Limited
Publication of WO2008132706A1 publication Critical patent/WO2008132706A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents

Definitions

  • the invention relates to display of content by browsers, particularly those executing on devices having limited screen sizes.
  • a proxy server performs a mark-up simplification (e.g. from XHTML 1.0 Transitional to XHTML Basic, CHTML or to a subset of another generally available mark-up language) and page segmentation
  • a mark-up simplification e.g. from XHTML 1.0 Transitional to XHTML Basic, CHTML or to a subset of another generally available mark-up language
  • page segmentation the original web page results in an ordered collection of simplified page segments. In most instances, the collection begins with page segments containing only navigational information. If the first segment is displayed by default, the user has to find the first relevant page segment manually, by skipping the segments with irrelevant content. This both degrades user experience considerably and may also decrease network efficiency.
  • US698331 (Microsoft) describes a method in which additional content (such as advertising) is only downloaded and displayed if it can be accommodated.
  • US2006/0277478 (Seraji et. al.) describes a process in which obstructive user interface elements generated by a browser may be hidden to maximize space for content display.
  • cHTML a.k.a. compact HTML, a subset of HTML 2.0, HTML 3.2 and HTML 4.0 for small information appliances
  • small device a device with a limited screen size and resolution (e.g. a cell phone or PDA)
  • XHTML Basic a subset of XHTML 1.1 for small information appliances
  • XHTML Mobile Profile a superset of XHTML Basic defined by Open Mobile Alliance
  • XML-DOM Platform and language independent object representation of an XML document.
  • a method for downloading content from a server to a user device comprising the steps of:
  • analysing an original document to dynamically segment the original document, at least one of said segments having primarily information content and at least one other segment having primarily navigation content, and downloading at least one segment having primarily information content.
  • the analysis step analyses characters of the original document to determine if original document content is to be part of a primarily-information segment.
  • the analysis step parses the original document to identify content indicators.
  • the analysis step comprises:
  • step (c) if the content indicator is not found, the segment is most likely primarily navigational, and choosing the next segment and going to step (b), and
  • the segment is most likely primarily informational and therefore it is preferentially downloaded to the user for rendering on the device.
  • the analysis step parses the original document content to determine a ratio of textual characters to anchor elements, and selects content having a ratio above a threshold for a primarily-information segment.
  • the threshold ratio is at least 6.
  • the analysis step comprises:
  • step (d) if the result is less than a configurable value, the segment is most likely primarily navigational and choosing the next segment and going to step (b), and
  • the segment is most likely primarily informational and therefore it is preferentially downloaded to the user device.
  • the method comprises the step of initially simplifying structure of the original document.
  • the simplification comprises providing a simplified document object model representation.
  • the simplified representation is segmented.
  • the method comprises the step of inserting inter-segment navigation content into a segment, and subsequently downloading another segment upon receipt of a request from the user using the inter-segment navigation content.
  • an item of inter-segment navigation content points to another segment.
  • an item of inter-segment navigation content points to a particular part of another segment.
  • said item of inter-segment navigation content is derived from a link in the original document.
  • At least one segment is provided including primarily original document navigation content, but download priority is given to the segment or segments having primarily information content.
  • the segment with primarily original document navigation content is downloaded upon request from the user, and user navigation inputs in said segments are used for further navigation.
  • the invention provided a computer readable medium comprising software code for performing operations of any method as defined above when executing on a digital processor.
  • the invention provides a network server comprising a digital processor and communication interfaces, wherein the processor is adapted to access content and to perform the steps of any method defined above.
  • the server may be a proxy server or a content provider.
  • the invention provides a method performed by a mobile device for processing a received original document, the method comprising analysing the original document to dynamically determine segments of the original document, at least one of said segments having primarily information content and at least one other segment having primarily navigation content, and displaying the segment having primarily information content.
  • the analysis step analyses characters of the original document to determine if original document content is to be part of a primarily-information segment.
  • the analysis step parses the original document to identify content indicators.
  • the analysis step parses the original document content to determine a ratio of textual characters to anchor elements, and selects content having a ratio above a threshold for a primarily-information segment.
  • At least one segment is provided including primarily original document navigation content, but download priority is given to the segment or segments having primarily information content, and the segment with primarily original document navigation content is downloaded upon request from the user, and user navigation inputs in said segments are used for further navigation.
  • Fig. 1 is a block diagram showing download and display of content
  • Fig. 2 is a diagrammatic representation of breakdown of an original web page
  • Fig. 3 is a message sequence diagram illustrating an embodiment in detail
  • Figs. 4 to 7 are illustrations of content processed by the method.
  • a content web page 2 is downloaded from server 1 to a mobile device 3 via a proxy server 4.
  • the proxy server 4 executes a program that parses an original content web page 2 (shown in Fig. 4) into an XML-DOM internal representation.
  • the internal representation is examined for a document (i.e. web page) structure and the structure is simplified, if necessary.
  • the result is then saved into the proxy server's operational memory (RAM).
  • the proxy server 4 dynamically and intelligently performs segmentation of the simplified page 2 to provide segments 5 (Figs. 5 to 7).
  • the proxy analyses the segments to determine if each one has primarily information content or primarily original document navigation content. It preferentially downloads to the device those having primarily information content.
  • the segments may be processed before download, as described below.
  • Fig. 1 shows initial routing of a particular segment 6 from the set 5
  • Fig. 2 shows diagrammatically the original page 2 and the set of segments 5.
  • this may involve transformation from XHTML 1.0 Transitional to table-less XHTML Basic 1.0.
  • the size of a serialized page segment should be reasonable — i.e. suitable for small devices — e.g. from 2 to 15 Kb (may depend on character set used). If the mark-up is not simplified and the source code of the content page is not a well- formed XML, the program creates an alternative tree internal representation of the content page. The alternative tree internal representation is saved in the operational memory of the proxy server.
  • a program running on the proxy server 4 augments the internal representation of the segments 5 with inter-segment navigational hypertext links that allow the user to navigate across segments which are downloaded. Examples are the "Prev", and "Next" links of Figs. 5 and 6.
  • the program also checks original hypertext links within the segments 5 and if they point to elements in other segments, the program changes them to navigational hypertext links pointing to the appropriate segment. Thereafter, the segments 5 are serialized into a XHTML mark-up representation and saved in the RAM of the proxy server 4, so if the mobile device 3 requests another segment, no repeated access from the proxy server 4 to the content provider 1 is necessary.
  • the proxy creates a segment for the original document navigation content, and this can be used for navigation if this segment is downloaded.
  • segment analysis involves computing a ratio between the amount of plain textual information and the number of anchor (" ⁇ a>") elements used in a segment's body. If this computation result is less than a threshold then, the first segment is chosen by default.
  • ⁇ a> the number of anchor elements used in a segment's body.
  • A. Receive the original web page on behalf of the mobile device.
  • the segment is most likely primarily informational and therefore it should be preferentially sent to the device and rendered.
  • a particular content indicator may be introduced to define relevant content to allow a content-indicator method usage.
  • the following is an example of a method implemented by the proxy server.
  • A. Receive the original web page on behalf of the mobile device.
  • the segment is most likely primarily navigational, so choose the next segment and go to step G.
  • the content indicator may be configured as a particular mark-up element or an attribute with a particular value (e.g. the "class” attribute with the value of "main- contents").
  • the location or the form that the content indicator takes may be configured on a file on the server.
  • the segment is most likely primarily informational and so therefore should be sent to the device and rendered.
  • a major benefit of the Example 1 approach is that the proxy server does not depend on content indicators, which may or may not be fully representative of the extent of navigation content in a segment. It is a generic approach which does not rely on content indicators inserted by authors.
  • a user wants to browse the web page http://www.acision.com/about.aspx on her mobile device.
  • the user sends a request from her mobile device (1).
  • the request is captured by the (transparent) proxy server and the proxy server re-sends the request to the content provider (2).
  • the content provider returns the requested web page (3).
  • the web page is then pre-processed on the proxy server as follows: •
  • the page (X)HTML code is parsed and the proxy server builds an internal tree representation of the document in its operational memory •
  • the proxy server starts a program which parses the XHTML code of the original document (example shown in Fig. 4) in the operational memory. Because it is too complex for a further manipulation, the program creates a new structure related to a simplified representation of the document structure (XHTML-MP in this case), but preserving as much information of the original web page (4) as possible.
  • the new structure is saved into the proxy server's operational memory (5).
  • the proxy server starts a program which traverses the simplified representation of the original document in the proxy server's operational memory. As there is too much information for rendering on a mobile device with a small screen, the program segments the information into segments of approximately 1 kB each. Three segments are shown in Figs. 5 to 7. The program checks hyperlinks and each hypertext link pointing to another part of the page translates into a link pointing to an appropriate segment. Then the program adds inter-segment navigational hyperlinks in the bottom of each segment (6). The links allow navigation from one segment to another. The segments are then converted from the internal tree representation to the final code (XHTML-MP in this case) and saved into the proxy server operational memory (7). Fig. 5 shows a first segment of useful content, Fig, 6 shows a second segment of useful content, and Fig. 7 shows a segment of navigation content.
  • a program classifies segments into categories of being primarily informational or primarily navigational.
  • the first segment (Fig. 7) contains only a logo (i.e. image), a simple form and hypertext links from the navigational part in the top of the original web page, the program classifies the segment as primarily navigational and proceeds with the second segment.
  • the second segment (Fig. 5) is classified as primarily navigational.
  • the proxy server sends the second segment to the user (9).
  • the mobile device renders the second segment (10) and the user can read its content.
  • the user sees that the second segment has been displayed first, and after reading the second segment, she wants to see the first segment with primarily navigational content (because she might want either to navigate to another page of the website or use the form for searching information within the website). Because the method described here is preserving all of the content of the original web page 2, the user is allowed to navigate to the first segment using hypertext links in the bottom of the segment.
  • the mobile device sends a request for the first segment (Fig. 7) to the proxy server (11) and the proxy server finds the first segment in its operational memory (12) and returns the first segment to the user's mobile device for rendering (13, 14).
  • the invention achieves a much improved utilization of screen space, providing a better user experience while preserving the original web page information.
  • This is achieved (by the proxy server) independently of the user device and browser characteristics.
  • the invention is not limited to the embodiments described but may be varied in construction and detail.
  • the analysis may be performed by a content server and, if so, the content server may communicate directly with the user device or via a proxy server.
  • the analysis may be performed by a user device, in which case there is preferential display rather than preferential download.
  • the device receives the full original document and performs the processing to preferentially render the segments.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Une page Web (2) de contenu est téléchargée à partir d'un serveur (1) vers un dispositif mobile (3), par l'intermédiaire d'un serveur mandataire (4). Le serveur mandataire (4) exécute un programme qui analyse une page Web (2, Fig. 4) de contenu original afin de produire une représentation interne XML-DOM. La représentation interne est examinée du point de vue de la structure du document (c.-à-d. page Web) et cette structure est simplifiée, si nécessaire. Le résultat est ensuite sauvegardé dans la mémoire opérationnelle (RAM) du serveur mandataire. Le serveur mandataire (4) met en œuvre de manière dynamique et intelligente une segmentation de la page simplifiée (2) afin de produire des segments (5, fig. 5 à 7). Une fois que le document original a été segmenté, le serveur mandataire analyse les segments afin de déterminer si chacun de ceux-ci comporte principalement du contenu de données ou du contenu de navigation de document original. Ledit serveur télécharge de préférence les segments comportant principalement du contenu de données. Cependant, il peut aussi télécharger des segments comportant principalement du contenu de navigation de document original et les réponses de l'utilisateur par rapport à ces segments sont utilisées pour la navigation. Les segments peuvent être traités avant le téléchargement de manière à inclure du contenu de navigation inter-segments, de sorte que l'utilisateur peut choisir un segment depuis l'intérieur d'un autre segment. Dans certains cas, le contenu de navigation inter-segments d'un segment peut être obtenu à partir d'un lien du document original afin de pointer une partie particulière d'un autre segment, et, lorsqu'il est présent dans un segment téléchargé, peut être utilisé pour la navigation, le cas échéant.
PCT/IE2008/000046 2007-04-26 2008-04-23 Procédé et système de navigation internet WO2008132706A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US92401007P 2007-04-26 2007-04-26
US60/924,010 2007-04-26

Publications (1)

Publication Number Publication Date
WO2008132706A1 true WO2008132706A1 (fr) 2008-11-06

Family

ID=39712564

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/IE2008/000046 WO2008132706A1 (fr) 2007-04-26 2008-04-23 Procédé et système de navigation internet

Country Status (1)

Country Link
WO (1) WO2008132706A1 (fr)

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011143814A1 (fr) * 2010-05-19 2011-11-24 Hewlett-Packard Development Company, L.P. Système et procédé de segmentation d'une page web par calcul d'un seuil adaptatif
WO2012028559A1 (fr) * 2010-09-01 2012-03-08 Axel Springer Digital Tv Guide Gmbh Transformation de contenu pour divertissement passif
CN102411625A (zh) * 2011-03-21 2012-04-11 苏州阔地网络科技有限公司 一种渐进式输出显示方法及装置
CN103034731A (zh) * 2012-12-20 2013-04-10 北京思特奇信息技术股份有限公司 一种生成Web前端交互页面的方法
WO2016108677A1 (fr) * 2015-01-02 2016-07-07 에스케이플래닛 주식회사 Appareil et procédé de sortie de contenu vidéo
CN105893014A (zh) * 2015-12-08 2016-08-24 乐视云计算有限公司 用于前端的项目开发方法及系统
US20210081602A1 (en) * 2019-09-16 2021-03-18 Docugami, Inc. Automatically Identifying Chunks in Sets of Documents

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040103371A1 (en) * 2002-11-27 2004-05-27 Yu Chen Small form factor web browsing

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040103371A1 (en) * 2002-11-27 2004-05-27 Yu Chen Small form factor web browsing

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
CHEN Y ET AL: "Adapting Web Pages for Small-Screen Devices", IEEE INTERNET COMPUTING, vol. 9, no. 1, January 2005 (2005-01-01), pages 2 - 8, XP002494208, ISSN: 1089-7801, Retrieved from the Internet <URL:http://research.microsoft.com/~xingx/tic1.pdf> [retrieved on 20080902] *

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2011143814A1 (fr) * 2010-05-19 2011-11-24 Hewlett-Packard Development Company, L.P. Système et procédé de segmentation d'une page web par calcul d'un seuil adaptatif
WO2012028559A1 (fr) * 2010-09-01 2012-03-08 Axel Springer Digital Tv Guide Gmbh Transformation de contenu pour divertissement passif
EP2431889A1 (fr) * 2010-09-01 2012-03-21 Axel Springer Digital TV Guide GmbH Transformation de contenu pour divertissement de personne penchée en arrière
CN102411625A (zh) * 2011-03-21 2012-04-11 苏州阔地网络科技有限公司 一种渐进式输出显示方法及装置
CN103034731A (zh) * 2012-12-20 2013-04-10 北京思特奇信息技术股份有限公司 一种生成Web前端交互页面的方法
CN103034731B (zh) * 2012-12-20 2016-12-28 北京思特奇信息技术股份有限公司 一种生成Web前端交互页面的方法
US10296566B2 (en) 2015-01-02 2019-05-21 Sk Planet Co., Ltd. Apparatus and method for outputting web content that is rendered based on device information
WO2016108677A1 (fr) * 2015-01-02 2016-07-07 에스케이플래닛 주식회사 Appareil et procédé de sortie de contenu vidéo
CN105893014A (zh) * 2015-12-08 2016-08-24 乐视云计算有限公司 用于前端的项目开发方法及系统
US20210081602A1 (en) * 2019-09-16 2021-03-18 Docugami, Inc. Automatically Identifying Chunks in Sets of Documents
US11816428B2 (en) * 2019-09-16 2023-11-14 Docugami, Inc. Automatically identifying chunks in sets of documents
US11822880B2 (en) 2019-09-16 2023-11-21 Docugami, Inc. Enabling flexible processing of semantically-annotated documents
US11960832B2 (en) 2019-09-16 2024-04-16 Docugami, Inc. Cross-document intelligent authoring and processing, with arbitration for semantically-annotated documents

Similar Documents

Publication Publication Date Title
US9524353B2 (en) Method and system for providing portions of information content to a client device
EP2532157B1 (fr) Procédé de pliage de contenu
US7853593B2 (en) Content markup transformation
KR100810010B1 (ko) 웹 장치에서의 html 페이지의 프리젠테이션을 개선하는방법 및 시스템
US7810049B2 (en) System and method for web navigation using images
US6344851B1 (en) Method and system for website overview
US20060282758A1 (en) System and method for identifying segments in a web resource
US20030004984A1 (en) Methods for transcoding webpage and creating personal profile
CN101782913A (zh) 一种更新提醒的方法及浏览器
CN104750851A (zh) 网页内容的延迟加载方法及系统
JP2010532884A5 (fr)
US20030011608A1 (en) Image display method and portable terminal for displaying selected image
WO2008132706A1 (fr) Procédé et système de navigation internet
KR100909232B1 (ko) 웹 페이지 인터페이스 제공 방법 및 시스템
Blekas et al. Use of RSS feeds for content adaptation in mobile web browsing
KR20020031691A (ko) 실시간 인터넷 콘텐츠 변환 방법 및 시스템
CN110737853A (zh) 一种多平台展示静态页面数据同步方法及b2b系统
CN102955852A (zh) 一种网页资源处理方法、装置及设备
CN106936727A (zh) 一种网页显示方法及装置
CN106575303B (zh) 显示网页的方法和设备
US20010056497A1 (en) Apparatus and method of providing instant information service for various devices
US20070236606A1 (en) Methods and arrangements for accessing information via a graphical user interface
EP1335306A2 (fr) Système de transmission conjointe de contenu hypertexte et d&#39;un programme d&#39;opération pour l&#39;interface utilisateur
US20020091735A1 (en) Method and apparatus for locating geographically classified establishment information
KR100936612B1 (ko) 모바일 인터넷 브라우저에서의 멀티미디어 콘텐츠 로딩방법 및 이를 위한 무선통신단말

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 08738137

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 08738137

Country of ref document: EP

Kind code of ref document: A1