WO2012122934A1 - 一种网页重排版的方法 - Google Patents

一种网页重排版的方法 Download PDF

Info

Publication number
WO2012122934A1
WO2012122934A1 PCT/CN2012/072285 CN2012072285W WO2012122934A1 WO 2012122934 A1 WO2012122934 A1 WO 2012122934A1 CN 2012072285 W CN2012072285 W CN 2012072285W WO 2012122934 A1 WO2012122934 A1 WO 2012122934A1
Authority
WO
WIPO (PCT)
Prior art keywords
webpage
rule
web page
browser client
content
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/CN2012/072285
Other languages
English (en)
French (fr)
Chinese (zh)
Inventor
汪轩然
范典
屈恒
洪锋
黄江吉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xiaomi Technology Co Ltd
Original Assignee
Beijing Xiaomi Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xiaomi Technology Co Ltd filed Critical Beijing Xiaomi Technology Co Ltd
Priority to EP12757856.5A priority Critical patent/EP2687997A4/en
Priority to KR1020137024111A priority patent/KR20140012664A/ko
Priority to JP2013556961A priority patent/JP2014514629A/ja
Priority to US14/004,410 priority patent/US20140215314A9/en
Publication of WO2012122934A1 publication Critical patent/WO2012122934A1/zh
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/14Tree-structured documents
    • G06F40/143Markup, e.g. Standard Generalized Markup Language [SGML] or Document Type Definition [DTD]
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions

Definitions

  • the present invention relates to the field of mobile internet technologies, and in particular, to a method for re-layout of a web page. Background technique
  • Microsoft's IE Mobi le series browser in its early Windows Mobi le system uses a strategy to vertically map all the elements in a web page. Arranged for user-friendly reading.
  • Google adopts text automatic line wrapping technology, that is, when the webpage is zoomed, the text paragraph in the webpage is re-typed, so that the text automatically wraps according to the current zoom ratio and the relationship of the screen, ensuring that the user is in the user. When reading, there is no need to scroll left and right on the web page.
  • Apple's browser in the iPhone series, and Microsoft's browser in its Windows Phone 7 system adopted Text Scaling technology to make different font sizes in different containers of the web page when rendering the web page for the first time.
  • the adjustment ensures that when the container is zoomed to the center of the screen, the size of the text in the container is suitable for the user to read and does not need to scroll left and right. This technique successfully avoids re-formatting the webpage every time it is zoomed.
  • server cache acceleration technology invented by UTV Dynamics (UCWEB Browser). This technology reduces the number of connections to the web server by re-formatting the web page on the server, making the font and width of the web page use the lower resolution of the mobile device, and by caching the rearranged web pages.
  • Server rearrangement technology requires a large amount of server resources and is costly.
  • the object of the present invention is to provide a method for re-layout of a webpage, which can fully adapt to the screen resolution of the device, bring a very good browsing experience to the user, and at the same time retain the information and interaction of the original webpage to the greatest extent, and effectively filter out the webpage.
  • the irrelevant elements in the page improve the page loading speed and save network bandwidth.
  • a method for re-layout a webpage comprising the following steps:
  • the mobile browser client obtains the webpage address
  • the mobile browser client determines whether the webpage corresponding to the webpage address satisfies the identification rule. If yes, go to step C. If not, load the webpage and display the webpage content; C. mobile browser client Obtaining the HTML code of the webpage; D. The mobile browser client extracts an element containing valid information from the HTML code of the webpage according to the content extraction rule, and extracts valid information from the element;
  • the mobile browser client inserts valid information of the webpage into a predefined frame page to generate a new webpage
  • the mobile browser client loads a new web page and displays the web content.
  • the identification rule includes a URL rule, a special element rule, and a webpage format rule, wherein the URL rule is implemented by a regular expression, and the special element rule determines whether the current webpage meets the requirement by searching for the qualified element in the webpage, and the webpage format rule is According to the overall hierarchical structure of the web page elements to determine whether it meets the requirements.
  • the special element rule includes determining whether the id of the body element in the web page is a special character; the web page formatting rule includes determining whether the body of the web page contains two div elements.
  • the content extraction rule is implemented by an XPath technology.
  • the content extraction rules include a news content webpage content extraction rule, a novel reading webpage content extraction rule, and a forum post list webpage content extraction rule.
  • Valid information includes internal HTML code and hyperlink information.
  • Step E further includes the following steps:
  • the mobile browser client inserts the valid information of the webpage into the predefined framework page; in the framework page, by pre-defining the cascading style sheet, combined with the characteristics of the mobile browser client, the layout style is presented, and a new webpage is generated.
  • FIG. 1 is a flow chart of a web page re-typesetting in a specific embodiment of the present invention. detailed description
  • FIG. 1 is a flow chart of a web page re-typesetting in a specific embodiment of the present invention. As shown in FIG. 1, the process of re-formatting the webpage includes the following steps:
  • Step 101 The mobile browser client obtains the webpage address that needs to be accessed.
  • Step 102 The mobile browser client determines whether the webpage corresponding to the webpage address satisfies the identification rule lj. If yes, the process goes to step 104. If not, the process goes to step 103.
  • the identification rules are stored in the mobile browser client, including URL rules, special element rules, and web page formatting rules.
  • the URL rule is implemented by a regular expression.
  • the special element rule is to determine whether the current webpage meets the requirements by finding an eligible element in the webpage, for example, including determining whether the id of the body element in the webpage is a special character.
  • the web page format rule determines whether the content meets the requirements according to the overall hierarchical structure of the web page element, for example, including determining whether the body of the web page contains two div elements.
  • Step 103 Load a webpage, and display webpage content.
  • Step 104 The mobile browser client obtains the HTML code of the webpage.
  • Step 105 The mobile browser client extracts an element containing valid information from the HTML code of the webpage according to the content extraction rule, and extracts valid information from the element, where the valid information includes an internal HTML code and hyperlink information.
  • the content extraction rule is stored in the mobile browser client, including the news content web content extraction rule, the novel reading web content extraction rule, and the forum post list web content extraction rule. For different types of web pages, different rules need to be defined.
  • the content extraction rule describes a collection of HTML elements or a set of HTML elements, which is generally implemented by XPath technology.
  • Step 106 The mobile browser client inserts the valid information of the webpage into the predefined framework page, and in the framework page, by pre-defining the cascading style sheet, combining the characteristics of the mobile browser client, presenting the typesetting style and generating a new webpage.
  • Features of the mobile browser client include resolution and display features.
  • Step 107 The mobile browser client loads a new webpage and displays the webpage content.
  • the frame definitions and the styles contained in the page are different.
  • the same framepage and style are used, so the effect of the re-formatted page is the same.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Information Transfer Between Computers (AREA)
  • Document Processing Apparatus (AREA)
PCT/CN2012/072285 2011-03-14 2012-03-13 一种网页重排版的方法 Ceased WO2012122934A1 (zh)

Priority Applications (4)

Application Number Priority Date Filing Date Title
EP12757856.5A EP2687997A4 (en) 2011-03-14 2012-03-13 METHOD FOR REORGANIZING A WEBSITE
KR1020137024111A KR20140012664A (ko) 2011-03-14 2012-03-13 웹페이지 재배치 방법
JP2013556961A JP2014514629A (ja) 2011-03-14 2012-03-13 一種のウェブページ再組版の方法
US14/004,410 US20140215314A9 (en) 2011-03-14 2012-03-13 Method for rearranging web page

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201110060342.9 2011-03-14
CN2011100603429A CN102622382A (zh) 2011-03-14 2011-03-14 一种网页重排版的方法

Publications (1)

Publication Number Publication Date
WO2012122934A1 true WO2012122934A1 (zh) 2012-09-20

Family

ID=46562305

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2012/072285 Ceased WO2012122934A1 (zh) 2011-03-14 2012-03-13 一种网页重排版的方法

Country Status (6)

Country Link
US (1) US20140215314A9 (enExample)
EP (1) EP2687997A4 (enExample)
JP (1) JP2014514629A (enExample)
KR (1) KR20140012664A (enExample)
CN (1) CN102622382A (enExample)
WO (1) WO2012122934A1 (enExample)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014134927A1 (en) * 2013-03-08 2014-09-12 Tencent Technology (Shenzhen) Company Limited Methods and systems for loading data into terminal devices
WO2015062514A1 (en) * 2013-10-31 2015-05-07 Tencent Technology (Shenzhen) Company Limited Web content extracting method, device, and system
US9473563B2 (en) 2013-03-08 2016-10-18 Tencent Technology (Shenzhen) Company Limited Methods and systems for loading data into terminal devices
WO2024051439A1 (zh) * 2022-09-08 2024-03-14 北京有竹居网络技术有限公司 网页生成方法、装置、电子设备及存储介质

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9342618B2 (en) * 2012-06-04 2016-05-17 Sap Se Web application compositon and modification editor
CN103729370A (zh) * 2012-10-15 2014-04-16 腾讯科技(深圳)有限公司 网络小说介绍页的提取方法及装置
CN102955852A (zh) * 2012-11-01 2013-03-06 北京小米科技有限责任公司 一种网页资源处理方法、装置及设备
CN102968474B (zh) * 2012-11-15 2016-02-24 广东欧珀移动通信有限公司 移动通讯设备浏览网页显示方法
CN102999595B (zh) * 2012-11-16 2016-06-08 北京百度网讯科技有限公司 一种用于提供与页面信息相对应的访问页面的方法和设备
CN103020129B (zh) * 2012-11-20 2015-11-18 中兴通讯股份有限公司 一种文本内容提取方法和装置
US10152459B2 (en) * 2013-02-20 2018-12-11 Google Llc Intelligent window placement with multiple windows using high DPI screens
US9710440B2 (en) * 2013-08-21 2017-07-18 Microsoft Technology Licensing, Llc Presenting fixed format documents in reflowed format
US20150058710A1 (en) * 2013-08-21 2015-02-26 Microsoft Corporation Navigating fixed format document in e-reader application
CN103761257B (zh) * 2013-12-30 2017-09-22 优视科技有限公司 基于移动浏览器的网页处理方法及系统
CN105468629B (zh) * 2014-09-04 2019-06-14 北大方正集团有限公司 移动设备数字报系统的实现方法、装置及系统
CN105512126A (zh) * 2014-09-24 2016-04-20 腾讯科技(深圳)有限公司 网页广告过滤隐藏及过滤隐藏规则下发方法和装置
WO2016129765A1 (ko) * 2015-02-13 2016-08-18 김효환 웹페이지 구축 장치 및 방법
CN104750793A (zh) * 2015-03-12 2015-07-01 小米科技有限责任公司 生成页面的方法和装置
CN105760527B (zh) * 2016-03-02 2022-09-27 百度在线网络技术(北京)有限公司 第三方页面展示方法和装置
US10437927B2 (en) 2017-02-09 2019-10-08 Zumobi, Inc. Systems and methods for delivering compiled-content presentations
KR200488306Y1 (ko) 2017-03-10 2019-01-11 안홍길 부품교체가 가능한 가스차량용 인젝터
WO2019090735A1 (zh) * 2017-11-10 2019-05-16 深圳市华阅文化传媒有限公司 阅读第三方网页的方法和装置
CN107943869A (zh) * 2017-11-10 2018-04-20 深圳市华阅文化传媒有限公司 阅读第三方网页的方法和装置
JP7351226B2 (ja) * 2020-01-08 2023-09-27 富士フイルムビジネスイノベーション株式会社 表示制御装置、及び表示制御プログラム
CN112149021B (zh) * 2020-09-23 2021-07-30 四川天邑康和通信股份有限公司 一种路由器CSS Sprites技术中使用自适应布局单位的兼容方法

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101146128A (zh) * 2007-10-30 2008-03-19 杨金钰 允许小屏幕移动终端访问和浏览www网站的方法
CN100392641C (zh) * 2006-08-16 2008-06-04 北京北大方正电子有限公司 一种基于克隆块实现自动排版的方法
CN101202748A (zh) * 2007-11-27 2008-06-18 优视动景(北京)技术服务有限公司 一种微浏览器浏览网页的方法及微浏览器
CN101859322A (zh) * 2010-05-26 2010-10-13 卓望数码技术(深圳)有限公司 一种移动终端的网页显示方法
CN101894168A (zh) * 2010-06-30 2010-11-24 优视科技有限公司 移动终端网页页面的排版显示方法及系统

Family Cites Families (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6938073B1 (en) * 1997-11-14 2005-08-30 Yahoo! Inc. Method and apparatus for re-formatting web pages
US6857102B1 (en) * 1998-04-07 2005-02-15 Fuji Xerox Co., Ltd. Document re-authoring systems and methods for providing device-independent access to the world wide web
US20040049737A1 (en) * 2000-04-26 2004-03-11 Novarra, Inc. System and method for displaying information content with selective horizontal scrolling
JP2003122770A (ja) * 2001-10-09 2003-04-25 Mitsubishi Electric Corp Webブラウズ装置
JP2007509402A (ja) * 2003-10-22 2007-04-12 オペラ ソフトウェア エイエスエイ 画面端末ディスプレイ上へのhtmlコンテンツ表示
JP4115375B2 (ja) * 2003-11-20 2008-07-09 キヤノン株式会社 データ処理装置およびデータ処理方法
CN101071426A (zh) * 2006-05-10 2007-11-14 北京锐科天智科技有限责任公司 个性网页生成方法及装置
US20080301545A1 (en) * 2007-06-01 2008-12-04 Jia Zhang Method and system for the intelligent adaption of web content for mobile and handheld access
WO2008157322A1 (en) * 2007-06-13 2008-12-24 Quattro Wireless, Inc. Displaying content on a mobile device
US7895598B2 (en) * 2007-06-15 2011-02-22 Microsoft Corporation Page and device-optimized cascading style sheets
CN101583072B (zh) * 2008-05-15 2011-09-21 北京凯思昊鹏软件工程技术有限公司 一种用于实现Mobile Internet的中间件产品及其方法
CN101286120A (zh) * 2008-05-28 2008-10-15 北京中企开源信息技术有限公司 一种网站页面的制作方法和系统
JP2010134780A (ja) * 2008-12-05 2010-06-17 Casio Computer Co Ltd 情報処理装置およびその制御プログラム
CN101815093A (zh) * 2010-03-11 2010-08-25 深圳市嘉讯软件有限公司 一种网页到移动终端的适配方法及移动终端页面适配装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN100392641C (zh) * 2006-08-16 2008-06-04 北京北大方正电子有限公司 一种基于克隆块实现自动排版的方法
CN101146128A (zh) * 2007-10-30 2008-03-19 杨金钰 允许小屏幕移动终端访问和浏览www网站的方法
CN101202748A (zh) * 2007-11-27 2008-06-18 优视动景(北京)技术服务有限公司 一种微浏览器浏览网页的方法及微浏览器
CN101859322A (zh) * 2010-05-26 2010-10-13 卓望数码技术(深圳)有限公司 一种移动终端的网页显示方法
CN101894168A (zh) * 2010-06-30 2010-11-24 优视科技有限公司 移动终端网页页面的排版显示方法及系统

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP2687997A4 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014134927A1 (en) * 2013-03-08 2014-09-12 Tencent Technology (Shenzhen) Company Limited Methods and systems for loading data into terminal devices
US9473563B2 (en) 2013-03-08 2016-10-18 Tencent Technology (Shenzhen) Company Limited Methods and systems for loading data into terminal devices
WO2015062514A1 (en) * 2013-10-31 2015-05-07 Tencent Technology (Shenzhen) Company Limited Web content extracting method, device, and system
WO2024051439A1 (zh) * 2022-09-08 2024-03-14 北京有竹居网络技术有限公司 网页生成方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
US20140006934A1 (en) 2014-01-02
EP2687997A4 (en) 2015-05-06
KR20140012664A (ko) 2014-02-03
US20140215314A9 (en) 2014-07-31
CN102622382A (zh) 2012-08-01
JP2014514629A (ja) 2014-06-19
EP2687997A1 (en) 2014-01-22

Similar Documents

Publication Publication Date Title
WO2012122934A1 (zh) 一种网页重排版的方法
US20230244725A1 (en) Transcoding and serving resources
US9501581B2 (en) Method and apparatus for webpage reading based on mobile terminal
CN102779167B (zh) 在移动终端中显示网页的方法及系统
CN102663126B (zh) 一种在移动终端中显示网页的方法及装置
CN102591954B (zh) 一种浏览器用数据加载方法及装置
EP2631824A2 (en) Systems and methods for modifying webpage data
WO2012159563A1 (zh) 基于移动终端的网页排版方法和装置
CN102346782A (zh) 在用户终端浏览器上按需显示图片的方法及装置
CN102831148B (zh) 一种基于浏览器的推荐数据加载方法和装置
WO2015062366A1 (zh) 一种网页广告的拦截方法、装置和浏览器
WO2014206072A1 (zh) 预览网页的方法及系统
CN104423991A (zh) 在移动终端加载网页、提供网页数据的方法及装置
CN102346738A (zh) 客制化网页处理装置及方法
CN111859211B (zh) 网页离线访问方法、装置、终端及存储介质
JP2007233659A (ja) ネットワークサービスにおける情報配信システム
CN103440340A (zh) 一种导航网页内容显示方法及装置
JP6142620B2 (ja) 表示変更プログラム、表示変更方法及び表示変更装置
WO2014055890A2 (en) Transcoding and serving resources
CN104714958A (zh) 一种网页转换方法及装置
CN107391519B (zh) 加速网页显示亚洲语系字体的方法、字体服务器、与浏览器端
JP5674704B2 (ja) 情報処理装置、方法、コンピュータ・プログラム及びシステム
JP5667841B2 (ja) データダウンロード装置、データダウンロード方法
CN113886738B (zh) 一种基于cef扩展自定义协议的方法
KR101372580B1 (ko) 브라우저 ui를 제공하기 위한 방법, 단말 장치, 서버 및 컴퓨터 판독 가능한 기록 매체

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12757856

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2013556961

Country of ref document: JP

Kind code of ref document: A

WWE Wipo information: entry into national phase

Ref document number: 14004410

Country of ref document: US

ENP Entry into the national phase

Ref document number: 20137024111

Country of ref document: KR

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 2012757856

Country of ref document: EP