CN103544178B - It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages - Google Patents

It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages Download PDF

Info

Publication number
CN103544178B
CN103544178B CN201210244986.8A CN201210244986A CN103544178B CN 103544178 B CN103544178 B CN 103544178B CN 201210244986 A CN201210244986 A CN 201210244986A CN 103544178 B CN103544178 B CN 103544178B
Authority
CN
China
Prior art keywords
page
target pages
reconfigurable
factors
main
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210244986.8A
Other languages
Chinese (zh)
Other versions
CN103544178A (en
Inventor
张世沂
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201210244986.8A priority Critical patent/CN103544178B/en
Publication of CN103544178A publication Critical patent/CN103544178A/en
Application granted granted Critical
Publication of CN103544178B publication Critical patent/CN103544178B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9577Optimising the visualization of content, e.g. distillation of HTML documents

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The object of the present invention is to provide a kind of for providing the method and apparatus of reconstruction page corresponding with target pages.Specifically, the target pages for being supplied to mobile terminal are obtained;Determine the page type information of target pages;According to page type information, page main reconfigurable factors corresponding with target pages are determined;According to page main reconfigurable factors, content is reconstructed by extracting the page corresponding with page main reconfigurable factors from target pages, generates reconstruction page corresponding with target pages;Reconstruction page is provided to the mobile terminal.Compared with prior art, the present invention passes through the page type information according to target pages, determine page main reconfigurable factors corresponding with target pages, and then according to main reconfigurable factors, content is reconstructed by extracting the page from target pages, generates reconstruction page corresponding with target pages, realize the reconstruct to the different type page, the stability for improving page reconstruct template, reduces communication flows, improves the browsing access experience of user.

Description

It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages
Technical field
The present invention relates to mobile internet technical fields, more particularly to one kind is used to provide for mobile terminal and target pages The technology of corresponding reconstruction page.
Background technique
With the development of mobile internet, by mobile terminal carry out web page browsing, message reference have become people's study, Obtain one of the major way of information resources.
However, the content of each web displaying is limited, many users are also doped with not other than body matter in webpage The information needed, such as the link of a large amount of pictures, web page navigation, advertisement link, meanwhile, also by the lesser limit of mobile terminal screen System;Also, the expense to surf the web is more expensive, related with the transmission flow of data, affects the reading experience of user.It is existing Convert internet page to and be suitble to during the page that mobile terminal browses, usually by manually distinguishing web page class Type different types of page such as news, novel, forum, question and answer etc. is separately configured the page reconstruct template to suit the requirements, and works as When the original pattern layout of webpage changes, the page need to be reconfigured for the webpage again and reconstruct template, not only wasted big Manpower and material resources are measured, and experience is read in the browsing access for also affecting user.
Summary of the invention
The object of the present invention is to provide a kind of for providing the method and apparatus of reconstruction page corresponding with target pages.
According to an aspect of the invention, there is provided a kind of corresponding with target pages heavy for being provided for mobile terminal The method of the structure page, method includes the following steps:
A obtains the target pages for being supplied to mobile terminal;
B determines the page type information of the target pages;
C determines that one or more page reconstruct corresponding with the target pages are wanted according to the page type information Element;
D is according to one or more of page main reconfigurable factors, by extracting and the page weight from the target pages The corresponding page of structure element reconstructs content, generates reconstruction page corresponding with the target pages;
The reconstruction page is provided to the mobile terminal by e.
According to another aspect of the present invention, it additionally provides a kind of for providing reconstruction page corresponding with target pages Page reconstructing arrangement, the page reconstructing arrangement include:
Page acquisition device, for obtaining the target pages for being supplied to mobile terminal;
Type determination device, for determining the page type information of the target pages;
Element determining device, for determining one corresponding with the target pages according to the page type information Or multiple page main reconfigurable factors;
Webpage generating device is used for according to one or more of page main reconfigurable factors, by from the target pages It extracts the page corresponding with the page main reconfigurable factors and reconstructs content, generate reconstruct page corresponding with the target pages Face;
Device is provided, for the reconstruction page to be provided to the mobile terminal.
According to a further aspect of the invention, additionally provide a kind of browser, including as it is aforementioned according to the present invention another Aspect is used to provide the page reconstructing arrangement of reconstruction page corresponding with target pages.
According to a further aspect of the invention, a kind of browser plug-in is additionally provided, including such as aforementioned another according to the present invention One aspect is used to provide the page reconstructing arrangement of reconstruction page corresponding with target pages.
Compared with prior art, the present invention passes through the page type information according to target pages, the determining and page object The corresponding page main reconfigurable factors in face, and then according to the main reconfigurable factors, by being extracted and the page weight from target pages The corresponding page of structure element reconstructs content, generates reconstruction page corresponding with the target pages, realizes to inhomogeneity The reconstruct of the page of type improves the stability of page reconstruct template, reduces the flow of communication, and the browsing for improving user is visited Ask experience.Further, the present invention may also be combined with the Segment of target pages, will reconstruct point corresponding with the Segment Block is supplied to mobile terminal, to further shorten the time of user's web page access, reduces user's flowing of access, improves User accesses the efficiency of webpage, and improves the browsing access experience of user.In addition, the present invention may also be combined with mobile terminal Terminal association attributes generate reconstruction page corresponding with mobile terminal, so that the browsing for further improving user is visited Ask experience.
Detailed description of the invention
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, of the invention other Feature, objects and advantages will become more apparent upon:
The equipment for providing reconstruction page corresponding with target pages that Fig. 1 shows one aspect according to the present invention is shown It is intended to;
Fig. 2 shows in accordance with a preferred embodiment of the present invention for providing reconstruction page corresponding with target pages Equipment schematic diagram;
Fig. 3 show according to a further aspect of the present invention for providing the method for reconstruction page corresponding with target pages Flow chart;
Fig. 4 show in accordance with a preferred embodiment of the present invention for providing reconstruction page corresponding with target pages Method flow diagram.
The same or similar appended drawing reference represents the same or similar component in attached drawing.
Specific embodiment
Present invention is further described in detail with reference to the accompanying drawing.
Fig. 1 show one aspect according to the present invention for providing the page weight of reconstruction page corresponding with target pages Structure equipment 1.Wherein, page reconstructing arrangement 1 includes page acquisition device 11, type determination device 12, element determining device 13, page Face generating means 14 and offer device 15.Specifically, page acquisition device 11 obtains the target pages for being supplied to mobile terminal; Type determination device 12 determines the page type information of the target pages;Element determining device 13 is believed according to the page type Breath determines one or more page main reconfigurable factors corresponding with the target pages;Webpage generating device 14 is according to described one A or multiple page main reconfigurable factors, by extracting page weight corresponding with the page main reconfigurable factors from the target pages Structure content generates reconstruction page corresponding with the target pages;The reconstruction page is provided to described by offer device 15 Mobile terminal.Wherein, the mobile terminal is that any one can pass through the modes such as keyboard, touch tablet or handwriting equipment with user Carry out the electronic product, such as smart phone, portable game machine, PDA, palm PC PPC or tablet computer etc. of human-computer interaction. Here, page reconstructing arrangement 1 includes but is not limited to mobile terminal, network host, single network server, multiple network servers The cloud that collection or multiple servers are constituted.Here, cloud is by a large amount of hosts or network clothes based on cloud computing (Cloud Computing) Business device is constituted, wherein cloud computing is one kind of distributed computing, and one consisting of a loosely coupled set of computers super Virtual machine.Those skilled in the art will be understood that above-mentioned page reconstructing arrangement 1 is only for example, other are existing or from now on may be used The network equipment or mobile terminal that can occur such as are applicable to the present invention, should also be included within the scope of protection of the present invention, and This is incorporated herein by reference.
Specifically, page acquisition device 11 passes through news website, novel website, question and answer website or forum website etc. the The application programming interfaces (API) that three method, apparatus provide obtain the target pages for being supplied to mobile terminal;Or by JSP, The dynamic web page techniques such as ASP obtain the search sequence that user is inputted by mobile terminal, then the search sequence are submitted to search Engine, and receive the search result corresponding with the search sequence that search engine is fed back, using as be supplied to it is mobile eventually The target pages at end;Or communication mode is arranged by http, https etc., obtain the target pages for being supplied to mobile terminal. Wherein, the target pages include but is not limited to following at least any one: 1) news pages;2) the novel page;3) the question and answer page; 4) forum page, those skilled in the art will be understood that above-mentioned target pages are only for example, other are existing or may go out from now on Existing target pages are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and wrap by reference herein Contained in this.
For example, user inputs network address http://news.sina.com.cn/ in browser address bar, by "enter" key", Page acquisition device 11 is got and the net by the application programming interfaces (API) that the third party devices such as news website provide The corresponding webpage of location http://news.sina.com.cn/.For another example, user inputs key in the search column of mobile terminal Word " the Water Margin novel " clicks search button, and page acquisition device 11 is by dynamic web page techniques such as JSP or ASP, from movement end End gets the search sequence of user input, and submits searching request to search engine based on the search sequence, passes through search Engine provide application programming interfaces (API) obtain search engine obtained according to keyword " the Water Margin novel " matching inquiry with pass One or more search results that keyword " the Water Margin novel " matches, such as " Water Margin txt downloading, Water Margin full text reading-" small Say read net " ", " Water Margin novel online reading ", as the target pages for being supplied to mobile terminal.
The mode that those skilled in the art will be understood that above-mentioned acquisition is supplied to the target pages of mobile terminal is only to lift Example, the mode for the target pages that other acquisitions that are existing or being likely to occur from now on are supplied to mobile terminal are such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Type determination device 12 passes through the specific of the correlated characteristic information such as URL of the corresponding URL of such as described target pages Content perhaps tool such as discuz, phpwind etc. or is passed through by establishing the third party that the target pages are taken and building a station The forum page feature for including in the corresponding source code of the target pages, determines the page type information of the target pages. Here, the page type information includes but is not limited to following at least any one: 1) news type;2) novel types;3) question and answer Type;4) Forum Type.Here, the forum page feature includes but is not limited to following at least any one: 1) forum's homepage: opinion Altar title, column area title, plate title, model number today, login, registration, search;2) forum tabulation page: plate title, son Column title, subject name, plate theme/reply number, subject classification, topic headings, author/time;3) forum postings page: note Sub- author, time of posting, model text, floor number, page turning link.For example, it is assumed that the mesh that page acquisition device 11 is got The URL for marking the page is http://news.sina.com.cn/, then type determination device 12 is according to http: // The URL correlated characteristic information such as news for including in the particular content of news.sina.com.cn/ determines the class of the target webpage Type information is news web page.For another example, it is assumed that the target pages that page acquisition device 11 is got are that net http is studied abroad in New Orient: // Www.66xue.com/, it is assumed that the page uses Discuz!SNS+BBS interaction platform is built, then 12 basis of type determination device Establishing the tool Discuz that builds a station that page http://www.66xue.com/ is used determines the page type information of the page for opinion Altar type.Also such as, it is assumed that the target pages that page acquisition device 11 is got are http://bbs.sina.com.cn/, then class Type determining device 12 according to the forum page feature such as forum tabulation, forum postings etc. for including in the source code of the target pages, The page type information for determining the page is Forum Type.Those skilled in the art will be understood that above-mentioned page type information or opinion Altar page feature is only for example, other page type informations or forum page feature existing or be likely to occur from now on can such as fit It for the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Preferably, whether type determination device 12 can also meet according to the target pages that page acquisition device 11 obtains Scheduled type judgment rule, determines the page type information of the target pages;
Wherein, the scheduled type judgment rule includes following at least any one:
Belong to the page for tool foundation of being built a station by forum or the source code package of the target pages when the target pages When feature containing forum page, determine that the page type information of the target pages is forum page;
When the URL corresponding to the target pages belongs to page type database, according to the page type database Determine the page type information of the target pages;
When there is reference page similar with URL corresponding to the target pages, according to the page of the reference page Face type information determines the page type information of the target pages;
When the URL corresponding to the target pages includes URL correlated characteristic information, believed according to the URL correlated characteristic Breath determines the page type information of the target pages;
When URL and predetermined webpage template corresponding to the target pages match, according to the predetermined webpage template Determine the page type information of the target pages.
For example, when the scheduled type judgment rule includes that the target pages belong to and build a station what tool was established by forum When the source code package feature containing forum page of the page or the target pages, type determination device 12 determines the target pages Page type information when, here, the forum builds a station tool include such as discuz, phpwind, it is assumed that page acquisition device 11 target pages got are that net http://www.66xue.com/ is studied abroad in New Orient, it is assumed that the page uses Discuz!It takes Build SNS+BBS interaction platform, then type determination device 12 according to establish the page http://www.66xue.com/ use build a station Tool Discuz determines that the page type information of the page is Forum Type;Assuming that the page object that page acquisition device 11 is got Face is http://bbs.sina.com.cn/, the then forum that type determination device 12 contains according to the source code package of the target pages Page feature such as forum tabulation, forum postings etc. determines that the page type information of the page is Forum Type.
For another example, when the scheduled type judgment rule includes that URL corresponding to the target pages belongs to page type Database, when type determination device 12 determines the page type information of the target pages, it is assumed that page acquisition device 11 obtains The corresponding URL of the target pages arrived is http://news.163.com/12/0604/02/834D02M300014AED.html, Type determination device 12 obtains page http://news.163.com/12/0604/ by the URL Pattern of the calculating URL The URL Pattern of 02/834D02M300014AED.html be http://news .163 .com/ [0-9]+/ [0-9]+/ [0-9]+/ [0-9a-zA-Z]+.html, be based on the URL Pattern, match and look into page type database such as news library Ask, obtain in news library comprising value for http://news .163 .com/ [0-9]+/ [0-9]+/ [0-9]+/ [0-9a-zA-Z] + .html data, then type determination device 12 judges page http://news.163.com/12/0604/02/ 834D02M300014AED.html page type information be news type.For another example, when the scheduled type judgment rule packet It includes in the presence of reference page similar with URL corresponding to the target pages, type determination device 12 determines the target pages Page type information when, it is assumed that the target pages that page acquisition device 11 is got be http://news.sina.com.cn/ China/, then type determination device 12 passes through according to similar with target pages http://news.sina.com.cn/china/ The page type information such as news type of reference page such as http://news.sina.com.cn/, judges target pages The page type information of http://news.sina.com.cn/china/ is news type.
Also such as, the URL corresponding to the target pages includes URL correlated characteristic information, and type determination device 12 determines When the page type information of the target pages, here, the URL correlated characteristic information includes but is not limited to following at least any : 1) URL particular content, that is, form URL full content, as URL composition in include protocol type, host name, path and Filename etc.;2) character in URL suffix, i.e. URL composition at ending, such as htm, html, shtml, asp, jsp, php;3) Link depth etc. between URL depth, the i.e. TOC level of URL, page link;4) URL pattern has marked the page by multiple The page of type cluster the URL pattern of obtained corresponding page type.Assuming that the mesh that page acquisition device 11 is got The URL for marking the page is http://www.tianyabook.com/, then type determination device 12 is according to http: // The URL correlated characteristic information such as tianyabook that includes in the particular content of www.tianyabook.com/ determines the target The page type information of webpage is novel types, and those skilled in the art will be understood that above-mentioned URL correlated characteristic information is only to lift Example, other URL correlated characteristic information that are existing or being likely to occur from now on are such as applicable to the present invention, should also be included in the present invention Within protection scope, and it is incorporated herein by reference.
Also such as, when the scheduled type judgment rule includes URL corresponding to the target pages and predetermined webpage mould Plate matches, when type determination device 12 determines the page type information of the target pages, it is assumed that page acquisition device 11 obtains Does is URL corresponding to the target pages got http://xinzhi.baidu.com/pub? next=%2F, type determine dress 12 are set according to http://xinzhi.baidu.com/pub? next=%2F judges itself and predetermined webpage template question and answer template phase Matching, then type determination device 12 determines that the type information of the target pages is question and answer type.
Those skilled in the art will be understood that the mode of the page type information of the above-mentioned determination target pages is only to lift Example, the mode of the page type information of other determination that is existing or being likely to occur from now on target pages are such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Those skilled in the art will be understood that type determination device 12 can also be according to above-mentioned scheduled type judgment rule Any combination, to determine the page type information of the target pages.
Those skilled in the art will be understood that above-mentioned scheduled type judgment rule is only for example, other are existing or from now on The scheduled type judgment rule being likely to occur such as is applicable to the present invention, should also be included within the scope of protection of the present invention, and It is incorporated herein by reference.
The page type information field and setting for including in the URL that element determining device 13 passes through such as described target pages Page main reconfigurable factors between corresponding relationship, according to the page type information, determination is corresponding with the target pages One or more page main reconfigurable factors.Here, the page main reconfigurable factors include such as page body matter, the page reconstruct section The pages key messages such as point, page reconstruct piecemeal.For example, it is assumed that the target pages that type determination device 12 determines are such as The page type information of http://news.sina.com.cn/ be news type, then element determining device 13 determine with The corresponding one or more page main reconfigurable factors of the target pages include the Segment of the different content of the target pages such as The news for including in " important news, home news channel, world news channel, sports channel, channel for finance and economics " etc. and the target pages The page-tags such as title, body, source of news, issuing time such as heading label<h1>-<h6>, document body label< Body>, paragraph tag<p>and corresponding content of text.For another example, it is assumed that the target pages that type determination device 12 determines are such as The page type information of http://bbs.dospy.com/ be Forum Type, then element determining device 13 determine with this The corresponding one or more page main reconfigurable factors of target pages include forum's title " plug in forum's homepage of the target pages Such as " Nokia WP7 discusses subregion, the 7 operating system zone of discussion Windows Phone, apple for class's smart phone net ", column subregion IPhone type Taxonomic discussion area, the zone of discussion Android android/hot topic, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area etc. and for example sub- column title of forum tabulation page, subject classification, column theme/reply number, Author/time etc. and forum postings page such as model author, time of posting, model text etc..For another example, it is assumed that type determination device The page type information of the 12 target pages such as http://xinzhi.baidu.com/ determined is question and answer type, then One or more page main reconfigurable factors corresponding with the target pages that element determining device 13 determines include the target pages The Segment such as homepage of different content, square/hot topic question and answer, square/newest problem, discovery browsing etc..Also such as, it is assumed that class The page for the target pages such as http://www.readnovel.com/book/73144/ that type determining device 12 determines Face type information is novel types, then one or more pages corresponding with the target pages that element determining device 13 determines Main reconfigurable factors include the novel cover page such as novel title " making the name of an article: A Dream of Red Mansions " of the target pages, storywriter: Cao Xueqin ", Such as " 23. the 23rd times wonderful words of The Romance of West Chamber are logical for brief introduction, renewal time 2010-04-0418:02:10 etc. and listing of novel Gorgeous bent alert heart of a young woman of language peony pavilion of playing " etc. and novel text such as chapter title " the logical play language peony pavilion of the 23rd time wonderful word of The Romance of West Chamber The gorgeous bent alert heart of a young woman ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just order that day all topic Chant, life visits the spring and successively makes a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", " make the name of an article: red building Dream ", " author: Cao Xueqin ", the link of novel chapters and sections such as " [page up] [returning catalogue] [lower one page] ".Here, page type is believed Corresponding relationship between breath field and the page main reconfigurable factors of setting can be present in page reconstruct in the form of table or database 1 end of equipment, or the third party device being connected with page reconstructing arrangement 1 by network.
Those skilled in the art will be understood that above-mentioned determination one or more page weights corresponding with the target pages The mode of structure element is only for example, other determinations that are existing or being likely to occur from now on are one corresponding with the target pages Or multiple page main reconfigurable factors modes are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein with Way of reference is incorporated herein.
Those skilled in the art will be understood that the corresponding relationship of above-mentioned page type information and page main reconfigurable factors is only to lift The corresponding relationship of example, other existing or page type informations for being likely to occur from now on and page main reconfigurable factors is such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Then, one or more of pages reconstruct that webpage generating device 14 is determined according to element determining device 13 is wanted Element reconstructs content by extracting corresponding with the page main reconfigurable factors page from the target pages, generation with it is described The corresponding reconstruction page of target pages.Specifically, the institute that webpage generating device 14 is determined according to element determining device 13 first One or more page main reconfigurable factors are stated, by such as parsing the HTML of the target pages, are extracted from the target pages The page corresponding with the page main reconfigurable factors reconstructs content, for example, it is assumed that element determining device 13 determines and page object The corresponding one or more page main reconfigurable factors of face http://www.readnovel.com/novel/73144/23.html Novel text such as chapter title " the gorgeous bent alert virtue of the logical play language peony pavilion of the 23rd time wonderful word of The Romance of West Chamber including the target pages The heart ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just order that day all inscription, life visits the spring Successively make a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", the link of novel chapters and sections is such as " [page up] [returning catalogue] [lower one page] " etc., then webpage generating device 14 passes through the html document for such as parsing the target pages, extracts the page The page corresponding with above-mentioned page main reconfigurable factors reconstructs for example specific content of text of content in face.
Those skilled in the art will be understood that the above-mentioned extraction from the target pages is opposite with the page main reconfigurable factors The mode for the page reconstruct content answered is only for example, other are existing or what is be likely to occur from now on extracts from the target pages The mode of page reconstruct content corresponding with the page main reconfigurable factors is such as applicable to the present invention, should also be included in the present invention Within protection scope, and it is incorporated herein by reference.
Then, the page is reconstructed content by webpage generating device 14, reconstructs mode or root according to the predefined page According to the original layout mode of the target pages, reconstruction page corresponding with the target pages is generated.Example is connected, the page is raw At device 14 by the reconstruct content of extraction including such as chapter title " the gorgeous bent police of the logical play language peony pavilion of the 23rd time wonderful word of The Romance of West Chamber The heart of a young woman ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just order that day all inscription, life is visited Spring successively makes a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", novel chapters and sections link " [page up] [returning catalogue] [lower one page] ", such as in sequence according to predefined mode: the sequence that chapter title, novel text, chapters and sections link It is arranged successively.
Those skilled in the art will be understood that the mode of above-mentioned generation reconstruction page corresponding with the target pages only For citing, other modes for generating reconstruction page corresponding with the target pages that are existing or being likely to occur from now on such as may be used It suitable for the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Preferably, webpage generating device 14 can also be according to one or more of page main reconfigurable factors, by from the mesh It marks and extracts page reconstruct content corresponding with the page main reconfigurable factors in the page, the terminal in conjunction with the mobile terminal is related Attribute generates reconstruction page corresponding with the mobile terminal.Specifically, webpage generating device 14 is determined according to element first One or more of page main reconfigurable factors that device 13 determines are reconstructed by extracting from the target pages with the page The corresponding page of element reconstructs content, then in conjunction with the terminal association attributes of the mobile terminal, generates with the movement eventually Hold corresponding reconstruction page.Wherein, the terminal association attributes include following at least any one:
The page visibility region of the mobile terminal;
The screen available work region of the mobile terminal;
The screen resolution of the mobile terminal;
The system configuration attribute of the mobile terminal.
For example, being generated and the movement when the terminal association attributes include the page visibility region of the mobile terminal When the corresponding reconstruction page of terminal, it is assumed that element determining device 13 determine with target pages http: // The corresponding one or more page reconstruct of tech.sina.com.cn/i/m/2012-05-31/03497194247.shtml are wanted Element includes headline " internet empress D10 report: mobile interchange network users are by super desktop ", the issuing time of the target pages " 03:49 of on May 31st, 2012 ", source of news " http://www.sina.com.cn ", body " Sina's science and technology news north May 31 capital time morning message, the well-known risk investment mechanism Kleiner Perkins Caufield Byers in Silicon Valley is (hereinafter referred to as " KPCB ") partner, " internet empress " Mary Mick you claim in D10 conference (Mary Meeker) Wednesday ..., Facebook is also required to possess sound " war chest ".(wind vertical bamboo flute Vygen) ", webpage generating device 14 can be according to the target pages Js resource in html document obtains the page visibility region of the mobile terminal, e.g., according to availWidth=parseInt (document.body.clientWidth) page visual field field width is obtained, according to availHeight=parseInt (document.body.clientHeight) it is high to obtain page visibility region, then, webpage generating device 14 combines should AvailWidth and availHeight generates reconstruction page corresponding with the mobile terminal.For another example, when the terminal phase Close the system configuration attribute that attribute includes the mobile terminal, such as OS Type and version, processor configuration information life When at reconstruction page corresponding with the mobile terminal, it is assumed that the system configuration attribute of the mobile terminal includes " double-core 1.2GHz ", then webpage generating device 14 according to the system configuration attribute determine the mobile terminal be Gao Duanji, generate with it is described The corresponding reconstruction page of mobile terminal includes that the one or more of pages reconstruct determined according to element determining device 13 is wanted Element reconstructs content by extracting the page corresponding with the page main reconfigurable factors from the target pages, such as news Type page, including headline, body, source of news, news briefing time;Assuming that the system of the mobile terminal is matched Setting attribute includes " 1GHz high pass Snapdragon processor, using 2.3 operating system of Android ", then webpage generating device 14 It determines that the mobile terminal is low side machine according to the system configuration attribute, generates reconstruction page corresponding with the mobile terminal Including from the page info for removing all the elements other than advertisement in the target pages, such as the news type page, including it is new Hear title, news picture, body, source of news, news briefing time.
Those skilled in the art will be understood that the terminal association attributes of mobile terminal described in above-mentioned combination generate and the shifting The mode of the dynamic corresponding reconstruction page of terminal is only for example, described in other existing or combinations for being likely to occur from now on it is mobile eventually The mode that the terminal association attributes at end generate reconstruction page corresponding with the mobile terminal is such as applicable to the present invention, also answers Within the scope of the present invention, and it is incorporated herein by reference.
The reconstruction page that then generates webpage generating device 14 of device 15 is provided, by the communication mode of agreement, Such as http or https communication protocol is provided to the mobile terminal, for user's reading and browsing.
It constantly works between each device of page reconstructing arrangement 1.Specifically, page acquisition device 11 continues Obtain the target pages for being supplied to mobile terminal;Type determination device 12 persistently determines the page type letter of the target pages Breath;Element determining device 13 continues to determine one or more corresponding with the target pages according to the page type information A page main reconfigurable factors;Webpage generating device 14 continues according to one or more of page main reconfigurable factors, by from the mesh It marks and extracts page reconstruct content corresponding with the page main reconfigurable factors in the page, generate corresponding with the target pages Reconstruction page;Device 15 is provided, the reconstruction page is provided to the mobile terminal.Here, it will be understood by those skilled in the art that " lasting " refer to each device of page reconstructing arrangement 1 constantly carry out respectively the acquisitions of target pages, the determination of page type information, The determination of page main reconfigurable factors, the generation of reconstruction page and offer, until the page reconstructing arrangement 1 stops mesh in a long time Mark the acquisition of the page.
Preferably, element determining device 13 can also pass through such as described target pages according to the page type information Mapping between the page type information field for including in URL and the page main reconfigurable factors and its page of setting reconstruct pattern is closed System, according to the page type information, determine one or more page main reconfigurable factors corresponding with the target pages and its The page reconstructs pattern.Here, the page reconstruct pattern includes but is not limited to: 1) page layout;2) webpage representation mode.Example Such as, it is assumed that the classes of pages for the target pages such as http://news.sina.com.cn/ that type determination device 12 determines Type information is news type, then one or more page reconstruct corresponding with the target pages that element determining device 13 determines Element includes the Segment such as " important news, home news channel, world news channel, sport of the different content of the target pages The pages such as headline, body, source of news, the issuing time for including in channel, channel for finance and economics " etc. and the target pages Face label such as heading label<h1>-<h6>, document body label<body>, paragraph tag<p>and corresponding content of text, element The respective page reconstruct pattern that determining device 13 determines includes such as according to important news, home news channel, world news channel, sport Channel, channel for finance and economics successively from top to bottom arrange, and each channel includes headline content of text and title link etc..For another example, Assuming that the page type information for the target pages such as http://bbs.dospy.com/ that type determination device 12 determines For Forum Type, then one or more page main reconfigurable factors packets corresponding with the target pages that element determining device 13 determines Including forum's title " Saipan smart phone net " in forum's homepage of the target pages, column subregion, such as " Nokia WP7, which is discussed, to be divided Area, the 7 operating system zone of discussion Windows Phone, apple iPhone type Taxonomic discussion area, the zone of discussion Android android/ Hot topic, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area etc., element determining device 13 are true Fixed respective page reconstruct pattern include such as according to Nokia WP7 discussion subregion, the 7 operating system zone of discussion WindowsPhone, Apple iPhone type Taxonomic discussion area, the zone of discussion Android android/hot topic, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area successively from top to bottom arranges, and each subregion includes that title text content and chain of title are discussed It connects.Also such as, it is assumed that the target pages such as http://www.readnovel.com/ that type determination device 12 determines The page type information of novel/73144/23.html be novel types, then element determining device 13 determine with the mesh The corresponding one or more page main reconfigurable factors of the mark page include the novel text such as chapter title the " the 20th of the target pages The gorgeous bent alert heart of a young woman of the logical play language peony pavilion of three times wonderful words of The Romance of West Chamber ", novel body matter " talk about Jia Yuanchun to return from that day good fortune Grand View Garden Gong Quhou is just ordered that day all inscription, and life visits the spring and successively makes a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without facing the moon Wind hatred has it.", novel chapters and sections link piecemeal such as " [page up] [return to catalogue] [lower one page] ", the determination of element determining device 13 Respective page reconstruct pattern include as in sequence: the sequence that chapter title, novel text, chapters and sections link is arranged successively.? This, its page of the page main reconfigurable factors of page type information field and setting reconstruct the mapping relations between pattern can with table or The form of database is present in 1 end of page reconstructing arrangement, or is set with page reconstructing arrangement 1 by the third party that network is connected It is standby.
Those skilled in the art will be understood that above-mentioned determination one or more page weights corresponding with the target pages Structure element and its mode of page reconstruct pattern are only for example, other determinations and the target existing or be likely to occur from now on The corresponding one or more page main reconfigurable factors of the page or its page reconstruct pattern mode are such as applicable to the present invention, should also wrap It is contained within the scope of the present invention, and is incorporated herein by reference.
Those skilled in the art will be understood that above-mentioned page type information and page main reconfigurable factors and its page reconstruct pattern Mapping relations be only for example, other existing or page type informations for being likely to occur from now on and page main reconfigurable factors or its page The mapping relations of face reconstruct pattern are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Then, webpage generating device 14 is first according to one or more of page main reconfigurable factors, from the target pages It is middle to extract page reconstruct content corresponding with the page main reconfigurable factors;Then content is reconstructed according to the page, and combined The page reconstructs pattern, generates reconstruction page corresponding with the target pages.For example, it is assumed that webpage generating device 14 from It is extracting with the page in the target pages such as http://www.readnovel.com/novel/73144/23.html It includes such as chapter title the " the 20th that main reconfigurable factors such as chapter title, novel text, chapters and sections, which link corresponding page reconstruct content, The gorgeous bent alert heart of a young woman of the logical play language peony pavilion of three times wonderful words of The Romance of West Chamber ", novel body matter " talk about Jia Yuanchun to return from that day good fortune Grand View Garden Gong Quhou is just ordered that day all inscription, and life visits the spring and successively makes a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without facing the moon Wind hatred has it.", novel chapters and sections link " [page up] [return to catalogue] [lower one page] ", and determined in conjunction with element determining device 13 The page reconstruct pattern, such as in sequence: chapter title, novel text, chapters and sections link sequence be arranged successively.
Preferably, page reconstructing arrangement 1 further includes piecemeal acquisition device (not shown).Specifically, piecemeal acquisition device obtains Take the Segment of the target pages;Webpage generating device 14 according to one or more of page main reconfigurable factors, by from The page corresponding with the page main reconfigurable factors is extracted in the Segment and reconstructs content, is generated and the Segment phase Corresponding reconstruct piecemeal;Device 15 is provided, the reconstruct piecemeal is provided to the mobile terminal.
Specifically, the target pages that piecemeal acquisition device is obtained according to page acquisition device 11 are based on html tag Analysis method or according to VIPS (Vision-based Page Segmentation, view-based access control model the page segmentation) algorithm, To obtain the Segment of the target pages.For example, piecemeal acquisition device is according to VIPS algorithm, using webpage foreground color, The visual signatures such as spacing, element position between background color, font color and size, frame, logical block and logical block, pass through The target pages http://news.sina.com.cn/ that page acquisition device 11 is obtained that establishes relevant regulations is divided into Each visual information block, such as subject of news block, body block, navigation block, commercial block.Skilled artisans will appreciate that above-mentioned The mode for obtaining the Segment of the target pages is only for example, other described mesh of acquisition that is existing or being likely to occur from now on The mode for marking the Segment of the page is only such as applicable to the present invention, should also be included within the scope of protection of the present invention, and to draw It is incorporated herein with mode.
Then, webpage generating device 14 is reconstructed according to one or more of pages that element determining device 13 determines first Element reconstructs content by extracting the page corresponding with the page main reconfigurable factors from the Segment, then generates Reconstruct piecemeal corresponding with the Segment.For example, it is assumed that element determining device 13 determine with the target pages The corresponding page main reconfigurable factors of http://news.sina.com.cn/ include the page of the different content of the target pages Face channel piecemeal such as " important news, home news channel, world news channel, sports channel, channel for finance and economics " and the target pages The page-tags such as headline, body, source of news, issuing time for including in middle different channel piecemeal such as heading label <h1>-<h6>, document body label<body>, paragraph tag<p>and corresponding content of text, example is connected, then the page generates dress 14 are set first by such as parsing the HTML of the target pages, the Segment such as subject of news obtained from piecemeal acquisition device Block, navigation block, extracts page corresponding with the main reconfigurable factors of the determination of element determining device 13 in commercial block at body block Face reconstructs content, and that extracts from subject of news block such as webpage generating device 14 is opposite with page main reconfigurable factors page channel piecemeal Answer the page reconstruct content include include in headline channel in target pages http://news.sina.com.cn/ news such as " network media in-depth, which is walked, to be turned to change that activity forum is in Beijing to be held ", " Children's Day special topic " etc..For another example, webpage generating device 14 is from new Hearing the page corresponding with the page main reconfigurable factors page channel piecemeal reconstruct content extracted in main body block includes target pages Such as " 22 points of thunderclaps of Ah Du win that pull back 1-2GDP low completely to the news for including in sports channel in http://news.sina.com.cn/ Fan's consecutive victories of spur 20 terminate ", " Europe Cup Final 16 list announce 368 soccer stars entirely have a guide look of you can also edit " etc..
Then, webpage generating device 14 will extract page corresponding with the page main reconfigurable factors from the Segment Face reconstructs content, reconstructs mode according to predefined Segment or according to the original layout mode of the Segment, life At reconstruct piecemeal corresponding with the Segment.
Skilled artisans will appreciate that the mode of above-mentioned generation reconstruct piecemeal corresponding with the Segment is only Citing, other modes for generating reconstruct piecemeal corresponding with the Segment that are existing or being likely to occur from now on only such as may be used It suitable for the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
The reconstruct piecemeal that then generates webpage generating device 14 of device 15 is provided, by the communication mode of agreement, Such as http or https communication protocol is provided to the mobile terminal, for user's reading and browsing.
Preferentially, webpage generating device 14 can also be generated according to the importance of block of the reconstruct piecemeal by providing device 15 The reconstruct piecemeal mobile terminal is provided to by the communication mode of agreement, such as http or https communication protocol, For user's reading and browsing.Here, the importance of block includes but is not limited to following at least any one: 1) text of the reconstruct piecemeal This character and entire<body>the ratio of the text character of block;2) there is no the text character of link and entire in the reconstruct piecemeal The ratio of total text character of the page;3) it is described reconstruct piecemeal block area and the page it is entire<body>the ratio of the area of block Rate.Skilled artisans will appreciate that above-mentioned importance of block is only for example, other blocks that are existing or being likely to occur from now on are important Degree is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
(refer to Fig. 1) in a preferred embodiment, page reconstructing arrangement 1 is determined including page acquisition device 11, type Device 12, element determining device 13, webpage generating device 14 and offer device 15.Wherein, element determining device 13 includes model Determination unit (not shown) and Node extraction unit (not shown).The preferred embodiment is described below with reference to Fig. 1: specific Ground, page acquisition device 11 obtain the target pages for being supplied to mobile terminal;Type determination device 12 determines the page object The page type information in face;Model determination unit determines page common document object mould corresponding with the page type information Type;Node extraction unit extracts the page reconstruct node of the target pages according to the page common document object model, with As the page main reconfigurable factors;Webpage generating device 14 is according to one or more of page main reconfigurable factors, by from described The page corresponding with the page main reconfigurable factors is extracted in target pages and reconstructs content, is generated corresponding with the target pages Reconstruction page;Device 15 is provided, the reconstruction page is provided to the mobile terminal.Wherein, page acquisition device 11, class Type determining device 12, webpage generating device 14 and offer device 15.It is same or similar that step is corresponded to shown in Fig. 1, therefore herein not It repeats, and is incorporated herein by reference again.
Specifically, model determination unit is by having public affairs in multiple pages such as corresponding with the page type information The DOM tree node in conode path is then based on the DOM tree node with common node path, the determining and classes of pages The corresponding page common document object model of type information.For example, it is assumed that corresponding with the type information such as novel types Multiple pages are such as:
A: the nine Hui Lin coach's wind and snow mountain temple land anxiety, which is waited, burns fodder field
http://www.readnovel.com/novel/73145/12.html
B: the tenth Hui Lin coach's wind and snow mountain temple land of Heroes of the Marshes anxiety, which is waited, burns fodder field
http://www.purepen.com/shz/010.htm
C: the two ten eight time the Liangshan pool full partner of the city parting great Mai Song Gongming is offered amnesty and enlistment to rebels
Http:// www.cuiweiju.com/fulltext/97/97926.html#5383832 has common node road The DOM tree node of diameter such as D1-Dn, then model determination unit generates corresponding dom tree such as DOM-D according to D1-Dn, using as with novel The corresponding page common document object model Common-DOM-D of type page.
Skilled artisans will appreciate that above-mentioned determination unit determines that the page corresponding with the page type information is public The mode of DOM Document Object Model is only for example altogether, other existing or determination units determinations being likely to occur from now on and the page The mode of the corresponding page common document object model of type information is such as applicable to the present invention, should also be included in guarantor of the present invention It protects within range, and is incorporated herein by reference.
Then, Node extraction unit extracts the page of the target pages according to the page common document object model Node is reconstructed, using as the page main reconfigurable factors.For example, it is assumed that the target pages that page acquisition device 11 is got are The novel types page takes Baozhusi http by force as two Longshan blueness face beast of the 17th lap waste Buddhist monk singles of Heroes of the Marshes is double: // Www.purepen.com/shz/017.htm, then what Node extraction unit was determined according to model determination unit believes with page type The breath such as corresponding page common document object model Common-DOM-D of novel types, from the corresponding dom tree of the page The node all the same with page common document object model Common-DOM-D nodename and nodes X Path is extracted, as page Face reconstructs node, and in this, as the page main reconfigurable factors.
Preferably, model determination unit is first according to each in multiple reference page of the correspondence page type information Corresponding DOM Document Object Model extracts the common node of the multiple reference page, then generates and believes with the page type The corresponding page common document object model of manner of breathing.Correspond to the page type information such as news for example, it is assumed that existing Multiple reference page of type are such as:
I:sina news homepage http://news.sina.com.cn/,
II:sina home news http://news.sina.com.cn/china/,
III:sina world news http://news.sina.com.cn/world/,
IV:sohu news homepage http://news.sohu.com/,
Model determination unit is parsed according to the corresponding html document of each in multiple reference page first, will Html tag is converted into the node of corresponding dom tree, so that respective dom tree, respectively DOM-I, DOM-II, DOM-III are generated, DOM-IV, by extracting DOM-I, DOM-II, DOM-III, DOM-IV interior joint title and nodes X Path node all the same, Obtain the common node such as E1-En of multiple reference page;Then, model determination unit is generated according to common node E1-En The page common document object model such as Common-DOM-E corresponding with the page type information.
Fig. 2 shows in accordance with a preferred embodiment of the present invention for providing reconstruction page corresponding with target pages Page reconstructing arrangement 1.Wherein, page reconstructing arrangement 1 is determined including page acquisition device 11 ', type determination device 12 ', element Device 13 ', webpage generating device 14 ' and offer device 15 '.The preferred embodiment is described below with reference to Fig. 2: specific Ground, page acquisition device 11 ' obtain the target pages for being supplied to mobile terminal;Type determination device 12 ' determines the target The page type information of the page;Element determining device 13 ' is according to the page type information, in page main reconfigurable factors database Matching inquiry is carried out, to obtain one or more page main reconfigurable factors corresponding with the target pages;Webpage generating device 14 ' according to one or more of page main reconfigurable factors, by extracting and the page main reconfigurable factors from the target pages The corresponding page reconstructs content, generates reconstruction page corresponding with the target pages;Device 15 ' is provided by the reconstruct The page is provided to the mobile terminal.Wherein, page acquisition device 11 ', type determination device 12 ', webpage generating device 14 ' and Device 15 ' is provided and corresponding intrument shown in Fig. 1 is same or similar, therefore details are not described herein again, and is contained in by reference This.
Specifically, element determining device 13 ' carries out in page main reconfigurable factors database according to the page type information Matching inquiry, to obtain one or more page main reconfigurable factors corresponding with the target pages.For example, it is assumed that the page obtains Do are the target pages that device 11 ' obtains http://xinzhi.baidu.com/pub? next=%2F, type determination device 12 ' The page info type of the determining target pages is question and answer type, then element determining device 13 ' is determined according to type and filled 12 ' the question and answer types determined are set, carry out matching inquiry in page main reconfigurable factors database, to obtain and the target pages phase Corresponding one or more page main reconfigurable factors.Here, the page main reconfigurable factors database can be located at page reconstructing arrangement 1 In, it may be alternatively located in the third party device being connected with page reconstructing arrangement 1 by network, such as server.
Preferably, page reconstructing arrangement 1 further includes classification acquisition device 16 ', element acquisition device 17 ' and database update Device 18 '.Specifically, classification acquisition device 16 ' is carried out according to multiple training pages for having marked page type by page type Classification obtains one or more page classifications, wherein the page classifications include at least one described trained page;Element obtains Take device 17 ' according to the trained page for including in the page classifications, by predetermined page element training rules, obtain with The corresponding one or more page main reconfigurable factors of page type corresponding to the page classifications;Database update device 18 ' According to one or more corresponding with page type corresponding to the page classifications page main reconfigurable factors, establish or more The new page main reconfigurable factors database.
Specifically, classification acquisition device 16 ' is carried out according to multiple training pages for having marked page type by page type Classification obtains one or more page classifications, wherein the page classifications include at least one described trained page.For example, false If there are multiple training pages for having marked page type such as:
V:sina sports news http://sports.sina.com.cn/, news type
VI:sina financial and economic news http://finance.sina.com.cn/, news type
VII:sina/ reading/novel shop/world's masterpiece/" the Count of Monte Christo "
Http:// vip.book.sina.com.cn/book/index_81300.html, novel types
VIII:sina/ reading/books publish in instalments/novel/local novels/" the ordinary world "
Http:// vip.book.sina.com.cn/book/index_86819.html, novel types
IX:sohu/ reading/books publish in instalments/literature general pipeline/classical fiction/" Tang, Sui historical romance " (full text)
Http:// lz.book.sohu.com/serialize-id-13706.html, novel types, which are then classified, obtains dress 16 ' are set according to multiple training page for having marked page type, is classified by page type, one or more pages are obtained Classification, such as the news type page V and VI, the novel types page VII, VIII and IX, wherein the page classifications include at least one A trained page.
Then, include in the page classifications that element acquisition device 17 ' is obtained according to classification acquisition device 16 ' is described The training page is obtained corresponding with page type corresponding to the page classifications by predetermined page element training rules One or more page main reconfigurable factors.Wherein, the predetermined page element training rules include following at least any one:
Bayesian Estimation analysis is carried out to the trained page in the page classifications, obtains the page classifications institute The corresponding one or more page main reconfigurable factors of corresponding page type;
Maximal possibility estimation analysis is carried out to the trained page in the page classifications, obtains the page classifications The corresponding one or more page main reconfigurable factors of corresponding page type.
For example, connecting example, wrapped in the page classifications that element acquisition device 17 ' is obtained according to classification acquisition device 16 ' The trained page included, as news type classification in include the trained page V and VI, novel types classification in include The described trained page VII, VIII and IX, by carrying out Bayesian Estimation point to the trained page in the page classifications Analysis carries out maximal possibility estimation analysis by or to the multiple page node training data, to obtain the page classifications The corresponding one or more page main reconfigurable factors of corresponding page type, as the page classifications institute with news type is right The corresponding one or more page main reconfigurable factors of the page type answered include subject of news block, body block and news mark The page-tags such as topic, body such as heading label<h1>-<h6>, document body label<body>, paragraph tag<p>and it is corresponding Content of text, it is corresponding with page type corresponding to the page classifications of novel types one or more the pages reconstruct Element includes novel text, storywriter, chapters and sections directory link etc..
Skilled artisans will appreciate that above-mentioned obtain the page main reconfigurable factors according to predetermined page element training rules Mode be only for example, other are existing or what is be likely to occur from now on obtains the page according to predetermined page element training rules The mode of main reconfigurable factors is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and includes by reference In this.
Database update device 18 ' is according to one or more corresponding with page type corresponding to the page classifications The page main reconfigurable factors establish or update the page main reconfigurable factors database.For example, 18 ' basis of database update device The one or more page corresponding with page type corresponding to the page classifications that element acquisition device 17 ' obtains Main reconfigurable factors establish the page weight of corresponding relationship between the page main reconfigurable factors comprising page type and corresponding to it Structure factor database.
In another preferred embodiment, it can be used to provide the page weight of reconstruction page corresponding with target pages for above-mentioned Structure equipment 1 is combined with existing browser, constitutes a kind of new browser, and existing browser includes such as Microsoft The IE browser of company, the netscape browser of Netscape company, Mozilla company Firefox browser, Google The Chrome browser of company, the Maxthon browser for company of roaming, the opera browser of Opera company, 360 companies 360 browsers, the sogou browser of Sohu.com Inc., tencent TT browser of Tencent etc..
In another preferred embodiment, it can be used to provide the page weight of reconstruction page corresponding with target pages for above-mentioned Structure equipment 1 is combined with existing browser plug-in, constitutes a kind of new browser plug-in, and existing browser plug-in includes Such as Flash plug-in unit, RealPlayer plug-in unit, MMS plug-in unit, MIDI staff plug-in unit, ActiveX plug-in unit.
Fig. 3 show according to a further aspect of the present invention for providing the method for reconstruction page corresponding with target pages Flow chart.
Specifically, in step sl, page reconstructing arrangement 1 obtains the target pages for being supplied to mobile terminal;In step In S2, page reconstructing arrangement 1 determines the page type information of the target pages;In step s3,1 basis of page reconstructing arrangement The page type information determines one or more page main reconfigurable factors corresponding with the target pages;In step s 4, Page reconstructing arrangement 1 is according to one or more of page main reconfigurable factors, by extracting and the page from the target pages The main reconfigurable factors corresponding page in face reconstructs content, generates reconstruction page corresponding with the target pages;In step s 5, The reconstruction page is provided to the mobile terminal by page reconstructing arrangement 1.Wherein, the mobile terminal is that any one can be with User carries out the electronic product of human-computer interaction by modes such as keyboard, touch tablet or handwriting equipments, such as smart phone, portable Formula game machine, PDA, palm PC PPC or tablet computer etc..Here, page reconstructing arrangement 1 include but is not limited to mobile terminal, The cloud that network host, single network server, multiple network server collection or multiple servers are constituted.Here, cloud is by being based on cloud The a large amount of hosts or network server for calculating (Cloud Computing) are constituted, wherein cloud computing is the one of distributed computing Kind, a super virtual computer consisting of a loosely coupled set of computers.Those skilled in the art will be understood that It states page reconstructing arrangement 1 to be only for example, other network equipments or mobile terminal existing or be likely to occur from now on are for example applicable It in the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Specifically, in step sl, page reconstructing arrangement 1 passes through such as news website, novel website, question and answer website or opinion The application programming interfaces (API) that the third party devices such as altar website provide obtain the target pages for being supplied to mobile terminal;Or By dynamic web page techniques such as JSP, ASP, the search sequence that user is inputted by mobile terminal is obtained, then the search sequence is mentioned Search engine is given, and receives the search result corresponding with the search sequence that search engine is fed back, using as to be supplied To the target pages of mobile terminal;Or communication mode is arranged by http, https etc., acquisition is supplied to mobile terminal Target pages.Wherein, the target pages include but is not limited to following at least any one: 1) news pages;2) the novel page;3) The question and answer page;4) forum page, those skilled in the art will be understood that above-mentioned target pages are only for example, other are existing or modern The target pages being likely to occur afterwards are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
For example, user inputs network address http://news.sina.com.cn/ in browser address bar, by "enter" key", In step sl, page reconstructing arrangement 1 is obtained by the application programming interfaces (API) that the third party devices such as news website provide Get webpage corresponding with the network address http://news.sina.com.cn/.For another example, search column of the user in mobile terminal Middle input keyword " the Water Margin novel " clicks search button, and in step sl, page reconstructing arrangement 1 is dynamic by JSP or ASP etc. State web technologies, the search sequence inputted from the acquisition for mobile terminal to the user, and the search sequence is based on to search engine Searching request is submitted, obtaining search engine by the application programming interfaces (API) that search engine provides, " the Water Margin is small according to keyword Say " the obtained one or more search results to match with keyword " the Water Margin novel " of matching inquiry, such as " under Water Margin txt Load, Water Margin full text reading-" novel reading net " ", " Water Margin novel online reading ", as the mesh for being supplied to mobile terminal Mark the page.
The mode that those skilled in the art will be understood that above-mentioned acquisition is supplied to the target pages of mobile terminal is only to lift Example, the mode for the target pages that other acquisitions that are existing or being likely to occur from now on are supplied to mobile terminal are such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
In step s 2, page reconstructing arrangement 1 passes through the correlated characteristic information of the corresponding URL of such as described target pages such as The particular content of URL, or built a station tool such as discuz, phpwind etc. by establishing the third party that the target pages are taken, Or the forum page feature by including in the corresponding source code of the target pages, determine the classes of pages of the target pages Type information.Here, the page type information includes but is not limited to following at least any one: 1) news type;2) novel types; 3) question and answer type;4) Forum Type.Here, the forum page feature includes but is not limited to following at least any one: 1) forum Homepage: forum's title, column area title, plate title, model number today, login, registration, search;2) forum tabulation page: plate Title, sub- column title, subject name, plate theme/reply number, subject classification, topic headings, author/time;3) forum's note Subpage: model author, time of posting, model text, floor number, page turning link.For example, it is assumed that in step sl, page weight The URL for the target pages that structure equipment 1 is got is http://news.sina.com.cn/, then in step s 2, page reconstruct The URL correlated characteristic information such as news that equipment 1 includes in the particular content according to http://news.sina.com.cn/ is come really The type information of the fixed target webpage is news web page.For another example, it is assumed that in step sl, the mesh that page reconstructing arrangement 1 is got The mark page is that net http://www.66xue.com/ is studied abroad in New Orient, it is assumed that the page uses Discuz!It is mutual to build SNS+BBS Moving platform, then in step s 2, page reconstructing arrangement 1 according to establish the page http://www.66xue.com/ use build a station Tool Discuz determines that the page type information of the page is Forum Type.Also such as, it is assumed that in step S 1, page reconstruct is set Standby 1 target pages got are http://bbs.sina.com.cn/, then in step s 2, page reconstructing arrangement 1 is according to this Forum page feature such as forum tabulation, forum postings for including in the source code of target pages etc. determines the classes of pages of the page Type information is Forum Type.Those skilled in the art will be understood that above-mentioned page type information or forum page feature is only to lift Example, other page type informations or forum page feature existing or be likely to occur from now on are such as applicable to the present invention, also answer Within the scope of the present invention, and it is incorporated herein by reference.
Preferably, in step s 2, the target pages that page reconstructing arrangement 1 can also obtain in step sl according to it Whether meet scheduled type judgment rule, determines the page type information of the target pages;
Wherein, the scheduled type judgment rule includes following at least any one:
Belong to the page for tool foundation of being built a station by forum or the source code package of the target pages when the target pages When feature containing forum page, determine that the page type information of the target pages is forum page;
When the URL corresponding to the target pages belongs to page type database, according to the page type database Determine the page type information of the target pages;
When there is reference page similar with URL corresponding to the target pages, according to the page of the reference page Face type information determines the page type information of the target pages;
When the URL corresponding to the target pages includes URL correlated characteristic information, believed according to the URL correlated characteristic Breath determines the page type information of the target pages;
When URL and predetermined webpage template corresponding to the target pages match, according to the predetermined webpage template Determine the page type information of the target pages.
For example, when the scheduled type judgment rule includes that the target pages belong to and build a station what tool was established by forum When the source code package feature containing forum page of the page or the target pages, in step s 2, page reconstructing arrangement 1 determines institute When stating the page type information of target pages, here, the forum builds a station, tool includes such as discuz, phpwind, it is assumed that In step S1, the target pages that page reconstructing arrangement 1 is got are that net http://www.66xue.com/ is studied abroad in New Orient, false If the page uses Discuz!SNS+BBS interaction platform is built, then in step s 2, page reconstructing arrangement 1 is according to establishing the page What http://www.66xue.com/ was used build a station, and tool Discuz determines the page type information of the page for Forum Type; Assuming that in step sl, the target pages that page reconstructing arrangement 1 is got are http://bbs.sina.com.cn/, then in step In rapid S2, page reconstructing arrangement 1 is according to the forum page feature such as forum tabulation, forum for including in the source code of the target pages Model etc. determines that the page type information of the page is Forum Type.
For another example, when the scheduled type judgment rule includes that URL corresponding to the target pages belongs to page type Database, in step s 2, when page reconstructing arrangement 1 determines the page type information of the target pages, it is assumed that in step S1 In, the corresponding URL of the target pages that page reconstructing arrangement 1 is got is http://news.163.com/12/0604/02/ 834D02M300014AED.html, in step s 2, page reconstructing arrangement 1 are obtained by the URL Pattern of the calculating URL The URL Pattern of page http://news.163.com/12/0604/02/834D02M300014AED.html is Http:// news .163 .com/ [0-9]+/ [0-9]+/ [0-9]+/ [0-9a-zA-Z]+.html, be based on the URL Pattern, the matching inquiry in page type database such as news library, obtain in news library comprising value for http://news .163 .com/ [0-9]+/ [0-9]+/ [0-9]+/ [0-9a-zA-Z]+.html data, then in step s 2, the page reconstruct Equipment 1 judges the page type letter of page http://news.163.com/12/0604/02/834D02M300014AED.html Breath is news type.For another example, when the scheduled type judgment rule includes existing and URL phase corresponding to the target pages As reference page, in step s 2, when page reconstructing arrangement 1 determines the page type information of the target pages, it is assumed that In step S1, the target pages that page reconstructing arrangement 1 is got are http://news.sina.com.cn/china/, then in step In rapid S2, page reconstructing arrangement 1 passes through according to ginseng similar with target pages http://news.sina.com.cn/china/ The page type information such as news type for examining the page such as http://news.sina.com.cn/, judges target pages http: // The page type information of news.sina.com.cn/china/ is news type.
Also such as, the URL corresponding to the target pages includes URL correlated characteristic information, in step s 2, page reconstruct When equipment 1 determines the page type information of the target pages, here, the URL correlated characteristic information include but is not limited to Under any one of at least: 1) URL particular content forms the full content of URL, such as protocol type, the host for including in URL composition Name, path and filename etc.;2) character in URL suffix, i.e. URL composition at ending, as htm, html, shtml, asp, jsp, Php etc.;3) the link depth etc. between URL depth, the i.e. TOC level of URL, page link;4) URL pattern, i.e., by it is multiple The page of mark page type cluster the URL pattern of obtained corresponding page type.Assuming that in step sl, page weight The URL for the target pages that structure equipment 1 is got is http://www.tianyabook.com/, then in step s 2, page weight The URL correlated characteristic information that structure equipment 1 includes in the particular content according to http://www.tianyabook.com/ is such as Tianyabook is novel types come the page type information for determining the target webpage, and those skilled in the art will be understood that above-mentioned URL correlated characteristic information is only for example, other URL correlated characteristic information that are existing or being likely to occur from now on are such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Also such as, when the scheduled type judgment rule includes URL corresponding to the target pages and predetermined webpage mould Plate matches, in step s 2, when page reconstructing arrangement 1 determines the page type information of the target pages, it is assumed that in step Does is URL corresponding to the target pages that page reconstructing arrangement 1 is got http://xinzhi.baidu.com/pub in S1? Next=%2F, in step s 2, is page reconstructing arrangement 1 according to http://xinzhi.baidu.com/pub? next=%2F Judge that it matches with predetermined webpage template question and answer template, then in step s 2, page reconstructing arrangement 1 determines the target pages Type information is question and answer type.
Those skilled in the art will be understood that the mode of the page type information of the above-mentioned determination target pages is only to lift Example, the mode of the page type information of other determination that is existing or being likely to occur from now on target pages are such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Those skilled in the art will be understood that in step s 2 page reconstructing arrangement 1 can also be according to above-mentioned scheduled class Any combination of type judgment rule, to determine the page type information of the target pages.
Those skilled in the art will be understood that above-mentioned scheduled type judgment rule is only for example, other are existing or from now on The scheduled type judgment rule being likely to occur such as is applicable to the present invention, should also be included within the scope of protection of the present invention, and It is incorporated herein by reference.
In step s3, the page type information for including in the URL that page reconstructing arrangement 1 passes through such as described target pages Corresponding relationship between field and the page main reconfigurable factors of setting, according to the page type information, the determining and page object The corresponding one or more page main reconfigurable factors in face.Here, the page main reconfigurable factors include such as page body matter, page Face reconstructs the pages key messages such as node, page reconstruct piecemeal.For example, it is assumed that in step s 2, what page reconstructing arrangement 1 determined The page type information of the target pages such as http://news.sina.com.cn/ is news type, then in step S3 In, one or more page main reconfigurable factors corresponding with the target pages that page reconstructing arrangement 1 determines include the page object The Segment of the different content in face such as " important news, home news channel, world news channel, sports channel, channel for finance and economics ", And the page-tags such as heading label such as headline, body, source of news, issuing time in the target pages including < H1>-<h6>, document body label<body>, paragraph tag<p>and corresponding content of text.For another example, it is assumed that in step s 2, The page type information for the target pages such as http://bbs.dospy.com/ that page reconstructing arrangement 1 determines is opinion Altar type, then in step s3, one or more page reconstruct corresponding with the target pages that page reconstructing arrangement 1 determines Element includes forum's title " Saipan smart phone net " in forum's homepage of the target pages, column subregion such as " Nokia WP7 Discuss that subregion, the 7 operating system zone of discussion Windows Phone, apple iPhone type Taxonomic discussion area, Android android are begged for Forum/hot topic, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area etc. and forum tabulation page Such as sub- column title, subject classification, column theme/reply number, author/time and forum postings page such as model author post Time, model text etc..For another example, it is assumed that in step s 2, the target pages such as http that page reconstructing arrangement 1 determines: // The page type information of xinzhi.baidu.com/ is question and answer type, then in step s3, page reconstructing arrangement 1 determines One or more page main reconfigurable factors corresponding with the target pages include the target pages different content the page point Block such as homepage, square/hot topic question and answer, square/newest problem, discovery browsing etc..Also such as, it is assumed that in step s 2, page reconstruct The page type letter for the target pages such as http://www.readnovel.com/book/73144/ that equipment 1 determines Breath is novel types, then in step s3, one or more pages corresponding with the target pages that page reconstructing arrangement 1 determines Face main reconfigurable factors include the novel cover page such as novel title " making the name of an article: A Dream of Red Mansions " of the target pages, storywriter: Cao Xue Celery ", brief introduction, renewal time 2010-04-0418:02:10 etc. and listing of novel are as " 23. the 23rd times The Romance of West Chambers are wonderful " the logical play language of the 23rd time wonderful word of The Romance of West Chamber is male for gorgeous bent alert heart of a young woman of the logical play language peony pavilion of word " etc. and novel text such as chapter title The gorgeous bent alert heart of a young woman of red pavilion ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just order and own that day Inscription, life visit the spring successively make a copy of compromises ..., exactly: the Zhuan Chenxiu night heart without, to moon facing the wind hate have it.", " make the name of an article: red Lou Meng ", " author: Cao Xueqin ", the link of novel chapters and sections such as " [page up] [returning catalogue] [lower one page] ".Here, page type Corresponding relationship between information field and the page main reconfigurable factors of setting can be present in page weight in the form of table or database 1 end of structure equipment, or the third party device being connected with page reconstructing arrangement 1 by network.
Those skilled in the art will be understood that above-mentioned determination one or more page weights corresponding with the target pages The mode of structure element is only for example, other determinations that are existing or being likely to occur from now on are one corresponding with the target pages Or multiple page main reconfigurable factors modes are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein with Way of reference is incorporated herein.
Those skilled in the art will be understood that the corresponding relationship of above-mentioned page type information and page main reconfigurable factors is only to lift The corresponding relationship of example, other existing or page type informations for being likely to occur from now on and page main reconfigurable factors is such as applicable to this Invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Then, in step s 4, one or more of pages that page reconstructing arrangement 1 determines in step s3 according to it Main reconfigurable factors reconstruct content by extracting the page corresponding with the page main reconfigurable factors from the target pages, generate Reconstruction page corresponding with the target pages.Specifically, in step s 4, page reconstructing arrangement 1 is first according to it in step The one or more of page main reconfigurable factors determined in rapid S3, by such as parsing the HTML of the target pages, from described The page corresponding with the page main reconfigurable factors is extracted in target pages reconstructs content, for example, it is assumed that in step s3, the page Reconstructing arrangement 1 determines corresponding with target pages http://www.readnovel.com/novel/73144/23.html One or more page main reconfigurable factors include novel text such as chapter title " the 23rd time wonderful word of The Romance of West Chamber of the target pages The logical gorgeous bent alert heart of a young woman of play language peony pavilion ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just life will That day all inscription, life visit the spring and successively make a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", it is small Say chapters and sections link such as " [page up] [returning catalogue] [lower one page] ", then in step s 4, page reconstructing arrangement 1 is by such as solving The html document for analysing the target pages extracts corresponding with the above-mentioned page main reconfigurable factors page reconstruct content such as tool in the page Body content of text.
Those skilled in the art will be understood that the above-mentioned extraction from the target pages is opposite with the page main reconfigurable factors The mode for the page reconstruct content answered is only for example, other are existing or what is be likely to occur from now on extracts from the target pages The mode of page reconstruct content corresponding with the page main reconfigurable factors is such as applicable to the present invention, should also be included in the present invention Within protection scope, and it is incorporated herein by reference.
Then, in step s 4, the page is reconstructed content by page reconstructing arrangement 1, is reconstructed according to the predefined page Mode or original layout mode according to the target pages generate reconstruction page corresponding with the target pages.It connects The reconstruct content of extraction is included such as chapter title " the 23rd time The Romance of West Chamber by upper example, in step s 4, page reconstructing arrangement 1 The gorgeous bent alert heart of a young woman of the logical play language peony pavilion of wonderful word ", novel body matter " talk about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just Life was by that day all inscription, and life visits the spring and successively makes a copy of compromise ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has It.", novel chapters and sections link " [page up] [return to catalogue] [lower one page] ", such as in sequence according to predefined mode: chapters and sections mark Topic, the sequence of novel text, chapters and sections link are arranged successively.
Those skilled in the art will be understood that the mode of above-mentioned generation reconstruction page corresponding with the target pages only For citing, other modes for generating reconstruction page corresponding with the target pages that are existing or being likely to occur from now on such as may be used It suitable for the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
Preferably, in step s 4, page reconstructing arrangement 1 can also lead to according to one or more of page main reconfigurable factors It crosses and extracts page reconstruct content corresponding with the page main reconfigurable factors from the target pages, in conjunction with the mobile terminal Terminal association attributes, generate corresponding with mobile terminal reconstruction page.Specifically, in step s 4, the page reconstructs Equipment 1 is first according to one or more of page main reconfigurable factors that it is determined in step s3, by from the target pages It is middle to extract page reconstruct content corresponding with the page main reconfigurable factors, then in conjunction with the terminal correlation category of the mobile terminal Property, generate reconstruction page corresponding with the mobile terminal.Wherein, the terminal association attributes include following at least any :
The page visibility region of the mobile terminal;
The screen available work region of the mobile terminal;
The screen resolution of the mobile terminal;
The system configuration attribute of the mobile terminal.
For example, being generated and the movement when the terminal association attributes include the page visibility region of the mobile terminal When the corresponding reconstruction page of terminal, it is assumed that in step s3, page reconstructing arrangement 1 determine with target pages http: // The corresponding one or more page reconstruct of tech.sina.com.cn/i/m/2012-05-3I/03497194247.shtml are wanted Element includes headline " internet empress D10 report: mobile interchange network users are by super desktop ", the issuing time of the target pages " 03:49 of on May 31st, 2012 ", source of news " http://www.sina.com.cn ", body " Sina's science and technology news north May 31 capital time morning message, the well-known risk investment mechanism Kleiner Perkins Caufield Byers in Silicon Valley is (hereinafter referred to as " KPCB ") partner, " internet empress " Mary Mick you claim in D10 conference (Mary Meeker) Wednesday ..., Facebook is also required to possess sound " war chest ".(wind vertical bamboo flute Vygen) ", in step s 4, page reconstructing arrangement 1 can basis Js resource in the target pages html document obtains the page visibility region of the mobile terminal, e.g., according to availWidth= ParseInt (document.body.clientWidth) obtains page visual field field width, according to availHeight= ParseInt (document.body.clientHeight) obtains page visibility region height, then, in step s 4, page weight Structure equipment 1 generates reconstruction page corresponding with the mobile terminal in conjunction with the availWidth and availHeight.Again Such as, when the system configuration attribute that the terminal association attributes include the mobile terminal, such as OS Type and version, processing When the information such as device configuration generate reconstruction page corresponding with the mobile terminal, it is assumed that the system configuration category of the mobile terminal Property include " double-core 1.2GHz ", then in step s 4, page reconstructing arrangement 1 according to the system configuration attribute determine it is described it is mobile eventually End is Gao Duanji, and generating reconstruction page corresponding with the mobile terminal includes described one determined in step s3 according to it A or multiple page main reconfigurable factors, by extracting page weight corresponding with the page main reconfigurable factors from the target pages Structure content, such as the news type page, including headline, body, source of news, news briefing time;Assuming that institute The system configuration attribute for stating mobile terminal includes that " 1GHz high pass Snapdragon processor operates system using Android 2.3 System ", then in step s 4, page reconstructing arrangement 1 determine that the mobile terminal is low side machine according to the system configuration attribute, generate Reconstruction page corresponding with the mobile terminal includes from the page for removing all the elements other than advertisement in the target pages Information, such as the news type page, including headline, news picture, body, source of news, news briefing time.
Those skilled in the art will be understood that the terminal association attributes of mobile terminal described in above-mentioned combination generate and the shifting The mode of the dynamic corresponding reconstruction page of terminal is only for example, described in other existing or combinations for being likely to occur from now on it is mobile eventually The mode that the terminal association attributes at end generate reconstruction page corresponding with the mobile terminal is such as applicable to the present invention, also answers Within the scope of the present invention, and it is incorporated herein by reference.
In step s 5, the reconstruction page that page reconstructing arrangement 1 then generates it in step s 4, passes through agreement Communication mode, such as http or https communication protocol is provided to the mobile terminal, for user's reading and browsing.
It constantly works between each step of page reconstructing arrangement 1.Specifically, in step sl, page weight Structure equipment 1 persistently obtains the target pages for being supplied to mobile terminal;In step s 2, page reconstructing arrangement 1 persistently determines institute State the page type information of target pages;In step s3, page reconstructing arrangement 1 continues according to the page type information, really Fixed one or more page main reconfigurable factors corresponding with the target pages;In step s 4, page reconstructing arrangement 1 continues root It is corresponding with the page main reconfigurable factors by being extracted from the target pages according to one or more of page main reconfigurable factors The page reconstruct content, generate corresponding with target pages reconstruction page;In step s 5, page reconstructing arrangement 1 will The reconstruction page is provided to the mobile terminal.Here, it will be understood by those skilled in the art that " lasting " refers to that page reconstruct is set Standby 1 each step constantly carry out respectively the acquisition of target pages, the determination of page type information, the determination of page main reconfigurable factors, The generation and offer of reconstruction page, until the page reconstructing arrangement 1 stops the acquisition of target pages in a long time.
Preferably, in step s3, page reconstructing arrangement 1 can also be according to the page type information, by such as described Between the page type information field for including in the URL of target pages and the page main reconfigurable factors and its page of setting reconstruct pattern Mapping relations determine that one or more pages corresponding with the target pages are reconstructed according to the page type information Element and its page reconstruct pattern.Here, the page reconstruct pattern includes but is not limited to: 1) page layout;2) webpage representation Mode.For example, it is assumed that in step s 2, the target pages such as http that page reconstructing arrangement 1 determines: // The page type information of news.sina.com.cn/ is news type, then in step s3, page reconstructing arrangement 1 determines One or more page main reconfigurable factors corresponding with the target pages include the target pages different content the page point Block as included in " important news, home news channel, world news channel, sports channel, channel for finance and economics " and the target pages The page-tags such as headline, body, source of news, issuing time such as heading label<h1>-<h6>, document body label <body>, paragraph tag<p>and corresponding content of text, in step s3, the respective page reconstruct that page reconstructing arrangement 1 determines Pattern includes such as successively from top to bottom arranging according to important news, home news channel, world news channel, sports channel, channel for finance and economics Column, each channel include headline content of text and title link etc..For another example, it is assumed that in step s 2, page reconstructing arrangement 1 The page type information of the determining target pages such as http://bbs.dospy.com/ is Forum Type, then in step In rapid S3, one or more page main reconfigurable factors corresponding with the target pages that page reconstructing arrangement 1 determines include the mesh Mark the page forum's homepage in forum's title " Saipan smart phone net ", column subregion such as " Nokia WP7 discuss subregion, The 7 operating system zone of discussion Windows Phone, apple iPhone type Taxonomic discussion area, the zone of discussion Android android/heat Door, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area etc., element determining device 13 determine Respective page reconstruct pattern include such as according to Nokia WP7 discussion subregion, the 7 operating system zone of discussion Windows Phone, Apple iPhone type Taxonomic discussion area, the zone of discussion Android android/hot topic, the zone of discussion Android android/motor, Saipan 3 (symbian3) type Taxonomic discussion area successively from top to bottom arranges, and each subregion includes that title text content and chain of title are discussed It connects.Also such as, it is assumed that in step s 2, the target pages such as http that page reconstructing arrangement 1 determines: // The page type information of www.readnovel.com/novel/73144/23.html is novel types, then in step S3 In, one or more page main reconfigurable factors corresponding with the target pages that page reconstructing arrangement 1 determines include the page object In the novel text such as chapter title " the gorgeous bent alert heart of a young woman of the logical play language peony pavilion of the 23rd time wonderful word of The Romance of West Chamber " in face, novel text Hold and " talks about Jia Yuanchun from that day good fortune Grand View Garden and return Gong Quhou, just order that day all inscription, the life spy spring successively makes a copy of appropriate Association ..., exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", novel chapters and sections link piecemeal such as " [page up] [and return mesh Record] [lower one page] " etc., in step s3, the respective page reconstruct pattern that page reconstructing arrangement 1 determines includes as in sequence: The sequence that chapter title, novel text, chapters and sections link is arranged successively.Here, the page weight of page type information field and setting Mapping relations between its page of structure element reconstruct pattern can be present in page reconstructing arrangement 1 in the form of table or database End, or the third party device being connected with page reconstructing arrangement 1 by network.
Those skilled in the art will be understood that above-mentioned determination one or more page weights corresponding with the target pages Structure element and its mode of page reconstruct pattern are only for example, other determinations and the target existing or be likely to occur from now on The corresponding one or more page main reconfigurable factors of the page or its page reconstruct pattern mode are such as applicable to the present invention, should also wrap It is contained within the scope of the present invention, and is incorporated herein by reference.
Those skilled in the art will be understood that above-mentioned page type information and page main reconfigurable factors and its page reconstruct pattern Mapping relations be only for example, other existing or page type informations for being likely to occur from now on and page main reconfigurable factors or its page The mapping relations of face reconstruct pattern are such as applicable to the present invention, should also be included within the scope of protection of the present invention, and herein to draw It is incorporated herein with mode.
Then, in step s 4, page reconstructing arrangement 1 is first according to one or more of page main reconfigurable factors, from institute It states and extracts page reconstruct content corresponding with the page main reconfigurable factors in target pages;Then according in page reconstruct Hold, and reconstruct pattern in conjunction with the page, generates reconstruction page corresponding with the target pages.For example, it is assumed that in step In S4, page reconstructing arrangement 1 is from the target pages such as http://www.readnovel.com/novel/73144/ That extracts in 23.html links corresponding page reconstruct with the page main reconfigurable factors such as chapter title, novel text, chapters and sections Content includes such as chapter title " the gorgeous bent alert heart of a young woman of the logical play language peony pavilion of the 23rd time wonderful word of The Romance of West Chamber ", novel body matter " Jia Yuanchun to be talked about from that day good fortune Grand View Garden and returns Gong Quhou, is just ordered that day all inscription, life visits the spring and successively makes a copy of compromise ..., Exactly: the Zhuan Chenxiu night heart is without hating moon facing the wind has it.", novel chapters and sections link " [page up] [return to catalogue] [lower one page] ", And combine in step s3, the page determined by page reconstructing arrangement 1 reconstructs pattern, such as in sequence: chapter title, small Say text, the sequence of chapters and sections link is arranged successively.
Preferably, page reconstructing arrangement 1 further includes step S9 (not shown).Specifically, in step s 9, page reconstruct is set Standby 1 obtains the Segment of the target pages;In step s 4, page reconstructing arrangement 1 is according to one or more of pages Main reconfigurable factors reconstruct content by extracting the page corresponding with the page main reconfigurable factors from the Segment, generate Reconstruct piecemeal corresponding with the Segment;In step s 5, the reconstruct piecemeal is provided to institute by page reconstructing arrangement 1 State mobile terminal.
Specifically, in step s 9, the target pages that page reconstructing arrangement 1 obtains in step sl according to it, base In html tag analysis method or according to VIPS (Vision-based Page Segmentation, the page of view-based access control model Segmentation) algorithm, to obtain the Segment of the target pages.For example, in step s 9, page reconstructing arrangement 1 is according to VIPS Algorithm, using between webpage foreground color, background color, font color and size, frame, logical block and logical block spacing, The visual signatures such as element position pass through the target pages http that it is obtained in step sl that establishes relevant regulations: // News.sina.com.cn/ is divided into each visual information block, such as subject of news block, body block, navigation block, commercial block.This Field technical staff can understand that the mode of the Segment of the above-mentioned acquisition target pages is only for example, other are existing or modern The mode of the Segment for the acquisition target pages being likely to occur afterwards is only such as applicable to the present invention, should also be included in this hair Within bright protection scope, and it is incorporated herein by reference.
Then, in step s 4, page reconstructing arrangement 1 is determined according to it one or more of in step s3 first Page main reconfigurable factors reconstruct content by extracting the page corresponding with the page main reconfigurable factors from the Segment, Then reconstruct piecemeal corresponding with the Segment is generated.For example, it is assumed that in step s3, page reconstructing arrangement 1 determines The page main reconfigurable factors corresponding with the target pages http://news.sina.com.cn/ include the page object The page channel piecemeal of the different content in face such as " important news, home news channel, world news channel, sports channel, finance and economics frequency Headline, body, source of news, the issuing time etc. for including in different channel piecemeal in road " etc. and the target pages Page-tag such as heading label<h1>-<h6>, document body label<body>, paragraph tag<p>and corresponding content of text, it connects Upper example, then in step s 4, page reconstructing arrangement 1 obtain first by such as parsing the HTML of the target pages from piecemeal Segment such as subject of news block that device obtains, navigation block, extract and the page in step s3 in commercial block body block The corresponding page of the main reconfigurable factors that reconstructing arrangement 1 determines reconstructs content, such as in step s 4, page reconstructing arrangement 1 from The page corresponding with the page main reconfigurable factors page channel piecemeal reconstruct content extracted in subject of news block includes target pages Such as " network media in-depth is walked to be turned to change activity forum the news for including in headline channel in http://news.sina.com.cn/ It is in Beijing to hold ", " Children's Day special topic " etc..For another example, in step s 4, page reconstructing arrangement 1 extracted from subject of news block with The corresponding page reconstruct content of page main reconfigurable factors page channel piecemeal includes target pages http: // The news for including in sports channel in news.sina.com.cn/ is as " 22 points of thunderclaps of Ah Du, which are won completely, pulls back 1-2GDP depression spur 20 consecutive victories terminate ", " Europe Cup Final 16 list announce 368 soccer stars entirely have a guide look of you can also edit " etc..
Then, in step s 4, page reconstructing arrangement 1 will extract and the page main reconfigurable factors from the Segment The corresponding page reconstructs content, reconstructs mode according to predefined Segment or according to the original of the Segment Layout type generates reconstruct piecemeal corresponding with the Segment.
Skilled artisans will appreciate that the mode of above-mentioned generation reconstruct piecemeal corresponding with the Segment is only Citing, other modes for generating reconstruct piecemeal corresponding with the Segment that are existing or being likely to occur from now on only such as may be used It suitable for the present invention, should also be included within the scope of protection of the present invention, and be incorporated herein by reference.
In step s 5, the reconstruct piecemeal that page reconstructing arrangement 1 then generates it in step s 4, passes through agreement Communication mode, such as http or https communication protocol is provided to the mobile terminal, for user's reading and browsing.
Preferentially, in step s 5, page reconstructing arrangement 1 can also according to it is described reconstruct piecemeal importance of block, by its The reconstruct piecemeal generated in step S4 is provided to by the communication mode of agreement, such as http or https communication protocol The mobile terminal, for user's reading and browsing.Here, the importance of block includes but is not limited to following at least any one: 1) institute State the text character of reconstruct piecemeal and entire<body>the ratio of the text character of block;2) there is no link in the reconstruct piecemeal The ratio of total text character of text character and full page;3) it is described reconstruct piecemeal block area and the page it is entire<body> The ratio of the area of block.Skilled artisans will appreciate that above-mentioned importance of block is only for example, other are existing or from now on may The importance of block of appearance is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and includes by reference In this.
(refer to Fig. 3) in a preferred embodiment, page reconstructing arrangement 1 includes step S1, step S2, step S3, step Rapid S4 and step S5.Wherein, step S3 includes step S31 (not shown) and step S32 (not shown).Below with reference to Fig. 3 to this Preferred embodiment is described: specifically, in step sl, page reconstructing arrangement 1 obtains the target for being supplied to mobile terminal The page;In step s 2, page reconstructing arrangement 1 determines the page type information of the target pages;In step S31, the page Reconstructing arrangement 1 determines page common document object model corresponding with the page type information;In step s 32, the page Reconstructing arrangement 1 extracts the page reconstruct node of the target pages according to the page common document object model, using as institute State page main reconfigurable factors;In step s 4, page reconstructing arrangement 1 is according to one or more of page main reconfigurable factors, by from The page corresponding with the page main reconfigurable factors is extracted in the target pages and reconstructs content, is generated and the target pages phase Corresponding reconstruction page;In step s 5, the reconstruction page is provided to the mobile terminal by page reconstructing arrangement 1.Wherein, Step S1, it is same or similar to correspond to step with shown in Fig. 3 by step S2, step S4 and step S5, therefore details are not described herein again, and passes through The mode of reference is incorporated herein.
Specifically, in step S31, page reconstructing arrangement 1 passes through such as corresponding with the page type information more With the DOM tree node in common node path in a page, it is then based on the DOM tree node with common node path, really Fixed page common document object model corresponding with the page type information.For example, it is assumed that for example small with the type information Say the corresponding multiple pages of type such as:
A: the nine Hui Lin coach's wind and snow mountain temple land anxiety, which is waited, burns fodder field
http://www.readnovel.com/novel/73145/12.html
B: the tenth Hui Lin coach's wind and snow mountain temple land of Heroes of the Marshes anxiety, which is waited, burns fodder field
http://www.purepen.com/shz/010.htm
C: the two ten eight time the Liangshan pool full partner of the city parting great Mai Song Gongming is offered amnesty and enlistment to rebels
Http:// www.cuiweiju.com/fulltext/97/97926.html#5383832 has common node road The DOM tree node of diameter such as D1-Dn, then in step S31, page reconstructing arrangement 1 generates corresponding dom tree such as DOM- according to D1-Dn D, using as page common document object model Common-DOM-D corresponding with the novel types page.
Skilled artisans will appreciate that above-mentioned determination unit determines that the page corresponding with the page type information is public The mode of DOM Document Object Model is only for example altogether, other existing or determination units determinations being likely to occur from now on and the page The mode of the corresponding page common document object model of type information is such as applicable to the present invention, should also be included in guarantor of the present invention It protects within range, and is incorporated herein by reference.
Then, in step s 32, page reconstructing arrangement 1 extracts the mesh according to the page common document object model The page for marking the page reconstructs node, using as the page main reconfigurable factors.For example, it is assumed that in step sl, page reconstructing arrangement 1 The target pages got are the novel types page, as two Longshan blueness face beast of the 17th lap waste Buddhist monk singles of Heroes of the Marshes is double Take Baozhusi http://www.purepen.com/shz/017.htm by force, then in step s 32, page reconstructing arrangement 1 is according to it The page common document object model Common- corresponding with page type information such as novel types determined in step S31 DOM-D, from the corresponding dom tree of the page extract with page common document object model Common-DOM-D nodename and Node nodes X Path all the same reconstructs node as the page, and in this, as the page main reconfigurable factors.
Preferably, in step S31, page reconstructing arrangement 1 is first according to multiple ginsengs of the correspondence page type information Examine DOM Document Object Model corresponding to each in the page, extract the common node of the multiple reference page, then generate with The corresponding page common document object model of the page type information.Correspond to the page for example, it is assumed that existing Multiple reference page of type information such as news type are such as:
I:sina news homepage http://news.sina.com.cn/,
II:sina home news http://news.sina.com.cn/china/,
III:sina world news http://news.sina.com.cn/world/,
IV:sohu news homepage http://news.sohu.com/,
In step S31, page reconstructing arrangement 1 is first according to the corresponding html document of each in multiple reference page It is parsed, converts html tag to the node of corresponding dom tree, to generate respective dom tree, respectively DOM-I, DOM- II, DOM-III, DOM-IV, it is equal by extracting DOM-I, DOM-II, DOM-III, DOM-IV interior joint title and nodes X Path Identical node obtains the common node such as E1-En of multiple reference page;Then, in step S31, page reconstructing arrangement 1 According to common node E1-En, the page common document object model corresponding with the page type information is generated such as Common-DOM-E。
Fig. 4 show in accordance with a preferred embodiment of the present invention for providing reconstruction page corresponding with target pages Method flow diagram.
Specifically, in step S1 ', page reconstructing arrangement 1 obtains the target pages for being supplied to mobile terminal;In step In S2 ', page reconstructing arrangement 1 determines the page type information of the target pages;In step S3 ', page reconstructing arrangement 1 According to the page type information, matching inquiry is carried out in page main reconfigurable factors database, to obtain and the target pages phase Corresponding one or more page main reconfigurable factors;In step S4 ', page reconstructing arrangement 1 is according to one or more of pages Main reconfigurable factors reconstruct content by extracting the page corresponding with the page main reconfigurable factors from the target pages, generate Reconstruction page corresponding with the target pages;In step S5 ', the reconstruction page is provided to by page reconstructing arrangement 1 The mobile terminal.Wherein, step S1 ', step S2 ', step S4 ' and step S5 ' be identical as step is corresponded to shown in Fig. 3 or phase Seemingly, therefore details are not described herein again, and is incorporated herein by reference.
Specifically, in step S3 ', page reconstructing arrangement 1 is according to the page type information, in page main reconfigurable factors number According to matching inquiry is carried out in library, to obtain one or more page main reconfigurable factors corresponding with the target pages.For example, false It being located in step S1 ', do are the target pages that page reconstructing arrangement 1 obtains http://xinzhi.baidu.com/pub? next =%2F, in step S2 ', the page info type for the target pages that page reconstructing arrangement 1 determines is question and answer type, Then in step S3 ', the question and answer type that page reconstructing arrangement 1 determines in step S2 ' according to it, in page main reconfigurable factors data Matching inquiry is carried out in library, to obtain one or more page main reconfigurable factors corresponding with the target pages.Here, described Page main reconfigurable factors database can be located in page reconstructing arrangement 1, may be alternatively located at and be connected by network with page reconstructing arrangement 1 In third party device, such as server.
Preferably, page reconstructing arrangement 1 further includes step S6 ', step S7 ' and step S8 '.Specifically, in step S6 ' In, page reconstructing arrangement 1 is classified according to multiple training pages for having marked page type by page type, obtains one Or multiple page classifications, wherein the page classifications include at least one described trained page;In step S7 ', page reconstruct Equipment 1 is according to the trained page for including in the page classifications, by predetermined page element training rules, obtain with it is described The corresponding one or more page main reconfigurable factors of page type corresponding to page classifications;In step S8 ', page reconstruct is set Standby 1, according to one or more corresponding with page type corresponding to the page classifications page main reconfigurable factors, establishes Or update the page main reconfigurable factors database.
Specifically, in step S6 ', page reconstructing arrangement 1 is according to multiple training pages for having marked page type, by page Noodles type is classified, and one or more page classifications are obtained, wherein the page classifications include at least one described trained page Face.For example, it is assumed that there are multiple training pages for having marked page type such as:
V:sina sports news http://sports.sina.com.cn/, news type
VI:sina financial and economic news http://finance.sina.com.cn/, news type
VII:sina/ reading/novel shop/world's masterpiece/" the Count of Monte Christo "
Http:// vip.book.sina.com.cn/book/index_81300.html, novel types
VIII:sina/ reading/books publish in instalments/novel/local novels/" the ordinary world "
Http:// vip.book.sina.com.cn/book/index_86819.html, novel types
IX:sohu/ reading/books publish in instalments/literature general pipeline/classical fiction/" Tang, Sui historical romance " (full text)
Http:// lz.book.sohu.com/serialize-id-13706.html, novel types are then in step S6 ' In, page reconstructing arrangement 1 is classified according to multiple training page for having marked page type by page type, obtains one A or multiple page classifications, such as the news type page V and VI, the novel types page VII, VIII and IX, wherein the page point Class includes at least one described trained page.
Then, it in step S7 ', is wrapped in the page classifications that page reconstructing arrangement 1 obtains in step S6 ' according to it The trained page included is obtained and page type corresponding to the page classifications by predetermined page element training rules Corresponding one or more page main reconfigurable factors.Wherein, the predetermined page element training rules include following at least any :
Bayesian Estimation analysis is carried out to the trained page in the page classifications, obtains the page classifications institute The corresponding one or more page main reconfigurable factors of corresponding page type;
Maximal possibility estimation analysis is carried out to the trained page in the page classifications, obtains the page classifications The corresponding one or more page main reconfigurable factors of corresponding page type.
For example, connecting example, in step S7 ', page reconstructing arrangement 1 is according to it in the middle page obtained of step S6 ' The trained page for including in classification, such as the trained page V and VI for including in news type classification, novel types classification In include the described trained page VII, VIII and IX, by carrying out Bayes to the trained page in the page classifications Estimation analysis carries out maximal possibility estimation analysis by or to the multiple page node training data, to obtain the page The corresponding one or more page main reconfigurable factors of the corresponding page type of face classification, such as divide with the page of news type The corresponding one or more page main reconfigurable factors of page type corresponding to class include subject of news block, body block and new Hear the page-tags such as heading labels such as title, body<h1>-<h6>, document body label<body>, paragraph tag<p>and Corresponding content of text, the one or more page corresponding with page type corresponding to the page classifications of novel types Main reconfigurable factors include novel text, storywriter, chapters and sections directory link etc..
Skilled artisans will appreciate that above-mentioned obtain the page main reconfigurable factors according to predetermined page element training rules Mode be only for example, other are existing or what is be likely to occur from now on obtains the page according to predetermined page element training rules The mode of main reconfigurable factors is such as applicable to the present invention, should also be included within the scope of protection of the present invention, and includes by reference In this.
In step S8 ', page reconstructing arrangement 1 is according to corresponding with page type corresponding to the page classifications one A or multiple page main reconfigurable factors, establish or update the page main reconfigurable factors database.For example, in step S8 ', page Obtained in step S7 ' according to it one corresponding with page type corresponding to the page classifications of face reconstructing arrangement 1 Or multiple page main reconfigurable factors, establish comprising page type with its corresponding to the page main reconfigurable factors between it is corresponding pass The page main reconfigurable factors database of system.
It should be noted that the present invention can be carried out in the assembly of software and/or software and hardware, for example, can adopt With specific integrated circuit (ASIC), general purpose computer or any other realized similar to hardware device.In one embodiment In, software program of the invention can be executed to implement the above steps or functions by processor.Similarly, of the invention Software program (including relevant data structure) can be stored in computer readable recording medium, for example, RAM memory, Magnetic or optical driver or floppy disc and similar devices.In addition, some of the steps or functions of the present invention may be implemented in hardware, example Such as, as the circuit cooperated with processor thereby executing each step or function.
In addition, a part of the invention can be applied to computer program product, such as computer program instructions, when its quilt When computer executes, by the operation of the computer, it can call or provide according to the method for the present invention and/or technical solution. And the program instruction of method of the invention is called, it is possibly stored in fixed or moveable recording medium, and/or pass through Broadcast or the data flow in other signal-bearing mediums and transmitted, and/or be stored according to described program instruction operation In the working storage of computer equipment.Here, according to one embodiment of present invention including a device, which includes using Memory in storage computer program instructions and processor for executing program instructions, wherein when the computer program refers to When enabling by processor execution, method and/or skill of the device operation based on aforementioned multiple embodiments according to the present invention are triggered Art scheme.
It is obvious to a person skilled in the art that invention is not limited to the details of the above exemplary embodiments, Er Qie In the case where without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Benefit requires rather than above description limits, it is intended that all by what is fallen within the meaning and scope of the equivalent elements of the claims Variation is included in the present invention.Any reference signs in the claims should not be construed as limiting the involved claims.This Outside, it is clear that one word of " comprising " does not exclude other units or steps, and odd number is not excluded for plural number.That states in device claim is multiple Unit or device can also be implemented through software or hardware by a unit or device.The first, the second equal words are used to table Show title, and does not indicate any particular order.

Claims (17)

1. a kind of for providing the method for reconstruction page corresponding with target pages for mobile terminal, wherein this method includes Following steps:
A obtains the target pages for being supplied to mobile terminal;
Whether b meets scheduled type judgment rule according to the target pages, determines the page type letter of the target pages Breath;Wherein, the scheduled type judgment rule includes following at least any one:
When the target pages belong to by forum build a station tool foundation the page or the target pages source code package containing opinion When altar page feature, determine that the page type information of the target pages is forum page;
When the URL corresponding to the target pages belongs to page type database, determined according to the page type database The page type information of the target pages;
When there is reference page similar with URL corresponding to the target pages, according to the classes of pages of the reference page Type information determines the page type information of the target pages;
It is true according to the URL correlated characteristic information when URL corresponding to the target pages includes URL correlated characteristic information The page type information of the fixed target pages;
When URL and predetermined webpage template corresponding to the target pages match, determined according to the predetermined webpage template The page type information of the target pages;
C determines one or more page main reconfigurable factors corresponding with the target pages according to the page type information, In, the page main reconfigurable factors include following at least any one:
Page body matter;
The page reconstructs node;
The page reconstructs piecemeal;
D is wanted by extracting from the target pages with page reconstruct according to one or more of page main reconfigurable factors The corresponding page of element reconstructs content, generates reconstruction page corresponding with the target pages;
The reconstruction page is provided to the mobile terminal by e.
2. according to the method described in claim 1, wherein, the step c includes:
According to the page type information, determine one or more page main reconfigurable factors corresponding with the target pages and Its page reconstructs pattern;
Wherein, the step d includes:
According to one or more of page main reconfigurable factors, wanted by being extracted from the target pages with page reconstruct The corresponding page of element reconstructs content;
Content is reconstructed according to the page, and reconstructs pattern in conjunction with the page, is generated corresponding with the target pages heavy The structure page.
3. method according to claim 1 or 2, wherein the step c includes:
According to the page type information, matching inquiry is carried out in page main reconfigurable factors database, to obtain and the target The corresponding one or more page main reconfigurable factors of the page.
4. according to the method described in claim 3, wherein, this method further include:
According to multiple training pages for having marked page type, classify by page type, obtains one or more pages point Class, wherein the page classifications include at least one described trained page;
According to the trained page for including in the page classifications, by predetermined page element training rules, obtain with it is described The corresponding one or more page main reconfigurable factors of page type corresponding to page classifications;
According to one or more corresponding with page type corresponding to the page classifications page main reconfigurable factors, build Found or update the page main reconfigurable factors database;
Wherein, the predetermined page element training rules include following at least any one:
Bayesian Estimation analysis is carried out to the trained page in the page classifications, is obtained corresponding to the page classifications The corresponding one or more page main reconfigurable factors of page type;
Maximal possibility estimation analysis is carried out to the trained page in the page classifications, it is right to obtain the page classifications institute The corresponding one or more page main reconfigurable factors of the page type answered.
5. method according to claim 1 or 2, wherein this method further include:
Obtain the Segment of the target pages;
Wherein, the step d includes:
According to one or more of page main reconfigurable factors, wanted by being extracted from the Segment with page reconstruct The corresponding page of element reconstructs content, generates reconstruct piecemeal corresponding with the Segment;
Wherein, the step e includes:
The reconstruct piecemeal is provided to the mobile terminal.
6. method according to claim 1 or 2, wherein the step c includes:
X determines page common document object model corresponding with the page type information;
According to the page common document object model, the page reconstruct node of the target pages is extracted, using as the page Face main reconfigurable factors.
7. according to the method described in claim 6, wherein, the step x includes:
DOM Document Object Model according to corresponding to each in multiple reference page of the correspondence page type information extracts The common node of the multiple reference page;
According to the common node, the page common document object model corresponding with the page type information is generated.
8. method according to claim 1 or 2, wherein the step d includes:
According to one or more of page main reconfigurable factors, wanted by being extracted from the target pages with page reconstruct The corresponding page of element reconstructs content, in conjunction with the terminal association attributes of the mobile terminal, generates opposite with the mobile terminal The reconstruction page answered;
Wherein, the terminal association attributes include following at least any one:
The page visibility region of the mobile terminal;
The screen available work region of the mobile terminal;
The screen resolution of the mobile terminal;
The system configuration attribute of the mobile terminal.
9. a kind of for providing the page reconstructing arrangement of reconstruction page corresponding with target pages, wherein page reconstruct is set It is standby to include:
Page acquisition device, for obtaining the target pages for being supplied to mobile terminal;
Type determination device determines the target for whether meeting scheduled type judgment rule according to the target pages The page type information of the page;Wherein, the scheduled type judgment rule includes following at least any one:
When the target pages belong to by forum build a station tool foundation the page or the target pages source code package containing opinion When altar page feature, determine that the page type information of the target pages is forum page;
When the URL corresponding to the target pages belongs to page type database, determined according to the page type database The page type information of the target pages;
When there is reference page similar with URL corresponding to the target pages, according to the classes of pages of the reference page Type information determines the page type information of the target pages;
It is true according to the URL correlated characteristic information when URL corresponding to the target pages includes URL correlated characteristic information The page type information of the fixed target pages;
When URL and predetermined webpage template corresponding to the target pages match, determined according to the predetermined webpage template The page type information of the target pages;
Element determining device, for determining one or more corresponding with the target pages according to the page type information A page main reconfigurable factors, wherein the page main reconfigurable factors include following at least any one:
Page body matter;
The page reconstructs node;
The page reconstructs piecemeal;
Webpage generating device is used for according to one or more of page main reconfigurable factors, by extracting from the target pages The page corresponding with the page main reconfigurable factors reconstructs content, generates reconstruction page corresponding with the target pages;
Device is provided, for the reconstruction page to be provided to the mobile terminal.
10. page reconstructing arrangement according to claim 9, wherein the element determining device is used for:
According to the page type information, determine one or more page main reconfigurable factors corresponding with the target pages and Its page reconstructs pattern;
Wherein, the webpage generating device is used for:
According to one or more of page main reconfigurable factors, wanted by being extracted from the target pages with page reconstruct The corresponding page of element reconstructs content;
Content is reconstructed according to the page, and reconstructs pattern in conjunction with the page, is generated corresponding with the target pages heavy The structure page.
11. the page reconstructing arrangement according to any one of claim 9 or 10, wherein the element determining device is used for:
According to the page type information, matching inquiry is carried out in page main reconfigurable factors database, to obtain and the target The corresponding one or more page main reconfigurable factors of the page.
12. page reconstructing arrangement according to claim 11, wherein the page reconstructing arrangement further include:
Acquisition device of classifying is obtained for classifying by page type according to multiple training pages for having marked page type One or more page classifications, wherein the page classifications include at least one described trained page;
Element acquisition device, for being instructed by predetermined page element according to the trained page for including in the page classifications Practice rule, obtains one or more page main reconfigurable factors corresponding with page type corresponding to the page classifications;
Database update device, for according to one or more institutes corresponding with page type corresponding to the page classifications Page main reconfigurable factors are stated, the page main reconfigurable factors database is establishd or updated;
Wherein, the predetermined page element training rules include following at least any one:
Bayesian Estimation analysis is carried out to the trained page in the page classifications, is obtained corresponding to the page classifications The corresponding one or more page main reconfigurable factors of page type;
Maximal possibility estimation analysis is carried out to the trained page in the page classifications, it is right to obtain the page classifications institute The corresponding one or more page main reconfigurable factors of the page type answered.
13. page reconstructing arrangement according to claim 9 or 10, wherein the page reconstructing arrangement further include:
Piecemeal acquisition device, for obtaining the Segment of the target pages;
Wherein, the webpage generating device is used for:
According to one or more of page main reconfigurable factors, wanted by being extracted from the Segment with page reconstruct The corresponding page of element reconstructs content, generates reconstruct piecemeal corresponding with the Segment;
Wherein, the offer device is used for:
The reconstruct piecemeal is provided to the mobile terminal.
14. page reconstructing arrangement according to claim 9 or 10, wherein the element determining device includes:
Model determination unit, for determining page common document object model corresponding with the page type information;
Node extraction unit, for extracting the page reconstruct of the target pages according to the page common document object model Node, using as the page main reconfigurable factors.
15. page reconstructing arrangement according to claim 14, wherein the model determination unit is used for:
DOM Document Object Model according to corresponding to each in multiple reference page of the correspondence page type information extracts The common node of the multiple reference page;
According to the common node, the page common document object model corresponding with the page type information is generated.
16. page reconstructing arrangement according to claim 9 or 10, wherein the webpage generating device is used for:
According to one or more of page main reconfigurable factors, wanted by being extracted from the target pages with page reconstruct The corresponding page of element reconstructs content, in conjunction with the terminal association attributes of the mobile terminal, generates opposite with the mobile terminal The reconstruction page answered;
Wherein, the terminal association attributes include following at least any one:
The page visibility region of the mobile terminal;
The screen available work region of the mobile terminal;
The screen resolution of the mobile terminal;
The system configuration attribute of the mobile terminal.
17. a kind of browser, including corresponding with target pages for providing as described in any one of claim 9 to 16 Device in the page reconstructing arrangement of reconstruction page.
CN201210244986.8A 2012-07-13 2012-07-13 It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages Active CN103544178B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210244986.8A CN103544178B (en) 2012-07-13 2012-07-13 It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210244986.8A CN103544178B (en) 2012-07-13 2012-07-13 It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages

Publications (2)

Publication Number Publication Date
CN103544178A CN103544178A (en) 2014-01-29
CN103544178B true CN103544178B (en) 2019-04-12

Family

ID=49967641

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210244986.8A Active CN103544178B (en) 2012-07-13 2012-07-13 It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages

Country Status (1)

Country Link
CN (1) CN103544178B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104050273B (en) * 2014-06-24 2018-07-10 北京奇虎科技有限公司 For recording newest network file, the installation method for changing search result
CN105282104A (en) * 2014-07-01 2016-01-27 武汉科技大学 Data retrieval system based on cloud computing and webpage parsing technology
KR20160014463A (en) * 2014-07-29 2016-02-11 삼성전자주식회사 Server, providing metheod of server, display apparatus, controlling metheod of display apparatus and informatino providing system
CN104331512B (en) * 2014-11-25 2017-10-20 南京烽火星空通信发展有限公司 A kind of BBS pages automatic acquiring method
CA2989462A1 (en) * 2015-06-18 2016-12-22 Tylio Inc. System and method for generating an electronic page
CN105260420B (en) * 2015-09-25 2019-05-10 百度在线网络技术(北京)有限公司 A kind of method and apparatus for the offer target pages in mobile application
CN105512296B (en) * 2015-12-11 2019-01-22 宁波中青华云新媒体科技有限公司 Webpage analysis method and system based on webpage difference
CN107085578B (en) * 2016-02-16 2020-05-12 腾讯科技(深圳)有限公司 Webpage editing method and device
JP2017157083A (en) * 2016-03-03 2017-09-07 富士ゼロックス株式会社 File reconstruction device and program
CN108959325B (en) * 2017-05-26 2021-06-29 珠海金山办公软件有限公司 Uniform resource locator display method, information display method and related products thereof
CN107861982A (en) * 2017-09-29 2018-03-30 五八有限公司 It is dynamically determined method, terminal, server and the system of application program page layout
CN108446116B (en) * 2018-02-26 2021-10-08 平安普惠企业管理有限公司 Application program page generation method and device, computer equipment and storage medium
CN110750739B (en) * 2018-07-04 2022-07-05 北京国双科技有限公司 Page type determination method and device
CN109086064B (en) * 2018-08-01 2022-01-14 南京茂毓通软件科技有限公司 General extraction method of HTTP (hyper text transport protocol) protocol elements based on custom tag language
CN109657130A (en) * 2018-12-10 2019-04-19 陆少杰 Querying method, device and the electronic equipment of automobile information
CN109670133B (en) * 2018-12-22 2021-04-02 网宿科技股份有限公司 Method for determining public component of page, server and storage medium
CN111611476B (en) * 2020-04-13 2023-08-29 百度在线网络技术(北京)有限公司 Thematic page display method and device
CN112035830A (en) * 2020-08-04 2020-12-04 郑州阿帕斯数云信息科技有限公司 Browser page reconstruction method, device and equipment
CN113569044B (en) * 2021-06-28 2023-07-18 南京大学 Method for classifying webpage text content based on natural language processing technology
CN113407889B (en) * 2021-07-15 2023-10-20 北京百度网讯科技有限公司 Novel transcoding method, device, equipment and storage medium
CN114327731B (en) * 2021-12-31 2023-11-14 北京字跳网络技术有限公司 Information display method, device, equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815093A (en) * 2010-03-11 2010-08-25 深圳市嘉讯软件有限公司 Method for adapting webpage to mobile terminal and mobile terminal page adaptation device
CN101859322A (en) * 2010-05-26 2010-10-13 卓望数码技术(深圳)有限公司 Webpage display method for mobile terminal
CN101894168A (en) * 2010-06-30 2010-11-24 优视科技有限公司 Method and system for layout display of web page of mobile terminal
CN102033944A (en) * 2010-12-21 2011-04-27 重庆新媒农信科技有限公司 Mobile terminal-based web page display system and method
CN102043861A (en) * 2010-12-29 2011-05-04 重庆新媒农信科技有限公司 Web page data structured display method based on mobile terminal
CN102184249A (en) * 2011-05-23 2011-09-14 广州市动景计算机科技有限公司 Webpage layout method and device based on mobile terminal

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101815093A (en) * 2010-03-11 2010-08-25 深圳市嘉讯软件有限公司 Method for adapting webpage to mobile terminal and mobile terminal page adaptation device
CN101859322A (en) * 2010-05-26 2010-10-13 卓望数码技术(深圳)有限公司 Webpage display method for mobile terminal
CN101894168A (en) * 2010-06-30 2010-11-24 优视科技有限公司 Method and system for layout display of web page of mobile terminal
CN102033944A (en) * 2010-12-21 2011-04-27 重庆新媒农信科技有限公司 Mobile terminal-based web page display system and method
CN102043861A (en) * 2010-12-29 2011-05-04 重庆新媒农信科技有限公司 Web page data structured display method based on mobile terminal
CN102184249A (en) * 2011-05-23 2011-09-14 广州市动景计算机科技有限公司 Webpage layout method and device based on mobile terminal

Also Published As

Publication number Publication date
CN103544178A (en) 2014-01-29

Similar Documents

Publication Publication Date Title
CN103544178B (en) It is a kind of for providing the method and apparatus of reconstruction page corresponding with target pages
CN103544176B (en) Method and apparatus for generating the page structure template corresponding to multiple pages
CN105378727B (en) Inverse operator is used to inquire about on online social networks
CN103294781B (en) A kind of method and apparatus for processing page data
CN107403388B (en) Syntactic model for structured search inquiry
CN102929939B (en) The offer method and device of customized information
CN103678325B (en) It is a kind of for providing the method and apparatus of browsing pages corresponding with initial page
CN109983455A (en) The diversified media research result on online social networks
Barnett et al. A multi-level network analysis of web-citations among the world’s universities
CN106716399A (en) Ranking external content on online social networks
CN105917330A (en) Client-side search templates for online social networks
Ting et al. What does hotel website content say about a property—an evaluation of upscale hotels in Taiwan and China
CN101981570A (en) Open framework for integrating, associating and interacting with content objects
CN102970326A (en) Method and devices for sharing emotion indication information of users
CN104050243B (en) It is a kind of to search for the network search method combined with social activity and its system
Król et al. Aggregated indices in website quality assessment
Jeong et al. Usability study on newspaper mobile websites
CN106021583A (en) Statistical method and system for page flow data
JP2011527062A (en) Homepage integrated service providing system and method
King Can Bard, Google’s experimental Chatbot based on the LaMDA large language model, help to analyze the gender and racial diversity of authors in your cited scientific references?
CN101393567A (en) Method for displaying intelligent search result web page in intelligent search windows
CN103365876A (en) Method and device for generating network operation auxiliary information based on relation maps
Schneider Searching for ‘digital Asia’in its networks: where the spatial turn meets the digital turn
CN102999576A (en) Method and equipment for confirming page description information corresponding to target pages
CN106776640A (en) A kind of stock information information displaying method and device

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant