CN101777081A - Method and device for improving webpage access speed - Google Patents

Method and device for improving webpage access speed Download PDF

Info

Publication number
CN101777081A
CN101777081A CN201010128121A CN201010128121A CN101777081A CN 101777081 A CN101777081 A CN 101777081A CN 201010128121 A CN201010128121 A CN 201010128121A CN 201010128121 A CN201010128121 A CN 201010128121A CN 101777081 A CN101777081 A CN 101777081A
Authority
CN
China
Prior art keywords
page
set
entry
link
entries
Prior art date
Application number
CN201010128121A
Other languages
Chinese (zh)
Inventor
阚光远
Original Assignee
中兴通讯股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中兴通讯股份有限公司 filed Critical 中兴通讯股份有限公司
Priority to CN201010128121A priority Critical patent/CN101777081A/en
Publication of CN101777081A publication Critical patent/CN101777081A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Abstract

The invention discloses a method and a device for improving webpage access speed. The method includes the following steps: reading the recorded historical webpage data to get a webpage collection, extracting stems from the webpages in the webpage collection, slitting the stems into entries to get entry collections corresponding to the webpages; extracting the link points in the webpages of the webpage collection to get link point collections corresponding to the webpages, extracting the link stems of the link points and slitting the stems to get link entry collections corresponding to the link points; generating an interest association rule between the entries in the entry collection corresponding to the webpages in the webpage collection, or generating an interest association rule between the entry collections corresponding to the webpages in the webpage collection and the entries in the link entry connections; and according to the webpage currently the user accesses and the interest association rule, forecasting the next link point that the user will access among the link point collections corresponding to the webpage the user currently accesses, and unloading and caching the webpage data of the link point.

Description

一种提高网页访问速度的方法及装置 Method for improving the access speed of the web and means

技术领域 FIELD

[0001] 本发明涉及移动通讯技术领域,尤其涉及一种提高网页访问速度的方法及装置。 [0001] The present invention relates to a method and mobile communication technologies, particularly to a webpage access speed improved.

背景技术 Background technique

[0002] 随着3G (第三代移动通讯技术)时代的到来,手机互联网用户的数量逐渐增加,用户对手机浏览器的速度要求也越来越高,但是由于手机浏览器受到手机信号和实时带宽限制等条件的制约,用户在使用手机浏览器时服务质量得不到保证。 [0002] With the advent of 3G (third generation mobile communication technology) era, the number of mobile Internet users increasing, the speed of the mobile browser user requirements are also getting higher and higher, but because mobile browser by cell phone signals and real-time constraints bandwidth restrictions, users use mobile browser service quality can not be guaranteed.

[0003] 现在的手机浏览器一般都使用缓冲机制,它利用网页浏览的时间局部性,将曾经访问过的文档保存在手机浏览器缓存中,从而避免向远程服务器发送请求,或者避免由远程服务器发送完整的响应。 [0003] The mobile browser is now generally use buffering mechanism, which uses a web browser temporal locality, the document has been visited saved in the mobile browser cache, so as to avoid sending a request to a remote server, or by a remote server to avoid send the complete response.

[0004] 单纯的Cache (高速缓冲存储器)技术只是利用了网页浏览模式的时间局部性,对于未曾访问过的内容无法缓冲,响应性能依然得不到改善,这一点在用户发现一个新的热点服务器或服务器的页面经常更新时,感觉尤其明显。 [0004] simple Cache (cache) technology takes advantage of temporal locality just web browsing mode, the content has not visited not buffering, response performance still does not improve, it found a new hot spot in the server user when a page or server is updated frequently, feeling particularly evident.

[0005] 另夕卜,由于手机浏览器用于网页内容缓冲的空间不大,曾经访问过的内容被覆盖, 单纯的Cache机制也不会产生好的响应性能,如何有效的提高手机浏览器访问速度已经成为各种浏览器厂家目前最为关注的问题。 [0005] Another Xi Bu, because the mobile browser page content for little buffer space, have visited the contents are covered, simply Cache mechanism does not produce good response performance, how to effectively improve the access speed mobile browser It has become a problem all browsers factories are most concerned about.

发明内容 SUMMARY

[0006] 本发明要解决的问题在于提供一种提高网页访问速度的方法及装置,实现提高浏览网页的速度,以进一步提高用户体验。 [0006] To solve the problem of the present invention to provide a method and apparatus for improving web access speed, improved browsing speed, to further improve the user experience.

[0007] 为了解决上述技术问题,本发明的一种提高网页访问速度的方法,包括: [0007] In order to solve the above problems, a method of increasing the access speed of the web of the present invention, comprising:

[0008] 读取所保存的历史网页数据,得到页面集合,从该页面集合的页面中抽取词干,将 [0008] reads the saved web page data to obtain a set of pages, stemmed from the page of the page set, the

词干切分为词条,得到与页面对应的词条集合; Entry cut into the stem, to obtain a set of entries corresponding to the page;

[0009] 从页面集合的页面中提取该页面中的链接点,得到与页面对应的链接点集合,并 [0009] Extraction of the page from the page link page set point, the set point to obtain a link corresponding to the page, and

提取链接点的链接词干,进行词干切分,得到与链接点对应的链接词条集合; Stem extract links point links, Stemming segmentation, to obtain the corresponding link entry point link set;

[0010] 生成页面集合中页面对应的词条集合中词条之间的兴趣关联规则,或页面集合中 [0010] generate a page corresponding to a page entry in the set of interest set in association rule between the entries, or page set

页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则,兴趣关联规则的组合构 Page corresponding to the entry of interest set in association rule between a set of entries with entry link, a combination of interest association rules configuration

成兴趣关联规则数据库; Interest association rules into the database;

[0011 ] 根据用户当前访问的页面和兴趣关联规则,从该用户当前访问的页面对应的链接 [0011] According to the page and access the user's current interest association rules from the page the user is currently visiting the corresponding link

点集合中预测出该用户下一步进入的链接点,下载并缓存该链接点的网页数据。 Point collection predict the user to enter the next point link, data is downloaded and cached pages that link points.

[0012] 进一步地,词条采用二元组的方式表示,在二元组中记录词条及该词条的权重,该 [0012] Further, by way of entry of tuples, said record entry and the entries in the tuple of weights, which

权重等于词条的新鲜度乘以该词条出现的频度。 Weights are equal to the freshness of the entry is multiplied by the frequency of occurrence of the term.

[0013] 进一步地,生成页面集合中页面对应的词条集合中词条之间的兴趣关联规则的方法包括: [0013] Further, the set page generation page corresponding to the entry of interest in association rules between sets of entries comprising:

[0014] 遍历页面集合,对任一已存页面遍历该页面的链接点集合,逐一判断各链接点的目标页面是否属于页面集合,如果属于,则遍历已存页面和目标页面的词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,关联支持度等于两个词条的权重之和,当词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重。 [0014] traversal set of pages, for any link traversed saved page of the page set point, is determined individually for each link point target page belongs to a set of pages, if they are, then the traversal entry pages and saved target page set, vocabulary entry combination, calculating two entries associated support, to give interest association rules between entries associated support equal to sum of the weights of the two entries and, when entry is repeated in a plurality of pages, correspondingly two entries accumulated in the associated support the weight.

[0015] 进一步地,生成页面集合中页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则的方法包括: [0015] Further, the set page generation page corresponding to the entry set of links between the entry of interest in entries association rule set comprising:

[0016] 如果目标页面不属于页面集合,则遍历已存页面的词条集合及链接词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,关联支持度等于已存页面中词条的权重,当链接词条在多个链接词条集合中出现时,则在关联支持度中相应地累加已存页面中词条的权重。 [0016] If the target page does not belong to a set of pages, then traverse the collection of entries saved pages and links to a collection of entries, entries were combined to calculate the associated support two entries, get interest association rules between terms associated support saved page entries equal weight, when linking occurs in the plurality of link entries in the entry set, then the associated support correspondingly saved page entries accumulated weight.

[0017] 进一步地,根据用户当前访问的页面和兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点的方法包括: [0017] Further, according to the rules associated with the page of interest and the user is currently accessing, from the set of links corresponding to the point the user is currently accessing page predicted that the user enters the next link point method comprising:

[0018] 在兴趣关联规则数据库中查找用户当前访问的页面对应的词条集合与链接词条集合中的词条之间的兴趣关联规则,计算转移度,该转移度等于该用户当前访问的页面对应的词条集合中的词条的权重X所查找到的兴趣关联规则中的关联支持度,完成转移度的计算后,对所得到的全部的转移度进行排序,转移度最大的链接点为该用户下一步进入的链接点。 [0018] In the interest of association rules to find a user in the current database entry corresponding to the page accessed interest association rules set link entries and the set of entries, the calculation of the transfer, the transfer is equal to the currently accessed user page associated with a corresponding set of entry support in terms of the weight of the X's interest to find association rules, after the completion of the transfer of the calculation of the transfer of all of the resulting sort, link the transfer of maximum points the user enters the next link points.

[0019] 进一步地,兴趣关联规则采用三元组的方式表示,在该三元组中记录两个词条及该两个词条的关联支持度。 [0019] Further, by way of interest triple association rules, she said associated record entry and support two of the two entries in the triple. [0020] 进一步地,一种提高网页访问速度的装置,包括:依次连接的数据保存模块、数据挖掘模块、网页预测模块和网页下载模块,其中: [0021] 数据保存模块,用于保存历史网页数据; [0020] Further, an apparatus webpage access speed is increased, comprising: a data storage modules connected in sequence, the data mining module, web page download module and a prediction module, wherein: [0021] The data storage module configured to store the history of the page data;

[0022] 数据挖掘模块,用于从数据保存模块读取历史网页数据,得到页面集合,从该页面集合的页面中抽取词干,将词干切分为词条,得到与页面对应的词条集合,还生成页面集合中页面对应的词条集合中词条之间的兴趣关联规则,或页面集合中页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则,兴趣关联规则的组合构成兴趣关联规则数据库; [0022] The data mining module for reading historical data from the page data storage module, to give a set of pages, stemmed from the page of the page set, the cut stem into entries, entries corresponding to the page to obtain collection, also generates a page corresponding to a page entry in the set of interest entries in the association rules set, or a set of pages in the page entry corresponding to the set of interest in terms of the association rules set entry links, interest association combination rules constitute interest association rules database;

[0023] 网页预测模块,用于根据用户当前访问的页面和从数据挖掘模块读取的兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点, 将该链接点发送给网页下载模块; [0023] page prediction module for the current page to access the user's interests and mining association rules from the data reading module predicts that the user enters the next link point of the link point from the set of pages corresponding to the current user's access sends the link points to a page to download the module;

[0024] 网页下载模块,用于根据接收到的用户下一步进入的链接点,下载并缓存该链接点的网页数据。 [0024] The web page download module, according to the received user enters the next link point, the web page data is downloaded and cached link points.

[0025] 进一步地,词条采用二元组的方式表示,在二元组中记录词条及该词条的权重,该权重等于词条的新鲜度乘以该词条出现的频度。 [0025] Further, by way of entry of tuples, he said record entry and the entries in the tuple of weights, the weight is equal to the freshness of the entry is multiplied by the frequency of occurrence of the term.

[0026] 进一步地,数据挖掘模块生成页面集合中页面对应的词条集合中词条之间的兴趣关联规则的方法包括: Method [0026] Further, the data mining module generates a set of pages corresponding to a page entry set of association rules between the interest entries comprises:

[0027] 遍历页面集合,对任一已存页面遍历该页面的链接点集合,逐一判断各链接点的目标页面是否属于页面集合,如果属于,则遍历已存页面和目标页面的词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,关联支持度等于两个词条的权重之和,当词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重; [0027] traversal set of pages, for any link traversed saved page of the page set point, is determined individually for each link point target page belongs to a set of pages, if they are, then the traversal entry pages and saved target page set, vocabulary entry combination, calculating two entries associated support, to give interest association rules between entries associated support equal to sum of the weights of the two entries and, when entry is repeated in a plurality of pages, correspondingly two entries accumulated in the associated support the weight;

[0028] 如果目标页面不属于页面集合,则遍历已存页面的词条集合及链接词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,关联支持度等于已存页面中词条的权重,当链接词条在多个链接词条集合中出现时,则在关联支持度中相应地累加已存页面中词条的权重。 [0028] If the target page does not belong to a set of pages, then traverse the collection of entries saved pages and links to a collection of entries, entries were combined to calculate the associated support two entries, get interest association rules between terms associated support saved page entries equal weight, when linking occurs in the plurality of link entries in the entry set, then the associated support correspondingly saved page entries accumulated weight.

[0029] 进一步地,网页预测模块根据用户当前访问的页面和兴趣关联规则,从该用户当 [0029] Further, the prediction module in accordance with the page and the page of interest association rules currently visited by the user, from the user when

前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点的方法包括: The method of the link point of the link the user enters the next set of points corresponding to the page accessed before the predicted comprising:

[0030] 在兴趣关联规则数据库中查找用户当前访问的页面对应的词条集合与链接词条 [0030] In the interest of association rules to find the user currently accessing the database entry corresponding to a page with links to a collection of entries

集合中的词条之间的兴趣关联规则,计算转移度,该转移度等于该用户当前访问的页面对 Interest association rules between entry set, calculation of the transfer, the transfer is equal to the current user's access to the page

应的词条集合中的词条的权重X所查找到的兴趣关联规则中的关联支持度,完成转移度 Right of entry should be a collection of entries in the X associated with the heavy support of the found interest association rules, to complete the transfer of

的计算后,对所得到的全部的转移度进行排序,转移度最大的链接点为该用户下一步进入 After the calculation of the transfer of all the obtained sorted, the maximum transfer of the linking point for the user to enter the next step

的链接点。 The link points.

[0031] 进一步地,兴趣关联规则采用三元组的方式表示,在该三元组中记录两个词条及该两个词条的关联支持度。 [0031] Further, by way of interest triple association rules, she said associated record entry and support two of the two entries in the triple.

[0032] 综上所述,本发明通过预测出用户下一步可能访问的网页,并提前下载该网页的数据,可以提高用户浏览网页的速度,减少用户的等待时间,提高用户的体验。 [0032] In summary, the present invention is by predicting the next page the user may access and download data in advance of the web page, the user can browse the web to improve the speed and reduce the waiting time for users, improve the user experience.

附图说明 BRIEF DESCRIPTION

[0033] 图1为本发明提高网页访问速度的方法的流程图; [0034] 图2为本发明提高网页访问速度的装置的架构图。 [0034] Fig 2 a schematic diagram webpage access speed improved apparatus of the present invention; flowchart illustrating a method to improve the access speed of the page [0033] FIG. 1 of the present invention.

具体实施方式 Detailed ways

[0035] 本实施方式的目的在于提升访问网页的速度,提高用户使用浏览器的服务质量, Objective [0035] The present embodiment is to increase the speed of access to web pages to improve the user to use the browser service quality,

本实施方式获取浏览器缓存中保存的历史网页数据,由于在这些数据中隐含着用户的兴趣 Web page acquiring historical data stored in the browser cache to this embodiment, due to the implicit user interest in these data

爱好和访问习惯,可以通过对这些数据进行挖掘,得到反映用户兴趣和习惯的兴趣关联规 Hobbies and access habits, by mining these data, user interests and habits reflected the interest association rules

则,根据兴趣关联规则和用户当前访问网页,预测用户可能发出的访问请求,在用户浏览当 Then, interest association rules and user access page based on the current, the user may access the forecast issued by the request, the browser when the user

前网页时就将预测的内容下载到浏览器的缓存中,是一种主动的Cache服务,在用户真正 When the front page will be projected content downloaded to the browser's cache, it is an active Cache service, the user real

要访问这些页面时只需从手机浏览器缓存下载,从而在很大程度上减小用户的访问延迟。 Just download the mobile browser cache To access these pages, thus greatly reduce delays in user access.

[0036] 本实施方式通过兴趣关联规则挖掘技术预取网页到手机缓存的实现方法基本分 [0036] The interest in the present embodiment by way of mining association rules web page to the phone prefetch cache implementation basic points

为三个阶段:保存终端浏览器缓存中的历史网页数据、对保存的历史网页数据进行兴趣关 Historical data storage terminal web browser cache, history data stored pages were interested Off: three stages

联规则数据挖掘和通过数据挖掘给出的结果和当前用户访问的网页,将预测内容下载到手 Association rules data mining and data mining results and web pages accessed by the current user is given the predicted content download hand

机缓存中。 Machine cache.

[0037] 下面结合附图对本发明的具体实施方式进行说明。 [0037] DETAILED DESCRIPTION OF THE DRAWINGS Embodiment of the present invention will be described. [0038] 图1为本实施例的提高网页访问速度的方法,包括: [0038] FIG. 1 a method to improve the speed page access according to the present embodiment, comprising:

[0039] 101 :保存并读取浏览器缓存中的历史网页数据,得到页面集合C = {Yn Y2,..., Yk, • • • , Yj,其中,l《k《n ; [0039] 101: Save and read historical web browser cache data, the set of pages to give C = {Yn Y2, ..., Yk, • • •, Yj, wherein, l "k" n;

[0040] 102 :对所保存的历史网页数据进行兴趣关联规则的数据挖掘; [0041] 数据挖掘具体包括如下步骤: [0040] 102: history data is stored webpage data association rule mining interest; [0041] Data mining includes the following steps:

[0042] (1)将词条定义为节点,节点以二元组(t, weight)表示,简记为Node(t),其中,weight为词条t的权重; [0042] (1) is defined as the entry node, node tuple (t, weight), said abbreviated as the Node (t), wherein, weight is the weight of term t;

[OO43] weight =新鲜度X出现的频度(fi)。 [OO43] frequency (fi) weight = X freshness occur.

[0044] 新鲜度反映词条存在时间的长短,最近访问页面中的词条的新鲜度相对较高,在预测过程中,越是最近访问的页面中的词条对预测起的作用越大。 [0044] reflect the presence of the freshness of the length of time of entry, the entry most recently accessed page freshness is relatively high, during the prediction, the more recently accessed page translation prediction greater role. 新鲜度等于词条所在页面在页面集合中的序号,也可以等于序号的平方等,越是在后访问的页面在页面集合中的序号越大。 Freshness equal terms in the page where the page number set may be equal to the square of the number and the like, the more the greater the number set in the page after the page is accessed.

[0045] fi为词条在页面中出现的频度,例如,某个词条在一个页面中出现了8次,该页面 [0045] fi as the frequency of occurrence of the term in the page, e.g., an entry appears 8 times in a page, the page

中总的词条数为100(包括重复),则fi =8/100。 The total number of entries 100 (including repeats), then fi = 8/100.

[0046] 兴趣词条可以是娱乐、体育、新闻、天气、咨询和财经等。 [0046] The entry can be interested in entertainment, sports, news, weather, and financial consulting, etc.

[0047] (2)定义节点之间的联系为兴趣关联规则,用三元组[Node (t》,su卯ort, Node (tj)]表示,简记为Rule [Node (t》,Node (t》],其中,support称为关联支持度,表示由节点Node(ti)转到节点Node(tj)的可能性; [0047] The link between (2) is defined as the node of interest association rules, the triples [Node (t ", su d ort, Node (tj)] represents abbreviated as Rule [Node (t", Node ( t "], wherein, called an association support support, represented by the possibility of node node (ti) to the node node (tj) of;

[0048] (3)数据预处理,对页面集合C中的各页面抽取词干,并进行词干切分,对应地得 [0048] (3) pre-processing the data, a page of each page in the set C stemmed, segmentation and stem, in correspondence to give

到页面Yk的词条集合K(Yk) = {(V ,weight)IV GT(汉语词汇),i GN(自然数)}; Yk page translation to set K (Yk) = {(V, weight) IV GT (Chinese vocabulary), i GN (a natural number)};

[0049] K(Yk)表示在Yk页面中出现的所有的词条的集合,ti'为其中一个词条。 [0049] K (Yk) represents the set of all terms appearing in the page Yk, ti ​​'as one entry.

[0050] Cache中的历史网页数据通常采用WWW数据模型表示,根据具体实现还可能对WWW [0050] Data Cache page in the history of commonly used WWW data model representation, depending on the implementation of the WWW may also

数据模型的历史网页数据进行数据格式转换,转换为所需要的数据格式。 History page data model for data format conversion, converted into the required data format.

[0051] 词干的抽取和切分可以参考IEEE(美国电气和电子工程师协会)的数据挖掘在网 [0051] The stem extraction and segmentation data may refer to the IEEE (Institute of Electrical and Electronics Engineers) in the network Mining

页予页取中的应用(application of data mining in Web pre-fetching)。 Application (application of data mining in Web pre-fetching) to take the Page.

[0052] (4)从页面集合C中的各页面Yk中提取该页面的链接点,得到页面的链接点集合 [0052] (4) from the page C is set in each page Yk link point extracting the page, the page link point to obtain a set of

L(Yk) = (lk,illk,i为页面Yk中的链接点); L (Yk) = (lk, illk, it is the point of attachment of the page Yk);

[0053] 链接点集合表示Yk页面中所有的可以点击进入的页面的地址的集合,通过点击Yk 页面里的链接点就可以进入下一个页面。 [0053] represented by the set Yk links page can click on a collection of all of the addresses into the page, the next page by clicking on the link in the Yk point you can enter a page.

[0054] (5)提取页面的链接点的同时,获取链接点的链接词干,对链接词干进行切分,得到页面中链接点lk,i的链接词条集合Q(lk,ht一) = {t/' It/'在lk,ht一中,j GN}; [0055] Q (lk, i. steing)表示对Yk中的某个链接lk, i对其链接词干进行切分后得到的词条的 [0054] (5) extracts the link point, while the page for links to point links stem, stem links be segmented to obtain the page link point lk, i link entry set Q (lk, ht a) after [0055] Q (lk, i steing.) Yk expressed in a link lk, i be segmented link its stem; = {t / 'It /' in lk, ht in a, j GN} the resulting entry

隹A 朱no A short-tailed bird Zhu no

[0056] 通过以上的数据处理得到了四种集合,分别为:页面集合、页面的词条集合、页面的链接点集合以及页面中链接点的链接词条集合。 [0056] obtained by the above set of four data processing, namely: a collection of pages, the page entry set, the set of pages linking point and the page entry in the link point of the link set. 得到四种集合是为了下面计算兴趣关联规则[Node (tj , su卯ort, Node (t》],即从一个词条转移到另一个词条的可能性,进而再计算出从一个页面转移到其中某个链接的可能性。 The following four types of the collection is to calculate interest association rules [Node (tj, su d ort, Node (t "], i.e. from one entry to another entry possibility, which can then calculate a page to be transferred from in which the possibility of a link.

[0057] (6)生成兴趣关联规则,兴趣关联规则的集合构成兴趣关联数据库; [0058] 生成兴趣关联规则的具体过程包括: [0057] (6) generating interest association rules, a set of association rules constituting interest interest association database; specific process [0058] association rules generated interest comprising:

[0059] 遍历页面集合C,对于已存页面Yk遍历该页面中的链接点集合L(Yk),逐一判断其中的链接点的目标页面(链接点链接到的页面)Yj是否属于页面集合C,如果属于,则遍历页面Yk和Yj的词条集合,将Yk与Yj中的词条进行组合,计算词条组合中从一个词条转移到另一个词条的关联支持度,该关联支持度等于两个词条权重之和,当词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重; [0059] The traverse page set C, for the saved page Yk traverse the page link point set L (Yk), individually determined target page where the link point (link point linked to the page) Yj belongs page set C, if they are, then traverse the page entry set Yk and Yj, Yj to Yk and the term combination, combination calculation entries from one entry to another entry associated support, the associated support equal sum of weights of two terms, when the entry is repeated in a plurality of pages, correspondingly associated support the accumulated weight in terms of two weight;

[0060] 如果链接点的目标页面Y,.不属于页面集合C,则遍历页面Yk和链接点的链接词条集合,将Yk与链接点的链接词条集合中的词条进行组合,计算词条组合中从一个词条转移到另一个词条的关联支持度,该关联支持度等于页面Yk中词条的权重,当链接词条在多个链接点的链接词条集合中出现时,则在关联支持度中累加页面Yk中词条的权重。 [0060] If the linking point Y ,. target page does not belong to the page set C, then traverse the page link and the link point Yk set of entries, the entry link and set the link point Yk entries in combination, calculate word from a bar composition transferred to the associated entry to another entry support, support the association entries equal weights Yk page when the link entries in the link occurs plurality of links set entry points, then Yk accumulate right page in terms of the association support the weight. [0061] 生成兴趣关联规则的伪代码如下: [0062] for保存的页面集合C中的每个页面Yk [0063] { [0061] The pseudo code to generate interest association rule is as follows: [0062] for the saved page set in each page Yk C [0063] {

[0064] for链接集合L (Yk)中的每个链接lk, r [0064] for a set of links in each link lk L (Yk), r

[0065] { [0065] {

设lk,r的目标页面为Yj; Set lk, r target page is Yj;

if Yj GC then if Yj GC then

for页面Yk中的词条集K(Yk)中的每个词条(t/ , weightp) for the page entries in the set K Yk (Yk) for each term (t /, weightp)

for页面Yj中的词条集K(Yj)中的每个词条(tq' , weight,) Yj for the page entries in the set K (Yj) in each term (tq ', weight,)

Rule[Node(tp' ),Node(、')]的支持度+ = g (weightp, weightq) ;(tp' ,weightp) EK(Yk), (tq' , weightq) (Yj) Rule [Node (tp '), Node (,')] of support + = g (weightp, weightq); (tp ', weightp) EK (Yk), (tq', weightq) (Yj)

6ls6 6ls6

[0066] [0067] [0068] [0069] [0070] [0071] [0072] [0073] [0074] [0075] [0076] [0077] [0078] [0079] [0080] [0081] [0082] [0083] [0084] [0085] [0086] [0087] } [0088] } [0089] } [0090] } [0091] } [0066] [0067] [0068] [0069] [0070] [0071] [0072] [0073] [0074] [0075] [0076] [0077] [0078] [0079] [0080] [0081] [0082 ] [0083] [0084] [0085] [0086] [0087]} [0088]} [0089]} [0090]} [0091]}

[0092] 其中,g(weightp,weightq)为函数,令其为(weightp+weightq),表示缓存中的页面的链接点及链接点所指向的页面对兴趣关联数据库中的兴趣关联规则的影响。 [0092] where, g (weightp, weightq) as a function, so that it is (weightp + weightq), represented cache pages link to point links and point pointed to the impact on interest interest association rules associated database pages. 使用上面的关联规则挖掘算法计算Rule [Node (ti), Node (tj)]的支持度反映了当前浏览器用户访问网页兴趣和习惯,作为下一步预测的依据。 Use the above association rule mining algorithm to calculate Rule [Node (ti), Node (tj)] of support reflects the current browser user visits a webpage interests and habits, as the next projection is based.

[0093] 103 :根据用户当前的访问网页和兴趣关联规则数据库,预测用户下一步访问的链 [0093] 103: The user is currently accessing the page and interest association rule database, the next user access prediction chain

for页面Yk中的词条集合K(Yk)中的每个词条a/ , weightp) for a set of entries in the page K Yk (Yk) in each term a /, weightp)

for Q(lk,' for Q (lk, '

'string 'String

)中的每个词条t。 Each term t) is. ' '

Rule[Node(t/ ),Node(tq')]的支持度+ = weightp ; (V , weight, GY" tq' e Q(lkr.string)接点,并将所预测的链接点的网页数据下载并缓存到手机浏览器中,达到主动Cache和服务的目的,提高浏览器浏览网页速度。 Rule [Node (t /), Node (tq ')] of support + = weightp; (V, weight, GY "tq' e Q (lkr.string) contacts, and web page data downloaded link point of the predicted and cached in the mobile browser, the purpose of the initiative cache and services, improve the speed of browsing the web browser.

[0094] 预测的方法为:在兴趣关联规则数据库中查找当前访问页面中的词条与链接词条的兴趣关联规则,计算转移度,该转移度等于当前访问页面中词条的权重X该查找到的兴趣关联规则中的关联支持度,完成转移度的计算后,对得到的全部转移度进行排序,转移度最大的链接点就是所预测的用户下一步访问的网页。 [0094] The method of prediction is: find the current interest related to access rules and entry page link entries, calculate the transfer of interest in the database of association rules, the transfer of the right to equal access to the current page in terms of the weight of the X Find related support to the interest association rules, after the completion of the transfer of the calculation of the transfer of all get sorted, the transfer of the web links biggest point is predicted user next visits.

[0095] 图2所示为本发明实施方式提高网页访问速度的装置,包括:依次连接的数据保 The present apparatus shown in [0095] Embodiment 2 FIG invention improves the access speed of the page, comprising: sequentially connected data retention

存模块、数据挖掘模块、网页预测模块和网页下载模块, [0096] 数据保存模块,用于保存浏览器中的历史网页数据; Memory modules, data mining module, web page download prediction module and the module, [0096] data storage module configured to store historical data web browser;

[0097] 数据挖掘模块,用于从数据保存模块读取历史网页数据,得到页面集合C = {Y15 Y2, . . . , Yk, . . . , Yn},其中,1《k《n,对所保存的历史网页数据进行兴趣关联规则的数据挖掘; [0097] The data mining module for reading historical data from the page data storage module to obtain the set of the page C = {Y15 Y2,..., Yk,..., Yn}, where, 1 "k" n, for data stored historical data pages interest association rule mining;

[0098] 数据挖掘具体包括如下步骤: [0098] Data mining includes the following steps:

[0099] (1)将词条定义为节点,节点以二元组(t, weight)表示,简记为Node(t),其中, weight为词条t的权重; [0099] (1) is defined as the entry node, node tuple (t, weight), said abbreviated as the Node (t), wherein, weight is the weight of term t;

[OWO] weight =新鲜度X出现的频度(fi)。 [OWO] frequency (fi) weight = X freshness occur.

[0101] 新鲜度反映词条存在时间的长短,最近访问页面中的词条的新鲜度相对较高,在预测过程中,越是最近访问的页面中的词条对预测起的作用越大。 [0101] reflect the presence of the freshness of the length of time of entry, the entry most recently accessed page freshness is relatively high, during the prediction, the more recently accessed page translation prediction greater role. 新鲜度等于词条所在页面在页面集合中的序号,也可以等于序号的平方等,越是后访问的页面在页面集合中的序号越大。 Freshness equal terms in the page where the page number set may be equal to the square of the number and the like, the more the greater the page number in the page after accessing the set.

[0102] fi为词条在页面中出现的频度,例如,某个词条在一个页面中出现了8次,该页面 [0102] fi as the frequency of occurrence of the term in the page, e.g., an entry appears 8 times in a page, the page

中总的词条数为IOO(包括重复),则fi = 8/100。 The total number of entries of the IOO (including repeat), the fi = 8/100.

[0103] 兴趣词条可以是娱乐、体育、新闻、天气、咨询和财经等。 [0103] The entry can be interested in entertainment, sports, news, weather, and financial consulting, etc.

[O104] (2)定义节点之间的联系为兴趣关联规则,用三元组[Node (ti) , su卯ort, Node (tj)]表示,简记为Rule [Node (t》,Node (t》],其中,support称为关联支持度,表示由节点Node(ti)转到节点Node(tj)的可能性; [O104] linkages between (2) is defined as the node of interest association rules, the triples [Node (ti), su d ort, Node (tj)] represents abbreviated as Rule [Node (t ", Node ( t "], wherein, called an association support support, represented by the possibility of node node (ti) to the node node (tj) of;

[0105] (3)数据预处理,对页面集合C中的各页面抽取词干,并进行词干切分,对应地得 [0105] (3) pre-processing the data, a page of each page in the set C stemmed, segmentation and stem, in correspondence to give

到页面Yk的词条集合K(Yk) = {(V ,weight)IV GT(汉语词汇),i GN(自然数)}; Yk page translation to set K (Yk) = {(V, weight) IV GT (Chinese vocabulary), i GN (a natural number)};

[0106] K(Yk)表示在Yk页面中出现的所有的词条的集合,ti'为其中一个词条。 [0106] K (Yk) represents the set of all terms appearing in the page Yk, ti ​​'as one entry.

[0107] Cache中的历史网页数据通常采用WWW数据模型表示,根据具体实现还可能对WWW [0107] Data Cache page in the history of commonly used WWW data model representation, depending on the implementation of the WWW may also

数据模型的历史网页数据进行数据格式转换,转换为所需要的数据格式。 History page data model for data format conversion, converted into the required data format.

[0108] 词干的抽取和切分可以参考IEEE(美国电气和电子工程师协会)的数据挖掘在网 [0108] stem extraction and segmentation data may refer to the IEEE (Institute of Electrical and Electronics Engineers) in the network Mining

页予页取中的应用(application of data mining in Web pre-fetching)。 Application (application of data mining in Web pre-fetching) to take the Page.

[0109] (4)从页面集合C中的各页面Yk中提取该页面的链接点,得到页面的链接点集合 [0109] (4) from the page C is set in each page Yk link point extracting the page, the page link point to obtain a set of

L(Yk) = {lk,i lk,i为页面Yk中的链接点); L (Yk) = {lk, i lk, it is the point of attachment of the page Yk);

[0110] 链接点集合表示Yk页面中所有的可以点击进入的页面的地址的集合,通过点击Yk 页面里的链接点就可以进入下一个页面。 [0110] represented by the set Yk links page can click on a collection of all of the addresses into the page, the next page by clicking on the link in the Yk point you can enter a page.

[0111] (5)提取页面的链接点的同时,获取链接点的链接词干,对链接词干进行切分,得到页面中链接点1 k,i的链接词条集合Q(lk,ht一) = {t/' It/'在lk,ht一中,j GN}; [0111] Also the linking point (5) extracts the page for links to point links stem, stem to be segmented links, a page linking point to obtain 1 k, i link entry set Q (lk, ht a ) = {t / 'It /' in lk, in a ht, j GN};

9[0112] Q (lk, i. steing)表示对Yk中的某个链接lk, i对其链接词干进行切分后得到的词条的 9 [0112] Q (lk, i. Steing) Yk expressed in a link lk, i its stem linked segmentation obtained after entry

隹A 朱no A short-tailed bird Zhu no

[0113] 通过以上的数据处理得到了四种集合,分别为:页面集合、页面的词条集合、页面的链接点集合以及页面中链接点的链接词条集合。 [0113] obtained by the above set of four data processing, namely: a collection of pages, the page entry set, the set of pages linking point and the page entry in the link point of the link set. 得到四种集合是为了下面计算兴趣关联规则[Node (ti) , su卯ort, Node (t》],即从一个词条转移到另一个词条的可能性,进而在计算出从一个页面转移到其中某个链接的可能性。 The following four types of the collection is to calculate interest association rules [Node (ti), su d ort, Node (t "], i.e. from one entry to another entry possibility, and then the calculated transition from one page in which the possibility of a link to.

[0114] (6)生成兴趣关联规则,兴趣关联规则的集合构成兴趣关联数据库; [0115] 生成兴趣关联规则的具体过程包括: [0114] (6) generating interest association rules, a set of association rules constituting interest interest association database; specific process [0115] association rules generated interest comprising:

[0116] 遍历页面集合C,对于已存页面Yk遍历该页面中的链接点集合L (Yk),逐一判断其中的链接点的目标页面(链接点链接到的页面)Yj是否属于页面集合C,如果属于,则遍历页面Yk和Yj的词条集合,将Yk与Yj中的词条进行组合,计算词条组合中从一个词条转移到另一个词条的关联支持度,该关联支持度等于两个词条权重之和,当词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重; [0116] traversing page set C, for the saved page Yk traverse the page link point set L (Yk), individually determined target page where the link point (link point linked to the page) Yj belongs page set C, if they are, then traverse the page entry set Yk and Yj, Yj to Yk and the term combination, combination calculation entries from one entry to another entry associated support, the associated support equal sum of weights of two terms, when the entry is repeated in a plurality of pages, correspondingly associated support the accumulated weight in terms of two weight;

[0117] 如果链接点的目标页面Yj不属于页面集合C,则遍历页面Yk和链接点的链接词条 [0117] If the target page linked Yj point does not belong to the collection page C, then traverse pages and links Yk point of entry link

集合,将Yk与链接点的链接词条集合中的词条进行组合,计算词条组合中从一个词条转移 Collection, set Yk link entry in the linked entry point are combined in the combination calculation entries from one entry

到另一个词条的关联支持度,该关联支持度等于页面Yk中词条的权重,当链接词条在多个 Support the association of another term, support the association entries equal weights Yk page, when a plurality of link entries in

链接点的链接词条集合中出现时,则在关联支持度中累加页面Yk中词条的权重。 When the link appears in the collection of links entry point, the cumulative page Yk in terms of weight in support of the association.

[0118] 网页预测模块,用于根据用户当前的访问网页和从数据挖掘模块读取的兴趣关联 [0118] page prediction module configured to associate the user currently accessing the page reading module and from the data mining interest

规则,预测用户下一步访问的链接点,并将该用户下一步访问的链接点发送给网页下载模 Rules predict the user's next access point link and send the link to point users to access the next page download mode

块; Piece;

[0119] 网页下载模块,用于下载并缓存所接收到的用户下一步访问的链接点对应的网页数据。 Link point corresponding to [0119] page downloading module configured to download and cache the received user data of the next page access.

[0120] 以上所述仅为本发明的优选实施例而已,并不用于限制本发明,对于本领域的技术人员来说,本发明可以有各种更改和变化。 [0120] The foregoing is only preferred embodiments of the present invention, it is not intended to limit the invention to those skilled in the art, the present invention may have various changes and variations. 凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。 Any modification within the spirit and principle of the present invention, made, equivalent substitutions, improvements, etc., should be included within the scope of the present invention.

Claims (11)

  1. 一种提高网页访问速度的方法,包括:读取所保存的历史网页数据,得到页面集合,从该页面集合的页面中抽取词干,将词干切分为词条,得到与页面对应的词条集合;从所述页面集合的页面中提取该页面中的链接点,得到与页面对应的链接点集合,并提取所述链接点的链接词干,进行词干切分,得到与链接点对应的链接词条集合;生成所述页面集合中页面对应的词条集合中词条之间的兴趣关联规则,或所述页面集合中页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则,兴趣关联规则的组合构成兴趣关联规则数据库;根据用户当前访问的页面和所述兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点,下载并缓存该链接点的网页数据。 A method of increasing the page access speed, comprising: reading historical data stored on the page, the page set obtained, stemmed from the page of the page set, the cut stem into entry, the word corresponding to the page to obtain Article set; extracting the page link of the page from the page set point, a link corresponding to the page to obtain a set of points, and extracts the link point of the link stems, stem for segmentation, to obtain the corresponding point links the term link set; generating a set of pages in the page entry corresponding to the interest entries in the association rules set, or a set of pages corresponding to a page entry in the set of terms and the link set between the entry interest association rules, the combination constituting interest interest association rules database association rules; based on the page and the user is currently accessing the association rules of interest, the prediction that the user enters the next set of pages from the link point corresponding to the current user's access point link, data is downloaded and cached pages that link points.
  2. 2. 如权利要求1所述的方法,其特征在于:所述词条采用二元组的方式表示,在所述二元组中记录词条及该词条的权重,该权重等于词条的新鲜度乘以该词条出现的频度。 2. The method according to claim 1, wherein: said tuple entry by way of representation, the right to record the entry and re-entry of the tuples in the entry is equal to the weight of multiplying the frequency of the freshness of the entry appears.
  3. 3. 如权利要求2所述的方法,其特征在于,所述生成所述页面集合中页面对应的词条集合中词条之间的兴趣关联规则的方法包括:遍历所述页面集合,对任一已存页面遍历该页面的链接点集合,逐一判断各链接点的目标页面是否属于所述页面集合,如果属于,则遍历所述已存页面和目标页面的词条集合, 进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,所述关联支持度等于两个词条的权重之和,当所述词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重。 3. The method according to claim 2, characterized in that the generation method set in the page entry corresponding to the page of interest in terms of the association rules set comprises: traversing the set of pages, for any saved traversed a page of the page set point links, each link target page is determined by one point belongs to the set page, if they are, then traversing the saved entry page and the target page set for entries in combination, calculated two correlation entries support, to obtain the association rules term interest, the association is equal to the weight of the support and the right two entries, when the entries are repeated in a plurality of pages, then accordingly, two entries accumulated in the associated support the weight.
  4. 4. 如权利要求3所述的方法,其特征在于,所述生成所述页面集合中页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则的方法包括:如果所述目标页面不属于所述页面集合,则遍历已存页面的词条集合及链接词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,所述关联支持度等于所述已存页面中词条的权重,当链接词条在多个链接词条集合中出现时,则在所述关联支持度中相应地累加所述已存页面中词条的权重。 4. The method according to claim 3, wherein said method generates the page of interest in terms of the association rules in the set of entries corresponding to a page with links set entry set comprises: if the not belonging to the target page set page, the page translation traversal saved set of links and a set of entries, entries for the combination, calculated two correlation entries support, to obtain the association rules term interest, the said support equal to said associated page entries saved weight, when linking occurs in the plurality of link entries in the entry set, then the associated support accordingly accumulating the saved page entries the weight of.
  5. 5. 如权利要求2所述的方法,其特征在于:所述根据用户当前访问的页面和所述兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点的方法包括:在所述兴趣关联规则数据库中查找所述用户当前访问的页面对应的词条集合与链接词条集合中的词条之间的兴趣关联规则,计算转移度,该转移度等于该用户当前访问的页面对应的词条集合中的词条的权重X所查找到的兴趣关联规则中的关联支持度,完成转移度的计算后,对所得到的全部的转移度进行排序,转移度最大的链接点为该用户下一步进入的链接点。 5. The method according to claim 2, wherein: the user is currently accessing page according to the interests and association rules, a set of links to pages from the point corresponding to the current user's access to the predicted next user enters the method of linking point comprises: searching the page of interest association rule database corresponding to the current user to access a set of translation rules interest association links between terms in the set of terms, calculation of the transfer, the transfer right entry terms equal to the set of users currently accessing the page corresponding to the X associated with heavy support on the found interest association rules, after the completion of the transfer of the calculation of the transfer of all of the resulting sort the transfer of the maximum link point for the user to enter the next link points.
  6. 6. 如权利要求1至5任一项所述的方法,其特征在于:所述兴趣关联规则采用三元组的方式表示,在该三元组中记录两个词条及该两个词条的关联支持度。 6. A method according to any one of claims 1 to 5, wherein: said association rules of interest by way of triplets, said two entry and the recording of the two entries in the triple the association support.
  7. 7. —种提高网页访问速度的装置,包括:依次连接的数据保存模块、数据挖掘模块、网页预测模块和网页下载模块,其中:所述数据保存模块,用于保存历史网页数据;所述数据挖掘模块,用于从所述数据保存模块读取历史网页数据,得到页面集合,从该页面集合的页面中抽取词干,将词干切分为词条,得到与页面对应的词条集合,还生成所述页面集合中页面对应的词条集合中词条之间的兴趣关联规则,或所述页面集合中页面对应的词条集合与链接词条集合中词条之间的兴趣关联规则,兴趣关联规则的组合构成兴趣关联规则数据库;所述网页预测模块,用于根据用户当前访问的页面和从所述数据挖掘模块读取的所述兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点,将该链接点发送给所述网页下载模块 7. - kind of means to improve the access speed of the web, comprising: a data storage module which are sequentially connected, a data mining module, web page download module and a prediction module, wherein: said data storage module configured to store historical data on the page; the data mining module for reading historical data from the page data storage module, to give a set of pages, stemmed from the page of the page set, the cut stem into entry, corresponding to the page to obtain a set of entries, generating a further set of page entries corresponding to pages of interest in terms of association rules set, the page or pages in the set of entries corresponding to a set of interest association rules between the entry and entry link set, interest association rules composition constituting interest association rules database; the webpage prediction module for the current page to access the user's interest and the association rule mining data read from the module, the user is currently accessing the page corresponding to link point set predicted that the user enters the next link point, transmission point of the link to the web page download module 所述网页下载模块,用于根据接收到的所述用户下一步进入的链接点,下载并缓存该链接点的网页数据。 The web page download module, according to the received user enters the next link point, the web page data is downloaded and cached link points.
  8. 8. 如权利要求7所述的装置,其特征在于:所述词条采用二元组的方式表示,在所述二元组中记录词条及该词条的权重,该权重等于词条的新鲜度乘以该词条出现的频度。 8. The apparatus according to claim 7, wherein: said tuple entry by way of representation, the right to record the entry and re-entry of the tuples in the entry is equal to the weight of multiplying the frequency of the freshness of the entry appears.
  9. 9. 如权利要求8所述的装置,其特征在于,所述数据挖掘模块生成所述页面集合中页面对应的词条集合中词条之间的兴趣关联规则的方法包括:遍历所述页面集合,对任一已存页面遍历该页面的链接点集合,逐一判断各链接点的目标页面是否属于所述页面集合,如果属于,则遍历所述已存页面和目标页面的词条集合, 进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,所述关联支持度等于两个词条的权重之和,当所述词条在多个页面中重复出现时,则相应地在关联支持度中累加两个词条的权重;如果所述目标页面不属于所述页面集合,则遍历已存页面的词条集合及链接词条集合,进行词条组合,计算两个词条的关联支持度,得到词条之间的兴趣关联规则,所述关联支持度等于所述已存页面中词条的权重,当链接词条在多个 9. The apparatus according to claim 8, wherein the data mining process module generates a page corresponding to a page entry set of interest in terms of the association rules set comprises: traversing the set of page , for any link traversed saved page of the page set point, is determined individually for each link point target page belongs to the set page, if they are, then traversing the saved entry page and the target page set for words bar composition, the degree of association is calculated supports two entries, to give the association rules term interest, the association is equal to the weight of the support and the right two entries, when the entries in the plurality of pages is repeated when the accumulated weight correspondingly two entries in the associated support weight; and if the target page does not belong to the set page, the page entry traversal saved link set and a set of entries, entries for the combination, calculated two correlation entries support, to obtain the association rules term interest, the association is equal to the support has been stored in the page entry weights, when a plurality of link entries in 接词条集合中出现时,则在所述关联支持度中相应地累加所述已存页面中词条的权重。 When the set access entry occurs, in the correspondingly associated support accumulating the saved page entries weight.
  10. 10. 如权利要求8所述的装置,其特征在于,所述网页预测模块根据用户当前访问的页面和所述兴趣关联规则,从该用户当前访问的页面对应的链接点集合中预测出该用户下一步进入的链接点的方法包括:在所述兴趣关联规则数据库中查找所述用户当前访问的页面对应的词条集合与链接词条集合中的词条之间的兴趣关联规则,计算转移度,该转移度等于该用户当前访问的页面对应的词条集合中的词条的权重X所查找到的兴趣关联规则中的关联支持度,完成转移度的计算后,对所得到的全部的转移度进行排序,转移度最大的链接点为该用户下一步进入的链接点。 10. The apparatus according to claim 8, wherein the web page according to the prediction module and the user is currently accessing interest association rules, a set of links to pages from the point corresponding to the current user's access to the predicted user the method of entering the next link point comprises: searching the page of interest association rules database entry corresponding to the user is currently accessing the set of association rules interest translation entry link set, calculation of the transfer right of entry, the transfer is equal to the current user to access the page corresponding set of entries in the X associated with the heavy support of the found interest association rules, complete the calculation of the transfer, the transfer of all of the resulting of the sort, link the transfer of a maximum point for the user to enter the next link points.
  11. 11. 如权利要求7至10任一项所述的装置,其特征在于:所述兴趣关联规则采用三元组的方式表示,在该三元组中记录两个词条及该两个词条的关联支持度。 11. The device according to any one of claims 7 to 10, characterized in that: said association rules of interest by way of triplets, said two entry and the recording of the two entries in the triple the association support.
CN201010128121A 2010-03-08 2010-03-08 Method and device for improving webpage access speed CN101777081A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201010128121A CN101777081A (en) 2010-03-08 2010-03-08 Method and device for improving webpage access speed

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201010128121A CN101777081A (en) 2010-03-08 2010-03-08 Method and device for improving webpage access speed
PCT/CN2010/073143 WO2011109957A1 (en) 2010-03-08 2010-05-24 Method and apparatus for improving web page access speed

Publications (1)

Publication Number Publication Date
CN101777081A true CN101777081A (en) 2010-07-14

Family

ID=42513542

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201010128121A CN101777081A (en) 2010-03-08 2010-03-08 Method and device for improving webpage access speed

Country Status (2)

Country Link
CN (1) CN101777081A (en)
WO (1) WO2011109957A1 (en)

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930475A (en) * 2010-09-14 2010-12-29 中兴通讯股份有限公司 Web page display method and browser
CN102123168A (en) * 2011-01-14 2011-07-13 广州市动景计算机科技有限公司 Web page pre-reading and integration method and system based on relay server
WO2012119496A1 (en) * 2011-03-07 2012-09-13 腾讯科技(深圳)有限公司 Pre-reading method and equipment
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN102902805A (en) * 2012-10-15 2013-01-30 东软集团股份有限公司 Page access method and device
CN102957712A (en) * 2011-08-17 2013-03-06 阿里巴巴集团控股有限公司 Method and system for loading website resources
CN103077225A (en) * 2012-12-31 2013-05-01 华为技术有限公司 Data reading method, device and system
CN103460205A (en) * 2011-08-01 2013-12-18 华为技术有限公司 Method and apparatus for web page prefetching
CN103530295A (en) * 2012-07-05 2014-01-22 腾讯科技(深圳)有限公司 Webpage pre-reading method and device
CN103886038A (en) * 2014-03-10 2014-06-25 中标软件有限公司 Data caching method and device
CN104272306A (en) * 2012-05-11 2015-01-07 微软公司 Flip ahead
CN104462567A (en) * 2014-12-26 2015-03-25 北京奇虎科技有限公司 Web page switching method and device and comprehensive page providing device
CN104980311A (en) * 2014-04-14 2015-10-14 腾讯科技(深圳)有限公司 Method, device and system for predicting network access
CN105868207A (en) * 2015-01-21 2016-08-17 方正宽带网络服务有限公司 Network resource pushing method and apparatus
EP3457289A1 (en) 2017-09-15 2019-03-20 ProphetStor Data Services, Inc. Method for determining data in cache memory of cloud storage architecture and cloud storage system using the same

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102647481B (en) * 2012-03-31 2016-04-06 北京奇虎科技有限公司 An access preset apparatus and method for network address

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1522418A (en) * 2001-03-08 2004-08-18 国际商业机器公司 Predictive caching and highlighting of web pages
WO2009085664A2 (en) * 2007-12-27 2009-07-09 Microsoft Corporation Relevancy sorting of users browser history
CN101493832A (en) * 2009-03-06 2009-07-29 辽宁般若网络科技有限公司 Website content combine recommendation system and method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6871218B2 (en) * 2001-11-07 2005-03-22 Oracle International Corporation Methods and systems for preemptive and predictive page caching for improved site navigation
CN101369280A (en) * 2008-10-10 2009-02-18 深圳市茁壮网络技术有限公司 Method and device for web page browsing in digital television terminal

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1522418A (en) * 2001-03-08 2004-08-18 国际商业机器公司 Predictive caching and highlighting of web pages
WO2009085664A2 (en) * 2007-12-27 2009-07-09 Microsoft Corporation Relevancy sorting of users browser history
CN101493832A (en) * 2009-03-06 2009-07-29 辽宁般若网络科技有限公司 Website content combine recommendation system and method

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101930475A (en) * 2010-09-14 2010-12-29 中兴通讯股份有限公司 Web page display method and browser
CN102123168A (en) * 2011-01-14 2011-07-13 广州市动景计算机科技有限公司 Web page pre-reading and integration method and system based on relay server
CN102123168B (en) 2011-01-14 2012-07-18 广州市动景计算机科技有限公司 Web page pre-reading and integration method and system based on relay server
WO2012119496A1 (en) * 2011-03-07 2012-09-13 腾讯科技(深圳)有限公司 Pre-reading method and equipment
CN102681996A (en) * 2011-03-07 2012-09-19 腾讯科技(深圳)有限公司 Pre-reading method and device
CN102681996B (en) * 2011-03-07 2015-12-16 腾讯科技(深圳)有限公司 Pre-reading method and apparatus
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN103460205B (en) * 2011-08-01 2016-11-02 华为技术有限公司 A method and apparatus for prefetching pages
CN103460205A (en) * 2011-08-01 2013-12-18 华为技术有限公司 Method and apparatus for web page prefetching
CN102957712A (en) * 2011-08-17 2013-03-06 阿里巴巴集团控股有限公司 Method and system for loading website resources
CN102957712B (en) * 2011-08-17 2016-04-20 阿里巴巴集团控股有限公司 Website resource loading method and system
CN104272306B (en) * 2012-05-11 2018-04-27 微软技术许可有限责任公司 The flip ahead
CN104272306A (en) * 2012-05-11 2015-01-07 微软公司 Flip ahead
CN103530295A (en) * 2012-07-05 2014-01-22 腾讯科技(深圳)有限公司 Webpage pre-reading method and device
CN103530295B (en) * 2012-07-05 2018-12-07 腾讯科技(深圳)有限公司 Method and apparatus for pre-reading the page
CN102902805A (en) * 2012-10-15 2013-01-30 东软集团股份有限公司 Page access method and device
CN103077225A (en) * 2012-12-31 2013-05-01 华为技术有限公司 Data reading method, device and system
CN103886038B (en) * 2014-03-10 2017-11-03 中标软件有限公司 Method and apparatus for data cache
CN103886038A (en) * 2014-03-10 2014-06-25 中标软件有限公司 Data caching method and device
CN104980311A (en) * 2014-04-14 2015-10-14 腾讯科技(深圳)有限公司 Method, device and system for predicting network access
CN104462567A (en) * 2014-12-26 2015-03-25 北京奇虎科技有限公司 Web page switching method and device and comprehensive page providing device
CN104462567B (en) * 2014-12-26 2018-01-09 北京奇虎科技有限公司 Method and apparatus for switching and integrated page web page providing means
CN105868207A (en) * 2015-01-21 2016-08-17 方正宽带网络服务有限公司 Network resource pushing method and apparatus
EP3457289A1 (en) 2017-09-15 2019-03-20 ProphetStor Data Services, Inc. Method for determining data in cache memory of cloud storage architecture and cloud storage system using the same

Also Published As

Publication number Publication date
WO2011109957A1 (en) 2011-09-15

Similar Documents

Publication Publication Date Title
EP2339816B1 (en) Domain name system lookup latency reduction
Nanopoulos et al. Effective prediction of web-user accesses: A data mining approach
US20090094213A1 (en) Composite display method and system for search engine of same resource information based on degree of attention
CA2465536C (en) Methods and systems for preemptive and predictive page caching for improved site navigation
US8966053B2 (en) Methods and systems for performing a prefetch abort operation for network acceleration
Armstrong et al. Efficient and transparent dynamic content updates for mobile clients
CN100501746C (en) Web page collecting method and web page collecting server
US20060294223A1 (en) Pre-fetching and DNS resolution of hyperlinked content
US20090204682A1 (en) Caching http request and response streams
CN1108685C (en) Distributed system and method for prefetching objects
Fan et al. Web pre fetching between low-bandwidth clients and proxies: potential and performance
US20080301300A1 (en) Predictive asynchronous web pre-fetch
Ihm et al. Towards understanding modern web traffic
US8880594B2 (en) Computer networking system and method with Javascript execution for pre-fetching content from dynamically-generated URL
US7552195B2 (en) Dynamic page generation acceleration using component-level caching by determining a maximum navigation probability for a particular cacheline
US7058691B1 (en) System for wireless push and pull based services
EP1665071A1 (en) System and method for determining the unique web users and calculating the reach, frequency and effective reach of user web access
US9456050B1 (en) Browser optimization through user history analysis
US20140310095A1 (en) Mobile click fraud prevention
US9037638B1 (en) Assisted browsing using hinting functionality
Nanopoulos et al. A data mining algorithm for generalized web prefetching
CN1459064A (en) Method for searching and analying information in data networks
US9081789B2 (en) System for prefetching digital tags
JP2007510224A (en) Method of determining the priority of the segment of the multimedia content in the proxy cache
WO2012094937A1 (en) Webpage pre-reading method, transfer server and webpage pre-reading system

Legal Events

Date Code Title Description
C06 Publication
C10 Entry into substantive examination
C12 Rejection of a patent application after its publication