CN106326261A - Pre-reading method and device for webpage and intelligent terminal device - Google Patents

Pre-reading method and device for webpage and intelligent terminal device Download PDF

Info

Publication number
CN106326261A
CN106326261A CN201510368430.3A CN201510368430A CN106326261A CN 106326261 A CN106326261 A CN 106326261A CN 201510368430 A CN201510368430 A CN 201510368430A CN 106326261 A CN106326261 A CN 106326261A
Authority
CN
China
Prior art keywords
webpage
click
path
web page
clicking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510368430.3A
Other languages
Chinese (zh)
Inventor
梁捷
蒋喻新
姚文清
梁延俊
芦焱
许延伟
仇家伟
吴伙成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Guangzhou Dongjing Computer Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Dongjing Computer Technology Co Ltd filed Critical Guangzhou Dongjing Computer Technology Co Ltd
Priority to CN201510368430.3A priority Critical patent/CN106326261A/en
Publication of CN106326261A publication Critical patent/CN106326261A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention provides a pre-reading method and device for a webpage and an intelligent terminal device. The pre-reading method for the web page includes: acquiring access information that a user accesses all the web pages in a time period; analyzing the access information and determining a plurality of parameter values; determining a hot value of each click path according to the plurality of determined parameter values; making the plurality of click paths into a pre-reading list of web pages on the basis of the hot values; and searching a click path of the current browsing web page from the pre-reading list so as to pre-read data of a corresponding target web page to be browsed. The beneficial effects of the technical scheme are that webpage data with high probability can be pre-read in advance for a use, the webpage data can be saved in a local cache, the speed that the user opens the webpage can be improved, and the user experience can be improved.

Description

The pre-reading method of a kind of Webpage, device and intelligent terminal
Technical field
The present invention relates to web browsing technology field, in particular to a kind of Webpage Pre-reading method, device and intelligent terminal.
Background technology
Along with universal, the computer of the Internet is as one of the essential product of family and work, People use the access internet through browsers on computer increased;Especially as such as intelligence Popularizing rapidly of the smart terminal product of energy mobile phone peace plate computer, user uses terminal to browse Device accesses the Internet becomes daily behavior.During user uses browser to open webpage, If current network environment is limited, or the speed of service of terminal unit is limited, Jiu Huiying Ringing the opening speed of webpage, cause user to wait, the experience sense of user's online is very poor.Impact The main cause of webpage opening speed is to download the overlong time of web page contents.
At present, the general method using the webpage to possible needs to carry out pre-read meets use The needs of family fast opening web page.The method mainly browses the current web page page user During, server end obtains next step webpage that may need of user, does not browses user First network address and the resource thereof of these webpages are loaded in local cache before these webpages;When From local cache, directly read related data during user's access wherein certain webpage show Read to user, it is to avoid the process of the webpage that waits for downloads, shorten user and carry out web page access Webpage response time after operation.
The existing common following two of webpage pre-reading method:
The first, wrap on this webpage of pre-read when user browses some webpage of continuous content The one or more keywords being linked to adjacent webpage contained, such as " lower one page ", " next Page " etc., take the web page contents of link the most successively and put in local cache.
The second, obtain web page listings from server end, be then successively read in this web page listings Each web page contents is also put in local cache.
From the foregoing, it will be observed that the first webpage pre-reading method existing may be only available for particular webpage, I.e. only a longer content is split into the multiple webpages got up with hyperlink chain link effective, And for there is not other webpage of page number order in a large number, such as news web page etc., this method Will be unable to pre-read and get the follow-up webpage that may access of user, therefore can not accelerate browser The display of Webpage.Existing the second webpage pre-reading method then needs pre-read a large amount of Data and be loaded into local cache, seriously take spatial cache, even cause spatial cache not The situation of foot occurs.
It addition, people also propose to fetch " conjecture " user possibility based on the focus chain on webpage Then these webpages are carried out pre-read targetedly, to carry by the thinking of the webpage accessed The effectiveness of high pre-read.Such as, Application No. disclosed in JIUYUE in 2012 12 days 201210074771.6 Chinese patent literature, entitled a kind of " webpage prestrain method And system ", the method is: the focus link comprised in determining source web page, links focus Corresponding target web carries out prestrain;But, the determination focus disclosed in this patent documentation The method that the mode of link or those skilled in the art are readily apparent that, such as according to source web page In A, certain clicked total degree of link determines whether to link for focus;With user to each The click order of link.This method determines that the accuracy rate that focus links is on the low side.Such as, should In the 0099th section of the patent documentation of Application No. 201210074771.6 with regard to readme " In actual application, it is understood that there may be although user clicks certain link, but user may be not Like or be not miss potter this link situation, in this case, if merely By number of clicks, ground distinguishes whether a link is that focus links, may be such that result Not accurate enough ".Equally, with user, each click linked order is determined that focus links Accuracy rate be also on the low side.Such as, there is also the user's multiple heat to comprising on source web page Point link situation interested, its not necessarily every day be all to click in a fixed order, when When hot information occurs in the link of certain focus, user the most first clicks on the link of this focus.Such as During Football World Championship, user can first click on World Cup Competition news;During world cup, Certain money very reputable brand holds new product release, and user can first click on the heat about this new product Point link.Therefore, even if both modes being combined, focus chain is determined by this compound mode The accuracy rate connect is also by the low side.
It addition, the patent documentation of Application No. 201210074771.6 does not consider focus Link there is also the problem that temperature decays over time.Such as, U.S.'s basketball NBA always competes A period of time after end, user pays close attention to the temperature of NBA column and can significantly reduce.Further for example, Hot news or focus top news are all occurring every day, and hot news or the focus top news of today arrive Perhaps temperature will just reduce tomorrow, perhaps pay close attention to regard to no one after the week.
The Chinese patent of Application No. 201310461879.5 disclosed in 8 days January in 2014 Document, entitled " Webpage hot point resource update method based on pre-read and device ", It mainly proposes the method that the temperature for link carries out temperature attenuation processing.But, should Its temperature is defined as by patent documentation: " temperature H can represent and make a reservation in the past In the time window of duration, this points out the number of times that page is clicked " (see the 0047th section), itself and Shen Please number be 201210074771.6 Chinese patent literature disclosed in the determination side of focus link As formula is, this just equally exists the defect determining that the accuracy rate of temperature is on the low side;It addition, should Patent documentation also proposes time decay factor and temperature decay factor, utilizes this decay factor Carry out temperature attenuation processing;But this patent documentation is not given and how to arrange or obtain This decay factor, the most general says by setting after client log is carried out data mining Put, time decay factor X and temperature decay factor Y to different source address configurations Generally and differ.Because the Chinese patent literature in Application No. 201310461879.5 In, it is thus achieved that correct time decay factor and the place of the temperature decay factor temperature for linking Reason result is most important, and its be not given concrete arrange correct time decay factor and The method of temperature decay factor, therefore exists and utilizes this decay factor to carry out at temperature decay Reason obtains the defect that the accuracy rate of temperature is on the low side;It addition, its time attenuation processing process is: N=N × X × T/Now, in formula, X is time decay factor, and T is that link is to the point in information Going out page and be accessed for the time for the first time, Now is current time (such as server current time), Above-mentioned first time is accessed for the time and current time may each comprise: year, month, day, Hour, minute and second;This time attenuation processing process equally exists does not possesses universality Defect, so that effectively determining of all of webpage temperature cannot be applicable to.Such as, football generation During boundary's cup, even if the user of non-football fan also can pay close attention to or click on the news of World Cup Competition, But the time span paying close attention to World Cup Competition of football fan and non-football fan is different, World Cup Competition Terminating, football fan user still can click on the related news of world cup, including playback column etc., Rather than football fan will not click on the related news of world cup again.
Summary of the invention
It is an object of the invention to provide the pre-reading method of a kind of Webpage, device and intelligence Energy terminal unit, to improve above-mentioned problem.
Provide a kind of Web page pre-reading access method, its feature in an embodiment of the present invention It is, including:
Obtain one or more user access in a period of time to all Webpages Information;
Analyze access information and determine multiple parameter values;
Determine that every clicks through from the first Webpage according to the multiple parameter values determined The hot value clicking on path of the second Webpage;
Based on hot value, a plurality of click path is made the pre-read list of Webpage;
The click path of the webpage currently browsed is inquired about from pre-read list, thus pre-read Target web data to be browsed.
Preferably, one or more users are being obtained in a period of time to all webpage pages During the access information in face, data to each Webpage carry out the pretreatment of data cleansing Step.
Preferably, described parameter includes: the point of each Webpage in all Webpages The amount of hitting, click through the click time clicking on path of the second Webpage from the first Webpage Number, accounting rate and clicking rate, click through the use of the second Webpage from the first Webpage Amount and the quantity of the second Webpage hit from the first Webpage point.
Preferably, before determining every hot value clicking on path, Webpage is screened out Click volume is less than the Webpage of webpage click amount threshold value, and screens out the accounting clicking on path Rate is less than the click path accounting for rate threshold.
Preferably, every is being determined from the first Webpage according to the multiple parameter values determined Click through in the step of the hot value clicking on path of the second Webpage, these computational methods For:
h o t ( r e f e r , u r l ) = log ( p v + 1 ) * log ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * log ( u v + 1 ) log ( u r l _ n u m + 1 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate of current path (refer--> url);
Uv: represent the number of users accessing this click path refer--> url;
Url_num: represent and click through the second Webpage from the first Webpage refer The quantity of url.
Preferably, every is being determined from the first Webpage according to the multiple parameter values determined Click through in the step of the hot value clicking on path of the second Webpage, these computational methods For:
h o t ( r e f e r , u r l ) = Σ i 1.0 2 i * log ( p v + 1 ) * log ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * log ( u v + 1 ) log ( u r l _ n u m + 1 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
I: represent the current time number calculating the time of distance;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate of current path (refer--> url);
Uv: represent the number of users accessing this click path refer--> url;
Url_num: represent the second Webpage hit from the first Webpage refer point The quantity of url.
Preferably, utilize Wilson's interval formula that parameter ctr is carried out confidence interval calculating, Take the interval limit end value as parameter ctr.
Preferably, in the pre-read that based on hot value, a plurality of click path is made Webpage In the step of list, by a plurality of click path in the way of hot value size carries out ordered arrangement Make the pre-read list of Webpage.
Preferably, before the pre-read list making Webpage, hot value is screened out low Click path in preset heat threshold value.
Preferably, in the click path inquiring about the webpage currently browsed from pre-read list, Thus in the step of pre-read target web data to be browsed, from pre-read list Inquiry goes out a plurality of click path of multiple target web with the webpage click currently browsed, and selects Pre-read target web data are carried out in the click path of maximum heat angle value, or by the fall of hot value Sequential mode pre-read multiple target web data;When pre-read list does not exist from the most clear When the webpage click look at goes out the click path of target web, do not trigger pre-read operation.
Additionally providing a kind of Web page pre-reading fetching in an embodiment of the present invention to put, it is special Levy and be, including:
Acquisition module, analysis and processing module, determine module, generation module and pre-read delivery Block, wherein:
Described acquisition module is used for obtaining user in a period of time to all Webpages Access information;
Described analysis and processing module is used for analyzing access information and determining multiple parameter values;
Described determine that module for determining every from first according to the multiple parameter values determined Webpage clicks through the hot value clicking on path of the second Webpage;
Described generation module is for making Webpage based on hot value by a plurality of click path Pre-read list;
Described pre-read module is for inquiring about the webpage that currently browses from pre-read list Click path, thus the target web data that pre-read is to be browsed.
Preferably, also include: pretreatment module, for web data is carried out data cleansing.
Preferably, also include: the first screening module, for screening out the click of Webpage Amount is less than the Webpage of webpage click amount threshold value, and the accounting rate screening out click path is low In the click path accounting for rate threshold.
Preferably, also include: the second screening module, be used for screening out hot value less than presetting The click path of heat degree threshold.
Additionally provide a kind of intelligent terminal in an embodiment of the present invention, it is characterised in that Put including Web page pre-reading fetching as above.
The technical scheme that the embodiment of the present invention provides provides the benefit that: binding analysis determines The multiple parameter values of Webpage show that user accesses the temperature trend of webpage, it is possible in advance The hit probability pre-reading taking-up web data to user is greatly improved, and then ensure that pre-read High accuracy and high-efficiency, at the web data of pre-read high probability and be saved in this locality After caching, improve user and open the speed of Webpage, improve Consumer's Experience.
Accompanying drawing explanation
Fig. 1 is the flow chart of the Web page pre-reading access method of the present invention;
Fig. 2 is the structural representation that the Web page pre-reading fetching of the present invention is put;
Fig. 3 is the knot of the preferred embodiment that the Web page pre-reading fetching of the present invention is put Structure schematic diagram.
Detailed description of the invention
Below in conjunction with accompanying drawing in the embodiment of the present invention, to the technical side in the embodiment of the present invention Case is clearly and completely described, it is clear that described embodiment is only the present invention one Section Example rather than whole embodiments.Generally herein described in accompanying drawing and illustrate The assembly of the embodiment of the present invention can arrange with various different configurations and design.Therefore, Hereinafter the detailed description of the embodiments of the invention provided in the accompanying drawings is not intended to limit The scope of claimed invention, but it is merely representative of the selected embodiment of the present invention.Base In embodiments of the invention, those skilled in the art are not making the premise of creative work Lower obtained every other embodiment, broadly falls into the scope of protection of the invention.
Below in conjunction with the drawings and specific embodiments of the present invention, to technical scheme It is clearly and completely described.
Fig. 1 is the flow chart of the pre-reading method of the Webpage of the present invention.
As it is shown in figure 1, the first embodiment of the present invention provides pre-reading of a kind of Webpage Access method, including:
Step S101: obtain user's access in a period of time to all Webpages Information.
Obtain user and in a period of time all Webpages are accessed the mode of information The client internet records daily record that the network of the terminal unit reading user accesses can be used Mode, it would however also be possible to employ read server preserve access Webpage internet records day The mode of will obtains.If daily record does not record the visit capacity of each page, Here after the information obtaining all webpages, can also add up the visit capacity of each webpage with And time distribution situation etc..User can be 1, it is also possible to is multiple.In general, Obtain the mode of all Webpages that multiple not specific user accesses in a period of time Preferably employ the mode reading the internet records daily record accessing Webpage that server preserves. The information content of this log recording is that those skilled in the art both knows about, and the most no longer lifts Example explanation.Same, record and obtain user's online and browse the method for webpage and can use Any known method, such as Application No. disclosed in 2013 on Decembers 11, 201310364722.0 Chinese patent literature, entitled " a kind of User operation log letter The record of breath and read method ", and for example Application No. disclosed in 15 days April in 2015 201510038747.0 Chinese patent literature, it is entitled that " a kind of mobile subscriber surfs the Net note The generation method and system of record ".Here one or more users can be obtained a time The interior access information to all Webpages of section.A time period described here can basis Practical situations sets.Such as, when to obtain the access situation of multiple user, should A period of time can be set as multiple hours, such as 12 hours, 24 hours, 30 Hour, 36 hours etc.;When to obtain the access situation of 1 user, this section Time can be set as many days, such as 5 days, 7,10 days, 15 days etc..
In another embodiment, user is being obtained in a period of time to all webpage pages During the access information in face, data to each Webpage carry out the step of pretreatment.Described pre- The step processed includes: data cleansing.Generally, the internet records that either client preserves The internet records daily record that daily record or server preserve all can record substantial amounts of information, wherein wraps Include non-network request data, data lack of standardization and non-main document request data, inconsistent Unrelated data.Such as, daily record data can include IP address, ID, Request access URL, requesting method, the access time, host-host protocol, the byte number of transmission, The attribute such as error code, user agent.The web page browsing request of user may allow clear Multiple file downloaded automatically by device of looking at, and such as some pictures etc., the All Files of download constitutes one Webpage view, constitutes the situation once asking corresponding multiple journal entry.
It is therefore desirable to first data to log recording carry out the pretreatment of data cleansing, by nothing The data closed are disposed from the data of log recording.Such as, URL extension name is washed: In general information website, simply Webpage text is relevant with the request of user, and webpage page Webpage request (the entitled gif of suffix, jpg etc.) of the picture category on face and script type literary composition Part (the entitled js of suffix, the file of cgi, css) be considered ask with user unrelated , should be deleted.Under normal circumstances, user will not specify ask on certain webpage complete Portion's picture and script file, picture and script file in daily record are by web page frame mostly The pictorial information that carries in the webpage of the content for script of configuration, this picture and script file with Family is downloaded automatically as ancillary documents when browsing Webpage word content, these pictures and Script file can not truly reflect the request behavior of user, will during data cleansing It is deleted.
By data cleansing, fall asking unrelated data dump with user, it is thus achieved that be suitable for Follow-up statistics, analyze the reliably accurate data of operation, be conducive to being added up accurately And analysis result, it is more conducive to reduce the operand of data.
Step S102: analyze access information and determine multiple parameter values.
Described parameter may include that each Webpage in all Webpages click volume, Click through the number of clicks clicking on path of the second Webpage from the first Webpage, account for Ratio and clicking rate, hit from the first Webpage point the second Webpage number of users and The quantity of the second Webpage that the first Webpage point hits.
Analyze and determine that each Webpage in all Webpages is accessed for number of times, i.e. point The amount of hitting.
Analyze and determine the click path clicking through the second Webpage from the first Webpage Number of clicks, i.e. click volume.
Analyze and determine the click path clicking through the second Webpage from the first Webpage Accounting rate, this accounting rate is to hit the from the first Webpage point within the time period set The number of times of two Webpages hits the secondary of all Webpages with from this first Webpage point The percentage ratio of number.
Analyze and determine the click path clicking through the second Webpage from the first Webpage Clicking rate, namely by the click volume in this click path divided by the click of the first Webpage Amount.
Analyze and determine the number of users clicking through the second Webpage from the first Webpage.
Analyze the quantity determining the second Webpage hit from the first Webpage point.
Generally, existed by various browsers such as such as PC browser or intelligent terminal's browsers After opening certain webpage, the various elements constituting this webpage can be shown in browser interface, For example, it may be possible to include word, picture, audio frequency, video etc. content, it is also possible to include chain Connect this common Webpage element.Link in visitor's webpage clicking just can be automatic Jumping at the target of link, it is typically another webpage.User is browsed by browser The network that all can record at all terminal units of the various information of various Webpages access Client internet records daily record in, time of such as accessing (include accessing the time started and Access the end time), URL address, the content type etc. of transmission.For the ease of distinguishing, Corresponding for a link webpage can be referred to as next webpage or target web, by webpage Appearance includes the webpage of this link and is referred to as current web page or source web page, click on path be exactly from Current web page (source web page) page clicks through another webpage (target web) page Path, as a rule, the afterbody in this path is current web page (source web page) page, this road The head in footpath is that user puts another webpage (target web) page hit.
In this example, it is assumed that a time period is 12 hours, each Webpage Click volume be exactly the number of times that this Webpage is clicked in 12 hours.From the first net It is exactly little at 12 that the page page clicks through the accounting rate clicking on path of the second Webpage Time interior hit the number of times of the second Webpage from the first Webpage point and account for from this first webpage Page point hits the percentage ratio of the number of times of all Webpages.Such as, user was at 12 hours Interior resource center's Webpage (the first Webpage) from such as UC browser homepage It is 3 times that point hits the number of times of sports section Webpage (the second Webpage), and this 12 All webpage pages are hit from this resource center's Webpage (the first Webpage) point in hour The number of times in face is 20 times, described clicked go out all webpages e.g. news column webpage page Face, sports section Webpage, amusement hurdle Webpage, finance and economics hurdle Webpage, society hurdle Webpage, NBA hurdle Webpage etc., then from described resource center Webpage (first Webpage) put the click road hitting described sports section Webpage (the second Webpage) The accounting rate in footpath is 15%;Same, if the user while in 12 hours from described information Heart Webpage (the first Webpage) point hits amusement hurdle Webpage (the second webpage page Face) number of times be 5 times, the accounting rate in this click path is then 25%.
The parameter related in step S102 is all to be the most all Known parameter, the most no longer does too much explanation.Step S103: determine according to above-mentioned Multiple parameter values determine that every clicks through the second Webpage from the first Webpage Click on the hot value in path.
Illustrate how to determine the hot value in click path below by several examples.
Example 1:
Directly use the above-mentioned multiple parameter values determined to determine every temperature clicking on path The method of value is:
h o t ( r e f e r , u r l ) = l o g ( p v + 1 ) * l o g ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * log ( u v + 1 ) l o g ( u r l _ n u m + 1 ) - - - ( 1 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks, i.e. click on path refer--> url click volume;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate currently clicking on path (refer--> url), the most currently Click on the click volume click volume divided by the first Webpage refer in path;
Uv: represent and access the number of users clicking on path refer--> url;
Url_num: represent and click through the second Webpage from the first Webpage refer The quantity of url.
Example 2:
When the duration of the time period set is shorter, arise that the webpage that the user of acquisition accesses Measuring very few probability, in order to make up this adverse effect, parameter ctr is done by inventor Do confidence interval to calculate, namely utilize known Wilson's interval formula, take interval limit End value as parameter ctr;Then, above-mentioned formula (1) is used to be calculated every Click on the hot value in path.
Example 3:
The factor decayed over time in view of temperature, the invention provides use and above-mentioned determines Multiple parameter values determine that every second method of hot value clicking on path is:
h o t ( r e f e r , u r l ) = Σ i 1.0 2 i * log ( p v + 1 ) * log ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * log ( u v + 1 ) log ( u r l _ n u m + 1 ) - - - ( 2 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
I: represent the current time number calculating the time of distance;This time number can be hourage Or natural law;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks, i.e. click on path refer--> url click volume;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate currently clicking on path (refer--> url), the most currently Click on the click volume click volume divided by the first Webpage refer in path;
Uv: represent and access the number of users clicking on path refer--> url;I.e. from the first webpage Page refer point hits the number of users of the second Webpage url;
Url_num: represent the second Webpage hit from the first Webpage refer point The quantity of url.
In above formula (2), when consider with hour for temperature decay unit of time time, i Value is the current hourage calculating the time of distance;Single when considering with sky for the time that temperature decays During position, i value is the current natural law calculating the time of distance.
It is, of course, also possible to use the mode that example 2 combines with example 3 to calculate every point Hit the hot value in path.
In another embodiment, screened out before determining every hot value clicking on path The click volume of Webpage is less than the Webpage of webpage click amount threshold value, and screens out click The accounting rate in path is less than the click path accounting for rate threshold.
Described webpage click amount threshold value and the described rate threshold that accounts for can be according to reality application feelings Depending on condition.Such as when set collection user access a time period of webpage as different value time, Such as 12 hours, 24 hours, 30 hours, or 5,7 days, 10 days etc., Described webpage click amount threshold value and the described rate threshold that accounts for can be provided accordingly to different Value.When collecting the number of users difference accessing webpage, described webpage click amount threshold value and described Account for rate threshold and can also be provided accordingly to different values.
Step S104: pre-reading of Webpage is made in a plurality of click path based on hot value Take list.
A plurality of click path is made the pre-read list of Webpage, in this pre-read list A plurality of click path can be ordered into arrangement.After step S103, every point Hit path and had a hot value, can incite somebody to action in the way of the size of hot value is ranked up It is a plurality of that a plurality of click path is made in the pre-read list of Webpage, i.e. pre-read list Click on path and can carry out ordered arrangement with the size of hot value.
In another embodiment, if the click number of paths counted is the most, permissible First screen out the click path that hot value is low, to reduce the operand of data, it is also possible to avoid The web data pre-read that user will not be browsed also is loaded into local cache, thus saves caching Taking of space resources.Preset heat threshold value can be carried out based on experience value, screen out hot value Click path less than heat degree threshold.In order to reduce the operand of data, net can made Before the pre-read list of the page page, first screen out the hot value point less than preset heat threshold value Hit path.
Step S105: inquire about the click path of the webpage currently browsed from pre-read list, Thus the target web data that pre-read is to be browsed.
When user browses current web page or the first Webpage, inquire about from pre-read list Using current web page or the first Webpage as the click path of source web page.When pre-read list Middle existence goes out a plurality of click road of multiple target web from current web page or the first webpage click During footpath, the click path of maximum heat angle value can be selected to carry out pre-read target web and be loaded into this Ground caching, of course for the accuracy rate of guarantee pre-read target web, can be by hot value The multiple target web of descending mode pre-read is also loaded into local cache, such as pre-read 2, 3,4 or more target web.This target web is exactly treating accordingly in step S105 The webpage browsed.
It addition, go out target network when pre-read list does not exist the webpage click from currently browsing During the click path of page, do not trigger pre-read operation, when user clicks on next Webpage Time obtain corresponding info web from the webserver.
If the webpage that user please not look for novelty and directly stop browsing webpage, as close browser, Then discharge the web data of pre-read, it is to avoid do not have the web data of the pre-read used to take Substantial amounts of local cache resource.
The Web page pre-reading access method that the embodiment of the present invention provides, it provides the benefit that: tie Close and analyze the multiple parameter values of the Webpage determined to show that user accesses the temperature of webpage Trend, it is possible in advance to user pre-read take out web data hit probability be greatly improved, enter And ensure that high accuracy and the high-efficiency of pre-read, at the webpage number of pre-read high probability According to and after being saved in local cache, improve user and open the speed of Webpage, improve Consumer's Experience.
Fig. 2 is the structural representation that the Web page pre-reading fetching of the present invention is put.Such as Fig. 2 institute Showing, the Web page pre-reading fetching of the present invention is put and is included: acquisition module 201, analyzing and processing Module 202, determine module 203, generation module 204 and pre-read module 205, wherein:
Described acquisition module 201 is used for obtaining user in a period of time to all webpages The access information of the page;
Described analysis and processing module 202 is used for analyzing access information and determining multiple parameter values. Described parameter includes: the click volume of each Webpage in all Webpages, from first Webpage click through the second Webpage click on the number of clicks in path, accounting rate and Clicking rate, click through the number of users of the second Webpage from the first Webpage and from The quantity of the second Webpage that 1 Webpage point hits;
Described determine module 203 for determine according to the multiple parameter values determined every from First Webpage clicks through the hot value clicking on path of the second Webpage;
Described generation module 204 is for making webpage based on hot value by a plurality of click path The pre-read list of the page;
Described pre-read module 205 is for inquiring about the net currently browsed from pre-read list The click path of page, thus the target web data that pre-read is to be browsed.
Described Web page pre-reading fetching puts concrete function and the friendship of modules in embodiment Mode can be found in the record of Fig. 1 correspondence embodiment mutually, does not repeats them here.
Further, described acquisition module includes pretreatment module, for entering web data Row data cleansing.
Fig. 3 is the knot of the preferred embodiment that the Web page pre-reading fetching of the present invention is put Structure schematic diagram.As it is shown on figure 3,
Further, described Web page pre-reading fetching is put and is also included the first screening module 206, For screening out the click volume Webpage less than webpage click amount threshold value of Webpage, and Screen out the accounting rate clicking on path less than the click path accounting for rate threshold.
Further, described Web page pre-reading fetching is put and is also included the second screening module 207, For screening out the hot value click path less than preset heat threshold value.
Additionally providing a kind of intelligent terminal in an embodiment of the present invention, it includes as above Described Web page pre-reading fetching is put.
The Web page pre-reading fetching accessing temperature based on user that the embodiment of the present invention provides Putting, it provides the benefit that: the multiple parameter values of the Webpage that binding analysis determines draw User accesses the temperature trend of webpage, it is possible to pre-read the life taking out web data in advance to user Middle probability is greatly improved, and then ensure that high accuracy and the high-efficiency of pre-read, in advance After reading the web data of high probability and being saved in local cache, improve user and open webpage The speed of the page, improves Consumer's Experience.
What the embodiment of the present invention was provided access the Web page pre-reading of temperature based on user takes The computer program of method, including the computer-readable storage medium storing program code Matter, the instruction that described program code includes can be used for performing described in previous methods embodiment Method, implements and can be found in embodiment of the method, does not repeats them here.
Those skilled in the art is it can be understood that arrive, for convenience and simplicity of description, The specific works process of the device of foregoing description, be referred in preceding method embodiment is right Answer process, do not repeat them here.
If described function realizes and as independent product using the form of SFU software functional unit When selling or use, can be stored in a computer read/write memory medium.Based on this The understanding of sample, prior art is contributed by technical scheme the most in other words The part of part or this technical scheme can embody with the form of software product, this meter Calculation machine software product is stored in a storage medium, including some instructions with so that one Computer equipment (can be personal computer, server, or the network equipment etc.) performs All or part of step of method described in each embodiment of the present invention.And aforesaid storage medium Including: USB flash disk, portable hard drive, read only memory (ROM, Read-Only Memory), Random access memory (RAM, Random Access Memory), magnetic disc or CD etc. The various media that can store program code.
The above, the only detailed description of the invention of the present invention, but protection scope of the present invention Being not limited thereto, any those familiar with the art is in the skill that the invention discloses In the range of art, change can be readily occurred in or replace, all should contain in protection scope of the present invention Within.Therefore, protection scope of the present invention should be as the criterion with described scope of the claims.

Claims (17)

1. a Web page pre-reading access method, it is characterised in that including:
Obtain one or more user access in a period of time to all Webpages to believe Breath;
Analyze access information and determine multiple parameter values;
Determine that every clicks through from the first Webpage according to the multiple parameter values determined The hot value clicking on path of two Webpages;
Based on hot value, a plurality of click path is made the pre-read list of Webpage;
The click path of the webpage currently browsed is inquired about from pre-read list, thus pre-read Target web data to be browsed.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: obtaining one or more users in a period of time to all Webpages During access information, data to each Webpage carry out the pre-treatment step of data cleansing.
Web page pre-reading access method the most according to claim 1, it is characterised in that Described parameter includes: the click volume of each Webpage in all Webpages, from first Webpage click through the second Webpage click on the number of clicks in path, accounting rate and Clicking rate, click through the number of users of the second Webpage from the first Webpage and from One Webpage clicks through the quantity of the second Webpage.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: before determining every hot value clicking on path, screen out the click of Webpage Amount is less than the Webpage of webpage click amount threshold value, and the accounting rate screening out click path is low In the click path accounting for rate threshold.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: determining that every is clicked on from the first Webpage according to the multiple parameter values determined Entering in the step of the hot value clicking on path of the second Webpage, these computational methods are:
h o t ( r e f e r , u r l ) = log ( p v + 1 ) * log ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * log ( u v + 1 ) log ( u r l _ n u m + 1 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate currently clicking on path (refer--> url);
Uv: represent and access the number of users clicking on path refer--> url;
Url_num: represent and click through the second Webpage from the first Webpage refer The quantity of url.
Web page pre-reading access method the most according to claim 5, it is characterised in that Also include:
Utilize Wilson's interval formula that parameter ctr is carried out confidence interval calculating, take under interval Limit the end value as parameter ctr.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: determining that every is clicked on from the first Webpage according to the multiple parameter values determined Entering in the step of the hot value clicking on path of the second Webpage, these computational methods are:
h o t ( r e f e r , u r l ) = Σ i 1.0 2 i * l o g ( p v + 1 ) * l o g ( p v _ r u + 1 ) * s q r t ( c t r ) * s q r t ( r a t i o ) * l o g ( u v + 1 ) l o g ( u r l _ n u m + 1 )
In formula, hot (refer, url) represents the hot value clicking on path refer--> url;
I: represent the current time number calculating the time of distance;
Pv: represent the click volume of the first Webpage refer;
Pv_ru: represent and click through the second Webpage url from the first Webpage refer Number of clicks;
Ratio: represent the accounting rate clicking on path (refer--> url);
Ctr: represent the clicking rate currently clicking on path (refer--> url);
Uv: represent and access the number of users clicking on path refer--> url;
Url_num: represent and click through the second Webpage from the first Webpage refer The quantity of url.
Web page pre-reading access method the most according to claim 7, it is characterised in that Also include:
Utilize Wilson's interval formula that parameter ctr is carried out confidence interval calculating, take under interval Limit the end value as parameter ctr.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: in the pre-read list that based on hot value, a plurality of click path is made Webpage Step in, in the way of hot value size carries out ordered arrangement, a plurality of click path is made The pre-read list of Webpage.
Web page pre-reading access method the most according to claim 1, it is characterised in that Also include: before the pre-read list making Webpage, screen out hot value less than pre- If the click path of heat degree threshold.
11. Web page pre-reading access methods according to claim 1, it is characterised in that Also include: in the click path inquiring about the webpage currently browsed from pre-read list, thus In the step of the target web data that pre-read is to be browsed,
The webpage click inquiring about currently to browse from pre-read list goes out multiple target web A plurality of click path, selects the click path of maximum heat angle value to carry out pre-read target web data, Or by descending mode pre-read multiple target web data of hot value.
12. Web page pre-reading access methods according to claim 11, its feature exists In, also include:
When pre-read list does not exist the point going out target web from the webpage click currently browsed When hitting path, do not trigger pre-read operation.
13. 1 kinds of Web page pre-reading fetching are put, it is characterised in that including:
Acquisition module, analysis and processing module, determine module, generation module and pre-read delivery Block, wherein:
Described acquisition module is for obtaining user in a period of time to all Webpages Access information;
Described analysis and processing module is used for analyzing access information and determining multiple parameter values;
Described determine that module for determining every from the first net according to the multiple parameter values determined The page page clicks through the hot value clicking on path of the second Webpage;
Described generation module is for making Webpage based on hot value by a plurality of click path Pre-read list;
Described pre-read module is for inquiring about the point of the webpage currently browsed from pre-read list Hit path, thus the target web data that pre-read is to be browsed.
14. Web page pre-reading fetching according to claim 13 are put, and its feature exists In, also include: pretreatment module, for web data is carried out data cleansing.
15. Web page pre-reading fetching according to claim 13 are put, and its feature exists In, also include: the first screening module, for screening out the click volume of Webpage less than net The Webpage of page click volume threshold value, and screen out the accounting rate clicking on path less than accounting rate The click path of threshold value.
16. Web page pre-reading fetching according to claim 13 are put, and its feature exists In, also include:
Second screening module, for screening out the hot value click road less than preset heat threshold value Footpath.
17. 1 kinds of intelligent terminals, it is characterised in that include such as claim 13-16 Web page pre-reading fetching described in one of is put.
CN201510368430.3A 2015-06-26 2015-06-26 Pre-reading method and device for webpage and intelligent terminal device Pending CN106326261A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510368430.3A CN106326261A (en) 2015-06-26 2015-06-26 Pre-reading method and device for webpage and intelligent terminal device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510368430.3A CN106326261A (en) 2015-06-26 2015-06-26 Pre-reading method and device for webpage and intelligent terminal device

Publications (1)

Publication Number Publication Date
CN106326261A true CN106326261A (en) 2017-01-11

Family

ID=57722701

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510368430.3A Pending CN106326261A (en) 2015-06-26 2015-06-26 Pre-reading method and device for webpage and intelligent terminal device

Country Status (1)

Country Link
CN (1) CN106326261A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107872523A (en) * 2017-11-07 2018-04-03 广东欧珀移动通信有限公司 Loading method, device, storage medium and the mobile terminal of network data
CN107896243A (en) * 2017-11-07 2018-04-10 广东欧珀移动通信有限公司 Accelerated method, device, storage medium and the mobile terminal of network data loading
CN108205589A (en) * 2017-12-29 2018-06-26 成都优易数据有限公司 A kind of temperature iterative calculation method
CN108280168A (en) * 2018-01-19 2018-07-13 中国科学院上海高等研究院 Handling method/system, computer readable storage medium and the electronic equipment of webpage
CN108829856A (en) * 2018-06-21 2018-11-16 青岛海信电器股份有限公司 The resource of web application preloads method and device in display terminal
CN109597743A (en) * 2017-09-30 2019-04-09 北京国双科技有限公司 Page circle choosing method, click volume statistical method and relevant device
CN110069739A (en) * 2019-04-28 2019-07-30 百度在线网络技术(北京)有限公司 The page preloads method and device
CN110889064A (en) * 2019-12-05 2020-03-17 北京百度网讯科技有限公司 Page display method, device, equipment and storage medium

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102663012A (en) * 2012-03-20 2012-09-12 北京搜狗信息服务有限公司 Webpage preloading method and system
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN103049497A (en) * 2012-12-07 2013-04-17 北京奇虎科技有限公司 Method and device for website navigation
CN103530365A (en) * 2013-10-12 2014-01-22 北京搜狗信息服务有限公司 Method and system for acquiring downloading link of resources

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737037A (en) * 2011-04-07 2012-10-17 北京搜狗科技发展有限公司 Webpage pre-reading method, device and browser
CN102663012A (en) * 2012-03-20 2012-09-12 北京搜狗信息服务有限公司 Webpage preloading method and system
CN103049497A (en) * 2012-12-07 2013-04-17 北京奇虎科技有限公司 Method and device for website navigation
CN103530365A (en) * 2013-10-12 2014-01-22 北京搜狗信息服务有限公司 Method and system for acquiring downloading link of resources

Cited By (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109597743A (en) * 2017-09-30 2019-04-09 北京国双科技有限公司 Page circle choosing method, click volume statistical method and relevant device
CN107896243B (en) * 2017-11-07 2020-01-14 Oppo广东移动通信有限公司 Acceleration method and device for network data loading, storage medium and mobile terminal
CN107896243A (en) * 2017-11-07 2018-04-10 广东欧珀移动通信有限公司 Accelerated method, device, storage medium and the mobile terminal of network data loading
CN107872523A (en) * 2017-11-07 2018-04-03 广东欧珀移动通信有限公司 Loading method, device, storage medium and the mobile terminal of network data
CN107872523B (en) * 2017-11-07 2020-04-17 Oppo广东移动通信有限公司 Network data loading method and device, storage medium and mobile terminal
CN108205589A (en) * 2017-12-29 2018-06-26 成都优易数据有限公司 A kind of temperature iterative calculation method
CN108205589B (en) * 2017-12-29 2022-02-15 成都优易数据有限公司 Heat iterative calculation method
CN108280168A (en) * 2018-01-19 2018-07-13 中国科学院上海高等研究院 Handling method/system, computer readable storage medium and the electronic equipment of webpage
CN108280168B (en) * 2018-01-19 2022-03-08 中国科学院上海高等研究院 Webpage processing method/system, computer readable storage medium and electronic device
CN108829856A (en) * 2018-06-21 2018-11-16 青岛海信电器股份有限公司 The resource of web application preloads method and device in display terminal
CN110069739A (en) * 2019-04-28 2019-07-30 百度在线网络技术(北京)有限公司 The page preloads method and device
CN110069739B (en) * 2019-04-28 2021-05-28 百度在线网络技术(北京)有限公司 Page preloading method and device
CN110889064A (en) * 2019-12-05 2020-03-17 北京百度网讯科技有限公司 Page display method, device, equipment and storage medium

Similar Documents

Publication Publication Date Title
CN105589914B (en) A kind of pre-reading method of Webpage, device and intelligent terminal
CN106339398B (en) A kind of pre-reading method of Webpage, device and intelligent terminal
CN106326261A (en) Pre-reading method and device for webpage and intelligent terminal device
CN102663012B (en) A kind of webpage preloads method and system
US9245294B1 (en) Providing separate views for items
US10579686B2 (en) Analyzing an interaction history to generate a customized webpage
US20090327863A1 (en) Referrer-based website personalization
US20100058195A1 (en) System And Method For Interfacing A Web Browser Widget With Social Indexing
CN104239298B (en) Text message recommends method, server, browser and system
CN106294648A (en) A kind of processing method and processing device for page access path
CN108304498A (en) Webpage data acquiring method, device, computer equipment and storage medium
CN102932206B (en) The method and system of monitoring website access information
CN109377275B (en) Data tracking method, device, computer equipment and storage medium
CN102932207B (en) The method of monitoring website access information and server
CN104052809B (en) A kind of flow-dividing control method and apparatus of website test
CN107463641A (en) System and method for improving the access to search result
CN102930059A (en) Method for designing focused crawler
CN107688568A (en) Acquisition method and device based on web page access behavior record
CN103078967B (en) A kind of generate conventional network address client terminal, server, system and method
CN101188521B (en) A method for digging user behavior data and website server
CN103092839A (en) Management method and device for recording historical information
US20130091415A1 (en) Systems and methods for invisible area detection and contextualization
CN114428901A (en) Personalized data loading method and device, electronic equipment and storage medium
KR101307105B1 (en) Information provisioning device, information provisioning method, and information recording medium
CN105450460B (en) Network operation recording method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right

Effective date of registration: 20200526

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Applicant after: Alibaba (China) Co.,Ltd.

Address before: 510627 Guangdong city of Guangzhou province Whampoa Tianhe District Road No. 163 Xiping Yun Lu Yun Ping B radio 14 floor tower square

Applicant before: GUANGZHOU UCWEB COMPUTER TECHNOLOGY Co.,Ltd.

TA01 Transfer of patent application right
RJ01 Rejection of invention patent application after publication

Application publication date: 20170111

RJ01 Rejection of invention patent application after publication