CN103714093B - A kind of method for digging and device of the website emphasis page - Google Patents

A kind of method for digging and device of the website emphasis page Download PDF

Info

Publication number
CN103714093B
CN103714093B CN201210380363.3A CN201210380363A CN103714093B CN 103714093 B CN103714093 B CN 103714093B CN 201210380363 A CN201210380363 A CN 201210380363A CN 103714093 B CN103714093 B CN 103714093B
Authority
CN
China
Prior art keywords
link
emphasis
pair
page
link pair
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210380363.3A
Other languages
Chinese (zh)
Other versions
CN103714093A (en
Inventor
张冲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201210380363.3A priority Critical patent/CN103714093B/en
Publication of CN103714093A publication Critical patent/CN103714093A/en
Application granted granted Critical
Publication of CN103714093B publication Critical patent/CN103714093B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention provides a kind of method for digging and device of the website emphasis page.The method for digging of the wherein website emphasis page includes:Navigation link string is extracted from each webpage of website respectively;Each navigation link string of extraction is split as link pair respectively, wherein each link two links of adjacent position in the navigation link string to being made of;Emphasis link pair is determined from each link pair, and the emphasis is linked to the corresponding page as the emphasis page of the website.By the above-mentioned means, the present invention can improve recall rate and accuracy rate when being excavated to the website emphasis page.

Description

A kind of method for digging and device of the website emphasis page
【Technical field】
The present invention relates to data mining treatment technology, more particularly to a kind of the method for digging and device of the website emphasis page.
【Background technology】
Web page authority is the important references factor that search engine is ranked up result.Calculate web page authority When, all webpages calculated will be participated in and gathered as one, and net is iterated to calculate by the linking relationship between webpage in set The authority of page.But with the development of internet, the webpage on internet is more and more, if by all webpages on internet It is all as the authoritative webpage calculated is participated in, then very high to the requirement of the framework of computing system, therefore usually only select each website Webpage with external website there are the webpage of linking relationship as the authoritative calculating of participation, but this mode of the prior art, Some outstanding webpages inside each website can be caused to be unable to get authoritative value, in addition, the webpage that can also influence to participate in calculating obtains The accuracy of the authoritative value arrived.
In order to improve the above problem, the prior art has a kind of way, is by there are linking relationships with external website in website Webpage and website in some important webpages extract together, as participate in web page authority calculate webpage.Existing Have in technology, be that anti-chain quantity determines the importance of webpage in the station by webpage in website, such as will in website in station Anti-chain quantity be more than given threshold webpage extract, if in the station of webpage pointed by these webpages anti-chain quantity also greater than Given threshold, then using these webpages and signified webpage as emphasis webpage.But the method for this prior art, recall rate compared with Low, accuracy is also poor.
【Invention content】
Technical problem to be solved by the invention is to provide a kind of method for digging and device of the website emphasis page, to improve Recall rate and accuracy rate when being excavated to the website emphasis page.
The present invention be solve technical problem and the technical solution adopted is that provide a kind of method for digging of the website emphasis page, Including:Navigation link string is extracted from each webpage of website respectively;Each navigation link string of extraction is split as link pair respectively, Wherein each link two links of adjacent position in the navigation link string to being made of;Determine that emphasis links from each link pair It is right, and the emphasis is linked to the corresponding page as the emphasis page of the website.
According to one of present invention preferred embodiment, determine that the step of emphasis link pair includes from each link pair:It unites respectively The occurrence number of each link pair is counted, and occurrence number is met into the link of prerequisite to as emphasis link pair.
According to one of present invention preferred embodiment, the prerequisite includes:Occurrence number is greater than the set value;Or occur The ranking of number is more than the link pair of each link pair setting ratio.
According to one of present invention preferred embodiment, determine that the step of emphasis link pair includes from each link pair:Profit respectively It is linked to each to classifying with advance trained disaggregated model, and is classified into the link of important class to being linked as emphasis It is right, wherein the characteristic of division parameter in the disaggregated model includes the occurrence number of link pair.
According to one of present invention preferred embodiment, the characteristic of division parameter in the disaggregated model further comprises following It is at least one:Link pair is directed toward the out-degree of the corresponding page of link, link pair is directed toward the depth of link, link pair quilt Refer to the depth of link, link pair is directed toward the depth of link and is referred to difference between the depth that links, linked to corresponding Anchor Text word quantity.
According to one of present invention preferred embodiment, the method further includes:Calculate the webpage power of the emphasis page Prestige, wherein the web page authority is search engine when returning to the emphasis page as search result, to the emphasis The foundation that the page is ranked up.
The present invention also provides a kind of excavating gears of the website emphasis page, including:Unit is excavated, for respectively from website Each webpage in extract navigation link string;Split cells, for each navigation link string of extraction to be split as link pair respectively, In each link to being made of two links of adjacent position in the navigation link string;Determination unit is used for from each link pair It determines emphasis link pair, and the emphasis is linked to the corresponding page as the emphasis page of the website.
According to one of present invention preferred embodiment, the determination unit includes:Statistic unit, for counting each link respectively To occurrence number, and occurrence number is met into the link of prerequisite to as emphasis link pair.
According to one of present invention preferred embodiment, the prerequisite includes:Occurrence number is greater than the set value;Or occur The ranking of number is more than the link pair of each link pair setting ratio.
According to one of present invention preferred embodiment, the determination unit includes:Taxon, for being utilized respectively advance instruction The disaggregated model perfected to classifying, and is classified into the link of important class to as emphasis link pair to each link, wherein Characteristic of division parameter in the disaggregated model includes the occurrence number of link pair.
According to one of present invention preferred embodiment, the characteristic of division parameter in the disaggregated model further comprises following It is at least one:Link pair is directed toward the out-degree of the corresponding page of link, link pair is directed toward the depth of link, link pair quilt Refer to the depth of link, link pair is directed toward the depth of link and is referred to difference between the depth that links, linked to corresponding Anchor Text word quantity.
According to one of present invention preferred embodiment, described device further comprises:Computing unit, for calculating the emphasis The web page authority of the page returns wherein the web page authority is search engine using the emphasis page as searching structure When, foundation that the emphasis page is ranked up.
As can be seen from the above technical solutions, the present invention does not depend on the station of webpage in the emphasis page in determining website Interior anti-chain quantity, but the navigation link string of webpage in website is analyzed.Experimental data shows to major on internet After website method using the present invention is excavated, the emphasis page quantity recalled increases 20,000,000 compared with prior art, And the emphasis page recalled belongs to the catalogue page of website mostly, that is to say, that the webpage that the method for the present invention is recalled can be well Reflect that the importance of webpage, the i.e. accuracy rate of the method for the present invention are higher.
【Description of the drawings】
Fig. 1 is the flow diagram of the method for digging of the website emphasis page in the present invention;
Fig. 2 is the schematic diagram of navigation link string in the present invention;
Fig. 3 is the schematic diagram of webpage source file in the present invention;
Fig. 4 is the structural schematic block diagram of the embodiment one of the excavating gear of the website emphasis page in the present invention;
Fig. 5 is the structural schematic block diagram of the embodiment two of the excavating gear of the website emphasis page in the present invention;
Fig. 6 is the structural schematic block diagram of the embodiment of model training apparatus in the present invention;
Fig. 7 is the structural schematic block diagram of the embodiment three of the excavating gear of the website emphasis page in the present invention.
【Specific implementation mode】
To make the objectives, technical solutions, and advantages of the present invention clearer, right in the following with reference to the drawings and specific embodiments The present invention is described in detail.
Referring to FIG. 1, Fig. 1 is the flow diagram of the method for digging of the website emphasis page in the present invention.As shown in Figure 1, This method includes:
Step S101:Navigation link string is extracted from each webpage of website respectively.
Step S102:Each navigation link string of extraction is split as link pair respectively.
Step S103:Emphasis link pair is determined from each link pair, and emphasis is linked to the corresponding page as net The emphasis page stood.
Above-mentioned steps are specifically described below.
Referring to FIG. 2, Fig. 2 is the schematic diagram of navigation link string in the present invention.As shown in Fig. 2, navigation link string is webpage Top by ">" the link string that connects of symbol.
Referring to FIG. 3, Fig. 3 is the schematic diagram of webpage source file in the present invention.As shown in figure 3, in the present invention by ">" symbol Number, several adjacent hyperlink labels can be positioned from webpage source file, by the chain in these hyperlink labels in step S101 Ground connection location extracts, and has just obtained the navigation link string of a webpage.
Link pair in step S102 is made of two links of adjacent position in navigation link string.Such as shaped like " A- >B->C->The navigation link string of D ", can extract out " A->B”、“B->C”、“C->Three links pair of D ".
After splitting the navigation link string of each webpage in website, a link is obtained to gathering, can be wrapped in the set Containing the element repeated, such as " A->If this link of B " occurs in the navigation link string of multiple pages, link will be become To the repeat element in set.
As an implementation, determine that the mode of emphasis link pair includes from each link pair in step S103:
The occurrence number of each link pair is counted respectively, and occurrence number is met into the link of prerequisite to as emphasis chain It connects pair.
The occurrence number of link pair, is occurrence number of the element in above-mentioned set, that is, links in each navigation link string The number of middle appearance.The occurrence number of each element in above-mentioned set is counted respectively, so that it may with according to the occurrence number of link pair It determines emphasis link pair, occurrence number is such as met to the link of prerequisite to as emphasis link pair.
Above-mentioned prerequisite includes:Occurrence number is greater than the set value;Alternatively, the ranking of occurrence number is more than each link pair The link pair of setting ratio.
Such as by occurrence number more than 100 link to as emphasis link pair, or when it is each link pair sum be 600, when setting ratio is 70%, due to 600*70%=420, then by occurrence number ranking in preceding 180 links pair(More than each 70% link pair of link pair)As emphasis link pair.
As preferred embodiment, determine that the mode of emphasis link pair includes from each link pair in step S103:
Advance trained disaggregated model is utilized respectively to each link to classifying, and is classified into the link of important class To as emphasis link pair.
Characteristic of division parameter in above-mentioned trained disaggregated model includes the occurrence number of link pair.In addition, above-mentioned instruction The characteristic of division parameter for the disaggregated model perfected can further include following at least one:The direction of link pair links institute The depth, link pair of depth, link pair that the out-degree of the corresponding page, the direction of link pair link linked by referring to It is directed toward the difference between the depth of link and the depth linked by finger, links to corresponding Anchor Text word quantity.
A kind of embodiment of advance train classification models is first introduced below, it both can be in this way in the present invention Obtain trained disaggregated model, can also obtain a trained model of third party as the present invention in disaggregated model, As long as the characteristic of division parameter in the model meets above-mentioned restriction.
The method of train classification models includes:
S1:The link marked is obtained to sample, sample therein includes positive sample and negative sample, and positive sample is exactly to mark For the sample of important link pair, negative sample is exactly the sample for being labeled as insignificant link pair.
S2:The characteristic of division of each sample is extracted, and, using the sample with characteristic of division to corresponding in disaggregated model Characteristic of division parameter is trained, to determine the characteristic of division parameter area of important link pair and the classification of insignificant link pair Characteristic parameter range.
After training, the characteristic of division parameter of disaggregated model just has the ability of description important link pair.
In the step of using trained disaggregated model in step s 103 to each link to classifying, extract first The characteristic of division of link pair to be sorted, then joins the characteristic of division in the characteristic of division of extraction and trained disaggregated model Number is compared, if the characteristic of division of extraction falls into the characteristic of division parameter area of important link pair, link to be sorted To being divided into important class, otherwise link to be sorted is to being divided into insignificant classification.
A detailed description is carried out to above-mentioned each characteristic of division parameter below.
The occurrence number of link pair is identical as the meaning in upper one embodiment of step S103, that is, links in step The number occurred in each navigation link string that S101 is obtained.
Shaped like " A->The such links pair of B ", link A are the direction link of the link pair, and link B is the link Being referred to for centering links.In the present invention, the out-degree of the corresponding page of direction link of link pair refers to link pair It is directed toward the sum of all-links that the corresponding page of link includes, being directed toward other pages, such as above " A->This link of B " It is right, it is assumed that include three links for being directed toward other pages on the page corresponding to link A, then link is to " A->Direction chain in B " The out-degree for connecing the page corresponding to A is exactly 3.
In the present invention, the depth of the direction link of link pair, it is right from the homepage of website arrival direction link institute to refer to The minimum number of hops for the page answered.Such as the homepage of website is F, it is X, linking relationship " F- to be directed toward the corresponding page of link> T1->T2->X " indicates that homepage F has the link for being directed toward page T1, page T1 to have the link for being directed toward page T2, page T2 to have direction page The link of face X is 3 to the number of hops of page X from homepage F, if the number of hops is to reach page X most from homepage F Few number of hops, then the depth of the direction link corresponding to page X is exactly 3.
Similarly, in the present invention, the depth for being referred to link of link pair refers to reaching to be referred to from the homepage of website linking The minimum number of hops of the corresponding page.
Assuming that link is to " A->In B ", the depth for being directed toward link A is 3, and the depth for being referred to link B is 1, then is directed toward link Difference between depth and the depth linked by finger is exactly 3-1=2.
Link refers to the corresponding Anchor Text warp of two links of link pair to corresponding Anchor Text word quantity Cross the sum of the Anchor Text word obtained after cutting word.Such as shaped like "Maintenance computer->Software fault" such link pair, Anchor Text has " maintenance computer " and " software fault ", obtain after cutting word " computer ", " repair ", " software ", " failure ", therefore the link to pair The Anchor Text word quantity answered is exactly 4.
In the present invention, the various machine learning methods of the prior art, such as SVM are utilized(support vector Machine support vector machines), you can realization is trained disaggregated model and using trained disaggregated model to link pair The step of being classified, details are not described herein.
After executing the step S103, the present invention has determined that the emphasis page in website.Further, the present invention also wraps The web page authority of the calculation stress page is included, wherein web page authority is that search engine is tied the emphasis page of website as search When fruit returns, foundation that the emphasis page is ranked up.The web page authority for calculating the page has been done there are many known in this field Method, the patent document if U.S. patent Nos number are 6285999 is to disclose a kind of method calculating web page authority.
In addition, present invention determine that the emphasis page can be additionally used in generate website skeleton.The mutual chain of the emphasis page Relationship is connect, can reflect the webpage distribution situation of a website, website is generated using the mutual linking relationship of the emphasis page Skeleton, so that it may classified with the type to website and webpage.The skeleton of usual website forms the structure of a tree-shaped, then same The website and webpage of one branch can be classified as one kind.
Referring to FIG. 4, Fig. 4 is the structural representation frame of the embodiment one of the excavating gear of the website emphasis page in the present invention Figure.As shown in figure 4, the device includes:Excavate unit 201, split cells 202 and determination unit 203.
Unit 201 is wherein excavated, for extracting navigation link string from each webpage of website respectively.Split cells 202 is used In respectively by each navigation link string of extraction be split as link pair.Determination unit 203, for determining emphasis chain from each link pair It connects pair, and emphasis is linked to the corresponding page as the emphasis page of website.
Referring to FIG. 2, in Fig. 2 present invention navigation link string schematic diagram.As shown in Fig. 2, navigation link string is on webpage Side by ">" the link string that connects of symbol.
Referring to FIG. 3, Fig. 3 is the schematic diagram of webpage source file in the present invention.As shown in figure 3, in the present invention by ">" symbol Number, several adjacent hyperlink labels can be positioned from webpage source file, excavate unit 201 by the chain in these hyperlink labels Ground connection location extracts, and has just obtained the navigation link string of a webpage.
Link pair in the present invention is made of two links of adjacent position in navigation link string.Such as shaped like " A-> B->C->The navigation link string of D ", can extract out " A->B”、“B->C”、“C->Three links pair of D ".
Split cells 202 obtains a link to set, the collection after splitting the navigation link string of each webpage in website Can include the element repeated, such as " A- in conjunction>If this link of B " occurs in the navigation link string of multiple pages, It is handled by split cells 202, link will be become to the repeat element in set.
In the present embodiment, determination unit 203 includes statistic unit 2031, wherein statistic unit 2031, for uniting respectively The occurrence number of each link pair is counted, and occurrence number is met into the link of prerequisite to as emphasis link pair.
Wherein, prerequisite includes:Occurrence number is greater than the set value;Alternatively, the ranking of occurrence number is more than each link pair The link pair of middle setting ratio.
Such as statistic unit 2031 by occurrence number more than 100 link to as emphasis link pair, or work as each link To sum be 600, setting ratio be 70% when, due to 600*70%=420, then by occurrence number ranking in preceding 180 links It is right(More than 70% link pair of each link pair)As emphasis link pair.
Referring to FIG. 5, Fig. 5 is the structural representation frame of the embodiment two of the excavating gear of the website emphasis page in the present invention Figure.As shown in figure 5, the embodiment is with embodiment one, difference lies in determination unit 203 includes taxon 2032, for dividing It is not linked to each to classifying using advance trained disaggregated model 204, and is classified into the link of important class to conduct Emphasis link pair.
Characteristic of division parameter in above-mentioned trained disaggregated model 204 includes the occurrence number of link pair.In addition, classification Characteristic parameter can further include following at least one:The out-degree of the corresponding page of direction link of link pair, chain Connect the depth and quilt of the direction link of the depth of the direction link of centering, the depth for being referred to link of link pair, link pair Refer to the difference between the depth of link, link to corresponding Anchor Text word quantity.
Trained disaggregated model in the present invention either a trained model of third party, can also be in advance by The model that model training apparatus obtains.Referring to FIG. 6, Fig. 6 is the structural representation of the embodiment of model training apparatus in the present invention Block diagram.
As shown in fig. 6, model training apparatus 301 includes sample acquisition unit 3011 and training unit 3012, wherein sample Acquiring unit 3011, for obtaining the link marked to sample.Training unit 3012, the classification for extracting each sample are special Sign, and, corresponding characteristic of division parameter in disaggregated model is trained using the sample with characteristic of division, to determine weight The characteristic of division parameter area of the characteristic of division parameter area and insignificant link pair that link pair.
Referring to FIG. 7, Fig. 7 is the structural representation frame of the embodiment three of the excavating gear of the website emphasis page in the present invention Figure.As shown in fig. 7, the embodiment further comprises computing unit 205, wherein computing unit 205, it to be used for the calculation stress page Web page authority, wherein web page authority is ranked up the emphasis page when to be search engine return to the emphasis page to user Foundation.The embodiment of computing unit 205 can refer to the patent document that U.S. patent Nos number are 6285999, herein no longer It repeats.
The foregoing is merely illustrative of the preferred embodiments of the present invention, is not intended to limit the invention, all essences in the present invention With within principle, any modification, equivalent substitution, improvement and etc. done should be included within the scope of protection of the invention god.

Claims (12)

1. a kind of method for digging of the website emphasis page, including:
Respectively from each webpage of website extract navigation link string, navigation link string be webpage above by ">" symbol connects Link string;
Each navigation link string of extraction is split as link pair respectively, wherein each link is to by adjacent bit in the navigation link string Two links set are constituted;
Emphasis link pair is determined from each link pair, and the emphasis is linked to the corresponding page as the weight of the website The point page.
2. according to the method described in claim 1, it is characterized in that, determining the step packet of emphasis link pair from each link pair It includes:
The occurrence number of each link pair is counted respectively, and occurrence number is met into the link of prerequisite to being linked as emphasis It is right.
3. according to the method described in claim 2, it is characterized in that, the prerequisite includes:
Occurrence number is greater than the set value;Or the ranking of occurrence number is more than the link pair of each link pair setting ratio.
4. according to the method described in claim 1, it is characterized in that, determining the step packet of emphasis link pair from each link pair It includes:
Advance trained disaggregated model is utilized respectively to each link to classifying, and is classified into the link of important class to making For emphasis link pair, wherein the characteristic of division parameter in the disaggregated model include link pair occurrence number.
5. according to the method described in claim 4, it is characterized in that, the characteristic of division parameter in the disaggregated model is further gone back Including following at least one:
Link pair is directed toward the out-degree of the corresponding page of link, link pair is directed toward the depth of link, link pair by finger chain The depth that connects, link pair be directed toward the depth of link and referred to difference between the depth that link, link it is literary to corresponding anchor This word quantity.
6. according to the method described in claim 1, it is characterized in that, the method further includes:
The web page authority of the emphasis page is calculated, makees the emphasis page wherein the web page authority is search engine When being returned for search result, foundation that the emphasis page is ranked up.
7. a kind of excavating gear of the website emphasis page, including:
Excavate unit, for respectively from each webpage of website extract navigation link string, navigation link string be webpage above by ">” The link string that symbol connects;
Split cells, for each navigation link string of extraction to be split as link pair respectively, wherein each link is to by the navigation Two links of adjacent position are constituted in link string;
Determination unit for determining emphasis link pair from each link pair, and the corresponding page is made in emphasis link For the emphasis page of the website.
8. device according to claim 7, which is characterized in that the determination unit includes:
Statistic unit, the occurrence number for counting each link pair respectively, and occurrence number is met to the link pair of prerequisite As emphasis link pair.
9. device according to claim 8, which is characterized in that the prerequisite includes:
Occurrence number is greater than the set value;Or the ranking of occurrence number is more than the link pair of each link pair setting ratio.
10. device according to claim 7, which is characterized in that the determination unit includes:
Taxon for being utilized respectively advance trained disaggregated model to each link to classifying, and is classified into important The link of classification to as emphasis link pair, wherein the characteristic of division parameter in the disaggregated model include link pair go out occurrence Number.
11. device according to claim 10, which is characterized in that the characteristic of division parameter in the disaggregated model is further It further include following at least one:
Link pair is directed toward the out-degree of the corresponding page of link, link pair is directed toward the depth of link, link pair by finger chain The depth that connects, link pair be directed toward the depth of link and referred to difference between the depth that link, link it is literary to corresponding anchor This word quantity.
12. device according to claim 7, which is characterized in that described device further comprises:
Computing unit, the web page authority for calculating the emphasis page are incited somebody to action wherein the web page authority is search engine When the emphasis page is returned as searching structure, foundation that the emphasis page is ranked up.
CN201210380363.3A 2012-09-29 2012-09-29 A kind of method for digging and device of the website emphasis page Active CN103714093B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210380363.3A CN103714093B (en) 2012-09-29 2012-09-29 A kind of method for digging and device of the website emphasis page

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210380363.3A CN103714093B (en) 2012-09-29 2012-09-29 A kind of method for digging and device of the website emphasis page

Publications (2)

Publication Number Publication Date
CN103714093A CN103714093A (en) 2014-04-09
CN103714093B true CN103714093B (en) 2018-10-16

Family

ID=50407078

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210380363.3A Active CN103714093B (en) 2012-09-29 2012-09-29 A kind of method for digging and device of the website emphasis page

Country Status (1)

Country Link
CN (1) CN103714093B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103914550B (en) * 2014-04-11 2017-08-18 百度在线网络技术(北京)有限公司 Show the method and apparatus of content recommendation
CN105243091B (en) * 2015-09-11 2018-11-13 晶赞广告(上海)有限公司 Page Semantic features extraction method and system based on Hypertext Link
CN106649337A (en) * 2015-10-30 2017-05-10 北京国双科技有限公司 Method and device for identifying webpage column
CN105608133B (en) * 2015-12-16 2019-07-02 北京神州绿盟信息安全科技股份有限公司 A kind of determination method and device of the key page
CN106095979B (en) * 2016-06-20 2020-05-08 百度在线网络技术(北京)有限公司 URL merging processing method and device

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101601036A (en) * 2006-12-29 2009-12-09 诺基亚公司 Navigation spots on the Web page
CN102043805A (en) * 2009-10-19 2011-05-04 阿里巴巴集团控股有限公司 Method and device for generating Internet navigation page
CN102439586A (en) * 2009-04-14 2012-05-02 自由科学有限公司 Document navigation method
CN102663091A (en) * 2012-04-11 2012-09-12 广东华大集成技术有限责任公司 WEB application navigation management method and system thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101601036A (en) * 2006-12-29 2009-12-09 诺基亚公司 Navigation spots on the Web page
CN102439586A (en) * 2009-04-14 2012-05-02 自由科学有限公司 Document navigation method
CN102043805A (en) * 2009-10-19 2011-05-04 阿里巴巴集团控股有限公司 Method and device for generating Internet navigation page
CN102663091A (en) * 2012-04-11 2012-09-12 广东华大集成技术有限责任公司 WEB application navigation management method and system thereof

Also Published As

Publication number Publication date
CN103714093A (en) 2014-04-09

Similar Documents

Publication Publication Date Title
CN103714093B (en) A kind of method for digging and device of the website emphasis page
Dumaru Community‐based adaptation: enhancing community adaptive capacity in Druadrua Island, Fiji
CN103955842B (en) A kind of online advertisement commending system and method towards mass media data
CN103927400B (en) Web site product detailed information classification crawling and product information base establishing method
CN103491205B (en) The method for pushing of a kind of correlated resources address based on video search and device
CN102254038A (en) System and method for analyzing network comment relevance
CN103699626A (en) Method and system for analysing individual emotion tendency of microblog user
CN102831119B (en) Short text clustering Apparatus and method for
CN103294781A (en) Method and equipment used for processing page data
CN101661513A (en) Detection method of network focus and public sentiment
CN104035997A (en) Scientific and technical information acquisition and pushing method based on text classification and image deep mining
CN102262663B (en) Method for repairing software defect reports
CN104134159A (en) Method for predicting maximum information spreading range on basis of random model
CN103631828A (en) Method and device for determining access path and method and system for determining page churn rate
CN103886501B (en) Post-loan risk early warning system based on semantic sentiment analysis
CN102789449A (en) Method and device for evaluating comment text
CN103077172A (en) Method and device for mining cheating user
CN106779278A (en) The evaluation system of assets information and its treating method and apparatus of information
CN106156257A (en) A kind of Tendency Prediction method of microblogging public sentiment event
CN101630321A (en) On-line article screening method based on data mining (DM)
CN105740310A (en) Automatic answer summarizing method and system for question answering system
CN106570750A (en) Browser plug-in-based automatic tax declaration method, system and browser plug-in
Webb et al. Weaving common threads in environmental causal assessment methods: toward an ideal method for rapid evidence synthesis
CN106021391B (en) Product review information real-time collecting method based on Storm
Idris Assessing a theoretically-derived e-readiness framework for e-commerce in a Nigerian SME

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant