CN103399918B - A kind of method improving the searched rate in website - Google Patents

A kind of method improving the searched rate in website Download PDF

Info

Publication number
CN103399918B
CN103399918B CN201310330651.2A CN201310330651A CN103399918B CN 103399918 B CN103399918 B CN 103399918B CN 201310330651 A CN201310330651 A CN 201310330651A CN 103399918 B CN103399918 B CN 103399918B
Authority
CN
China
Prior art keywords
website
size
page
score
ratio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201310330651.2A
Other languages
Chinese (zh)
Other versions
CN103399918A (en
Inventor
王冬琦
魏小淞
黄新宇
王静
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University China
Original Assignee
Northeastern University China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeastern University China filed Critical Northeastern University China
Priority to CN201310330651.2A priority Critical patent/CN103399918B/en
Publication of CN103399918A publication Critical patent/CN103399918A/en
Application granted granted Critical
Publication of CN103399918B publication Critical patent/CN103399918B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Transfer Between Computers (AREA)

Abstract

A kind of method improving the searched rate in website of the present invention, belong to the Internet and searching engine field, the present invention affects factor and the pattern of search engine by collecting, linear iterative algorithm is utilized to be iterated obtaining weights, thus website ranking is drawn a mark that can quantify, and for the value of each influence factor with the product of weights, the reasonable proposal to web information flow can be proposed, reduce cost, improve the efficiency of website promotion.

Description

A kind of method improving the searched rate in website
Technical field
The invention belongs to the Internet and searching engine field, be specifically related to a kind of method improving the searched rate in website.
Background technology
Development along with internet economy, " with search engine as platform; by adjust webpage in search results pages ranking thus bring visit capacity " have become as a kind of marketing mode, the most so-called search engine marketing (SEM, Search Engine Marketing).SEM mainly includes business promotion optimization and search engine optimization (SEO, the Search Engine Optimization) both of which paid.The SEM paid refers generally to bid ranking mechanism, and SEO refer to promote webpage in search engine nature Search Results (non-commercial Extended Results) include quantity and sorting position and the optimization behavior done.This behavior can bring the lifting of Consumer's Experience and conversion ratio for website, the self-marketing solutions of ecological type is provided to website simultaneously, allow website position oneself at the forefront in industry, thus obtain brand income, particularly in the case of the enterprise web site enough funds of shortage carry out publicity popularization, SEO is undoubtedly a kind of low cost, instant effect, optimal website promotion method that effect is lasting, therefore suffers from the emphasis of more and more enterprise.And along with Baidu is increasingly severe about the examination & verification of bid ranking, a lot of bid rankings are all adjusted, this has attracted more website operator to be sought for by web information flow to promote ranking in a search engine.
But, in the real process that SEO optimizes, website webmaster implements to optimize the most by rule of thumb, and it is slow not only to consume manager's energy but also speed, and cost is high, reduces the creativity of enterprise self.Therefore, it is necessary to one instrument automatically analyzing website SEO mass of exploitation, auxiliary web site manager carries out SEO and optimizes.
Summary of the invention
For the deficiencies in the prior art, the present invention proposes a kind of method improving the searched rate in website, to reach to make website ranking quantify, reduces cost, specifies solution, improve the purpose of the efficiency of website promotion.
A kind of method improving the searched rate in website, comprises the following steps:
Step 1, determine the factor affecting site search engine masses, including scripted code line number in the pagefile size of tested website, the page and the ratio of total code line number, the ratio of title and the size of motion picture files in the matching degree of keyword, website in Website page with the size of animation file in the ratio of Website page file size, website with Website page size of tested website, and by the way of iteration, determine the threshold value of above-mentioned five each scores of factor respectively;
Step 1-1, user set scripted code line number and the ratio of total code line number, title and the size of motion picture files in the matching degree of keyword, website in Website page and the initial weight of these five factors of ratio of the size of animation file in the ratio of Website page file size, website and Website page size of tested website in the pagefile size of first known website, the page according to demand, it is ensured that the initial weight sum of five factors is 1;
Step 1-2, according to the numerical value of these five factors in website and every initial weight thereof, calculate first known website and initialize score;
Step 1-3, according to the numerical value of five factors in second known website and initial weight, calculate this website and initialize score;
Step 1-4, by calculating each factor scores and the ratio of total score in first known website, try to achieve five new weights of five factors respectively;
Step 1-5, recalculate the score of second known website according to five new weights, and the initial score of the above-mentioned score regained Yu second website is done difference;
Step 1-6, judge the difference obtained in step 1-5 whether approximate 0, the most then complete the determination that weights are final, and according to first known website or five factors of second known website, and then the threshold value of five each scores of factor in acquisition website, iterative process terminates;Otherwise, the weights obtained are substituted into first known website, repeats step 1-4 and step 1-5;
Step 2, network address according to website to be measured gathers the pagefile size of this website, scripted code line number in the page, total code line number, the title of tested website, keyword in Website page, animation file size in motion picture files size and website in website, and according to the pagefile size of above-mentioned seven tested websites of parameter determination, scripted code line number and the ratio of total code line number in the page, the title of tested website and the matching degree of keyword in Website page, the size of motion picture files and the ratio of Website page file size in website, the size of animation file and five factors of ratio of Website page size in website;
Step 3, will calculate obtain five factor scores respectively compared with respective threshold value, it is achieved the adjustment to website and webpage;
If the pagefile size score of tested website is more than threshold value, then reduce pagefile size;Otherwise show result;
If scripted code line number and the ratio score of total code line number are more than threshold value in the page of tested website, then reduce scripted code amount in the page;Otherwise show result;
If the title of tested website and the matching degree score of keyword in Website page are more than threshold value, then optimize website keyword;Otherwise show result;
If the size of the motion picture files in tested website is more than threshold value with the ratio score of Website page file size, then reduce motion picture files quantity;Otherwise show result;
If the size of the animation file in tested website is more than threshold value with the ratio score of Website page size, then reduce animation file quantity;Otherwise show result.
The numerical value according to these five factors in website described in step 1-2 and every initial weight thereof, calculate website and initialize score, and formula is as follows:
a = Σ i = 1 5 ( p 1 i × w i ) - - - ( 1 )
Wherein, a represents the scoring detected according to initial weight;p1iRepresent the result of calculation of the i-th influence factor of first known website;wiRepresent the weights corresponding to this influence factor.
The numerical value according to five factors in second known website described in step 1-3 and initial weight, calculate this website and initialize score, and formula is as follows:
b = Σ i = 1 5 ( p 2 i × w i ) - - - ( 2 )
Wherein, b represents second known website detection score, p2iRepresent the result of calculation of the i-th influence factor of second known website, wiRepresent the weights corresponding to this influence factor.
Described in step 1-4 by seeking the ratio of each factor scores and total score in first known website, try to achieve five new weights of five factors respectively, formula is as follows:
w i ′ = p 1 i × w i Σ i = 1 5 ( p 1 i × w i ) - - - ( 3 )
Wherein, w 'iRepresent the i-th influence factor p after adjusting1iCorresponding weights, wiRepresent the weights corresponding to this influence factor.
The invention have the advantages that
A kind of method improving the searched rate in website of the present invention, by collecting factor and the pattern affecting search engine, linear iterative algorithm is utilized to be iterated obtaining weights, thus website ranking is drawn a mark that can quantify, and for the value of each influence factor with the product of weights, the reasonable proposal to web information flow can be proposed, reduce cost, improve the efficiency of website promotion.
Accompanying drawing explanation
Fig. 1 is the method flow diagram improving the searched rate in website of an embodiment of the present invention;
Fig. 2 is first each influence factor of website and the initial weight product coordinate diagram of an embodiment of the present invention;
Fig. 3 be an embodiment of the present invention second each influence factor of website and an iteration after weights product coordinate diagram;
Fig. 4 is the scoring threshold value coordinate diagram of an embodiment of the present invention.
Detailed description of the invention
Below in conjunction with the accompanying drawings an embodiment of the present invention is described further.
The development environment hardware configuration of the embodiment of the present invention is Windows operating system: Microsoft Windows 7;CPU:Intel Centrino2 internal memory: 2GB;Hard disk: 320GB.
A kind of method improving the searched rate in website, method flow diagram is as it is shown in figure 1, comprise the following steps:
Step 1, determine the factor affecting site search engine masses, including scripted code line number in the pagefile size of tested website, the page and the ratio of total code line number, the ratio of title and the size of motion picture files in the matching degree of keyword, website in Website page with the size of animation file in the ratio of Website page file size, website with Website page size of tested website, and by the way of iteration, determine the threshold value of above-mentioned five each scores of factor respectively;
The method using iterative algorithm in the embodiment of the present invention, data mapping (as shown in Figure 2 to 4) are obtained by the website of same key word ranking in detection search engine, abscissa is each influence factor, and vertical coordinate is the evaluation that this influence factor is corresponding.Then, by obtaining the difference of the maxima and minima of each influence factor as the initial value of weights shared by this influence factor.Result in data set being inputted in BP iterative algorithm and is trained, the weights of the BP iterative algorithm obtained after completing are added with initial weight, take the meansigma methods of the two, are finally balanced the process of weights, to avoid the apparent error caused due to incidental error.The key step of balance weight process is as follows:
Step 1-1, determine first known website, the present embodiment is chosen the website of Northeastern University, pagefile size p1Scripted code line number and the ratio p of total code line number in=25/50=0.48, the page2=12/250=0.048, title and matching degree p of keyword in Website page of tested website3=0.8, the size of motion picture files and the ratio p of Website page file size in website4=0, the size of animation file and the ratio p of Website page size in website5The initial weight w of=0 five factori(i=1,2 ..., 5)=0.2 (as shown in table 1), meet the initial weight w of five factorsi(i=1,2 ..., 5) sum is 1;
Table 1
Wherein, initializing weights is system definition, does not have any practical significance, only the weights initial value in iterative algorithm.
Step 1-2, according to the numerical value of these five factors in website and every initial weight thereof, calculate website and initialize score;
Formula is as follows:
a = Σ i = 1 5 ( p 1 i × w i ) = 0.2656 - - - ( 1 )
Wherein, a represents the scoring detected according to initial weight;p1iRepresent the result of calculation of the i-th influence factor of first known website;wiRepresent the weights corresponding to this influence factor.
The score of website ranking can be calculated by formula (1).This formula is linear iterative algorithm model, by using above iterative formula, by the simple change of initial value, can produce value is very abundant, difference is big weights and the vector machine of threshold values.Therefore initial value w is selectedi, then by continuous interative computation, it is possible to the weight of each influence factor of Query refinement, present invention utilizes the mode of pattern recognition, the unknown weight method Query refinement by iteration.
Step 1-3, according to second known website, the embodiment of the present invention uses software institute of Northeastern University website, the numerical value p of its five factorsj(j=1,2 ..., 5), p1=1.02, p2=0.1875, p3=0.7, p4=0.3, p5=25.6;And the initial weight w in step 1-1i(i=1,2 ..., 5)=0.2, calculate this website and initialize score;Second known website described in first known website, step 1-3 described in step 1-1 is that recall ratio is high, the website that retrieval ranking is forward.
Formula is as follows:
b = Σ i = 1 5 ( p 2 i × w i ) = 5.5615 - - - ( 2 )
Wherein, b represents second known website detection score, p2iRepresent the result of calculation of the i-th influence factor of second known website, wiRepresent the weights corresponding to this influence factor.
Step 1-4, by seeking the ratio of each factor scores and total score in first known website, try to achieve five new weights of five factors respectively;If being 0 after the weighed value adjusting of some influence factor, then with totally deduct be not 0 weights after mean allocation;
Formula is as follows:
w i ′ = p 1 i × w i Σ i = 1 5 ( p 1 i × w i ) - - - ( 3 )
Being computed, result is as follows:
w1'=0.361
w2'=0.036
w3'=0.602
w4'=0.0005
w5'=0.0005
Wherein, w 'iRepresent the i-th influence factor p after adjustingiCorresponding weights.
Step 1-5, recalculate the score of second known website according to five new weights, and the initial score of the above-mentioned score regained Yu second website is done difference, then remove absolute value;
Formula is as follows:
s = | Σ i = 1 5 ( p 2 i × w i ′ - b ) | = 4.75218 - - - ( 4 )
Wherein, the appraisal result of second tested website after behalf revises weights and the difference of unmodified front appraisal result.
Step 1-6, judge the difference obtained in step 1-5 whether approximate 0, the most then complete the determination that weights are final, and according to first known website or five factors of second known website, and then the threshold value of five each scores of factor in acquisition website, iterative process terminates;Otherwise, the weights obtained are substituted into first known website, repeats step 1-4 and step 1-5;
According to each influence factor's score value condition calculating targeted website, it is judged that optimize direction present in website, it is provided that reasonably advise.
Si=p2i×w″i (5)
Wherein, SiIt is the score threshold of the i-th factor of second known website, w "iFinal weights for i-th factor;
w″1=0.104
w″2=0.39
w″3=0.286
w″4=0.109
w″5=0.201
Choosing second website is standard web site, is computed
S1=0.10608
S2=0.73125
S3=0.2022
S4=0.0327
S5=5.1456
Step 2, network address (the homepage http://epub.cnki.net/kns/default.htm of National IP Network in using in the embodiment of the present invention) according to website to be measured gathers the pagefile size of this website, scripted code line number in the page, total code line number, the title of tested website, keyword in Website page, animation file size in motion picture files size and website in website, and according to the pagefile size of above-mentioned seven tested websites of parameter determination, scripted code line number and the ratio of total code line number in the page, the title of tested website and the matching degree of keyword in Website page, the size of motion picture files and the ratio of Website page file size in website, the size of animation file and five factors of ratio of Website page size in website;
Ci=(p 'i·w″i) (6)
Wherein, CiRepresent the score of targeted website i-th influence factor;w″iRepresent the weights after determining;p′iI-th factor for detected website;
C1=0.1248
C2=0
C3=0
C4=0
C5=0
Step 3, will calculate obtain five factor scores respectively compared with respective threshold value, it is achieved the adjustment to website and webpage;
If the pagefile size score of tested website is more than threshold value, then reduce pagefile size to 50K;Otherwise show result;
The page is excessive, and page-downloading speed can be caused slow.Meanwhile, part searches engine only captures the partial content of the page, thus cannot obtain intended ranking effect.
If scripted code line number and the ratio score of total code line number are more than threshold value in the page of tested website, then reduce scripted code amount in the page;Otherwise show result;
Too much script can disturb the reptile of search engine to be analyzed web page contents, virtually reduces Keyword Density, affects the distribution of webpage weight.
If the title of tested website and the matching degree score of keyword in Website page are more than threshold value, then optimize website keyword;Otherwise show result;
If the title of website and page keyword are not inconsistent, it will reduce and change the website degree of association in this keyword search.
If the size of the motion picture files in tested website is more than threshold value with the ratio score of Website page file size, then reduce motion picture files quantity;Otherwise show result;
Search engine can not resolve the content in dynamic picture well, causes the disappearance that web site contents is obtained by search engine.
If the size of the animation file in tested website is more than threshold value with the ratio score of Website page size, then reduce animation file quantity;Otherwise show result.
Owing to search engine is the most unfriendly to Flash, it is impossible to finding out the link wherein hidden, this definitely to be stopped.Although Google now begins to search the content in record Flash, but important link such for leading boat, it is definitely can not to make of Flash, nor intuitively, speed of download is also slow, no matter for search engine user, is all the most disagreeableness.
Thering is provided solution to every influence factor's score, feed back to user, user is by the website of above-mentioned suggestion amendment oneself.

Claims (4)

1. the method improving the searched rate in website, it is characterised in that: comprise the following steps:
Step 1, determine the factor affecting site search engine masses, including foot in the pagefile size of tested website, the page This lines of code and the ratio of total code line number, tested website title with in the matching degree of keyword, website in Website page The size of motion picture files and the size of animation file in the ratio of Website page file size, website and Website page size Ratio, and by the way of iteration, determine the threshold value of above-mentioned five each scores of factor respectively;
Step 1-1, user set according to demand in the pagefile size of first known website, the page scripted code line number with The ratio of total code line number, title and the motion picture files in the matching degree of keyword, website in Website page of tested website Size and the ratio of Website page file size, website in the size of animation file and the ratio these five of Website page size because of The initial weight of element, it is ensured that the initial weight sum of five factors is 1;
Step 1-2, according to the numerical value of these five factors in website and every initial weight thereof, calculate first known website and initialize Score;
Step 1-3, according to the numerical value of five factors in second known website and initial weight, calculate this website and initialize score;
Step 1-4, by calculating each factor scores and the ratio of total score in first known website, try to achieve five factors respectively Five new weights;
Step 1-5, recalculate the score of second known website according to five new weights, and by above-mentioned regain The initial score with second website is divided to do difference;
Step 1-6, judge the difference obtained in step 1-5 whether approximate 0, the most then complete the determination that weights are final, And according to first known website or five factors of second known website, and then obtain five each scores of factor in website Threshold value, iterative process terminates;Otherwise, the weights obtained are substituted into first known website, repeats step 1-4 and step 1- 5;
Step 2, network address according to website to be measured gather scripted code line number, total generation in the pagefile size of this website, the page Animation file in motion picture files size and website in keyword, website in code line number, the title of tested website, Website page Size, and according to scripted code line number in the pagefile size of above-mentioned seven tested websites of parameter determination, the page and total code row The ratio of number, tested website title and Website page in the matching degree of keyword, website the size of motion picture files with The size of animation file and five factors of ratio of Website page size in the ratio of Website page file size, website;
Step 3, will calculate obtain five factor scores respectively compared with respective threshold value, it is achieved the tune to website and webpage Whole;
If the pagefile size score of tested website is more than threshold value, then reduce pagefile size;Otherwise show result;
If scripted code line number and the ratio score of total code line number are more than threshold value in the page of tested website, then reduce foot in the page This size of code;Otherwise show result;
If the title of tested website and the matching degree score of keyword in Website page are more than threshold value, then optimize website keyword; Otherwise show result;
If the size of the motion picture files in tested website is more than threshold value with the ratio score of Website page file size, then reduce Motion picture files quantity;Otherwise show result;
If the size of the animation file in tested website is more than threshold value with the ratio score of Website page size, then reduce animation file Quantity;Otherwise show result.
The method of the searched rate in raising website the most according to claim 1, it is characterised in that: described in step 1-2 according to this net The numerical value of five factors of standing and every initial weight thereof, calculate website and initialize score, and formula is as follows:
a = Σ i = 1 5 ( p 1 i × w i ) - - - ( 1 )
Wherein, a represents the scoring detected according to initial weight;p1iRepresent the i-th influence factor of first known website Result of calculation;wiRepresent the weights corresponding to this influence factor.
The method of the searched rate in raising website the most according to claim 1, it is characterised in that: described in step 1-3 according to second The numerical value of five factors in known website and initial weight, calculate this website and initialize score, and formula is as follows:
b = Σ i = 1 5 ( p 2 i × w i ) - - - ( 2 )
Wherein, b represents second known website detection score, p2iRepresent the meter of the i-th influence factor of second known website Calculate result, wiRepresent the weights corresponding to this influence factor.
The method of the searched rate in raising website the most according to claim 1, it is characterised in that: described in step 1-4 by asking first In individual known website, each factor scores and the ratio of total score, try to achieve five new weights of five factors respectively, and formula is as follows:
w i ′ = p 1 i × w i Σ i = 1 5 ( p 1 i × w i ) - - - ( 3 )
Wherein, w 'iRepresent the i-th influence factor p after adjusting1iCorresponding weights, wiRepresent the power corresponding to this influence factor Value.
CN201310330651.2A 2013-07-31 2013-07-31 A kind of method improving the searched rate in website Expired - Fee Related CN103399918B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201310330651.2A CN103399918B (en) 2013-07-31 2013-07-31 A kind of method improving the searched rate in website

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201310330651.2A CN103399918B (en) 2013-07-31 2013-07-31 A kind of method improving the searched rate in website

Publications (2)

Publication Number Publication Date
CN103399918A CN103399918A (en) 2013-11-20
CN103399918B true CN103399918B (en) 2016-08-17

Family

ID=49563546

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201310330651.2A Expired - Fee Related CN103399918B (en) 2013-07-31 2013-07-31 A kind of method improving the searched rate in website

Country Status (1)

Country Link
CN (1) CN103399918B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156033B (en) * 2015-03-25 2019-09-03 阿里巴巴集团控股有限公司 A kind of search engine optimization SEO page generation method and equipment
CN107229631B (en) * 2016-03-24 2020-11-03 北京京东尚科信息技术有限公司 Method and device for capturing website data

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968447A (en) * 2012-10-24 2013-03-13 西安工程大学 SEO (search engine optimization) keyword competition level computing method based on decision tree algorithm

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102968447A (en) * 2012-10-24 2013-03-13 西安工程大学 SEO (search engine optimization) keyword competition level computing method based on decision tree algorithm

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
基于Google排名影响因素的SEO策略分析;张欢,夏圆,齐向楠;《现代情报》;20111115;第31卷(第11期);全文 *
搜索引擎中网站排名的影响因素;王煜;《中国科技信息》;20070201(第3期);全文 *
搜索引擎技术的研究与应用;范君剑;《中国优秀硕士学位论文全文数据库信息科技辑 》;20130715(第7期);全文 *

Also Published As

Publication number Publication date
CN103399918A (en) 2013-11-20

Similar Documents

Publication Publication Date Title
US10289700B2 (en) Method for dynamically matching images with content items based on keywords in response to search queries
CN111444395B (en) Method, system and equipment for obtaining relation expression between entities and advertisement recall system
WO2017121251A1 (en) Information push method and device
CN102760124B (en) Pushing method and system for recommended data
TWI615723B (en) Network search method and device
JP6517818B2 (en) Improving Website Traffic Optimization
US8615514B1 (en) Evaluating website properties by partitioning user feedback
CN102262661B (en) Web page access forecasting method based on k-order hybrid Markov model
Srivastava et al. Search Engine Optimization in E-Commerce Sites
CN106251174A (en) Information recommendation method and device
US20170351709A1 (en) Method and system for dynamically rankings images to be matched with content in response to a search query
US8290986B2 (en) Determining quality measures for web objects based on searcher behavior
CN102663617A (en) Method and system for prediction of advertisement clicking rate
CN107066476A (en) A kind of real-time recommendation method based on article similarity
CN101464897A (en) Word matching and information query method and device
CN107122467A (en) The retrieval result evaluation method and device of a kind of search engine, computer-readable medium
CN104899229A (en) Swarm intelligence based behavior clustering system
CN103902597A (en) Method and device for determining search relevant categories corresponding to target keywords
CN103150663A (en) Method and device for placing network placement data
Zhou et al. Relevance feature mapping for content-based multimedia information retrieval
CN102841908A (en) Micro-blog content ordering method and micro-blog content ordering device
CN105678590A (en) topN recommendation method for social network based on cloud model
CN104636403B (en) Handle the method and device of inquiry request
CN110175192A (en) A kind of travelling products recommended method based on subject nucleotide sequence mode
Zhang et al. Author impact: Evaluations, predictions, and challenges

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20160817

CF01 Termination of patent right due to non-payment of annual fee