Content of the invention
The purpose of the present invention is that to solve the above problems and provides a kind of retrieval based on inquiry click figure and recommend
Model optimization.
The present invention is achieved through the following technical solutions above-mentioned purpose:
The present invention includes that optimization aim builds, weighted value is reconstructed and proposed algorithm optimization;
The optimization aim builds:
Understood according to the above analysis, it is the topmost Search Results of inquiry to click on most pages in Search Results;
The relation that we first click on element in bipartite graph for inquiry sets up formalized description:
Define 1 order inquiry and click on bipartite graph G={ Q ∪ U, E, W }, wherein Q and represent inquiry session node set, U represents and looks into
Results web page set is ask, E represents the set in figure side, and W represents the weight set on side;Then side e in bipartite graph is clicked on for inquiryij
Weight WijConstruction method is as follows:
The optimization aim of bipartite graph is clicked in inquiry:
Formula (1) represents:When inquiry session node is qi(qi∈ Q) when, two-value optimized variable cijRepresent that figure is clicked in inquiry
Whether side e have selectedij, and the loss function of optimization aim is the weight on maximized selection side with constraints is to retain side
Inquiry and Webpage correlation weight be maximum, i.e. cijWhen=1, wij≥wikAnd wij≥wkj;When meeting this target, represent query point
Hit in figure and remain many maximum times with regard to inquiring about and click on as far as possible;
One can be inquired about for optimization aim formula (1) or webpage selects multiple identical weight limit sides;If drawing
Enter degree d (the i)=∑ of each nodejδ (i, j) and d (j)=∑iδ (i, j), then formula (1) be equivalent to formula (2), wherein δ (i,
J) query node q is representediWith web page joint uiBetween whether there is side (existing for 1, be otherwise 0);
The optimization aim equivalent form of value of core figure is clicked in inquiry:
In the constraint of optimization aim (2), explicit permission inquiry clicks on a query node of core in figure while connecting
Multiple web page joints are connected to, while also allowing inquiry to click on a web page joint connection multiple queries node of core in figure;
The weighted value reconstruct:
Such as define in 1, bipartite graph G={ Q ∪ U, E, W } is clicked in inquiry, first, is provided with aijIndividual user has carried out clicking on behaviour
Make;Now, weight W that conventional construction inquiry is connected side with webpage is to use inquiry qiCorresponding webpage ujNumber of clicks cijTable
Show, i.e. wij=cij;By analysis it was found that user is when Search Results are browsed, some users relatively enliven, number of clicks
Many, some numbers of clicks are few, due to the difference of user activity, cause touching quantity really can not reflect between inquiry and webpage
The degree of association;In order to avoid the appearance of this biasing phenomenon, we introduce user's frequency to replace number of clicks, i.e. wij=aij;
Secondly, for same inquiry, user clicks two webpage u1And u2, and touching quantity is equal, if u1Also by more
Inquiry was clicked on, then explanation occurs in u1On click there is no u2Important, that is, u1Low with inquiry degree of association;Therefore, it can
Inverse enquiry frequency is set up to each webpage, i.e.,:
In formula, N represents the quantity of inquiry, NqRepresent the inquiry quantity for clicking the webpage;Now, w is madeij=cij·iqf
(u);
Based on this, transition probability the Theory Construction weight can also be utilized;Following two probits are calculated first:
(1) inquiry Session Hand-off is to the probability of related web page:
(2) related web page is to the transition probability of inquiry session:
As transition probability has unsymmetry, i.e. P (uj|qi)≠P(qi|uj), therefore can adopt linear interpolation or take advantage of
The method of product carrys out the symmetry of equalizing weight, such as makes wij=α P (qi|uj)+(1-α)P(uj|qi) wherein α be adjustable JIESHEN
Number), or make wij=P (qi|uj)·P(uj|qi);
The proposed algorithm optimization:
(1) basic model:Most basic inquiry recommendation method is to be clicked in bipartite graph to click on co-occurrence according to inquiry
Inquiry is recommended;This thought is amplified further, that is, the inquiry with identical click is similar, and we will be by random
The similarity is propagated by migration method;Namely from initial query, click on according to click on bipartite graph in inquiry
Probability migration is to adjacent inquiry, and continues migration from adjacent inquiry;With this iteration, until terminating;Random walk model have before to
With backward two kinds of migration modes;Two kinds of migration modes can be represented with same group of definition;
Equally, bipartite graph is clicked in inquiry and G={ Q ∪ U, E, W } is defined as, make M represent the nodes of inquiry, N represents net
Page nodes, wijRepresent inquiry qiWith webpage ujClick weight;Probability transfer matrix A=(M+N) × (M+N) is built, is then saved
Point transition probability A [i, j]=P (qj|qi), it is re-introduced into from transition probability s, then new transition probability P (vj|vi) definition such as formula
(6);
According to given start node vi, the random walk iteration of forward or a backward can be carried out;Before being a difference in that to
Migration is possible to the inquiry q' for obtaining inquiring about that q clicks on most possible arrival on bipartite graph in inquiry, it is contemplated that start node viTrip
The probability of other nodes is gone to, i.e.,:And backward migration may reach initial query node q, examine
Consider from other node migration to start node viProbability, i.e.,:
(2) problem finds:On the basis of above-mentioned algorithm, the value of arrange parameter n and s, n represents the node being introduced in bipartite graph
Quantity;S represents from transition probability, i.e., migration to other nodes, s value should not be set to 0.9 quickly in transfer process;At place
When reason inquiry is recommended, the value of n is bigger, represents and wants that introducing more nodes carries out migration, or even can include all sections in whole figure
Point, can so bring " proposed topic drift " problem, be exactly that the inquiry and user inquiry degree of association that reaches of migration is not high;Specifically deposit
In problems with:
For migration forward, after iteration for several times, transition probability is transmitted in more popular inquiry, causes to push away
The inquiry that recommends is inaccurate or uncorrelated;Such as " People Weekly " is inquired about, " global personage " and " epoch people may be recommended to last
The more popular publication such as thing ";When being propagated using migration backward, probability can tend to homogenization, can recommend spelling wrong or
The relatively low inquiry of frequency;
Traditional recommended models can not effectively distinguish the inquiry for disagreeing figure, and the inquiry in random walk model is recommended to be profit
Carried out with the similar propagation of probability, before part can be caused to have tight association or closely similar inquiry to be recommended in most so that push away
Recommend result is more single, reduce the diversification of recommendation;
(3) algorithm optimization:In order to solve the problems, such as above-mentioned tradition random walk recommended models, propose based on query point
The random walk recommended models of figure are hit, by describing inaccurate and recommending to cut without representational in conventional recommendation model
Branch;According to the iterative algorithm of random walk ,-probability distribution the situation of web page joint can be inquired about, can be now each
Webpage is selected the inquiry of corresponding inquiry click in figure and recommends user;
The random walk model proposed algorithm of figure is clicked on based on inquiry:
The Random Walk Algorithm convergence process of forward and backward is as follows:
In forward direction random walk, transition probability matrix is carried out using the markovian Stationary Distribution in stochastic process
Convergence;Given shift-matrix A, if there is iterationses n, works as AnDuring [i, j] > 0, then the Ma Erke being made up of all nodes
Husband's chain is homogeneous aperiodic and irreducible, with unique Stationary Distribution;Now forward direction random walk iterative model can turn
It is changed into vT(n+1)=vT(n) A=v (0) An;Work as AnTend to A [i, j]=π when Stationary Distributionj, wherein each stage is steady
Distribution probability is πT=[π1,π2,...,πM+N], so limn→∞V (n)=π, it is probability distribution to be apparent from as probability v (0)
When, vTN () A must be stationary binomial random process;
Rear to random walk when, initially propose also not provide convergence card in the document of backward random walk model
Bright;Equally we assume that random matrix A Stable distritation, even if being apparent from probability v (0) for probability distribution, A v (n) also differs
Surely it is probability distribution;Therefore normalized vector v in an iterative process, orderBecause probability
The row of shift-matrix A and 0 is all higher than for all transition probabilities in 1, and A, when probability v (0) is homogeneous distribution, iteration mistake
Journey carries out probability normalized according to the row of probability transfer matrix A, i.e. norm (A v (n))=v (0), and now algorithm can not
The disconnected distribution probability for obtaining uniforming;If in the case of whole inquiry click bipartite graph is strongly connected, any two node
It is intercommunication, then in iterative process, each item of vector v can all be more than zero, and then constantly iteration can be by v normalization;Formalization
For:Iterative process takes advantage of matrix A for left side, and therefore after nth iteration, value is:If A Stationary Distribution, then An=[π1,π2,...,πM+1], nowIt is and vTWith the row vector of length, because Z is the homogenization factor, if v (0) is probability distribution,It is to be uniformly distributed, the initial state of system entropy maximum is exactly state when which is uniformly distributed, after
Just it is intended to return to the state of setting out of system most original to random walk model essence;And forward direction random walk model is that system passes through
Constantly iteration extends forward, eventually finds steady statue;Recommend in application, when whole in figure possesses more click in inquiry
When query node comes preferential position, that is, the plateau that forward direction random walk model is obtained;And work as all nodes of in figure
When distribution probability is identical, backward random walk model reaches Stationary Distribution;Therefore, in recommendation process, homogeneous probability and hot topic
Node matrix equation convergence in probability is unfavorable for that inquiry is recommended;Suitable iterationses are set in advance and from transition probability, such as n=10, s=
0.9, with the scope of random walk in this control figure.
The beneficial effects of the present invention is:
The present invention is a kind of retrieval recommended models optimization based on inquiry click figure, compared with prior art, present invention head
Search behavior first to user and intention are analyzed, and the data extraction method and expression of search behavior is ground
Study carefully, by the deep excavation to inquiring about session, it is proposed that the query word correlating method based on user's inquiry log.Secondly, emphasis
The theory of bipartite graph recommended models is clicked on to traditional directory and computational methods are analyzed.As the knot of bipartite graph is clicked in inquiry
Structure is simple, practical, and implementation process does not rely on term and webpage Similarity Measure, is therefore widely used in searching
During index is held up.The present invention proposes using click frequency and replaces number of clicks to build the weight on side in bipartite graph, so permissible
Avoid weight from not biased by excessive invalid clicks, make commending system reach steady statue as far as possible.Finally, by experiment and
Data analysiss demonstrate the superiority of improved model in terms of three.
Specific embodiment
The invention will be further described below:
The present invention includes that optimization aim builds, weighted value is reconstructed and proposed algorithm optimization;
The optimization aim builds:
Understood according to the above analysis, it is the topmost Search Results of inquiry to click on most pages in Search Results;
The relation that we first click on element in bipartite graph for inquiry sets up formalized description:
Define 1 order inquiry and click on bipartite graph G={ Q ∪ U, E, W }, wherein Q and represent inquiry session node set, U represents and looks into
Results web page set is ask, E represents the set in figure side, and W represents the weight set on side;Then side e in bipartite graph is clicked on for inquiryij
Weight WijConstruction method is as follows:
The optimization aim of bipartite graph is clicked in inquiry:
Formula (1) represents:When inquiry session node is qi(qi∈ Q) when, two-value optimized variable cijRepresent that figure is clicked in inquiry
Whether side e have selectedij, and the loss function of optimization aim is the weight on maximized selection side with constraints is to retain side
Inquiry and Webpage correlation weight be maximum, i.e. cijWhen=1, wij≥wikAnd wij≥wkj;When meeting this target, represent query point
Hit in figure and remain many maximum times with regard to inquiring about and click on as far as possible;
One can be inquired about for optimization aim formula (1) or webpage selects multiple identical weight limit sides;If drawing
Enter degree d (the i)=∑ of each nodejδ (i, j) and d (j)=∑iδ (i, j), then formula (1) be equivalent to formula (2), wherein δ (i,
J) query node q is representediWith web page joint uiBetween whether there is side (existing for 1, be otherwise 0);
The optimization aim equivalent form of value of core figure is clicked in inquiry:
In the constraint of optimization aim (2), explicit permission inquiry clicks on a query node of core in figure while connecting
Multiple web page joints are connected to, while also allowing inquiry to click on a web page joint connection multiple queries node of core in figure;
By finding to above optimization aim analysis, the problem has certain contact and area with traditional stable matching problem
Not;The core concept of stable matching is to realize a kind of steady statue, and in this state, coupling no longer has such two when finishing
Individual set main body;In reality, men and women's blind date familiar to us, the example such as company's intern and buyer seller are namely based on stable
The thought of market matching theory is developed;Wherein bilateral model and delay receive two pieces of weights that algorithm is stable matching theory
Want foundation stone;
The major function of a lot of markets of bipartite matching model and social system is exactly that main body therein can be led with another
Body phase is mated:For example, student and school, office worker and company, are old enough to get married between men and women;This market coupling is broadly divided into " monolateral city
Field coupling " (Single-SidedMarketMatch) and " two day market coupling " (Two-Sided Market Match);Wherein
" one-side market coupling " refers to only exist a set in market, and the individuality in set is mutually matched according to respective preference;So
And, " room-mate " phenomenon in one-side market coupling can cause the unstable of coupling;When assume exist four " room-mate " A, B, C,
D }, wherein A most preference B, B most preference C, C most preference A, and they are classified as D as least preference person;In this case, any
It is grouped two-by-two all and cannot realizes stablizing, because current matching can be terminated with the people that D is grouped together goes with matched people again
Coupling, and specifically new coupling will be successful so that market cannot realize stable (Gale&Shapley, 1962) always;" bilateral
Matching Model " is proposed from research student application school's model and marriage stable problem by Gale and Shapley (1962) earliest;
So-called " two day market " refers to there is such a market, has two class individual collections, the individuality in first kind set in market
Can only match with the individuality in Equations of The Second Kind set;They demonstrate in such a two day market, as long as the preference of individuality
With completeness and transferability, and the freedom that market is enough, individuality can be allowed to carry out any potentially possible coupling, entirely
Process can be carried out with iteration, until all individualities have coupling object, reach whole market stable;Bipartite matching model is present
This characteristic of stable matching so which is obtained in theory and practice and is widely applied;
What this chapter was proposed clicks on the improved recommended models of bipartite graph to inquiry, with " the one-side market in stable matching problem
Coupling " is similar, is that similar aspect is as follows with regard to inquiry and web page joint number stable matching problem:
(1) inquiry session node and to return web page joint number possibility different, therefore not can determine that all nodes have
Pairing as;
(2) only there is click preference relation in most inquiry sessions and the webpage of oneself correlation between, not be and all webpages
Exist and click on preference;
(3) inquiry is clicked in bipartite graph and is likely to occur number of clicks (weight) identical side, now cannot get Proper Match;
The weighted value reconstruct:
Such as define in 1, bipartite graph G={ Q ∪ U, E, W } is clicked in inquiry, first, is provided with aijIndividual user has carried out clicking on behaviour
Make;Now, weight W that conventional construction inquiry is connected side with webpage is to use inquiry qiCorresponding webpage ujNumber of clicks cijTable
Show, i.e. wij=cij;By analysis it was found that user is when Search Results are browsed, some users relatively enliven, number of clicks
Many, some numbers of clicks are few, due to the difference of user activity, cause touching quantity really can not reflect between inquiry and webpage
The degree of association;In order to avoid the appearance of this biasing phenomenon, we introduce user's frequency to replace number of clicks, i.e. wij=aij;
Secondly, for same inquiry, user clicks two webpage u1And u2, and touching quantity is equal, if u1Also by more
Inquiry was clicked on, then explanation occurs in u1On click there is no u2Important, that is, u1Low with inquiry degree of association;It is right to therefore, it can
Each webpage sets up inverse enquiry frequency, i.e.,:
In formula, N represents the quantity of inquiry, NqRepresent the inquiry quantity for clicking the webpage;Now, w is madeij=cij·iqf
(u);
Based on this, transition probability the Theory Construction weight can also be utilized;Following two probits are calculated first:
(1) inquiry Session Hand-off is to the probability of related web page:
(2) related web page is to the transition probability of inquiry session:
As transition probability has unsymmetry, i.e. P (uj|qi)≠P(qi|uj), therefore can adopt linear interpolation or take advantage of
The method of product carrys out the symmetry of equalizing weight, such as makes wij=α P (qi|uj)+(1-α)P(uj|qi) wherein α be adjustable JIESHEN
Number), or make wij=P (qi|uj)·P(uj|qi);
Using set forth herein built using user's frequency inquiry click on bipartite graph in weight, weight can be avoided not
Biased by excessive invalid clicks number of times;The benefit for so building is that in bipartite graph, all of side is all integer, is easy to follow-up
The solution of optimized algorithm;The number of users of other search daily record is the whole weight that clicks in bipartite graph of inquiring about with its result is straight
Sight is readily appreciated;
The proposed algorithm optimization:
Through above, the mathematical model and corresponding algorithm of inquiry click bipartite graph is analyzed, we have proposed and be based on
The new proposed algorithm of figure is clicked in inquiry, and the algorithm has filtered inaccurate and under-represented inquiry to be recommended, and successfully avoid
Conventional recommendation algorithm ignores the equivalence and problem typical that inquires about under same group;And avoid excessive invalid clicks number of times to draw
The biasing problem for rising, improves the precision of inquiry recommended models well;
(1) basic model:Most basic inquiry recommendation method is to be clicked in bipartite graph to click on co-occurrence according to inquiry
Inquiry is recommended;This thought is amplified further, that is, the inquiry with identical click is similar, and we will be by random
The similarity is propagated by migration method;Namely from initial query, click on according to click on bipartite graph in inquiry
Probability migration is to adjacent inquiry, and continues migration from adjacent inquiry;With this iteration, until terminating;Random walk model have before to
With backward two kinds of migration modes;Two kinds of migration modes can be represented with same group of definition;
Equally, bipartite graph is clicked in inquiry and G={ Q ∪ U, E, W } is defined as, make M represent the nodes of inquiry, N represents net
Page nodes, wijRepresent inquiry qiWith webpage ujClick weight;Probability transfer matrix A=(M+N) × (M+N) is built, is then saved
Point transition probability A [i, j]=P (qj|qi), it is re-introduced into from transition probability s, then new transition probability P (vj|vi) definition such as formula
(6);
According to given start node vi, the random walk iteration of forward or a backward can be carried out;Before being a difference in that to
Migration is possible to the inquiry q' for obtaining inquiring about that q clicks on most possible arrival on bipartite graph in inquiry, it is contemplated that start node viTrip
The probability of other nodes is gone to, i.e.,:And backward migration may reach initial query node q, examine
Consider from other node migration to start node viProbability, i.e.,:
(2) problem finds:On the basis of above-mentioned algorithm, the value of arrange parameter n and s, n represents the node being introduced in bipartite graph
Quantity;S represents from transition probability, i.e., migration to other nodes, s value should not be set to 0.9 quickly in transfer process;At place
When reason inquiry is recommended, the value of n is bigger, represents and wants that introducing more nodes carries out migration, or even can include all sections in whole figure
Point, can so bring " proposed topic drift " problem, be exactly that the inquiry and user inquiry degree of association that reaches of migration is not high;Specifically deposit
In problems with:
For migration forward, after iteration for several times, transition probability is transmitted in more popular inquiry, causes to push away
The inquiry that recommends is inaccurate or uncorrelated;Such as " People Weekly " is inquired about, " global personage " and " epoch people may be recommended to last
The more popular publication such as thing ";When being propagated using migration backward, probability can tend to homogenization, can recommend spelling wrong or
The relatively low inquiry of frequency;
Traditional recommended models can not effectively distinguish the inquiry for disagreeing figure, and the inquiry in random walk model is recommended to be profit
Carried out with the similar propagation of probability, before part can be caused to have tight association or closely similar inquiry to be recommended in most so that push away
Recommend result is more single, reduce the diversification of recommendation;
Traditional random walk model proposed algorithm is as follows:
(3) algorithm optimization:In order to solve the problems, such as above-mentioned tradition random walk recommended models, propose based on query point
The random walk recommended models of figure are hit, by describing inaccurate and recommending to cut without representational in conventional recommendation model
Branch;According to the iterative algorithm of random walk ,-probability distribution the situation of web page joint can be inquired about, can be now each
Webpage is selected the inquiry of corresponding inquiry click in figure and recommends user;
The random walk model proposed algorithm of figure is clicked on based on inquiry:
The Random Walk Algorithm convergence process of forward and backward is as follows:
In forward direction random walk, transition probability matrix is carried out using the markovian Stationary Distribution in stochastic process
Convergence;Given shift-matrix A, if there is iterationses n, works as AnDuring [i, j] > 0, then the Ma Erke being made up of all nodes
Husband's chain is homogeneous aperiodic and irreducible, with unique Stationary Distribution;Now forward direction random walk iterative model can turn
It is changed into vT(n+1)=vT(n) A=v (0) An;Work as AnTend to A [i, j]=π when Stationary Distributionj, wherein each stage is steady
Distribution probability is πT=[π1,π2,...,πM+N], so limn→∞V (n)=π, it is probability distribution to be apparent from as probability v (0)
When, vTN () A must be stationary binomial random process;
Rear to random walk when, initially propose also not provide convergence card in the document of backward random walk model
Bright;Equally we assume that random matrix A Stable distritation, even if being apparent from probability v (0) for probability distribution, A v (n) also differs
Surely it is probability distribution;Therefore normalized vector v in an iterative process, orderBecause probability turns
Moving the row of matrix A and 0 is all higher than for all transition probabilities in 1, and A, when probability v (0) is homogeneous distribution, iterative process
Row according to probability transfer matrix A carry out probability normalized, i.e. norm (A v (n))=v (0), and now algorithm can be continuous
Obtain the distribution probability for uniforming;If in the case of whole inquiry click bipartite graph is strongly connected, any two node is
Intercommunication, then in iterative process, each item of vector v can all be more than zero, and then constantly iteration can be by v normalization;Form is turned to:Iterative process takes advantage of matrix A for left side, and therefore after nth iteration, value is:If A Stationary Distribution, then An=[π1,π2,...,πM+1], nowIt is and vTWith the row vector of length, because Z is the homogenization factor, if v (0) is probability distribution,It is to be uniformly distributed, the initial state of system entropy maximum is exactly state when which is uniformly distributed, after
Just it is intended to return to the state of setting out of system most original to random walk model essence;And forward direction random walk model is that system passes through
Constantly iteration extends forward, eventually finds steady statue;Recommend in application, when whole in figure possesses looking into for more click in inquiry
When inquiry node comes preferential position, that is, the plateau that forward direction random walk model is obtained;And work as all nodes of in figure and divide
When cloth probability is identical, backward random walk model reaches Stationary Distribution;Therefore, in recommendation process, homogeneous probability and hot topic are saved
Dot matrix convergence in probability is unfavorable for that inquiry is recommended;Suitable iterationses are set in advance and from transition probability, such as n=10, s=
0.9, with the scope of random walk in this control figure.
Experiment and analysis
The inquiry click figure recommended models optimized algorithm performance that this section is proposed to this chapter by experiment is verified.By reality
Test the given inquiry of data set and diagram data analysis is clicked on, mainly click on degree of association, recommend performance and recommendation results various from inquiry
Change the recommendation method after three aspects compare traditional method and optimize, demonstrate the retrieval based on inquiry click figure after optimizing
The effectiveness of proposed algorithm.
Analysis of experimental data
The network inquiry daily record that experimental data set is provided using BeiJing ZhongKe's laboratory, by arranging to data set and dividing
Analysis, its log record file size is that always inquiry is recorded as 1135274 to 47MB, user, and total hits are 3675413, always
Inquiry word number is 176687.Learnt after the statistical analysiss to query word frequency, retrieving query word of the number of times more than 5 is
28745, these words we classify as high frequency words, these high frequency words are for must inquire about record totally 883752.It follows that accounting for
The high frequency query word of total query word 16.3%, but account for 77.8% inquiry times.We when pretreatment is carried out to data, if
Threshold value is put for 5, low-frequency word of the 83.7% retrieval number of times less than 5 is filtered out, has so also only cut 22.2% inquiry letter
Breath.Space more conference due to data set causes recommended models more complicated, and after 83.7% query word is neglected, model is adopted
Sample space is original 1/6, and low frequency query word corresponding be inquiry of low quality, if adopting low-frequency word
As iteration initial point, typically up to less than recommendation effect.To the essential information such as table 1 after the beta pruning arrangement of data.
Daily record data statistical information is clicked in the inquiry of table 1
According to the inquiry click figure information that excavates in data set, we carry out example at the sampled side of part different frequency
Analysis.User mainly has three kinds of information interaction approach when using search engine:(1) dragnet station owner wants domain name to carry out website
Search;(2) search name or fixing term find Authoritative Web pages, such as carry out relevant inquiring using the Baidupedia page;(3) search for
The main description of information such as searches for lyrics information using title of the song finding the source of information.It is found that figure is clicked in inquiry retaining
Touching quantity most webpage, and be the page that user is most interested in, its correspond to the query word that submits to be can accurate description
User's request.
In order to distribution of the figure in different frequency is clicked in the inquiry after analysis optimization, we are according to " inquiry " and " webpage " node
Between side weights, side is classified:Power while, Gao Quanbian, middle power while, low power while and during weak power, its each self-corresponding use
Family is clicked on frequency and is respectively:[1000 ,+∞), [100,1000], [10,100], [2,10], [1,1].Passed by calculating further
The distribution situation of system bipartite graph and improvement bipartite graph in above-mentioned five classification, as shown in table 2, the numeral in table bracket is to change
Enter the shared ratio in traditional bipartite graph of bipartite graph.It can be found that from the table:(1) improve figure power while and class during weak power
The ratio for accounting in type is higher than other three types, this is because what power side and weak Quan Bian represented is the most strong inquiry of the degree of association
Click on;(2) as high power is in, middle power and low power side proportion is relatively low, illustrate that the degree of association is relatively low, it is possible to be removed.
2 traditional directory of table is clicked on bipartite graph and its improves distribution situation of the figure on dissimilar side
Ultimate principle and principal character and the advantages of the present invention of the present invention has been shown and described above.The technology of the industry
Personnel it should be appreciated that the present invention is not restricted to the described embodiments, simply explanation described in above-described embodiment and description this
The principle of invention, without departing from the spirit and scope of the present invention, the present invention also has various changes and modifications, these changes
Change and improvement is both fallen within scope of the claimed invention.The claimed scope of the invention by appending claims and its
Equivalent thereof.