CN103577486A - Method and equipment of sorting search results - Google Patents

Method and equipment of sorting search results Download PDF

Info

Publication number
CN103577486A
CN103577486A CN201210279565.9A CN201210279565A CN103577486A CN 103577486 A CN103577486 A CN 103577486A CN 201210279565 A CN201210279565 A CN 201210279565A CN 103577486 A CN103577486 A CN 103577486A
Authority
CN
China
Prior art keywords
vector
attribute
value
result
searchers
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201210279565.9A
Other languages
Chinese (zh)
Inventor
王建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Oak Pacific Interactive Technology Development Co Ltd
Original Assignee
Beijing Oak Pacific Interactive Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Oak Pacific Interactive Technology Development Co Ltd filed Critical Beijing Oak Pacific Interactive Technology Development Co Ltd
Priority to CN201210279565.9A priority Critical patent/CN103577486A/en
Publication of CN103577486A publication Critical patent/CN103577486A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques

Abstract

The embodiment of the invention relates to a method of sorting search results. The method comprises the steps of building attribute vector X (x1, x2, ellipsis, xn) in allusion to each result, wherein n is a natural number; distributing weight omega (omega1, omega2, ellipsis, omegan) for each vector; sorting the results according to omega.X. The embodiment of the invention also relates to equipment of sorting the search results. The equipment comprises a device for building the attribute vector X (x1, x2, ellipsis, xn) in allusion to each result, a device for distributing the weight omega (omega1, omega2, ellipsis, omegan) for each vector and a device for sorting the results according to the omega.X, wherein n is a natural number.

Description

A kind of method and apparatus that Search Results is sorted
Technical field
The present invention relates to web search, relate more specifically to Search Results to sort.
Background technology
Along with the high speed development of internet and universal, in internet, comprised the information content of magnanimity.User will therefrom find the information of oneself paying close attention to just need to rely on web search technology.Search technique is that the information resources on internet are collected to arrangement, and then, for the technology of user's inquiry, it comprises that information search, finish message and user inquire about three parts.Yet due to the working method of search engine and the fast development of the Internet, the result of its search makes people more and more dissatisfied.
Particularly, usining the problem running into while searching people in social networks SNS (Social Network Service) describes existing issue as example.The user of social networks can be in social networks seeker to find classmate, colleague, friend etc.Yet, take Renren Network as example, if search for a people who is Wang Wei, in the user of Renren Network, there is more than 80000 Wang Wei, how this distinguishes that with regard to relating to which Wang Wei is only the problem of searchers's search target.Need to Search Results, carry out rational sorting according to Search Results and searchers's correlativity thus so that searchers finds the search target of oneself paying close attention to as early as possible, thereby improve search efficiency, improve user's experience.
Summary of the invention
For above-mentioned technical matters, the present invention has been proposed.Object of the present invention comprises provides a kind of method and apparatus that Search Results is sorted.
Some embodiment according to an aspect of the present invention, provides a kind of method that Search Results is sorted: for each result, set up attribute vector X (x 1, x 2..., x n), n is natural number; For each vector distributes weights omega (ω 1, ω 2..., ω n); According to ω X, result is sorted.
In some embodiments of the invention, described attribute vector comprises at least one in local, primary school, middle school, university, specialty, unit, location.According to the correlativity of the attribute of result and searchers's same attribute, by attribute vector value, be the value between 0 to 1.
In some embodiments of the invention, according to searchers's selection, train the value of weight.Preferably use linear classifier to train.Preferably use support vector machine to train.Preferably for linearly inseparable, introduce penalty factor, the value of training C according to searchers's selection.
In some embodiments of the invention, in training process, used cross validation model.In training process, adopt Hadoop parallel training mode.
Some embodiment according to an aspect of the present invention, provides a kind of equipment that Search Results is sorted, and comprising: for set up attribute vector X (x for each result 1, x 2..., x n) device, n is natural number; Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device; For device result being sorted according to ω X.
In some embodiments of the invention, described attribute vector comprises at least one in local, primary school, middle school, university, specialty, unit, location.For setting up attribute vector X (x for each result 1, x 2..., x n) device according to the correlativity of the attribute of result and searchers's same attribute, by attribute vector value, be the value between 0 to 1.
In some embodiments of the invention, be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for the value of training weight according to searchers's selection.Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and use linear classifier to train.Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and use support vector machine to train.The equipment that Search Results is sorted also comprises that wherein penalty factor is introduced for linearly inseparable for train the device of the value of C according to searchers's selection.
In some embodiments of the invention, be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, used cross validation model.Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, adopt Had00p parallel training mode.
Accompanying drawing explanation
By reference to accompanying drawing, read detailed description below, above-mentioned and other objects of exemplary embodiment of the invention, the feature and advantage easy to understand that will become.In the accompanying drawings, in exemplary and nonrestrictive mode, show some embodiments of the present invention, wherein:
Fig. 1 shows the schematic flow diagram of the method flow according to an embodiment of the invention Search Results being sorted.
Fig. 2 shows and uses the schematic diagram of linear classifier to sample training.
Fig. 3 shows the schematic diagram of the lineoid that sample point is separated.
Fig. 4 shows the lineoid schematic diagram of support vector machine.
Fig. 5 shows the schematic diagram of the sample point in linearly inseparable situation.
Fig. 6 shows the cross validation CV of training result (Cross Validation) curve according to an embodiment of the invention.
Fig. 7 shows the structured flowchart of the equipment that Search Results is sorted according to an embodiment of the invention.
Fig. 8 has schematically shown the structured flowchart that can realize computing equipment according to the embodiment of the present invention.
In the accompanying drawings, identical or corresponding label represents identical or corresponding part.
Embodiment
Below with reference to some illustrative embodiments, principle of the present invention and spirit are described.Should be appreciated that providing these embodiments is only used to make those skilled in the art can understand better and then realize the present invention, and not limit the scope of the invention by any way.
Embodiment according to an aspect of the present invention provides a kind of method that Search Results is sorted.Fig. 1 shows the schematic flow diagram of the method flow according to an embodiment of the invention Search Results being sorted.Referring to Fig. 1, at the step S110 place of this sort method 100, for each result, set up attribute vector X (x 1, x 2..., x n), n is natural number.
Specifically, for the problem that Search Results is sorted, way is to consider preferentially to look for the Search Results that has relevant information with searchers intuitively.These relevant informations are for example: same school, with specialty, to enter school year identical or approach (fellow apprentice), fellow villager etc.With attribute vector, represent that Search Results to be selected is as follows:
X=(x 1,...,x n)
Wherein for example, x 1represent university, with school x 1=1, different school x 1=0;
X 2represent specialty, with specialty x 2=1, different majors x 2=0;
X 3represent academic year phase recency, suppose into school year gap be d,
Figure BSA00000760866600041
X 4represent local, fellow villager is x 4=1, fellow villager x not 4=0, etc.
The specific embodiment more than providing is only done any restriction to protection scope of the present invention for exemplary purposes and not.Those skilled in the art can understand, and attribute vector can also comprise the vector that represents other attributes, for example include but not limited to: primary school, middle school, unit, location etc.Further, according to the correlativity of the attribute of result and searchers's same attribute, by attribute vector value, be the value between 0 to 1.Particularly, the sample list of the attribute vector using when Search Results is sorted according to an embodiment of the invention and corresponding value thereof is as follows, wherein:
Figure BSA00000760866600042
Figure BSA00000760866600061
Table 1
Visible according to above-mentioned analysis, X=(x 1..., x n) each component be peaked Search Results for best, yet in actual search results, not necessarily there is such result, and in esse most of Search Results be all different vectors is maximal value.
For which Search Results of synthetic determination and searchers's correlativity stronger, this sort method 100 marches to step S120, is each component of a vector x idistribute a weights omega i, weight vectors ω is expressed as follows thus:
ω=(ω 1,...,ω n)
ω wherein 1represent the weight relevant to university;
ω 2the weight that representative is relevant to specialty;
ω 3representative and the relevant weight of academic year phase recency;
ω 4represent the weight relevant to local, etc.
The specific embodiment more than providing is only done any restriction to protection scope of the present invention for exemplary purposes and not.Those skilled in the art can understand, in weight vectors, can also comprise for other vectorial weights, for example include but not limited to: for represent primary school vectorial weight, for represent middle school vectorial weight, for representative vector of unit length weight, for representing weight of on-site vector etc.
Then this sort method 100 enters step S130, according to ω X, result is sorted.Particularly, according to above-mentioned attribute vector and weight vectors, establish a function f,
f = Σ i = 1 n ω i x i = ω · X
According to the size of the value of f, Search Results is sorted, that is to say that the value of f is larger, the Search Results corresponding with it and searchers's correlativity are larger, thereby sequence is more forward.
For weights omega=(ω 1..., ω n) setting, in an embodiment according to the present invention, according to searchers's selection, train the value of weight.Specifically, to a large number of users, training is added up in behavior, thereby finds best ω.
With concrete example, the process of training the value of weight according to searchers's selection is described below.Suppose that searchers searches for “Wang army ", obtain Liao10Ge Wang Jun, according to existing ω, according to the value of the f calculating, Search Results is sorted.Yet searchers adds the second Ge Wang army for good friend.This has problem with regard to current order models is described, the second Ge Wang army is more valuable for searchers, that is to say:
According to existing reckoning result f (X1) > f (X2)
Yet f (X1) < f (X2) in fact
Similarly, a large number of users search, plusing good friend behavior are added up, just can be obtained a lot of knowledge:
f(X1)>f(X2)
f(X3)>f(X4)
f(X5)>f(X6)
Further, by
Figure BSA00000760866600081
visible f is a linear function, thereby can obtain
f(X1)>f(X2)→f(X1)-f(X2)=f(X1-X2)=ω·(X1-X2)>0
f(X3)>f(X4)→f(X3)-f(X4)=f(X3-X4)=ω·(X3-X4)>0
f(X5)>f(X6)→f(X5)-f(X6)=f(X5-X6)=ω·(X5-X6)>0
......
Had thus a large amount of knowledge that is called as sample
f(Y1)>0 f(Y4)<0 Y1=X1-X2 Y4=-Y1
f(Y2)>0 f(Y5)<0 Y2=X3-X4 Y5=-Y2
f(Y3)>0 f(Y6)<0 Y3=X5-X6 Y6=-Y3
......
According to f (Y) > 0 above, f (Z) < 0 knowledge, can train and obtain best ω.
For the above-mentioned training to ω, particularly, according to embodiments of the invention, can use linear classifier to train.In machine learning field, need to classify to object, the target of classification refers to assembles the object with similar features, and a linear classifier is made classification through the linear combination of feature and is determined, to reach this kind of object.It is to utilize lineoid to divide the situation of higher dimensional space a linear classification that problems it is contemplated that into: in the "Yes" that is a little all classified into of lineoid one side, opposite side is categorized into "No".
Referring to accompanying drawing 2, Fig. 2 shows and uses the schematic diagram of linear classifier to sample training.Wherein, transverse and longitudinal coordinate represents that sample space is two-dimentional, and horizontal ordinate represents the first dimension, and ordinate represents the second dimension, the straight line in ω x+b=0 representation space, and b represents the intercept of this straight line, expression formula
Figure BSA00000760866600082
show a little to the directed distance between straight line.
Further, when using linear classifier to classify to sample, have a lot of lineoid all to meet the requirements, specifically referring to accompanying drawing 3, Fig. 3 shows the lineoid much sample point being separated.Yet wish the plane that finds classification best, that is, make to belong to that face of two inhomogeneous data point interval maximums, this face is also called largest interval lineoid.
Can use support vector machine to train for this reason.Support vector machine (Support Vector Machine), is called for short SV machine.It is a kind of method of supervising formula study, and it is widely used in statistical classification and regretional analysis.Support vector machine belongs to vague generalization linear classifier.Support vector machine more in the space of higher-dimension, is set up DUAL PROBLEMS OF VECTOR MAPPING to have a largest interval lineoid in this space.On the both sides that separate the lineoid of data, have two lineoid parallel to each other.Separating lineoid maximizes the distance of two parallel lineoid.Distance or gap between parallel lineoid are larger, and the total error of sorter is less.
Referring to Fig. 4, Fig. 4 shows the lineoid diagram of support vector machine.If sample belongs to two classes, by this sample training support vector machine, obtain largest interval lineoid.Sample point on lineoid is also referred to as support vector.If these training datas are linear separabilities, that just can find such two lineoid, also maximum without any the distance between sample point and this two lineoid between them.The distance that can obtain between these two lineoid is
Figure BSA00000760866600091
therefore need to make
Figure BSA00000760866600092
maximize, make
Figure BSA00000760866600093
minimize.
Wherein | ω X+b| >=1,
If ω is X i+ b > 0 is X ifor positive sample, y i=1, otherwise y i=-1.
More specifically, in the processing of support vector machine, use KKT (generalized L agrange) condition, so that inequality constrain is converted into equality constraint.Specifically, KKT condition can be expressed as follows, and establishes a function L,
L ( &omega; , b ) = | | &omega; | | 2 2 - &Sigma; i = 1 n &alpha; i [ y i ( &omega; &CenterDot; X i + b ) - 1 ]
Minimum value to L solves
α wherein i>=0; y i(ω X i+ b)-1>=0;
If y i(ω X i+ b)-1 > 0, α i=0.
L &omega; &prime; = &omega; - &Sigma; i = 1 n &alpha; i y i X i = 0 &DoubleRightArrow; &omega; = &Sigma; i = 1 n &alpha; i y i X i
L b &prime; = &Sigma; i = 1 n &alpha; i y i = 0
Above problem is further transformed by Wolfe antithesis, asks the maximal value of following formula
&Sigma; i = 1 n &alpha; i - 1 2 &Sigma; i = 1 n &Sigma; j = 1 n &alpha; i &alpha; j y i y j X i &CenterDot; X i
Wherein &Sigma; i = 1 n &alpha; i y i = 0 , α i≥0。
In actual treatment, in the situation of sample point in linearly inseparable, referring to Fig. 5.Fig. 5 shows the schematic diagram of the sample point in linearly inseparable situation.For this linearly inseparable, in order to solve introducing penalty factor, ask the minimum value of following formula
| | &omega; | | 2 2 + C &Sigma; i = 1 n &xi; i
Wherein, ξ ilax of>=0 representative, represents wrong degree.
The value of then training C according to searchers's selection.
More than adopt the specific embodiment that support vector machine is trained only for exemplary purposes and not protection scope of the present invention to be done to any restriction.Those skilled in the art can understand and can adopt according to actual needs other linear classifiers to train.Linear classifier includes but not limited to: linear discriminant analysis (LDA) sorter, Naive Bayes Classifier, Logit category of model device, perception unit (Perceptron) sorter etc.
According to one embodiment of present invention, in training process, used cross validation model.Cross validation is that basic thought is that raw data is divided into groups for verifying a kind of statistical analysis technique of performance of sorter, and a part is as training set, and another part is as checking collection.First with training set, sorter is trained, recycling checking collection is tested the model that training obtains, and with this, carrys out the index as classification of assessment device performance.Using the object of cross validation is in order to obtain reliable and stable model.According to having used in the example of the present invention of cross validation, need to find best C, make cross validation accuracy rate the highest.
According to one embodiment of present invention, wherein in training process, adopt Hadoop parallel training mode.Hadoop a kind ofly can carry out to mass data the basis of software framework of distributed treatment, and user can be in the situation that not understanding distributed bottom details whereby, and exploitation distributed program, makes full use of power high-speed computation and the storage of cluster.Hadoop is efficiently, because it works in the mode walking abreast, by parallel processing speed up processing.Illustrate as follows, suppose that sample size is that the theoretical value space of 1,260,000, C is (0 ,+∞), in the initial index step training for C, the value of C is as follows:
e -15,e -15,...,e 0,e 1,e 2,..,e 15
If adopt so common account form to expend a large amount of time, for above-mentioned example, determine the cross validation accuracy rate of a C, need 44 minutes consuming time.Yet the Hadoop that adopts parallel distributed to process can save the processing time greatly.
Below provide and search according to an embodiment of the invention people's result ordering method, the result drawing by training.Through training, learn best C=57.58; Cross validation accuracy is 89.9196%.Wherein sample is divided into two classes, the first kind is called positive sample, and Equations of The Second Kind is called negative sample.The number of actual positive sample is A, and in A positive sample, having B sample is positive sample by machine recognition, and recall rate is B/A; For all positive sample and negative samples, having D sample is positive sample by machine recognition, and accuracy rate is B/D.When above-mentioned C=57.58, for positive sample, accuracy rate is 89.9367%, and recall rate is 89.9044%; For negative sample, accuracy rate: 89.9081%, recall rate is 89.9404%.
Referring to Fig. 6, Fig. 6 shows the cross validation CV curve for the above results.In figure, horizontal ordinate represents C, and ordinate represents accuracy rate.
When C=57.58, the value result of the weight that a specific embodiment according to the present invention is trained by searchers's selection is as follows, please notes that attribute vector corresponding to each weight is as shown in table 1.
1 0.3153811807008091 17 0.2768917428245827
2 0.06523622819605589 18 -0.3416365115929635
3 0.2299912863139242 19 -0.3416365115929635
4 0.229791315599308 20 0.7972915157236754
5 0.2551160227868295 21 0.1174118153407883
6 0.005340509770334241 22 -0.2227287194591228
7 0.6739728169872867 23 0.09637465836733855
8 0.16873364133134452 24 0.6717565383233873
9 0.16490403978642 25 0.1100008536822412
10 -0.08916577303857522 26 0.3455855546464067
11 0.07249746228050874 27 -0.1188966807775003
12 0.02756607559398201 28 0.18539625204656
13 0.07479750396818062 29 0.5589180110643096
14 0.2768917428245827 30 0.4614391274706418
15 0.3319000560374825 31 0.2327547337047164
16 0.3996967990421043
The specific embodiment more than providing is only done any restriction to protection scope of the present invention for exemplary purposes and not.Those skilled in the art can understand and can within the scope of connotation of the present invention, to the element of above-described embodiment, concrete numeral etc., make various adaptability revisions; The part of embodiment can be merged, delete, revise; And can make various specific embodiment.
Those skilled in the art can also understand, and search result ordering method of the present invention is not only applicable to search people, but within the scope of connotation of the present invention, are also applicable to other Search Results to sort.
Below in conjunction with accompanying drawing 7, the equipment according to an embodiment of the invention Search Results being sorted is described.Fig. 7 shows the example block diagram of the equipment according to an embodiment of the invention Search Results being sorted.As shown in Figure 7, the equipment 700 Search Results being sorted comprises: for set up attribute vector X (x for each result 1, x 2..., x n) device 710, wherein n is natural number; Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device 720; For the device 730 result being sorted according to ω X.
According to one embodiment of present invention, attribute vector comprises at least one in local, primary school, middle school, university, specialty, unit, location.Described for set up attribute vector X (x for each result 1, x 2..., x n) device according to the correlativity of the attribute of result and searchers's same attribute, by attribute vector value, be the value between 0 to 1.
According to one embodiment of present invention, be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for the value of training weight according to searchers's selection.
According to one embodiment of present invention, be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and use linear classifier to train.
According to one embodiment of present invention, be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and use support vector machine to train.
According to one embodiment of present invention, the equipment that Search Results is sorted also comprises that wherein penalty factor is introduced for linearly inseparable for train the device of the value of penalty factor according to searchers's selection.
According to one embodiment of present invention, described in be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, used cross validation model.
According to one embodiment of present invention, described in be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, adopt Hadoop parallel training mode.
The concrete configuration mode more than providing is only done any restriction to protection scope of the present invention for exemplary purposes and not.Those skilled in the art can understand and can within the scope of connotation of the present invention, to the element of above-mentioned configuration mode, concrete numeral etc., make various adaptability revisions; Different configuration modes partly can be merged, deletes, revises; And various configuration mode can be set.
Those skilled in the art can also understand, and search results ranking equipment of the present invention is not only applicable to search people, but within the scope of connotation of the present invention, are also applicable to other Search Results to sort.
Below, with reference to Fig. 8, describe and can realize computer equipment of the present invention.Fig. 8 has schematically shown the structured flowchart that can realize computing equipment according to the embodiment of the present invention.
Computer system shown in Fig. 8 comprises CPU (CPU (central processing unit)) 801, RAM (random access memory) 802, ROM (ROM (read-only memory)) 803, system bus 804, hard disk controller 805, keyboard controller 806, serial interface controller 807, parallel interface controller 808, display controller 809, hard disk 810, keyboard 811, serial external unit 812, parallel external unit 813 and display 814.In these parts, what be connected with system bus 804 has CPU801, RAM802, ROM803, hard disk controller 805, keyboard controller 806, serial interface controller 807, parallel interface controller 808 and a display controller 809.Hard disk 810 is connected with hard disk controller 805, keyboard 811 is connected with keyboard controller 806, serial external unit 812 is connected with serial interface controller 807, and parallel external unit 813 is connected with parallel interface controller 808, and display 814 is connected with display controller 809.
Structured flowchart described in Fig. 8 just to the object of example and illustrate, is not limitation of the present invention.In some cases, can add as required or reduce some equipment wherein.
In addition, embodiments of the present invention can realize by the combination of hardware, software or software and hardware.Hardware components can utilize special logic to realize; Software section can be stored in storer, and by suitable instruction execution system, for example microprocessor or special designs hardware are carried out.Those having ordinary skill in the art will appreciate that above-mentioned equipment and method can and/or be included in processor control routine with computer executable instructions realizes, for example, at the mounting medium such as disk, CD or DVD-ROM, provide such code on such as the programmable memory of ROM (read-only memory) (firmware) or the data carrier such as optics or electronic signal carrier.Equipment of the present invention and module thereof can be by such as VLSI (very large scale integrated circuit) or gate array, realize such as the semiconductor of logic chip, transistor etc. or such as the hardware circuit of the programmable hardware device of field programmable gate array, programmable logic device etc., also can use the software of being carried out by various types of processors to realize, also can by the combination of above-mentioned hardware circuit and software for example firmware realize.
Although it should be noted that some devices or the sub-device mentioned for the equipment of buffer scheduling in above-detailed, this division is not only enforceable.In fact, according to the embodiment of the present invention, the feature of above-described two or more devices and function can be specialized in a device.Otherwise, the feature of an above-described device and function can Further Division for to be specialized by a plurality of devices.
Although it is noted that the operation of having described in the accompanying drawings the inventive method with particular order,, this not requires or hint must be carried out these operations according to this particular order, or the operation shown in must carrying out all could realize the result of expectation.On the contrary, the step of describing in process flow diagram can change execution sequence.Additionally or alternatively, can omit some step, a plurality of steps be merged into a step and carry out, and/or a step is decomposed into a plurality of steps carries out.
Although described the present invention with reference to some embodiments, should be appreciated that, the present invention is not limited to disclosed embodiment.The present invention is intended to contain interior included various modifications and the equivalent arrangements of spirit and scope of claims.The scope of claims meets the most wide in range explanation, thereby comprises all such modifications and equivalent structure and function.

Claims (18)

1. the method that Search Results is sorted:
For each result, set up attribute vector X (x 1, x 2..., x n), n is natural number;
For each vector distributes weights omega (ω 1, ω 2..., ω n);
According to ω X, result is sorted.
2. method according to claim 1, wherein said attribute vector comprises at least one in local, primary school, middle school, university, specialty, unit, location.
3. method according to claim 2, wherein according to the correlativity of the attribute of result and searchers's same attribute, is the value between 0 to 1 by attribute vector value.
4. method according to claim 1, wherein trains the value of weight according to searchers's selection.
5. method according to claim 4, is wherein used linear classifier to train.
6. method according to claim 5, is wherein used support vector machine to train.
7. method according to claim 6, wherein introduces penalty factor, the value of training C according to searchers's selection for linearly inseparable.
8. method according to claim 4 has wherein been used cross validation model in training process.
9. method according to claim 4 wherein adopts Hadoop parallel training mode in training process.
10. equipment Search Results being sorted, comprising:
For setting up attribute vector X (x for each result 1, x 2..., x n) device, n is natural number;
Be used to each vector to distribute weights omega (ω 1, ω 2..., ω n) device;
For device result being sorted according to ω X.
11. equipment according to claim 10, wherein said attribute vector comprises at least one in local, primary school, middle school, university, specialty, unit, location.
12. equipment according to claim 11, wherein said for set up attribute vector X (x for each result 1, x 2..., x n) device according to the correlativity of the attribute of result and searchers's same attribute, by attribute vector value, be the value between 0 to 1.
13. equipment according to claim 10, wherein said each vector that is used to distributes weights omega (ω 1, ω 2..., ω n) device be configured for the value of training weight according to searchers's selection.
14. equipment according to claim 13, wherein said each vector that is used to distributes weights omega (ω 1, ω 2..., ω n) device be configured for and use linear classifier to train.
15. equipment according to claim 14, wherein said each vector that is used to distributes weights omega (ω 1, ω 2..., ω n) device be configured for and use support vector machine to train.
16. equipment according to claim 15, described equipment also comprises that wherein penalty factor is introduced for linearly inseparable for train the device of the value of penalty factor according to searchers's selection.
17. devices according to claim 13, wherein said each vector that is used to distributes weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, used cross validation model.
18. devices according to claim 13, wherein said each vector that is used to distributes weights omega (ω 1, ω 2..., ω n) device be configured for and in training process, adopt Hadoop parallel training mode.
CN201210279565.9A 2012-08-02 2012-08-02 Method and equipment of sorting search results Pending CN103577486A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210279565.9A CN103577486A (en) 2012-08-02 2012-08-02 Method and equipment of sorting search results

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210279565.9A CN103577486A (en) 2012-08-02 2012-08-02 Method and equipment of sorting search results

Publications (1)

Publication Number Publication Date
CN103577486A true CN103577486A (en) 2014-02-12

Family

ID=50049284

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210279565.9A Pending CN103577486A (en) 2012-08-02 2012-08-02 Method and equipment of sorting search results

Country Status (1)

Country Link
CN (1) CN103577486A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331411A (en) * 2014-09-19 2015-02-04 华为技术有限公司 Item recommendation method and item recommendation device
CN104933149A (en) * 2015-06-23 2015-09-23 郑州悉知信息技术有限公司 Information searching method and information searching device
CN104991915A (en) * 2015-06-23 2015-10-21 郑州悉知信息技术有限公司 Information search method and apparatus

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101105795A (en) * 2006-10-27 2008-01-16 北京搜神网络技术有限责任公司 Network behavior based personalized recommendation method and system
CN102035891A (en) * 2010-12-17 2011-04-27 百度在线网络技术(北京)有限公司 Method and device for recommending friends in network friend making platform
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN102419779A (en) * 2012-01-13 2012-04-18 青岛理工大学 Method and device for personalized searching of commodities sequenced based on attributes
CN102609523A (en) * 2012-02-10 2012-07-25 上海视畅信息科技有限公司 Collaborative filtering recommendation algorithm based on article sorting and user sorting

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101105795A (en) * 2006-10-27 2008-01-16 北京搜神网络技术有限责任公司 Network behavior based personalized recommendation method and system
CN102035891A (en) * 2010-12-17 2011-04-27 百度在线网络技术(北京)有限公司 Method and device for recommending friends in network friend making platform
CN102411754A (en) * 2011-11-29 2012-04-11 南京大学 Personalized recommendation method based on commodity property entropy
CN102419779A (en) * 2012-01-13 2012-04-18 青岛理工大学 Method and device for personalized searching of commodities sequenced based on attributes
CN102609523A (en) * 2012-02-10 2012-07-25 上海视畅信息科技有限公司 Collaborative filtering recommendation algorithm based on article sorting and user sorting

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104331411A (en) * 2014-09-19 2015-02-04 华为技术有限公司 Item recommendation method and item recommendation device
CN104331411B (en) * 2014-09-19 2018-01-09 华为技术有限公司 The method and apparatus of recommended project
CN104933149A (en) * 2015-06-23 2015-09-23 郑州悉知信息技术有限公司 Information searching method and information searching device
CN104991915A (en) * 2015-06-23 2015-10-21 郑州悉知信息技术有限公司 Information search method and apparatus
CN104933149B (en) * 2015-06-23 2018-08-14 郑州悉知信息科技股份有限公司 A kind of information search method and device

Similar Documents

Publication Publication Date Title
CN105210064B (en) Classifying resources using deep networks
CN105005589A (en) Text classification method and text classification device
Gao et al. Ar-tracker: Track the dynamics of mobile apps via user review mining
CN112883190A (en) Text classification method and device, electronic equipment and storage medium
CN102737126A (en) Classification rule mining method under cloud computing environment
CN102591940A (en) Map/Reduce-based quick support vector data description method and Map/Reduce-based quick support vector data description system
Fel et al. Xplique: A deep learning explainability toolbox
CN104519112A (en) Intelligent selecting framework for staged cloud manufacturing services
Ihde et al. A survey of big data, high performance computing, and machine learning benchmarks
CN103577486A (en) Method and equipment of sorting search results
WO2018010147A1 (en) User feed with professional and nonprofessional content
Li et al. Multi-fuzzy-objective graph pattern matching in big graph environments with reliability, trust and social relationship
Wang et al. Abnormal trajectory detection based on geospatial consistent modeling
CN110389932A (en) Electric power automatic document classifying method and device
Cunningham et al. Assessing network representations for identifying interdisciplinarity
CN110020214A (en) A kind of social networks streaming events detection system merging knowledge
Manaswini et al. Towards a novel strategic scheme for web crawler design using simulated annealing and semantic techniques
Tu An improving using MapReduce model in predicting learning ability of pupils based on Bayes classification algorithm.
CN114240560A (en) Product ranking method, device, equipment and storage medium based on multidimensional analysis
Berdeddouch et al. Recommender System for Most Relevant K Pick-Up Points
Kawan et al. Multiclass Resume Categorization Using Data Mining
Zhang et al. Learning geographical hierarchy features for social image location prediction
CN112148825B (en) User track data processing method and device, electronic equipment and storage medium
Riandari et al. Forecasting the Number of Students in Multiple Linear Regressions
Zhang et al. Research on application model of big data technology in the A1A2 system

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
EXSB Decision made by sipo to initiate substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20140212