CN101751405A - Method and system for searching documents - Google Patents

Method and system for searching documents Download PDF

Info

Publication number
CN101751405A
CN101751405A CN200810187106A CN200810187106A CN101751405A CN 101751405 A CN101751405 A CN 101751405A CN 200810187106 A CN200810187106 A CN 200810187106A CN 200810187106 A CN200810187106 A CN 200810187106A CN 101751405 A CN101751405 A CN 101751405A
Authority
CN
China
Prior art keywords
label
document
term
weight
page
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200810187106A
Other languages
Chinese (zh)
Inventor
杜磊
邓刚
高明
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN200810187106A priority Critical patent/CN101751405A/en
Publication of CN101751405A publication Critical patent/CN101751405A/en
Pending legal-status Critical Current

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for searching documents. The method comprises the following steps of receiving a search term used for searching; inquiring labels related to the search term; searching documents, each of which is provided with at least one label in the labels, carrying out level sequencing on documents; and sending a level sequencing result of the documents.

Description

The method and system that is used for searching documents
Technical field
The present invention relates to the information processing technology, relate in particular to the method and system that is used for searching documents.
Background technology
Along with the fast development of network, information releasing is more and more on the internet, makes the internet become the important media of information issue and information search.Information processing on the internet relates generally to three aspects, i.e. information issue, information classification and information search.The information publisher by internet or other relational approaches on some information platforms, according to certain rule issue relevant information, freely release news or announced information is commented on.The information seeker refers to search relevant information by internet or other relational approaches on some classified information platforms.Information classification is meant the search for information, and information releasing is organized.Searching method commonly used comprises that according to one or more keyword searches search again in Search Results can directly be inquired about or the like with network address during the webpage inquiry.
There are following two problems at search field.The first, because document itself does not comprise the keyword of user inquiring, the document that causes meeting user's needs can not be retrieved out.Present search engine is only paid close attention to the keyword whether document itself comprises user's input when searching documents.As one piece of article, what content was said is " China's innovation ", but the word that uses in the content is " descendants of the Yellow ", " innovation and creation ".With term " China's innovation " search, generally can not find this piece article by present search engine.The second, existing search engine can not be accurately sorts in semantically correlativity according to the searching key word of user's input and the document that searches out.The user needs search engine can search out the document that the semanteme with searching keyword mates more.But existing search engine is difficult to accomplish this point.Because existing search engine or document is sorted or document is sorted, do not consider that document and user input query keyword are in semantically matching degree according to the pouplarity of document according to the significance level of article.Sort algorithms such as " page rank ", " at most clicked ", " keyword occurs at most " for example.
For example, one piece of new article, what say is " innovation and creation " because the people that new article has been seen seldom, be cited also seldom, present search engine can come this piece article the position after leaning on very much.The user is difficult to find this piece article like this.Although can hold very " meeting innovation and creation " in this piece article.
Therefore, need improve, so that the document information that more meets its demand is provided for the user to existing search mechanisms.
Summary of the invention
In view of the deficiency of prior art, the invention provides a kind of method that is used for searching documents, comprising: receive the term that is used to search for; Inquire about the label relevant with described term; Search has the document of at least one label in the described label; Described document is carried out ranking compositor; And the result of ranking compositor is carried out in transmission to described document.
The present invention also provides a kind of system that is used for searching documents, comprising: receiving trap is used to receive the term that is used to search for; Inquiry unit is used to inquire about the label relevant with described term; Searcher is used for searching for the document of at least one label with described label; Collator is used for described document is carried out ranking compositor; And dispensing device, be used to send the result who described document is carried out ranking compositor.
Description of drawings
Fig. 1 shows the process flow diagram that is used for searching documents according to an embodiment of the invention.
Fig. 2 shows the process flow diagram that is used for searching documents according to another embodiment of the present invention.
Fig. 3 shows the process flow diagram that according to another embodiment of the present invention document is sorted.
Fig. 4 shows the process flow diagram that is used to upgrade term/label relation table according to an embodiment of the invention.
Fig. 5 shows the system chart that is used for searching documents according to an embodiment of the invention.
Embodiment
Following reference is according to method, unit describe the present invention of the embodiment of the invention.Wherein, the combination of each square frame can be realized by computer program instructions in each square frame of process flow diagram and/or block diagram and process flow diagram and/or the block diagram.These computer program instructions can offer the processor of multi-purpose computer, special purpose computer or other programmable data treating apparatus, sternly go out a kind of machine thereby give birth to, make and these instructions of carrying out by computing machine or other programmable data treating apparatus produce the device (means) of the function/operation of stipulating in the square frame in realization flow figure and/or the block diagram.
Also can be stored in these computer program instructions in energy command calculations machine or the computer-readable medium of other programmable data treating apparatus with ad hoc fashion work, like this, the instruction that is stored in the computer-readable medium produces a manufacture that comprises the command device (instruction means) of the function/operation of stipulating in the square frame in realization flow figure and/or the block diagram.
Can also be loaded into computer program instructions on computing machine or other programmable data treating apparatus, make and on computing machine or other programmable data treating apparatus, carry out the sequence of operations step, producing computer implemented process, thereby the instruction of carrying out on computing machine or other programmable device just provides the process of the function/operation of stipulating in the square frame in realization flow figure and/or the block diagram.
Fig. 1 shows the process flow diagram that is used for searching documents according to an embodiment of the invention.Wherein show a kind of method 100 that is used for searching documents.This method may further comprise the steps.
At step S110, receive the term (Docuterm) that is used to search for.Term is the speech or the phrase that are used for description document subject content or notion, and it can be used as data name and is used for later retrieval.This term can comprise keyword or descriptor, perhaps other term that can obtain according to a document content.This term also can be the word or expression that user's inquiry input is resolved into.In addition, in order to search for the document relevant, can also obtain term according to given document content with given document.For a person skilled in the art, for a given text fragments, there is several different methods to generate term.At first, the author of document can import its term of thinking the most close with text fragments.The second, can utilize existing multiple algorithm to generate term.For example, can utilize the tf-idf algorithm to come to select keyword from given text fragments.
At step S120, inquire about the label relevant with described term.Label (Tag) can comprise of being attached on the document or one group of vocabulary or phrase.Can be used for document is classified for document tags, conveniently carry out the inquiry of document.Label is different with general classification, and it is set up from bottom to top, and general classification is generally set up from the top down.For the general classification of traditional sense, classification design personnel preestablish the better vocabulary that is used to classify, and according to predefined classification vocabulary every piece of document are classified one by one then, and this classification is like the catalogue.For the label classification, each user can add label arbitrarily for the cognition of document and the demand of searching according to own on document, and the use label need not be considered bibliographic structure and classify to article.Same piece of writing document can be by the additional any a plurality of labels of a plurality of people, and same label also can be attached on the different documents.Relation between each label can be a kind of parallel relation, but can the label that often occurs together be associated according to correlation analysis again, and the classification that produces a kind of correlativity.Just because of can being used of label by everyone, to sum up by the conclusion of group wisdom, a lot of labels can accurately be classified to stepping, and help other users that document is searched, and this is also just using the major advantage of label.
In order to effectively utilize the classification effect of label more,, can manage label so that carry out more effective search.For example, can be according to the semanteme of label substance, and by the relation between the semanteme of the document of additional this label, the label on the document is filtered.Can also come label is filtered or be provided with the weight of label so that document is carried out ranking compositor according to the feedback of user for document.
According to another embodiment of the present invention, can also set up one term/label relation table in advance, be used to inquire about the label relevant with described term.It is right to set up term/label in described term/label relation table, can also set up term/label/document combination, and definite right score value of being set up of term/label, the weight of definite term/label of being set up/document combination.In the present invention, one term/label can inquire this label by this term to comprising a term and a label that is associated with this term; The combination of one term/label/document comprise one term/label to a document that is associated, the identifier of a document or the URL of a document are linked etc. with this term/label, wherein the document has been added this label, can search the document by this label.
According to another embodiment of the present invention, in step S120, inquire about the label relevant, can further be configured to, inquire about the label relevant with described term according to term/label relation table with described term.Described method comprises that further according to the feedback to described document, the term/label that upgrades in described term/label relation table is right.Wherein, described at least one label can comprise and is attached to of being used on the document document is classified or one group of vocabulary or phrase.Wherein, described vocabulary or phrase can have semantic dependency with the document that is added this label.
According to another embodiment of the present invention, for described renewal, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, it is right then to set up corresponding term/label.For described renewal, can also filter out earlier is not the label that is used to classify, and perhaps filters out the label that does not have semantic relation with document content.Then, for other labels of the document, if described term/label relation table does not comprise that the term/label between the label in term and other labels is right, it is right then to set up corresponding term/label.According to feedback to described document, can upgrade described term/label relation table, so that comprise that the term/label between the label that term and document comprise is right, for example, term/label of setting up between the label that the term that wherein do not comprise and document comprise is right.Thereby can enlarge the hunting zone in the search afterwards.
At step S130, search has the document of at least one label in the described label.At step S140, described document is carried out ranking compositor.Wherein, can adopt the mechanism of carrying out ranking compositor in the prior art for document, for example the popular ranking compositor mechanism that search engine adopted on the Internet.Can also carry out ranking compositor to described document according to the weight of predetermined label/document combination.Label is used for document is classified, and a large amount of feedbacks of document are reflected in the weight of label/document combination, has reflected be sorted in value, degree of correlation, validity or the degree of recognition that user it seems of this label to document indirectly.The feedback of document is reflected to the weight of label/document combination, has reflected that correspondingly incidence relation between this label/document is to the assistance degree of user inquiring.Feedback (for specific search, in user's value, degree of correlation, validity or the degree of recognition) concerning document is reflected in the weight that term/document makes up, and has reflected that correspondingly this term is to the assistance degree of document classification to user inquiring.Like this, after enlarging the hunting zone, (or more significant) document that the user may more be approved passes through ranking compositor, preferentially recommends the user.
According to one embodiment of the invention, said method may further include, and according to the feedback to described document, upgrades the weight of described label/document combination.Wherein, can upgrade the weight of described label/document combination according to feedback number of times to described document.
Wherein, can also carry out ranking compositor to described document according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.By the right score value of feedback deterministic retrieval speech/label of a large number of users, reflected that indirectly incidence relation between term/label is to the assistance degree of user inquiring.Feedback to document is reflected in the weight that term/label/document makes up, and has correspondingly reflected the incidence relation of this combination, to the assistance degree of user inquiring.
Those skilled in the art will appreciate that and to carry out first kind of ranking compositor according to the weight of predetermined label/document combination or the weight of term/document combination to described document; Also can carry out second kind of ranking compositor to described document according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination or the weight of term/document combination.For first kind of ranking compositor, the user is directly reflected into the weight of label/document combination or the weight of term/document combination for the feedback of document.For second kind of ranking compositor, the user will be reflected to the weight of the right score value of term/label and predetermined term/label/document combination or the weight of term/document combination for the feedback of document.To the once feedback of document, may reflect the assistance degree of the label classification relevant for user search with the document.Therefore, to a large amount of feedbacks of document, can reflect the assistance degree of the label classification relevant for user search with the document.
At step S150, send the result who described document is carried out ranking compositor.
According to another embodiment of the present invention, wherein, term/label in the described term of described renewal/label relation table is to further being configured to: according to the feedback to described document, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, it is right then to set up corresponding term/label; And described method further comprises: set up corresponding term/label/document combination, and definite right score value of being set up of term/label, the weight of definite term/label of being set up/document combination.
According to one embodiment of the invention, said method may further include, and according to the feedback to described document, upgrades the weight of the right score value of described term/label and described term/label/document combination.Wherein, can be according to feedback number of times to described document, the weight of the score value that renewal and described term/label are right and described term/label/document combination.Can also be according to user's value of feedback, as think for user the term of search or the label document very good, good, or poor, upgrade be attached to described document on the relevant right score value of term/label of label.
According to feedback, can also upgrade the weight of the described predetermined term/label relevant/document combination with described document to described document.Wherein, can estimate (very good, good, general or poor) or assessment (relevant, uncorrelated) etc. according to feedback number of times, scoring (1-100 branch), upgrade the weight of the described predetermined term/label relevant/document combination with described document to described document with the term and/or the label of its search.
User's feedback is often based on the cognition of user for document content, therefore, the score value that the term/label relevant with label is right, the weight of term/label/document combination can reflect that the user is for label and the content of document and the related approval between the semanteme.Therefore, relation (KL-D relation) between relation (KL relation) between use term and the label and KL relation and the document, can reflect user's search experience and all users' wisdom, and then the document that helps user search to mate the most to the semanteme with the query and search speech of importing.Label in term/label that label by feeding back the high term of definite score value/label centering or weighted value are high/document combination can have following one or more feature: be a keyword or keyword sets; Be associated with document; It is classified description according to document content.
According to one embodiment of the invention, said method may further include: search has the document of described term; Described document is carried out ranking compositor further to be configured to according to the weight of predetermined label/document combination or the weight of term/document combination described document be carried out ranking compositor.Wherein, described document being carried out ranking compositor can further be configured to according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination or the weight of term/document combination described document be carried out ranking compositor.Wherein, can also upgrade the weight of described term/document combination according to feedback to described document.
Fig. 2 shows the process flow diagram that is used for searching documents according to another embodiment of the present invention.Wherein show a kind of method that is used for searching documents.At step S210, receive the term that is used to search for.The label that the S220 inquiry is relevant with described term.At step S230, search has described term or has the document of at least one label in the described label.Wherein, at step S230A, search has described term document.Search engine submitted in user's query and search speech, and search engine is searched all collection of document that comprise key word of the inquiry (set A 1).At step S230B, search has the document of at least one label in the described label.Wherein, can search for document by label and the label that is attached on the document are complementary with at least one label in the described label.With the label that obtains in the step 220, search for the collection of document (set A 2) that all have been marked one of these labels.At last set A 1, A2 are merged and obtain all qualified collection of document (set A).
At step S240, described document is carried out ranking compositor.Wherein, can adopt the mechanism of carrying out ranking compositor in the prior art for document, for example the popular ranking compositor mechanism that search engine adopted on the Internet.All right weight according to predetermined label/document combination, and the weight of keyword/document combination is carried out ranking compositor to described document.Keyword/document combination comprises that a term and the document that is associated with this term, the identifier of document or the URL of document link, and wherein the document has been added this label, can search the document by this term.The weight of predetermined label/document combination can be to pre-determine by existing user feedback.At step S250, send the result who described document is carried out ranking compositor.
At step S260, receive the feedback that the user carries out a document after browsing ranking results.The label that marks on S270 document query the document to user feedback.Can filter label, wherein be used to the label of classifying so that select.The label that is used to classify generally comprises and is attached to of being used on the document document is classified or one group of vocabulary or phrase.For example, can calculate each label on document and the document in semantically degree of correlation, this degree of correlation is represented with digital value.Preestablish a threshold value simultaneously,, can think that then the document semantic correlativity that label and it is marked is very little or uncorrelated, thereby label that will semantically correlativity is low filters if relevance score is lower than threshold value.
At step S280, upgrade term/label relation table according to user feedback.According to feedback, upgrade described term/label relation table, so that comprise that the term/label between the label that term and document comprise is right to described document.For example, term/label of setting up between the label that the term wherein do not comprise and document comprise is right.Thereby can enlarge the hunting zone in the search afterwards.For described renewal, can also filter out earlier is not the label that is used to classify, and perhaps filters out the label that does not have semantic relation with document content.Then, for the label that is used to classify of the document, right if described term/label relation table does not comprise term and the term/label between it, it is right then to set up corresponding term/label.
The feedback of document is reflected to the weight of label/document combination, has reflected that correspondingly incidence relation between this label/document is to the assistance degree of user inquiring.Like this, after enlarging the hunting zone, carry out ranking compositor, (significant) document that the user may more approve is preferentially recommended the user according to the weight of label/document combination.
According to one embodiment of the invention, said method may further include, and according to the feedback to described document, upgrades the weight of described label/document combination.Wherein, can upgrade the weight of described label/document combination according to feedback number of times to described document.
Fig. 3 shows the process flow diagram that document is sorted according to another embodiment of the present invention.Wherein, for a plurality of documents that obtain by label search, at step S310, the term/label of searching key word of finding out and importing and the article that searches out coupling to and term/label/document combination.At step S320, the right score value of term/label that the query and search speech/the label relation table obtains being scheduled to.At step S330, the weight of term/label that the query and search speech/the label relation table obtains being scheduled to/document combination.At step S340, according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination ranking score to each document.At step S350, the document that search obtains is sorted according to the ranking score of each document.
Fig. 4 shows the process flow diagram that is used to upgrade term/label relation table according to one embodiment of the invention.At step S410, receive the feedback of user to a certain piece of writing document.At step S420, query and search speech/label relation table, right according to term and document by the additional corresponding term/label of label lookup.If do not find, execution in step S430 then; If find, execution in step S440 then.At step S430, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, then set up corresponding term/label to and determine its score value; And set up corresponding term/label/sets of documentation and merge definite its weight.At step S440,, upgrade the right score value of described term/label according to feedback to described document.At step S450, use the term/label among the step S420 right, search whether there is corresponding term/label/document combination in term/label relation table.If do not find, execution in step S460 then; If find, execution in step S470 then.At step S460,, then set up corresponding term/label/document combination, and determine its weight if there is not corresponding term/label/document combination in term/label relation table.At step S470,, upgrade the weight of described term/label/document combination according to feedback to described document.
The right score value of described term/label can reflect the correlativity of term and label; The weight of term/label/document combination can reflect the correlativity of term/label pair and document.
Wherein one piece of document (document A) in the collection of document that the user is inquired the foundation term carries out positive feedback, and this feedback behavior has reflected that the user thinks that the content of document A is relevant with the term that he submits to.Can think so has other document of close content also to have correlativity with the term that the user submits to document A.Because the label that is used to classify is the embodiment of document content, if other document has identical one or more labels with document A, these documents and document A are that close, identical label is many more in terms of content so, and close degree is also just strong more.Therefore, not only can safeguard the relevance weight of term and document combination, can also safeguard the relevance score of term and label.
For example, the document A that retrieves according to term OOM, document B and document C.Document A has 5 identical labels with document B, and document A has 1 identical label with document C.According to feedback, upgrade the weight of term/label/document combination to described document A.The weight of term/label/document combination, directly reflected the user to by term/label to retrieving the approval of the document.
According to feedback to described document, upgrade the right score value of term/label, user's feedback can be fed back to the document that other has same label indirectly.The correlativity that has between the many more documents of same label is big more, and content may be close more.By this feedback, when having the many more documents of same label afterwards by this keyword search, the raising that ranking score obtains is many more.
After upgrading according to feedback, after the user imported term OOM and searches for afterwards, the ranking score of document A improved at most, the ranking score of document B than the ranking score of document C improve many, the ranking score of document C improves minimum.
Specifically, according to one embodiment of the invention, the matching degree of calculating document and term is pair to calculate with the relevance weight of document according to the relevance score of each semantic label on term (as keyword) and the document and term/label.When the user to one piece of document (document A) when feeding back, relevance score to all labels on this term (as keyword) and the document A is adjusted, and this adjustment will have influence on the calculating of the correlativity of other document of comprising these semantic labels and this term.And other document label identical with document A is many more, during calculating, and the correlativity of it and this term also just big more.
Adopt this mode, can farthest utilize the information of user in once feeding back.Because not only the document of the direct feedback of user and the correlativity of term are adjusted, have other documents of close content and the correlativity of term to also having adjusted the document that directly feeds back with the user by term/label.
And iff upgrading term and the weight that document makes up, be can not embody term and other have the correlativity of close content document.Therefore, better according to feeding back the weight of upgrading the right score value of term/label and described term/label/document combination, also better than the weight of only upgrading term/label/document combination than the weight of only upgrading term/document combination.
Below, describe the weight of the right score value of deterministic retrieval speech/label and term/label/document combination or the weight of term/document combination in detail by concrete example.And describe in detail according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination or the weight of term/document combination in conjunction with example described document is carried out ranking compositor.
According to one embodiment of the invention, the relation between term and the label is by also constantly adjusting that vast user's continual feedback is set up.Having dynamic, is the result of colony's wisdom.Therefore also more can be near the coupling of semantic and search intention.When calculating the user inquiring term with the document degree of correlation, can be with label as intermediary, and set up the relation of term and label, reach term/label to the relation between (pair) and document.
These two relations by setting up can help user search to arrive the document of more heterogeneous pass.Because when searching for, not only search for the document that contains the user inquiring term and be marked the document of query and search speech label.Also term and the label relation of setting up by the present invention obtains the label relevant with term, and inquiry comprises the document that has been marked these labels.
By above-mentioned two relations of setting up, can expand to the feedback that other are had the document of same label to once feeding back of a document, make that the efficient of user feedback is higher.By two relations that the present invention sets up, the user transforms for the feedback to query and search speech and label relation the feedback of one piece of document searching out by the query and search speech, and to the feedback of term label to (pair) and document relationships.So not only the matching degree of this piece document that directly feeds back of user and query and search speech can be upgraded by current feedback, has with the more documents of this piece document same label and the matching degree of term also can be updated simultaneously.This mode has improved the efficient of user feedback greatly.
By continuous feedback, concern the storehouse in continuous expansion, this expansion is spontaneous finishing in user's feedback, can cross to embody colony's wisdom.Can accept user's positive feedback, thereby the rank of document is risen.Also can accept user's negative sense feedback, thereby the rank of document is descended.Through a large amount of positive feedback or negative sense feedback, can significantly improve the ordering precision of document.
Searching order and text based searching order based on label can also be merged, if do not have label between term and document and carried out forward or negative sense feedback by the user, can between set up term to the relation between the document, participate in the calculating of whole marking ordering.Concrete computing method can have multiple.
Suppose to have document A, B, C, D, F, these five pieces of documents comprise text Key1, Key2, Key3 respectively and are labeled in label Tag1, Tag2, Tag3, Tag4 on these five pieces of documents, and these relations are as shown in the table.
Webpage Has term (With Text) Label (Tag)
??Page?A(URL?A) ??Key1(OOM) ??Tag1(OOM)??Tag2(crash)??Tag3(memory)
??Page?B(URL?B) ??Key2(Out?of?memory) ??Tag1(OOM)??Tag?3(memory)??Tag4(Out?of?memory)
??Page?C(URL?C) ??Key3(memory) ??Tag1(OOM)??Tag4(Out?of?memory)
??Page?D(URL?D) ??Key2(Out?of?memory) ??Tag4(Out?of?memory)
??Page?F(URL?F) ??Key1(OOM) ??No?Tag
Can use database come memory scan speech/label to and score value, term/label/document combination and weight thereof and term/document combination and weight thereof.Initialized database is as follows.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
Term/label relation table 3
Term (Docuterm) Document (document) Weight (weight)
(1) for the first time, for the term Key1 (OOM) that user 1 submits to, the search engine searches document, for example webpage obtains following Search Results.
Webpage
Page?A
Page?F
…….
…….
…….
User 1 browsing page Page A and webpage Page F, and webpage Page A submitted to positive feedback.At this moment, it is right to set up term/label according to user 1 positive feedback, and for this term/label to initial score value is set, score value f for example 1(X)=and X, wherein X is the positive feedback number of times.
By the user feedback of page A is set up between following term and the label and to be concerned.Relation between term and the label is by also constantly adjusting that vast user's continual feedback is set up.Therefore also more can be near the coupling of semantic and search intention.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag?1(OOM) ?f1(1)=1
??Pair?2 ??Key1→Tag?2(crash) ?f1(1)=1
??Pair?3 ??Key1→Tag?3(memory) ?f1(1)=1
Can also set up the combination of term/label/document according to user 1 positive feedback, and for term/label/document combination is provided with initial weight, weight f2 (X)=X+1 for example, wherein X is the positive feedback number of times.
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
For the second time, user 2 submits term Key1 (OOM) to, according to term/label relation table 1 inquiry label relevant with described term, can obtain following label.
Tag1(OOM)
Tag2(crash)
Tag3(memory)
Search has described term or has the document of at least one label in the described label, obtains following Search Results.Here can see,,, also can search out Page B and Page C now originally only to search out Page A and Page F by primary positive feedback.As intermediary, can search more relevant document by label.
Page?A
Page?B
Page?C
Page?F
Described document is carried out ranking compositor.Weight according to the right score value of predetermined term/label and predetermined term/label/document combination is carried out ranking compositor to described document.
According to the ordering can understand, by a positive feedback, Page A has improved rank, and with Page A have the more Page B of same label also very high mark, come before the page C.By the relevance of label, can further show as the feedback that other is had the document of same label to the feedback of a document.
For example, the ranking score of the document right score value that can adopt the term/label relevant with the document multiply by the term/label/document combining weights relevant with the document.Wherein, for a document, if the right score value of the term/label relevant with the document is only arranged, and do not have the term/label/document combining weights relevant with the document, the summation of then that term/label is right score value (* weight default value 1) is as the ranking score of the document.
All right according to the term/label right score value relevant with the document, and the term/label relevant with the document/document combining weights, the employing alternate manner calculates the ranking score of document, so that carry out ranking compositor.
Page?A:Score=1*2+1*2+1*2=6
Page?B:Score=1+1=2
Page?C:Score=1
Page?F:Score=0
By the adjusting of two relations (score value that term/label is right, the weight of term/label/document combination), the document that is fed can arrive more rank with the document with how identical label and promote.
Only calculate term and problem that the relation of label is brought and be the document that the feedback to a document causes all and this document to have same label and all obtain identical lifting.And only calculate relation between term-label and the document, then can't bring into play the intermediation of label.
Page F:Score=0 is because do not have additional label on the Page F.For not by the document of additional label (as Page F), can calculate ranking score and carry out ranking compositor the document by the weight of term/document combination.
Sort according to ranking score, obtain following ranking results.
1.Page?A
2.Page?B
3.Page?C
4.Page?F
User 2 browses Page A, Page B or Page C, and input is for the positive feedback of Page B 1 time.According to this positive feedback, can upgrade term/label relation table (create term/label to or upgrade the right score value of term/label, create term/label/document combination or upgrade the weight of term/label/document combination), as follows.By continuous feedback, relation table can constantly be expanded, and this expansion is spontaneous finishing in user's feedback, can cross and embody colony's wisdom.
Create term/label to or upgrade the right score value of term/label.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag?1(OOM) ??f1(2)=2
??Pair?2 ??Key1→Tag?2(crash) ??f1(1)=1
??Pair?3 ??Key1→Tag?3(memory) ??f1(2)=2
??Pair?4 ??Key1→Tag?4(Out?of??memory) ??f1(1)=1
Create term/label/document combination or upgrade the weight that term/label/document makes up
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
For the third time, receive the term Key1 (OOM) that user 3 submits to,, can obtain following label according to term/label relation table 1 inquiry label relevant with described term.
Tag1(OOM)
Tag2(cra?sh)
Tag3(memory)
Tag4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.Here can see,,, also can search out Page B, Page C and Page D now originally only to search out Page A and Page F by twice positive feedback.As intermediary, can search more relevant document by label.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.
Page?A:Score=2*2+1*2+2*2=10
Page?B:Score=2*2+2*2+1*2=10
Page?C:Score=2+1=3
Page?D:Score=1
Page?F:Score=0
Ranking results
1.Page?A,Page?B
2.Page?C
3.Page?D
4.Page?F
Can notice, by to PageA and PageB each once the feedback, PageA comes the front and has identical rank mark with PageB.Reflect the rationality that has for ordering.
User 3 browses Page D, and input is for the positive feedback of Page D 1 time.According to this positive feedback, can upgrade term/label relation table (create term/label to or upgrade the right score value of term/label, create term/label/document combination or upgrade the weight of term/label/document combination), as follows.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag1(OOM) ??f1(2)=2
??Pair?2 ??Key1→Tag2(crash) ??f1(1)=1
??Pair?3 ??Key1→Tag3(memory) ??f1(2)=2
??Pair?4 ??Key1→Tag4(Out?of??memory) ??f1(2)=2
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
??Pair?4→Page?D ??f2(1)=2
The 4th time, receive the term Key1 (OOM) that user 4 submits to, according to term/label relation table 1 inquiry label relevant, can obtain following label with described term.
Tag1(OOM)
Tag2(crash)
Tag3(memory)
Tag4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.
Positive feedback to PageD makes the rank mark of PageB that raising arranged simultaneously, is because PageB, comprises the label on the PageD on the PageC, closes to fasten at term and label PageB and PageC have been produced positive acting.
Page?A:Score=10
Page?B:Score=12
Page?C:Score=4
Page?D:Score=4
Page?F:Score=0
Ranking results
1.Page?B
2.Page?A
3.Page?C,Page?D
4.Page?F
User 4 browses Page D, and input is for the positive feedback of Page D 1 time.According to this positive feedback, can upgrade term/label relation table (create term/label to or upgrade the right score value of term/label, create term/label/document combination or upgrade the weight of term/label/document combination), as follows.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag?1(OOM) ??f1(2)=2
??Pair?2 ??Key1→Tag?2(crash) ??f1(1)=1
??Pair?3 ??Key1→Tag?3(memory) ??f1(2)=2
??Pair?4 ??Key1→Tag?4(Out?of??memory) ??f1(3)=3
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
??Pair?4→Page?D ??f2(2)=3
The 5th time, receive the term Key1 (OOM) that user 4 submits to, according to term/label relation table 1 inquiry label relevant, can obtain following label with described term.
Tag?1(OOM)
Tag?2(crash)
Tag?3(memory)
Tag?4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.
Page?A:Score=10
Page?B:Score=14
Page?C:Score=5
Page?D:Score=9
Page?F:Score=0
Ranking results
1.Page?B
2.Page?A
3.Page?D
4.Page?C
5.Page?F
Can see that to the positive feedback of Page D the score of Page D is had faster and increase, concrete growth mark is decided with concrete sort algorithm.Here be a concrete example.
If receive for the positive feedback of Page C 100 times.It is as follows to upgrade term/label relation table.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term (label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag?1(OOM) ??f1(102)=102
??Pair?2 ??Key1→Tag?2(crash) ??f1(1)=1
??Pair?3 ??Key?1→Tag3(memory) ??f1(2)=2
??Pair?4 ??Key1→Tag?4(Out?of??memory) ??f1(103)=103
Term/label relation table 2
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
??Pair?4→Page?D ??f2(2)=3
??Pair?1→Page?C ??f2(100)=101
??Pair?4→Page?C ??f2(100)=101
In search subsequently, receive the term Key1 (OOM) that user 6 submits to, according to term/label relation table 1 inquiry label relevant, can obtain following label with described term.
Tag?1(OOM)
Tag?2(crash)
Tag?3(memory)
Tag?4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.
Page?A:Score=210
Page?B:Score=414
Page?C:Score=20705
Page?D:Score=309
Page?F:Score=0
Ranking results
1.Page?C
2.Page?B
3.Page?D
4.Page?A
5.Page?F
If receive for the positive feedback of Page D 200 times.It is as follows to upgrade term/label relation table.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ?Key1→Tag?1(OOM) ??f1(102)=102
??Pair?2 ?Key1→Tag?2?(crash) ??f1(1)=1
??Pair?3 ?Key1→Tag?3?(memory) ??f1(2)=2
??Pair?4 ?Key1→Tag?4?(Out?of?memory) ??f1(303)=303
Term/label relation table 2
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag)
??Pair?4→Page?D ??f2(202)=203
??Pair?1→Page?C ??f2(100)=101
??Pair?4→Page?C ??f2(100)=101
In search subsequently, receive the term Key1 (OOM) that user 7 submits to, according to term/label relation table 1 inquiry label relevant, can obtain following label with described term.
Tag?1(OOM)
Tag?2(crash)
Tag?3(memory)
Tag?4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.After can seeing a large amount of positive feedback, can be according to the expanded search results that concerns between the document label, and ranking results can reflect the evaluation of user for document.Embodiments of the invention just are being based on a large amount of user feedbacks, and the document that meets most user's needs is provided.
Page?A:Score=210
Page?B:Score=814
Page?C:Score=40705
Page?D:Score=61509
Ranking results
1.Page?D
2.Page?C
3.Page?B
4.Page?A
5.Page?F
Those skilled in the art will appreciate that also and can the weight of the right score value of term/label and term/label/document combination be reduced, thereby influence the ranking compositor of corresponding document according to user's negative sense feedback (negative evaluation).For example, receive 10 negative sense feedbacks of pageD, it is as follows to upgrade term/label relation table.
Term/label relation table 1
Term/label is to numbering (Pair Number) Term → label (Docuterm to Tag) Score value (Score)
??Pair?1 ??Key1→Tag?1(OOM) ??f1(102)=102
??Pair?2 ??Key1→Tag?2(crash) ??f1(1)=1
??Pair?3 ??Key1→Tag3(memory) ??f1(2)=2
??Pair?4 ??Key1→Tag?4(Out?of??memory) ??f1(303)=303-10=293
Term/label relation table 2
Term/label is right/document combination (Docuterm/Tag pair to page combination) Weight (weight)
??Pair?1→Page?A ??f2(1)=2
??Pair?2→Page?A ??f2(1)=2
??Pair?3→Page?A ??f2(1)=2
??Pair?1→Page?B ??f2(1)=2
??Pair?3→Page?B ??f2(1)=2
??Pair?4→Page?B ??f2(1)=2
??Pair?4→Page?D ??f2(202)=203-10=193
??Pair?1→Page?C ??f2(100)=101
??Pair?4→Page?C ??f2(100)=101
Receive the term Key1 (OOM) that user 8 submits to,, can obtain following label according to term/label relation table 1 inquiry label relevant with described term.
Tag?1(OOM)
Tag?2(crash)
Tag?3(memory)
Tag?4(Out?of?memory)
Search has above-mentioned term or has the document of at least one label in the above-mentioned label, obtains following Search Results.
Page?A
Page?B
Page?C
Page?D
Page?F
The document that searches is carried out ranking compositor, to described document calculations ranking score (as follows), and carry out ranking compositor according to ranking score according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination.
Page?A:Score=92×1-9×1-8×1=75
Page?B:Score=814-20=994
Page?C:Score=40705-1010=39695
Page D:Score=293 * 193=56549 is (former: 61509)
As seen the decline of the score of PageD is the fastest, and this concerns results of interaction just because of two.
Ranking results
1.Page?D
2.Page?C
3.Page?B
4.Page?A
5.Page?F
Fig. 5 shows the system chart that is used for searching documents according to an embodiment of the invention.Wherein show a kind of system 500 that is used for searching documents, it comprises following multiple arrangement.Receiving trap 510 is used to receive the term that is used to search for.Inquiry unit 520 is used to inquire about the label relevant with described term.Searcher 530 is used for searching for the document of at least one label with described label.Collator 540 is used for described document is carried out ranking compositor.Dispensing device 550 is used to send the result who described document is carried out ranking compositor.
According to one embodiment of the invention, described system 500 may further include: one is used for one of memory scan speech/label relation table concerns the storehouse; And a updating device, the term/label that is used for upgrading described term/label relation table is right.Wherein said inquiry unit can further be configured to: according to term/label relation table, inquire about the label relevant with described term; Described updating device can further be configured to: according to the feedback to described document, the term/label that upgrades in described term/label relation table is right.
According to one embodiment of the invention, collator 540 described in the described system 500 can further be configured to according to the weight of predetermined label/document combination described document be carried out ranking compositor.Described system 500 may further include a updating device, is used for according to the feedback number of times to described document, upgrades the weight of described label/document combination.
According to one embodiment of the invention, collator 540 described in the described system 500 can further be configured to according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination described document be carried out ranking compositor.
According to one embodiment of the invention, wherein, described at least one label comprises and is attached to of being used on the document document is classified or one group of vocabulary or phrase.
According to one embodiment of the invention, in the described system 500, described updating device can further be configured to: according to the feedback to described document, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, it is right then to set up corresponding term/label, and definite right score value of being set up of term/label; And set up corresponding term/label/document combination, and the weight of definite term/label of being set up/document combination.
According to one embodiment of the invention, in the described system 500, described updating device can further be configured to: according to the feedback to described document, upgrade the weight of the right score value of described term/label and described term/label/document combination.
The present invention also provides a kind of storage medium or signal vehicle, comprising the instruction that is used to carry out the method according to this invention.
Process flow diagram in the accompanying drawing and block diagram illustrate the system according to the embodiment of the invention, architectural framework in the cards, function and the operation of method and computer program product.In this, each square frame in process flow diagram or the block diagram can be represented the part of module, program segment or a code, and the part of described module, program segment or code comprises one or more executable instructions that are used to realize the logic function stipulated.Should be noted that also what the function that is marked in the square frame also can be marked to be different from the accompanying drawing occurs in sequence in some realization as an alternative.For example, in fact the square frame that two adjoining lands are represented can be carried out substantially concurrently, and they also can be carried out by opposite order sometimes, and this decides according to related function.Also be noted that, each square frame in block diagram and/or the process flow diagram and the combination of the square frame in block diagram and/or the process flow diagram, can realize with the hardware based system of the special use of function that puts rules into practice or operation, perhaps can realize with the combination of specialized hardware and computer instruction.
The person of ordinary skill in the field knows that the present invention can be presented as system, method or computer program.Therefore, the present invention can specific implementation be following form, that is, can be completely hardware, software (comprising firmware, resident software, microcode etc.) or this paper are commonly referred to as " circuit ", the software section of " module " or " system " and the combination of hardware components completely.In addition, the present invention can also take to be embodied in the form of the computer program in any tangible expression medium (medium ofexpression), comprises the procedure code that computing machine can be used in this medium.
Can use any combination of that one or more computing machines can be used or computer-readable medium.Computing machine can with or computer-readable medium for example can be---but being not limited to---electricity, magnetic, light, electromagnetism, ultrared or semi-conductive system, device, device or propagation medium.The example more specifically of computer-readable medium (non exhaustive tabulation) comprises following: the electrical connection, portable computer diskette, hard disk, random-access memory (ram), ROM (read-only memory) (ROM), erasable type programmable read only memory (EPROM or flash memory), optical fiber, Portable, compact disk ROM (read-only memory) (CD-ROM), light storage device of one or more leads arranged, such as transmission medium or the magnetic memory device of supporting the Internet or in-house network.Note computing machine can with or computer-readable medium in addition can be above be printed on paper or other suitable medium of program, this be because, for example can be by this paper of electric scanning or other medium, obtain program in the electronics mode, compiled by rights then, explain or handle, and necessary words are stored in computer memory.In the linguistic context of presents, computing machine can with or computer-readable medium can be anyly to contain, store, pass on, propagate or transmit for instruction execution system, device or device medium that use or the program that and instruction executive system, device or device interrelate.Computing machine can with medium can be included in the base band or propagate as a carrier wave part, embody the data-signal of the procedure code that computing machine can use by it.The procedure code that computing machine can be used can be with any suitable medium transmission, comprises that---but being not limited to---is wireless, electric wire, optical cable, RF or the like.
Be used to carry out the computer program code of operation of the present invention, can write with any combination of one or more programming languages, described programming language comprises the object-oriented programming language---such as Java, Smalltalk, C++, also comprising conventional process type programming language---such as " C " programming language or similar programming language.Procedure code can fully carried out in user's the calculating, partly carry out on the user's computer, carrying out on the remote computer or carrying out on remote computer or server fully on user's computer top as an independently software package execution, part.In a kind of situation in back, remote computer can---comprise Local Area Network or wide area network (WAN)---by the network of any kind of and be connected to user's computer, perhaps, can (for example utilize the ISP to pass through the Internet) and be connected to outer computer.
Abovely the present invention is described in detail, but is appreciated that above embodiment only is used for explanation and non-limiting the present invention in conjunction with the optimum seeking method scheme.Those skilled in the art can make amendment and do not depart from the scope of the present invention and spirit scheme shown in of the present invention.

Claims (20)

1. method that is used for searching documents comprises:
The term that reception is used to search for;
Inquire about the label relevant with described term;
Search has the document of at least one label in the described label;
Described document is carried out ranking compositor; And
The result of ranking compositor is carried out in transmission to described document.
2. method according to claim 1, wherein, according to term/label relation table, inquire about the label relevant with described term, described method comprises that further according to the feedback to described document, the term/label that upgrades in described term/label relation table is right.
3. method according to claim 1 and 2 wherein, is carried out ranking compositor according to the weight of predetermined label/document combination to described document.
4. method according to claim 1 and 2 wherein, is carried out ranking compositor according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination to described document.
5. method according to claim 1 and 2, wherein, described at least one label comprises and is attached to of being used on the document document is classified or one group of vocabulary or phrase.
6. method according to claim 4, wherein further comprise: according to feedback described document, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, it is right then to set up corresponding term/label, and definite right score value of being set up of term/label; Set up corresponding term/label/document combination, and the weight of definite term/label of being set up/document combination.
7. method according to claim 3 wherein further comprises, according to the feedback to described document, upgrades the weight of described label/document combination.
8. method according to claim 7 wherein, according to the feedback number of times to described document, is upgraded the weight of described label/document combination.
9. method according to claim 4 wherein further comprises, according to the feedback to described document, upgrades the weight of the right score value of described term/label and described term/label/document combination.
10. method according to claim 9 wherein, according to the feedback number of times to described document, is upgraded the weight of the right score value of described term/label and described term/label/document combination.
11. method according to claim 1 wherein further comprises: search has the document of described term; Described document is carried out ranking compositor further to be configured to according to the weight of predetermined label/document combination or the weight of term/document combination described document be carried out ranking compositor.
12. method according to claim 11, wherein, described document being carried out ranking compositor further is configured to according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination or the weight of term/document combination described document be carried out ranking compositor.
13. method according to claim 11 wherein further comprises, according to the feedback to described document, upgrades the weight of described label/document combination and the weight of described term/document combination.
14. a system that is used for searching documents comprises:
Receiving trap is used to receive the term that is used to search for;
Inquiry unit is used to inquire about the label relevant with described term;
Searcher is used for searching for the document of at least one label with described label;
Collator is used for described document is carried out ranking compositor; And
Dispensing device is used to send the result who described document is carried out ranking compositor.
15. system according to claim 14 comprises further that wherein one is used for one of memory scan speech/label relation table and concerns the storehouse; And a updating device, the term/label that is used for upgrading described term/label relation table is right; Wherein said inquiry unit further is configured to: according to described term/label relation table, inquire about the label relevant with described term; Described updating device further is configured to: according to the feedback to described document, the term/label that upgrades in described term/label relation table is right.
16. according to claim 14 or 15 described systems, wherein said collator further is configured to according to the weight of predetermined label/document combination described document be carried out ranking compositor; Described system further comprises a updating device, is used for according to the feedback number of times to described document, upgrades the weight of described label/document combination.
17. according to claim 14 or 15 described systems, wherein said collator further is configured to according to the weight of the right score value of predetermined term/label and predetermined term/label/document combination described document be carried out ranking compositor.
18. according to claim 14 or 15 described systems, wherein, described at least one label comprises and is attached to of being used on the document document is classified or one group of vocabulary or phrase.
19. system according to claim 17, wherein, described updating device further is configured to: according to the feedback to described document, if described term/label relation table does not comprise that the term/label between the label that term and the document comprise is right, it is right then to set up corresponding term/label, and definite right score value of being set up of term/label; And set up corresponding term/label/document combination, and the weight of definite term/label of being set up/document combination.
20. system according to claim 17, wherein, described updating device further is configured to: according to the feedback to described document, upgrade the weight of the right score value of described term/label and described term/label/document combination.
CN200810187106A 2008-12-12 2008-12-12 Method and system for searching documents Pending CN101751405A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200810187106A CN101751405A (en) 2008-12-12 2008-12-12 Method and system for searching documents

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200810187106A CN101751405A (en) 2008-12-12 2008-12-12 Method and system for searching documents

Publications (1)

Publication Number Publication Date
CN101751405A true CN101751405A (en) 2010-06-23

Family

ID=42478396

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200810187106A Pending CN101751405A (en) 2008-12-12 2008-12-12 Method and system for searching documents

Country Status (1)

Country Link
CN (1) CN101751405A (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324640A (en) * 2012-03-23 2013-09-25 日电(中国)有限公司 Method and device for determining search result file, as well as equipment
CN103714088A (en) * 2012-10-09 2014-04-09 深圳市世纪光速信息技术有限公司 Method for acquiring search terms, server and method and system for recommending search terms
CN103870460A (en) * 2012-12-10 2014-06-18 腾讯科技(深圳)有限公司 Good number searching method and system
CN107818092A (en) * 2016-09-12 2018-03-20 百度在线网络技术(北京)有限公司 Document processing method and device
CN108829800A (en) * 2018-05-29 2018-11-16 努比亚技术有限公司 A kind of search data processing method, equipment and computer readable storage medium
CN109947949A (en) * 2019-03-12 2019-06-28 国家电网有限公司 Knowledge information intelligent management, device and server

Cited By (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103324640A (en) * 2012-03-23 2013-09-25 日电(中国)有限公司 Method and device for determining search result file, as well as equipment
CN103324640B (en) * 2012-03-23 2016-06-08 日电(中国)有限公司 A kind of method, device and equipment determining search result document
CN103714088A (en) * 2012-10-09 2014-04-09 深圳市世纪光速信息技术有限公司 Method for acquiring search terms, server and method and system for recommending search terms
WO2014056337A1 (en) * 2012-10-09 2014-04-17 腾讯科技(深圳)有限公司 Search word acquisition method, server and search word recommendation system
CN103870460A (en) * 2012-12-10 2014-06-18 腾讯科技(深圳)有限公司 Good number searching method and system
CN103870460B (en) * 2012-12-10 2018-11-06 腾讯科技(深圳)有限公司 One kind beautiful search method and system
CN107818092A (en) * 2016-09-12 2018-03-20 百度在线网络技术(北京)有限公司 Document processing method and device
CN108829800A (en) * 2018-05-29 2018-11-16 努比亚技术有限公司 A kind of search data processing method, equipment and computer readable storage medium
CN108829800B (en) * 2018-05-29 2021-11-16 努比亚技术有限公司 Search data processing method and device and computer readable storage medium
CN109947949A (en) * 2019-03-12 2019-06-28 国家电网有限公司 Knowledge information intelligent management, device and server

Similar Documents

Publication Publication Date Title
US20210209182A1 (en) Systems and methods for improved web searching
US8892550B2 (en) Source expansion for information retrieval and information extraction
US20070250501A1 (en) Search result delivery engine
CN103678576A (en) Full-text retrieval system based on dynamic semantic analysis
CN110083696B (en) Global citation recommendation method and system based on meta-structure technology
WO2014047727A1 (en) A method and system for monitoring social media and analyzing text to automate classification of user posts using a facet based relevance assessment model
CN102597991A (en) Document analysis and association system and method
US9251289B2 (en) Matching target strings to known strings
CN101751405A (en) Method and system for searching documents
CN111611356A (en) Information searching method and device, electronic equipment and readable storage medium
US20170185672A1 (en) Rank aggregation based on a markov model
Simón et al. Calculating the significance of automatic extractive text summarization using a genetic algorithm
US20120130999A1 (en) Method and Apparatus for Searching Electronic Documents
Shao et al. Thuir@ coliee-2020: Leveraging semantic understanding and exact matching for legal case retrieval and entailment
Kantorski et al. Automatic filling of hidden web forms: a survey
CN105404677A (en) Tree structure based retrieval method
Wu et al. Searching online book documents and analyzing book citations
Meena et al. Feature priority based sentence filtering method for extractive automatic text summarization
Zhang Start small, build complete: Effective and efficient semantic table interpretation using tableminer
Shalaby et al. Toward an interactive patent retrieval framework based on distributed representations
JP5315726B2 (en) Information providing method, information providing apparatus, and information providing program
CN105426490A (en) Tree structure based indexing method
CN102708104A (en) Method and equipment for sorting document
Mittal et al. Understanding reviewer assignment problem and its issues and challenges
CN111506705B (en) Information query method and device and electronic equipment

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C02 Deemed withdrawal of patent application after publication (patent law 2001)
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20100623