CN102779193B - Self-adaptive personalized information retrieval system and method - Google Patents

Self-adaptive personalized information retrieval system and method Download PDF

Info

Publication number
CN102779193B
CN102779193B CN201210244519.5A CN201210244519A CN102779193B CN 102779193 B CN102779193 B CN 102779193B CN 201210244519 A CN201210244519 A CN 201210244519A CN 102779193 B CN102779193 B CN 102779193B
Authority
CN
China
Prior art keywords
current queries
history
historical query
word
click
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201210244519.5A
Other languages
Chinese (zh)
Other versions
CN102779193A (en
Inventor
杨沐昀
王晓春
李生
齐浩亮
赵铁军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Harbin University of technology high tech Development Corporation
Original Assignee
Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Harbin Institute of Technology filed Critical Harbin Institute of Technology
Priority to CN201210244519.5A priority Critical patent/CN102779193B/en
Publication of CN102779193A publication Critical patent/CN102779193A/en
Application granted granted Critical
Publication of CN102779193B publication Critical patent/CN102779193B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a self-adaptive personalized information retrieval system and method. For timely catching irregularly distributed dynamic retrieval requirements of a user, a retrieval module is timely updated through interaction of the user and a search engine. The system comprises a data input sub system, a parameter training and predicating sub system, a retrieval performing sub system and a data output sub system, wherein the data input sub system is used for combining historical inquiry information and historical click information to form a characteristic matrix according to the current inquiry information, and acquiring a training parameter predicating module according to the characteristic matrix; the parameter training and predicating sub system is used for training and applying the parameter predicating module to acquire the predicated parameters according to the characteristic matrix; the retrieval performing sub system is used for predicating the parameters to organize the current inquiry and the historical inquiry, and combining the user module and the inquiry module to form a personalized inquiry module; and the data output sub system is used for searching a document matched with the personalized inquiry from the document to be retrieved as a primary retrieved result, and sequencing the primary retrieved result according to the correlation to obtain the final retrieved result for outputting.

Description

Self-adaptation Personal Information System and method
Technical field
The present invention relates to computer information retrieval technology.
Background technology
The vastness of the network information and the develop rapidly of correlation technique make people use search engine more and more frequently.According to the statistics of China Internet network information center (NIC) (CNNIC), search engine (search engine) becomes the instrument that the most general auxiliary people retrieve Web information.
In recent years, in order to improve the precision of information retrieval, facilitating user to retrieve, improving the search experience of user, information retrieval field has emerged many outstanding information retrieval models and has achieved good effect.One of them mainly improves is exactly set up user interest model, and object is while the content relevance ensureing inquiry and document, ensures the correlativity of document and user interest simultaneously.User interest divides into Long-term Interest and short-term interest according to time span.Short-term interest comes from the search history of an inquiry session (session).Based in the personalized retrieval research of short-term interest, the people such as Cao (2008; 2009) regard the inquiry in inquiry session and click as ordered data, adopt HMM model and improve HMM model (vlHMM) and the training of CRF model, predicted query is intended to.Zhu and Mishne (2009) is to user's inquiry session process (session, be called for short inquiry session) carry out cluster, then the importance polymerization produced by whole inquiry session, as the importance of the overall situation, proposes the ClickRank model for weighing webpage or website importance.Except these directly carry out modeling research method to inquiry session, also there is researcher using inquiry session as the feature in order models.Multiple query modification relation is added in RankSVM as feature by the people such as Xiang (2010).In addition, conventional retrieval model also can be applied to the research of user's short-term interest.The people such as Chen (2009) combine current queries and click the similarity of documentation summary on the basis of conventional language model.Unlike, the personalized retrieval model comprising Long-term Interest is most based on conventional IR model.Tan (2006) proposes the method for some calculating historical information relevant to current queries on the basis of language model, and this retrieval model has positive role to new and old inquiry.The people such as Dou (2007) have carried out similar experiment respectively on vector space model and language model.The people such as Ahn (2008) are together in series multiple queries session according to Task, establish the Personalized Retrieval System embodying user's Long-term Interest based on BM25 probability model.
There is a significant shortcoming in these personalized retrieval models based on user interest above-mentioned: model is after training completes, and model inner parameter is all fixed value, relatively immobilizes.In fact, in different retrieval situations, information requirement is each variant, adopts unified mode to process various user search, lacks dirigibility unavoidably.For the personalized retrieval model based on query expansion, user model combines and current queries models coupling, and in research, the two-part weight of setting was constant constant usually in the past.But if the length of current queries is very short, the query intention of user is expressed clear or sufficiently complete not, so now should strengthen the effect of user model, reduce the importance of current queries model.Otherwise if current queries length is longer, it is clear that query intention is expressed, and so the effect played of user model is inessential on the contrary.Therefore, a kind of have the personalized retrieval experience that adaptive dynamic retrieval model theory can be improved further user, is the key characteristic that current searching system lacks.
Desirable dynamically personalized retrieval model should with objective retrieve application for foundation, consider when Design and implementation retrieval model following several in:
1. user distribution
In objective world, user is stochastic distribution, and studies often to user distribution proposition hypothesis in the past.Radlinski(2007) suppose that user is the Stochastic choice from the crowd fixed from a number.Second Year, thinks user always in a fixing crowd determined.Existing research confirms that the behavior of user is erratic (Agichtein et al., 2006), should avoid doing any hypothesis to user distribution as far as possible.
2. user interest
User interest is also changeable.Belkin (1997) finds very early when user searches information, and user search demand can change, and Sofia Stamou (2009) also thinks that user interest can along with time variations.
3. query capability
User and the mutual process of search engine are also processes (Shen et al., 2005) learning to use search engine.User, according to the quality returned results and satisfaction, resubmits new inquiry.That is, user can have influence in the reciprocal process with search engine the inquiry that user once submits to.Along with enriching of user search experience, user builds the ability of inquiry also in enhancing.Therefore the importance of each historical query is along with time variations, newer inquiry importance higher (BinTan et al., 2006; Dou et al., 2007).
Prior art does not have with reference to abundant user behavior feature to set the parameter in retrieval model.In fact, user search behavior itself provides important interest information, with this part information for basis can increase assigned weight rationality greatly.For example, if the length of current queries is less, so the quantity of information that provides of current queries is just less, and the weight now for historical information will strengthen.On the contrary, if the historical information of user is little, the weight of current queries will so be strengthened.Parameter training of the present invention and predicting subsystem provide important interest information for assigning weight dynamically according to realization with user behavior itself, greatly can increase the rationality of weight allocation.
Three, present invention employs machine learning algorithm and automatically complete prediction.
For example, if the length of current queries is less, the weight so for historical information will strengthen.If the historical information of user is little, the weight of current queries so will be strengthened.But if current queries is shorter, how the situation that historical information is less, assign weight and just seem complicated simultaneously.Adaptive personalized retrieval model is difficult to the problem determined by machine learning algorithm solution model parameter, ensure that the accuracy of the weight of prediction to a certain extent.
Four, contemplated by the invention the sequential relationship between inquiry.
The query history of user is according to time ordered arrangement, and new inquiry is more important than old inquiry, so carry out weight decay to historical query according to the time gap with current queries.
Five, the present invention solves in the middle of personalized retrieval modeling, how to organize current queries, historical query, and history clicks the relation between three.
Six, present invention enhances the process of customized information, excavate if explored the problem that the historical information of active user and current queries information improve the retrieval effectiveness of current queries further.
Seven, the present invention does not do any hypothesis to user distribution.Doing so avoids user truly distribute inconsistent with hypothesis and affect the situation of retrieval effectiveness.
Summary of the invention
In order to catch in time for the dynamic retrieval demand of the erratic user of distribution, the object of the retrieval model that upgrades in time alternately of adjoint user and search engine, the present invention devises a kind of self-adaptation Personal Information System and method.
Self-adaptation Personal Information System of the present invention comprises:
For according to current queries information, in conjunction with historical query information and history click information constitutive characteristic matrix, also for obtaining the data input subsystem of training parameter forecast model according to eigenmatrix;
For training according to eigenmatrix and application parameter forecast model, the parameter training obtaining Prediction Parameters and predicting subsystem;
For organizing current queries, historical query and history to click with prediction parameter out; Also for user model and interrogation model are combined the execution retrieval subsystem forming personalized enquire model;
For finding the document that mates with personalized enquire as preliminary search result in document to be retrieved, also for sorting to described preliminary search result according to correlativity, and using data output subsystem that the result after sorting exports as final result for retrieval.
Above-mentioned data input subsystem comprises:
For generating the module of user behavior feature according to current queries information, and
For the user's all behavioural characteristic constitutive characteristics matrix norm block according to acquisition.
Above-mentioned parameter training and predicting subsystem comprise:
For receiving the data input module of pending data;
For calculating historical query corresponding to each inquiry and history is clicked and is organized into the module of required data layout;
For constitutive characteristic matrix norm block;
For searching the module of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
For the module using SVM-Logic Regression Models to set up the mapping of user characteristics and optimized parameter.
Self-adaptation Personalized search of the present invention comprises:
According to current queries information, in conjunction with the step of historical query information and history click information constitutive characteristic matrix;
The step of training parameter forecast model is obtained according to eigenmatrix;
According to eigenmatrix training also application parameter forecast model, obtain the step of the parameter of prediction;
Organize current queries, historical query and history to click with prediction parameter out, user model and interrogation model are combined the step forming personalized enquire model;
Find in document to be retrieved with the document of personalized enquire Model Matching as preliminary search result, and according to correlativity, described preliminary search result is sorted, using the step that the result after sequence exports as final result for retrieval data.
Above-mentioned according to current queries information, the step in conjunction with historical query information and history click information constitutive characteristic matrix comprises:
The step of user behavior feature is generated according to current queries information, and
According to the step of the user's all behavioural characteristic constitutive characteristics matrix obtained.
Above-mentioned according to eigenmatrix training also application parameter forecast model, the step obtaining the parameter of prediction also comprises:
Receive the step of pending data;
Calculate historical query corresponding to each inquiry and history is clicked and is organized into the step of required data layout;
Constitutive characteristic matrix norm block step;
Search the step of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
SVM regression model is used to set up the step of the mapping of user characteristics and optimized parameter.
In technical scheme of the present invention, described user behavior feature comprises:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature, that is: represent the web document that user checked within very short time;
Represent the historical query category feature to searching system submitted of user in an inquiry session session, that is, represent the inquiry submitted in user's very short time,
Represent the current queries category feature of current queries;
Represent the feature between the current queries of relation between current queries and historical query and historical query;
Represent the feature between the current queries of relation between current queries and history click and history click.
The particular content of above-mentioned five category features is respectively:
Described history is clicked category feature and is comprised: history clicks total degree, history clicks total length, history clicks length mean value (mean values of whole click length of each inquiry correspondence), each click average length, a upper history clicks total length, last click number of documents, the last mean value clicking Document Length;
Described historical query category feature comprises: historical query total length, the average length of historical query and historical query total quantity;
The current queries category feature of described expression current queries comprises: current queries length;
Feature between described current queries and historical query comprises: current queries word is compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked, current queries and a upper inquiry are compared, the quantity of new epexegesis, current queries word is compared with a upper historical query, co-occurrence word accounts for the number percent of current queries length, the similarity average of current queries and historical query, the similarity maximal value of current queries and historical query, the similarity of current queries word and a upper historical query, current queries is compared with a upper historical query, the recurrence probability of new epexegesis and current queries, new epexegesis quantity, the number of times summation that new epexegesis occurs, current queries word is compared with a upper historical query, delete the recurrence probability of word and a upper historical query, the quantity of word is deleted in a upper historical query, the number of times summation that word occurs is deleted in a upper historical query, current queries is compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query, the quantity of co-occurrence word in a upper historical query, the number of times summation that in a upper historical query, co-occurrence word occurs,
Feature between described current queries and history click comprises: current queries word and whole history click similarity average, current queries word and whole history click similarity maximal value, current queries word and a upper history click similarity, current queries and a upper history point hit newly-increased word number, new epexegesis clicks occurrence number summation in a upper history, current queries word is compared with a upper historical query, delete the recurrence probability that word and a upper history are clicked, delete the quantity of word, a upper history point hits deletes word number, the number of times summation deleting that word occurs is hit at a upper history point, compared with current queries word is clicked with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked, the quantity of co-occurrence word, a upper history point hits the quantity of co-occurrence word, a upper history point hits co-occurrence word occurrence number summation.
User behavior feature corresponding to each inquiry is not necessarily identical, and the parameter in corresponding interrogation model is just not necessarily identical.Therefore, the objective retrieval Behavior law that the method that the present invention is directed to the concrete retrieval environment dynamic allocation parameter of each inquiry is more close to the users.
In actual information retrieving, call the feature weight obtained in training, the optimized parameter that should use in prediction retrieval model.The present invention adopts involved by retrieving information five kinds of features jointly to determine, and current queries, historical query and history are clicked in three parts, which part more accurately expresses user search intent and the contribution for current retrieval tasks, thus the dynamic assignment weight of three parts, reach the object obtaining optimized parameter.
To sum up, parameter in the present invention's adaptive personalized retrieval model is all predict the parameter in current queries model according to the interbehavior of each user, machine learning algorithm is have employed in the process of prediction, such retrieval model can parameter in sweetly disposition model, thus possesses higher dirigibility and retrieval rate.
Self-adaptation retrieval model of the present invention is continuous self along with user and increasing of searching system interaction times, wherein to historical information according to the size with current time interval dynamic assignment weight, determine that the parameter of attenuation amplitude is produced by parametric prediction model.In order to the present invention and mainstream technology be compared, have employed the data of (Shen et al., 2005), Setup Experiments is also consistent with this article.Consider that the importance of historical information is not with the special circumstances changed with current time interval, the present invention also compares the effect of now dynamic retrieval model and fixed coefficient retrieval model.See on the whole, along with enriching of historical information, the retrieval effectiveness of personalized retrieval model is become better and better on the whole, and the gap between model is more obvious, refers to following table:
The 4th the inquiry Q4 submitted to for user in inquiry session, first is utilized to inquire about Q1 equally, Q2 and the 3rd inquiry Q3 is as historical information in second inquiry, even if when not considering historical information difference of importance, method in this paper under such condition (i.e. AdaptiveEW result) relatively improves 38.18%, PR@20 index relative to traditional model (BayesInt) and relatively improves 17.74% on MAP measurement index; If difference of importance between historical information, AdaptiveDW model in this paper is relative to BatchUp model, MAP and PR@20 increase rate reaches 27.54% and 15.94% respectively.Data show, the retrieval effectiveness of the self-adaptation personalized retrieval model (AdaptiveDW) that the present invention proposes has exceeded personalized retrieval model (BatchUp mode) best in current main-stream method.
To sum up, self-adaptation personalized retrieval model of the present invention adopts separately parametric prediction model to produce weight, has taken into account dirigibility and the rationality of weight allocation.On identical data set, adaptive dynamically personalized retrieval model is superior to mainstream technology on retrieval effectiveness, confirms the validity of the technology proposed in this invention.
Invent concrete effect to have:
One, the new and old inquiry submitted to for user of the present invention is all effective.
Old inquiry refers to the inquiry occurred in user search history; New inquiry refers to the inquiry that user submits to first time.For old inquiry, because there is historical information can reference, will increase for the weight of historical information in personalized retrieval model, usually setting close to 1 constant.For new inquiry, because do not have history can reference, so will reduce for the weight of historical information in personalized retrieval model, usually setting close to 0 constant.The present invention unlike the prior art, self-adaptation retrieval model of the present invention is without the need to first judging whether new inquiry or old inquiry to inquiry classification, but the parameter directly set flexibly according to user behavior feature in retrieval model, therefore, the present invention is applicable to various types of user behavior feature.
Two, the present invention is according to user interactions behavior dynamic assignment weight.
Accompanying drawing explanation
Fig. 1 is principle schematic of the present invention.Fig. 2 is the message processing flow figure of parameter prediction subsystem.
Embodiment
Self-adaptation Personal Information System described in embodiment one, present embodiment comprises:
For according to current queries information, in conjunction with historical query information and history click information constitutive characteristic matrix, also for obtaining the data input subsystem of training parameter forecast model according to eigenmatrix;
For training according to eigenmatrix and application parameter forecast model, the parameter training obtaining Prediction Parameters and predicting subsystem;
For organizing current queries, historical query and history to click with prediction parameter out; Also for user model and interrogation model are combined the execution retrieval subsystem forming personalized enquire model;
For finding the document that mates with personalized enquire as preliminary search result in document to be retrieved, also for sorting to described preliminary search result according to correlativity, and using data output subsystem that the result after sorting exports as final result for retrieval.
Embodiment two, present embodiment are the further restrictions to data input subsystem in the self-adaptation Personal Information System described in embodiment one, and the data input subsystem in present embodiment comprises:
For generating the module of user behavior feature according to current queries information, and
For the user's all behavioural characteristic constitutive characteristics matrix norm block according to acquisition.
Embodiment three, present embodiment are the further restrictions to the parameter training in the self-adaptation Personal Information System described in embodiment one and predicting subsystem, and in present embodiment, parameter training and predicting subsystem comprise:
For receiving the data input module of pending data;
For calculating historical query corresponding to each inquiry and history is clicked and is organized into the module of required data layout;
For constitutive characteristic matrix norm block;
For searching the module of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
For the module using SVM regression model to set up the mapping of user characteristics and optimized parameter.
Embodiment four, present embodiment are further illustrating the user behavior feature in the self-adaptation Personal Information System described in embodiment one, and described user behavior feature comprises:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature, that is: represent that the history that user checked within very short time is clicked;
Represent the historical query category feature to searching system submitted of user in an inquiry session session, that is, represent the historical query submitted in user's very short time,
Represent the current queries category feature of current queries;
Represent the feature between the current queries of relation between current queries and historical query and historical query;
Represent the feature between the current queries of relation between current queries and history click and history click.
Embodiment five, present embodiment are further illustrating the self-adaptation Personal Information System described in embodiment four,
Described history is clicked category feature and is comprised: history clicks total degree, history clicks total length (with single word/term for unit), history clicks length mean value (mean values of whole click length of each inquiry correspondence), each click average length, a upper history clicks total length, last click number of documents, the last mean value clicking Document Length;
Described historical query category feature comprises: historical query total length, the average length of historical query and historical query total quantity;
The current queries category feature of described expression current queries comprises: current queries length;
Feature between described current queries and historical query comprises: current queries word is compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked, current queries and a upper inquiry are compared, the quantity of new epexegesis, current queries word is compared with a upper historical query, co-occurrence word accounts for the number percent of current queries length, the similarity average of current queries and historical query, the similarity maximal value of current queries and historical query, the similarity of current queries word and a upper historical query, current queries is compared with a upper historical query, the recurrence probability of new epexegesis and current queries, new epexegesis quantity, the number of times summation that new epexegesis occurs, current queries word is compared with a upper historical query, delete the recurrence probability of word and a upper historical query, the quantity of word is deleted in a upper historical query, the number of times summation that word occurs is deleted in a upper historical query, current queries is compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query, the quantity of co-occurrence word in a upper historical query, the number of times summation that in a upper historical query, co-occurrence word occurs,
Feature between described current queries and history click comprises: current queries word and whole history click similarity average, current queries word and whole history click similarity maximal value, current queries word and a upper history click similarity, current queries and a upper history point hit newly-increased word number, new epexegesis clicks occurrence number summation in a upper history, current queries word is compared with a upper historical query, delete the recurrence probability that word and a upper history are clicked, delete the quantity of word, a upper history point hits deletes word number, the number of times summation deleting that word occurs is hit at a upper history point, compared with current queries word is clicked with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked, the quantity of co-occurrence word, a upper history point hits the quantity of co-occurrence word, a upper history point hits co-occurrence word occurrence number summation.
Self-adaptation Personalized search described in embodiment six, present embodiment comprises:
According to current queries information, in conjunction with the step of historical query information and history click information constitutive characteristic matrix;
The step of training parameter forecast model is obtained according to eigenmatrix;
According to eigenmatrix training also application parameter forecast model, obtain the step of the parameter of prediction;
Organize current queries, historical query and history to click with prediction parameter out, user model and interrogation model are combined the step forming personalized enquire model;
Find in document to be retrieved with the document of personalized enquire Model Matching as preliminary search result, and according to correlativity, described preliminary search result is sorted, using the step that the result after sequence exports as final result for retrieval data.
Embodiment seven, present embodiment are in the self-adaptation Personalized search described in embodiment six, according to current queries information, in conjunction with the further restriction of the step of historical query information and history click information constitutive characteristic matrix, this step comprises further:
The step of user behavior feature is generated according to current queries information, and
According to the step of the user's all behavioural characteristic constitutive characteristics matrix obtained.
Embodiment eight, present embodiment are in the self-adaptation Personalized search described in embodiment six, according to eigenmatrix training also application parameter forecast model, obtain the further restriction of the step of the parameter of prediction, this step comprises further:
Receive the step of pending data;
Calculate historical query corresponding to each inquiry and history is clicked and is organized into the step of required data layout;
Constitutive characteristic matrix norm block step;
Search the step of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
SVM regression model is used to set up the step of the mapping of user characteristics and optimized parameter.
Embodiment nine, present embodiment are the further restrictions to the user behavior feature described in the self-adaptation Personalized search described in embodiment six, and described user behavior feature comprises:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature, that is: represent that the history that user checked within very short time is clicked;
Represent the historical query category feature of the historical query to searching system submitted of user in an inquiry session session, that is, the interior historical query submitted to of expression user's very short time,
Represent the current queries category feature of current queries;
Represent the feature between the current queries of relation between current queries and historical query and historical query;
Represent the feature between the current queries of relation between current queries and history click and history click.
Embodiment ten, present embodiment are further illustrating five class technical characteristics described in embodiment nine:
Described history is clicked category feature and is comprised: history clicks total degree, history clicks total length (with single word/term for unit), history clicks length mean value (mean values of whole click length of each inquiry correspondence), each click average length, a upper history clicks total length, last click number of documents, the last mean value clicking Document Length;
Described historical query category feature comprises: historical query total length, the average length of historical query and historical query total quantity;
The current queries category feature of described expression current queries comprises: current queries length;
Feature between described current queries and historical query comprises: current queries word is compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked, current queries and a upper inquiry are compared, the quantity of new epexegesis, current queries word is compared with a upper historical query, co-occurrence word accounts for the number percent of current queries length, the similarity average of current queries and historical query, the similarity maximal value of current queries and historical query, the similarity of current queries word and a upper historical query, current queries is compared with a upper historical query, the recurrence probability of new epexegesis and current queries, new epexegesis quantity, the number of times summation that new epexegesis occurs, current queries word is compared with a upper historical query, delete the recurrence probability of word and a upper historical query, the quantity of word is deleted in a upper historical query, the number of times summation that word occurs is deleted in a upper historical query, current queries is compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query, the quantity of co-occurrence word in a upper historical query, the number of times summation that in a upper historical query, co-occurrence word occurs,
Feature between described current queries and history click comprises: current queries word and whole history click similarity average, current queries word and whole history click similarity maximal value, current queries word and a upper history click similarity, current queries and a upper history point hit newly-increased word number, new epexegesis clicks occurrence number summation in a upper history, current queries word is compared with a upper historical query, delete the recurrence probability that word and a upper history are clicked, delete the quantity of word, a upper history point hits deletes word number, the number of times summation deleting that word occurs is hit at a upper history point, compared with current queries word is clicked with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked, the quantity of co-occurrence word, a upper history point hits the quantity of co-occurrence word, a upper history point hits co-occurrence word occurrence number summation.
Input data of the present invention are the continuous-query behaviors carried out to meet a search need according to each user of sequence of event, comprise the inquiry that each user submits to searching system, the document (comprising title and summary) that searching system returns, and the document code that user checked.
For file query_history.topic2, data layout is:
The result for retrieval of inquiry string " acquisition u.s.foreign company " is recorded between < result for retrieval > and </ result for retrieval >.The precedence that document code occurs has reacted the sequencing information of document in searching system returns results.Click the numbering that set record user clicks the document checked.
According to current queries information, in conjunction with the step of historical query information and history click information constitutive characteristic matrix be:
After input data, next carry out feature extraction.Current queries in Water demand inquiry session and historical query, current queries and history are clicked, historical query, the relation between history click, five classes, 39 the search behavior features of final each user of extraction when submitting each inquiry to, for:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature, comprising:
History clicks total degree
History clicks total length
History clicks length mean value (mean values of whole click length of each inquiry correspondence)
Each click average length
A upper history clicks total length
Last click number of documents
The last mean value clicking Document Length
Represent the historical query category feature to searching system submitted of user in an inquiry session session, comprising:
Historical query total length
Historical query length mean value
Historical query quantity
Represent the current queries category feature of current queries, comprising:
Current queries length
Represent the feature between the current queries of relation between current queries and history click and history click, comprise
Current queries term and whole history click similarity average
Current queries term and whole history click similarity maximal value
Current queries term and a upper history click similarity
Current queries and a upper history point hit newly-increased word number
New epexegesis clicks occurrence number summation in a upper history
Current queries term, compared with a upper historical query, deletes the recurrence probability that word and a upper history are clicked
Delete the quantity of word
A upper history point hits deletes word number
The number of times summation deleting that word occurs is hit at a upper history point
Compared with current queries term clicks with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked
The quantity of co-occurrence word
A upper history point hits the quantity of co-occurrence word
A upper history point hits co-occurrence word occurrence number summation
Represent the feature between the current queries of relation between current queries and historical query and historical query, comprising:
Current queries term compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked
Current queries and a upper inquiry are compared, the quantity of new epexegesis
Current queries term is compared with a upper historical query, and co-occurrence word accounts for the number percent of current queries length
The similarity average of current queries and historical query
The similarity maximal value of current queries and historical query
The similarity of current queries term and a upper historical query
Current queries term compared with a upper historical query, the recurrence probability of new epexegesis and current queries
New epexegesis quantity
The number of times summation that new epexegesis occurs
Current queries term, compared with a upper historical query, deletes the recurrence probability of word and a upper historical query
The quantity of word is deleted in a upper historical query
The number of times summation that word occurs is deleted in a upper historical query
Current queries term compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query
The quantity of co-occurrence word in a upper historical query
The number of times summation that in a upper historical query, co-occurrence word occurs
On the other hand, the optimum weighted value of each inquiry is calculated.The training data of these 39 common composition parameter forecast models of characteristic sum optimal weights value.In training data, the part of@beginning represents that the symbolic animal of the birth year of the filename of training data and the title of each feature and character pair describes.The part of below@DATA is exactly eigenmatrix (this form directly can input for existing SVM returns kit).
With q 2for example, then corresponding training data is:
@RELATION q2.arff
@ATTRIBUTE cqlenth numeric
@ATTRIBUTE class numeric
@DATA
3,2,20,20,10,20,0,2,0.0869565217391304,0.0869565217391304,0.0869565217391304,0,0,0,2,2,1,0.4,0.4,0.4,0.333333333333333,0,0.5,0.4
4,3,2,2,0.666666666666667,2,1,2,0,0,0,0,0,0,2,2,1,0.333333333333333,0.333333333333333,0.333333333333333,0.25,0,0.5,0.4
The first row of above-mentioned training data describes file " q2.arff " by name, key word is " RELATION ", and the second line description first feature " length of current queries ", key word is " ATTRIBUTE ".By that analogy, 39 feature interpretation are had.
An ensuing line description optimized parameter type is the decimal between 0-1, and key word is " ATTRIBUTE ".Be exactly characteristic of correspondence matrix after@DATA, eigenmatrix refers in training data file the content removed with@beginning, and eigenmatrix is made up of previously mentioned 39 user behavior proper vectors and corresponding optimized parameter.Every a line has 40 data item, and first 39 is eigenwert, and the 40th data item is optimized parameter.Each training data, can use a line (40) vector representation, a line of constitutive characteristic matrix.The quantity of training data determines the line number of eigenmatrix.Separate with comma between data item.
Adopt machine learning method SVM to return (SVM-Regression) according to above-mentioned training data and carry out training parameter forecast model, this model representation be the funtcional relationship of optimal weights and each feature;
With the MAP maximal value of each inquiry for search desired value.The step-length of traversal is 0.1.Adopt Support VectorRegression (SVR) (Chang and Lin, 2001) to train, determine the optimal weights of each inquiry and the funtcional relationship of 39 features, and then obtain training parameter forecast model.
When application parameter forecast model is predicted, input 39 eigenwerts of each test query, this parametric prediction model just can produce corresponding weighted value.Combine current queries by this way, historical query and history click three parts.Test data form is as follows.Test data and training data form basically identical, difference be last row of proper vector in test data are "? ", represent value to be predicted.Test data form is:
The main task performing retrieval subsystem is using TREC AP88-90 document as band search file, uses Lemur to set up index, then completes retrieval tasks at conventional language model framework.
What produce according to previous step application parameter forecast model predicts the outcome, and organizes current queries and historical information, forms personalized enquire model.
If current queries is the kth inquiry Q in inquiry session k, the user interest so representated by short-term history inquiry is embodied in historical query Q ithe average of the term probability of occurrence in (1≤i≤k-1).Similar, user's short-term interest is also embodied in history and clicks C ithe average that term in (1≤i≤k-1) occurs.Query history is by historical query H qh is clicked with history ccomposition.Query word ω represents.
A) current queries model is calculated
p ( &omega; | Q i ) = c ( &omega; , Q i ) | Q i | - - - ( 1 )
The implication of parameters in formula, please illustrate: ω represents word, Q irepresentative inquiry, P represents probability, and i represents i-th time.The length of the number of times that current queries model is occurred by current queries word and current queries determines.P (ω | Q i) represent current queries Q 1in the probability that occurs of each word ω.C (ω, Q i) represent at inquiry Q ithe number of times that middle word ω occurs.| Q i| represent the length of inquiry Qi, be namely made up of how many words.The implication of current queries model representative is the computing method of the probability of some words in inquiry string, the number of times that this word occurs in inquiry then divided by current queries in the sum of word.
B) historical query model is calculated
p ( &omega; | H Q ) = 1 k - 1 &Sigma; i = 1 i = k - 1 p ( &omega; | Q i ) - - - ( 2 )
The implication of parameters in formula, please illustrate: ω represents word, Q irepresentative inquiry, P represents probability, H qrepresent whole historical querys, i represents i-th time.Historical query model p (ω | H q) by single historical query model P (ω | Q i) adding up and being averaged obtains.For current queries Q k, its historical query is by Q 1, Q 2... Q k-1composition.By each historical query model P (ω | Q i) cumulative, then divided by the quantity k-1 of historical query.Wherein single historical query model P (ω | Q i) calculate according to formula (1).The implication of historical query model representative is at whole history H qthe method for calculating probability of middle single word ω is, calculates the sum of the word that number of times that this word occurs in each historical query comprises divided by place historical query first respectively, next, next k-1 probability is done and, finally divided by k-1.
C) history click model is calculated
p ( &omega; | H C ) = 1 k - 1 &Sigma; i = 1 i = k - 1 p ( &omega; | C i ) - - - ( 3 )
The implication of parameters in formula, please illustrate: ω represents word, C ithe web document that representative of consumer was checked, P represents probability, H cwhole history web pages document that representative of consumer has been seen, i represents i-th time.With historical query model class seemingly, history click model P (ω | H c) by single history click model P (ω | C i) adding up and being averaged obtains.For current queries Q k, its history is clicked by C 1, C 2... C k-1composition.By each history click model P (ω | C i) cumulative, then divided by the quantity k-1 that history is clicked.Wherein single history click model calculates according to formula (1).
D) current queries category feature is extracted
Mainly comprise the length of current queries.
E) historical query category feature is extracted
Mainly comprise historical query quantity, total length and average length.
F) feature between current queries and historical query is extracted
Mainly comprise the similarity between current queries and a upper inquiry, the similarity of current queries and whole historical query, new epexegesis and the quantity deleting word, and the proportion shared by current queries or historical query.
G) feature between current queries and history click is extracted
Mainly comprise the similarity between current queries and whole and upper history click, new epexegesis and the quantity deleting word, and concentrate the proportion of operation at current queries and history point.
H) operation parameter forecast model obtains parameter
User characteristics, as the input of parameter prediction system, exports the parameter of the best being applicable to current queries
I) current queries model, historical query model and history click model is organized according to the parameter doped
Wherein parameter beta k∈ (0,1) determines the weight allocation between historical query and history click, parameter beta kthe importance that larger explanation history is clicked is larger; Work as β kwhen=1, represent that user interest model is clicked by history completely and embody.In like manner, α klarger, the importance of current queries is larger.
Two kinds of methods attempted respectively by adaptive personalized retrieval model, and a kind of is that formalization representation is as formula (4) based on the retrieval model (AdaptiveEW) in the equal situation of importance between history.Another kind be according to history and current queries time gap descending, importance is changed from small to big the retrieval model (AdaptiveDW) under rule, and formalization representation is as shown in formula (5).Wherein, Q krepresent current queries, H crepresent that the history in current queries session before current queries is clicked, H qrepresent the historical query in current queries session.Parameter alpha k, β k, m k, n krepresent weight respectively, their span is the arbitrary small number between 0 to 1.
The interrogation model p of self-adaptation personalized retrieval model (AdaptiveEW) (ω | θ k) comprise two parts: current queries model p (ω | Q k) and historical models, current queries Model Weight is α k.Historical models weight is 1-α k.The probability that current queries model representation current queries word ω occurs, calculates according to formula (1).Wherein historical information by history click model p (ω | H c) and historical query model p (ω | H q) composition.Historical query model calculates according to formula (2).History click model calculates according to formula (3).Between each historical query, weight is equal.Between the click of each history, weight is equal.History click model weight is 1-β k, history click model weight is β kas shown in formula (4).
p(ω|θ k)=α κp(ω|Q K)+(1-α k)[β kp(ω|H C)+(1-β k)p(ω|H Q)]
(4)
The implication of parameters in formula, please illustrate:
Be more than self-adaptation retrieval model (AdaptiveEW), wherein between historical information, weight is equal.Another kind of self-adaptation retrieval model thinks that the importance of historical information is relevant with the time gap of current queries.The interrogation model p of this self-adaptation retrieval model (AdaptiveDW) (ω | ψ k) comprise two parts: interrogation model p (ω | θ k) and history click model p (ω | H c) composition.History click model p (ω | H c) weight be m k, interrogation model p (ω | θ k) weight be 1-m k.Interrogation model p (ω | θ k) by current queries model p (ω, θ k) and a upper moment interrogation model p (ω | θ k-1) composition.Current queries model p (ω, θ k) weight is n k, the interrogation model p in a upper moment (ω | θ k-1) weight is 1-n k.Historical query model calculates according to formula (2).History click model calculates according to formula (3).Interrogation model carries out weight decay to old interrogation model along with passage of time in self-adaptation retrieval model (AdaptiveDW), and new historical query is larger than the weight of old historical query, formalization representation is as shown in formula (5).
p(ω|θ k)=n kp(ω,Q K)+(1-n k)p(ω|θ k-1)
The implication of parameters in formula, please illustrate:
J) retrieving is started
In document to be retrieved, find the result for retrieval mated with personalized enquire, and carry out descending sort according to correlation probabilities value.Each inquiry returns 1000 sections of documents.
After personalized enquire is submitted to searching system, searching system returns result for retrieval.The data layout of personalizing search results:
First row represents number of queries, and secondary series represents document code, and the 3rd row representative sequence, the 4th row represent the mark of language model.So far, the implementation process of whole self-adaptation personalized retrieval model terminates.

Claims (8)

1. self-adaptation Personal Information System, is characterized in that this system comprises:
For according to current queries information, in conjunction with historical query information and history click information constitutive characteristic matrix, also for obtaining the data input subsystem of training parameter forecast model according to eigenmatrix;
For training according to eigenmatrix and application parameter forecast model, the parameter training obtaining Prediction Parameters and predicting subsystem;
For organizing current queries, historical query and history to click with prediction parameter out; Also for user model and interrogation model are combined the execution retrieval subsystem forming personalized enquire model;
For finding the document that mates with personalized enquire as preliminary search result in document to be retrieved, also for sorting to described preliminary search result according to correlativity, and using data output subsystem that the result after sorting exports as final result for retrieval;
Wherein, described parameter training and predicting subsystem comprise:
For receiving the data input module of pending data;
For calculating historical query corresponding to each inquiry and history is clicked and is organized into the module of required data layout;
For constitutive characteristic matrix norm block;
For searching the module of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
For the module using SVM regression model to set up the mapping of user characteristics and optimized parameter.
2. self-adaptation Personal Information System according to claim 1, is characterized in that, described data input subsystem comprises:
For generating the module of user behavior feature according to current queries information, and
For the user's all behavioural characteristic constitutive characteristics matrix norm block according to acquisition.
3. self-adaptation Personal Information System according to claim 2, is characterized in that, described user behavior feature comprises:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature,
Represent the historical query category feature to searching system submitted of user in an inquiry session session,
Represent the current queries category feature of current queries;
Represent the feature between the current queries of relation between current queries and historical query and historical query;
Represent the feature between the current queries of relation between current queries and history click and history click.
4. self-adaptation Personal Information System according to claim 3, is characterized in that,
Described history is clicked category feature and is comprised: history click total degree, and history clicks total length, and history clicks length mean value, clicks average length at every turn, and a upper history clicks total length, last click number of documents, the last mean value clicking Document Length;
Described historical query category feature comprises: historical query total length, the average length of historical query and historical query total quantity;
The current queries category feature of described expression current queries comprises: current queries length;
Feature between described current queries and historical query comprises: current queries word is compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked, current queries and a upper inquiry are compared, the quantity of new epexegesis, current queries word is compared with a upper historical query, co-occurrence word accounts for the number percent of current queries length, the similarity average of current queries and historical query, the similarity maximal value of current queries and historical query, the similarity of current queries word and a upper historical query, current queries is compared with a upper historical query, the recurrence probability of new epexegesis and current queries, new epexegesis quantity, the number of times summation that new epexegesis occurs, current queries word is compared with a upper historical query, delete the recurrence probability of word and a upper historical query, the quantity of word is deleted in a upper historical query, the number of times summation that word occurs is deleted in a upper historical query, current queries is compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query, the quantity of co-occurrence word in a upper historical query, the number of times summation that in a upper historical query, co-occurrence word occurs,
Feature between described current queries and history click comprises: current queries word and whole history click similarity average, current queries word and whole history click similarity maximal value, current queries word and a upper history click similarity, current queries and a upper history point hit newly-increased word number, new epexegesis clicks occurrence number summation in a upper history, current queries word is compared with a upper historical query, delete the recurrence probability that word and a upper history are clicked, delete the quantity of word, a upper history point hits deletes word number, the number of times summation deleting that word occurs is hit at a upper history point, compared with current queries word is clicked with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked, the quantity of co-occurrence word, a upper history point hits the quantity of co-occurrence word, a upper history point hits co-occurrence word occurrence number summation.
5. self-adaptation Personalized search, is characterized in that this self-adaptation Personalized search comprises:
According to current queries information, in conjunction with the step of historical query information and history click information constitutive characteristic matrix;
The step of training parameter forecast model is obtained according to eigenmatrix;
According to eigenmatrix training also application parameter forecast model, obtain the step of the parameter of prediction;
Organize current queries, historical query and history to click with prediction parameter out, user model and interrogation model are combined the step forming personalized enquire model;
Find in document to be retrieved with the document of personalized enquire Model Matching as preliminary search result, and according to correlativity, described preliminary search result is sorted, using the step that the result after sequence exports as final result for retrieval data;
Wherein, described according to eigenmatrix training also application parameter forecast model, the step obtaining the parameter of prediction also comprises:
Receive the step of pending data;
Calculate historical query corresponding to each inquiry and history is clicked and is organized into the step of required data layout;
The step of constitutive characteristic matrix;
Search the step of the parameter of current queries optimum in the mode of searching of traversal, the step-length of described traversal is 0.1;
SVM regression model is used to set up the step of the mapping of user characteristics and optimized parameter.
6. self-adaptation Personalized search according to claim 5, is characterized in that, according to current queries information, the step in conjunction with historical query information and history click information constitutive characteristic matrix comprises:
The step of user behavior feature is generated according to current queries information, and
According to the step of the user's all behavioural characteristic constitutive characteristics matrix obtained.
7. self-adaptation Personalized search according to claim 6, is characterized in that, described user behavior feature comprises:
Represent that the history of the web document of checking of user in an inquiry session session clicks category feature,
Represent the historical query category feature to searching system submitted of user in an inquiry session session,
Represent the current queries category feature of current queries;
Represent the feature between the current queries of relation between current queries and historical query and historical query;
Represent the feature between the current queries of relation between current queries and history click and history click.
8. self-adaptation Personalized search according to claim 7, is characterized in that,
Described history is clicked category feature and is comprised: history click total degree, and history clicks total length, and history clicks length mean value, clicks average length at every turn, and a upper history clicks total length, last click number of documents, the last mean value clicking Document Length;
Described historical query category feature comprises: historical query total length, the average length of historical query and historical query total quantity;
The current queries category feature of described expression current queries comprises: current queries length;
Feature between described current queries and historical query comprises: current queries word is compared with a upper historical query, the recurrence probability that new epexegesis and a upper history are clicked, current queries and a upper inquiry are compared, the quantity of new epexegesis, current queries word is compared with a upper historical query, co-occurrence word accounts for the number percent of current queries length, the similarity average of current queries and historical query, the similarity maximal value of current queries and historical query, the similarity of current queries word and a upper historical query, current queries is compared with a upper historical query, the recurrence probability of new epexegesis and current queries, new epexegesis quantity, the number of times summation that new epexegesis occurs, current queries word is compared with a upper historical query, delete the recurrence probability of word and a upper historical query, the quantity of word is deleted in a upper historical query, the number of times summation that word occurs is deleted in a upper historical query, current queries is compared with a upper historical query, the recurrence probability of co-occurrence word and a upper historical query, the quantity of co-occurrence word in a upper historical query, the number of times summation that in a upper historical query, co-occurrence word occurs,
Feature between described current queries and history click comprises: current queries word and whole history click similarity average, current queries word and whole history click similarity maximal value, current queries word and a upper history click similarity, current queries and a upper history point hit newly-increased word number, new epexegesis clicks occurrence number summation in a upper history, current queries word is compared with a upper historical query, delete the recurrence probability that word and a upper history are clicked, delete the quantity of word, a upper history point hits deletes word number, the number of times summation deleting that word occurs is hit at a upper history point, compared with current queries word is clicked with a upper history, the recurrence probability that co-occurrence word and a upper history are clicked, the quantity of co-occurrence word, a upper history point hits the quantity of co-occurrence word, a upper history point hits co-occurrence word occurrence number summation.
CN201210244519.5A 2012-07-16 2012-07-16 Self-adaptive personalized information retrieval system and method Active CN102779193B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201210244519.5A CN102779193B (en) 2012-07-16 2012-07-16 Self-adaptive personalized information retrieval system and method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201210244519.5A CN102779193B (en) 2012-07-16 2012-07-16 Self-adaptive personalized information retrieval system and method

Publications (2)

Publication Number Publication Date
CN102779193A CN102779193A (en) 2012-11-14
CN102779193B true CN102779193B (en) 2015-05-13

Family

ID=47124105

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201210244519.5A Active CN102779193B (en) 2012-07-16 2012-07-16 Self-adaptive personalized information retrieval system and method

Country Status (1)

Country Link
CN (1) CN102779193B (en)

Families Citing this family (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104346160B (en) * 2013-08-09 2018-02-27 联想(北京)有限公司 The method and electronic equipment of information processing
CN104462146A (en) * 2013-09-24 2015-03-25 北京千橡网景科技发展有限公司 Method and device for information retrieval
CN104516897B (en) * 2013-09-29 2018-03-02 国际商业机器公司 A kind of method and apparatus being ranked up for application
CN104679771B (en) * 2013-11-29 2018-09-18 阿里巴巴集团控股有限公司 A kind of individuation data searching method and device
CN104778176A (en) * 2014-01-13 2015-07-15 阿里巴巴集团控股有限公司 Data search processing method and device
CN104951637B (en) 2014-03-25 2018-04-03 腾讯科技(深圳)有限公司 A kind of method and device for obtaining training parameter
CN104537502A (en) * 2015-01-15 2015-04-22 北京嘀嘀无限科技发展有限公司 Method and device for processing orders
CN104462357B (en) * 2014-12-08 2017-11-17 百度在线网络技术(北京)有限公司 The method and apparatus for realizing personalized search
CN105022787A (en) * 2015-06-12 2015-11-04 广东小天才科技有限公司 Composition pushing method and apparatus
CN105095357A (en) * 2015-06-24 2015-11-25 百度在线网络技术(北京)有限公司 Method and device for processing consultation data
CN105045875B (en) * 2015-07-17 2018-06-12 北京林业大学 Personalized search and device
CN107423298B (en) * 2016-05-24 2021-02-19 北京百度网讯科技有限公司 Searching method and device
CN108509461A (en) * 2017-02-28 2018-09-07 华为技术有限公司 A kind of sequence learning method and server based on intensified learning
CN107133321B (en) * 2017-05-04 2020-06-12 广东神马搜索科技有限公司 Method and device for analyzing search characteristics of page
CN107229948A (en) * 2017-05-19 2017-10-03 四川新网银行股份有限公司 A kind of method for reducing customer churn on line based on customer problem forecast model
CN107256267B (en) * 2017-06-19 2020-07-24 北京百度网讯科技有限公司 Query method and device
CN108345696B (en) * 2018-03-20 2021-03-12 Oppo广东移动通信有限公司 Card sorting method, device, server and storage medium
CN114021019B (en) * 2021-11-10 2024-03-29 中国人民大学 Retrieval method integrating personalized search and diversification of search results
CN115016873A (en) * 2022-05-05 2022-09-06 上海乾臻信息科技有限公司 Front-end data interaction method and system, electronic equipment and readable storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127043A (en) * 2007-08-03 2008-02-20 哈尔滨工程大学 Lightweight individualized search engine and its searching method
CN102346899A (en) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 Method and device for predicting advertisement click rate based on user behaviors

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1858733B (en) * 2005-11-01 2012-04-04 华为技术有限公司 Information searching system and searching method

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101127043A (en) * 2007-08-03 2008-02-20 哈尔滨工程大学 Lightweight individualized search engine and its searching method
CN102346899A (en) * 2011-10-08 2012-02-08 亿赞普(北京)科技有限公司 Method and device for predicting advertisement click rate based on user behaviors

Also Published As

Publication number Publication date
CN102779193A (en) 2012-11-14

Similar Documents

Publication Publication Date Title
CN102779193B (en) Self-adaptive personalized information retrieval system and method
CN102902806B (en) A kind of method and system utilizing search engine to carry out query expansion
CN105488024B (en) The abstracting method and device of Web page subject sentence
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN1702654B (en) Method and system for calculating importance of a block within a display page
JP5632124B2 (en) Rating method, search result sorting method, rating system, and search result sorting system
CN100507920C (en) Search engine retrieving result reordering method based on user behavior information
CN101520785B (en) Information retrieval method and system therefor
CN103020164B (en) Semantic search method based on multi-semantic analysis and personalized sequencing
CN102081668B (en) Information retrieval optimizing method based on domain ontology
CN108846029B (en) Information correlation analysis method based on knowledge graph
CN108182186B (en) Webpage sorting method based on random forest algorithm
CN105893609A (en) Mobile APP recommendation method based on weighted mixing
CN103646092A (en) SE (search engine) ordering method based on user participation
CN101770521A (en) Focusing relevancy ordering method for vertical search engine
CN1996316A (en) Search engine searching method based on web page correlation
CN116244418A (en) Question answering method, device, electronic equipment and computer readable storage medium
CN114090861A (en) Education field search engine construction method based on knowledge graph
CN104636403B (en) Handle the method and device of inquiry request
Wang et al. Search engine optimization based on algorithm of BP neural networks
CN114090877A (en) Position information recommendation method and device, electronic equipment and storage medium
Yan et al. An improved PageRank method based on genetic algorithm for web search
CN105808761A (en) Solr webpage sorting optimization method based on big data
Sun et al. Research on question retrieval method for community question answering
Du et al. Scientific users' interest detection and collaborators recommendation

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20200330

Address after: 150001 No. 118 West straight street, Nangang District, Heilongjiang, Harbin

Patentee after: Harbin University of technology high tech Development Corporation

Address before: 150001 Harbin, Nangang, West District, large straight street, No. 92

Patentee before: HARBIN INSTITUTE OF TECHNOLOGY