CN111125538B - Searching method for enhancing personalized retrieval effect by utilizing entity information - Google Patents

Searching method for enhancing personalized retrieval effect by utilizing entity information Download PDF

Info

Publication number
CN111125538B
CN111125538B CN201911413378.3A CN201911413378A CN111125538B CN 111125538 B CN111125538 B CN 111125538B CN 201911413378 A CN201911413378 A CN 201911413378A CN 111125538 B CN111125538 B CN 111125538B
Authority
CN
China
Prior art keywords
entity
user
query
history
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201911413378.3A
Other languages
Chinese (zh)
Other versions
CN111125538A (en
Inventor
窦志成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN201911413378.3A priority Critical patent/CN111125538B/en
Publication of CN111125538A publication Critical patent/CN111125538A/en
Application granted granted Critical
Publication of CN111125538B publication Critical patent/CN111125538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a searching method for enhancing personalized retrieval effect by utilizing entity information, which comprises the following steps of 1, personalizing entity links, wherein the personalized entity links utilize historic promotion to query entity link effect and utilize entity enhancement models to carry out user intention modeling; step 2, constructing a user preference portrait, wherein the user preference portrait is constructed based on predicted intention, and a fine user preference portrait enhanced by a memory neural network construction entity by utilizing historical entity information; and step 3, obtaining personalized relevance according to the user intention model and the fine user preference portrait model and sequencing.

Description

Searching method for enhancing personalized retrieval effect by utilizing entity information
Technical Field
The present invention relates to a search method, and more particularly, to a search method for enhancing personalized search results using entity information.
Background
Personalized search has been widely focused, and aims to assist in judging the intention and preference of a user in current query by using the historical behaviors of the user, so that different search result sequences are returned to different users, and user experience is improved. Because of ambiguity and the general shortness and shortness of queries, the queries issued by users often cannot fully express their true intent, but even the same intent, different users may have different preferences, and thus personalization of search results is necessary.
In the prior art, many features such as document topics or sub-topics and the number of clicks of a user are extracted from the user history to calculate relevance for the current candidate document. Deep learning is then also introduced into the personalized search. In addition, the hierarchical recurrent neural network is utilized to dynamically learn the expression of the user portraits from the user history, thereby predicting the relevance of the current document and the user preference portraits. The use of the antagonistic neural network further enhances the effect of the depth model in personalized searches.
Existing personalized search methods are mainly based on historical search records of users to learn the relevance between documents and the user's current queries and user portraits, but may ignore the links between things that exist in the real world but are not reflected in these search records, thereby affecting the learning of relevance matches. Many search models utilize relationships and semantic information existing among entities to improve matching accuracy by introducing a knowledge base. But there is a lack of related methods for introducing knowledge of entities in the field of personalized searches.
In addition to better learning relevance with entity ties, the introduction of entities can better meet some of the demand characteristics of personalized searches. User intent can be better expressed, for example, with explicit entities, especially for ambiguous queries. Meanwhile, historical search information of the user in the personalized search task is also helpful for judging entity links, and further helps to infer and express the intention of the user. Secondly, clicking on the entity contained in the web page by the user may reflect the user's specific preference information more than text information in the entire web page, as the text information of the entire web page is more redundant. The user preference portrait can be better constructed by utilizing the entity information, so that the personalized relevance of the document can be better calculated.
Disclosure of Invention
The invention provides a searching method for enhancing personalized retrieval effect by utilizing entity information, which comprises the steps of firstly carrying out personalized entity linking on inquiry, utilizing history to promote the effect of the entity linking of the inquiry, utilizing the entity to enhance the model to model and express the user intention, then constructing a user portrait more accurately based on predicted intention, constructing a fine user preference portrait enhanced by the entity by utilizing the history entity information through a memory neural network, and finally utilizing the predicted user intention and the user portrait to calculate the personalized relevance of a document and arrange the personalized relevance of the document, thereby promoting user experience. After the sorting is finished, the invention provides that the entity link results of the previous query are adjusted by using the click feedback of the user and the current query, so that the understanding of the historical search intention and preference by a model is further optimized to act on the individuation of the subsequent query results.
Drawings
FIG. 1 is an overall flow chart of the present invention
FIG. 2 is a diagram of a personalized entity linkage structure of the present invention;
FIG. 3 is a diagram of a user portrayal construction.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.
Knowledge bases have been widely studied in recent years and often introduced into search models as external knowledge to improve the accuracy of matching between queries and documents by their nature of storing a large amount of contact and semantic information between entities in the real world. For example, the document "Chen Kaige" is not highly relevant to the director of the query "bawang-Ji" from the perspective of text semantic similarity alone, but the document is highly consistent with the query intent based on real world context, and the problem of utilizing the relationship between entities can be well solved. But research on the introduction of external knowledge is still relatively lacking in the field of personalized searches.
In addition, the invention utilizes the entity to better predict the search intention of the user in the personalized search and construct the user preference portrait. For example, for a query of "cherry reviews", it is difficult to determine whether the user's search intent is a cherry keyboard or a cherry keyboard due to semantic ambiguity. But the entity is introduced and personalized entity links, and according to the related history query 'famous cherry blossom spots in Asia', the user search intention can be predicted to be cherry blossom. If the user intent is explicitly expressed as a cherry blossom entity, then the web page document describing the cherry blossom may be ranked ahead to meet the user's needs. Further based on predicted user intent, and utilizing entity information in the historical search, finer representations of preferences of the user may be constructed. For example, based on predicting the linked sakura entity, the historical query including the sakura entity can be searched, and further according to the entities such as "japan", "hokkaido" and the like included in the user click document under the query, the finer preference of the user is known as "sakura landscape", and then the sakura tourist attraction in japan can be recommended to the search result front, thereby further improving the user experience.
As shown in FIG. 1, the method comprises the steps of firstly carrying out personalized entity linking on the query, utilizing the history to promote the effect of the entity linking on the query, utilizing the entity to enhance the modeling and representation of the user intention, then constructing the user portrait more accurately based on the predicted intention, constructing the fine user preference portrait enhanced by the entity through the memory neural network by utilizing the history entity information, and finally utilizing the predicted user intention and the user portrait to calculate the personalized relevance of the document and arrange the personalized relevance, thereby promoting the user experience. After the sorting is finished, the invention provides that the entity link results of the previous query are adjusted by using the click feedback of the user and the current query, so that the understanding of the historical search intention and preference by a model is further optimized to act on the individuation of the subsequent query results.
The personalisation entity links are shown in figure 2. The user history consists of a series of search sessions:
Figure BDA0002350559940000031
a session consists of a series of queries and corresponding sets of document candidates: />
Figure BDA0002350559940000032
Where h is the id, x that identifies the session h Is the number of queries within the session. When the user issues the t-th query q in the current session t After that, there is a need to search interest in the candidate document set according to the user history +.>
Figure BDA0002350559940000041
Personalized ranking to fit user at q t The following search intent. The process is repeated until the current session ends.
Dividing the user history into a short-term history and a long-term history is effective in personalized searches. Short term history is defined as historical search records in the current session
Figure BDA0002350559940000042
The long-term history is defined as search record +.>
Figure BDA0002350559940000043
If the query q contains x text segments related to the entity, the candidate entity set of the query is defined as:
Figure BDA0002350559940000047
wherein n is i The ids of candidate entities related to the ith text fragment in the query are identified. The query entity vector is expressed as:
Figure BDA0002350559940000044
wherein p is i,j For entity e i,j E i,j Is a pre-trained entity vector, and then trained.
The entity vector of a document is expressed as:
Figure BDA0002350559940000045
wherein c i Is the frequency of occurrence of an entity in a document. Likewise, text vector representations of documents and queries are defined as:
Figure BDA0002350559940000046
w i is a word vector pre-trained with glove.
As described above, the invention firstly utilizes the user history information to carry out personalized entity linking on the query, namely, calculates the linking probability of each candidate entity, so that the user intention is clearer and clearer, and the invention is used for the subsequent construction of the user portrait and the personalized relevance calculation of the document.
The calculation of the entity link probability is mainly divided into two parts: link relevance between entities and queries, entity link relevance determined based on user history:
Figure BDA0002350559940000051
wherein MLP stands for full connectivity layer.
Link relevance between an entity and a query includes vector similarity and statistical features:
Figure BDA0002350559940000052
wherein l i,j Representing statistical features such as popularity of the candidate entity.
The entity link correlation calculation based on the user history comprises the following steps: modeling a historical search sequence of a user to infer an implicit intention under a current query so as to provide basis for a current entity link; searching related queries in the user history, and providing basis for entity links of the current query by utilizing historical entity information in the queries.
Sequence history modeling first models a sequence of user historical query actions using the LSTM layer and uses the attention mechanism based on the current query to give higher attention to relevant historical actions to infer the current query intent. Firstly, splicing a query text vector in a history search behavior and a text vector of a corresponding click document for short-term history to be used as an input of an LSTM layer, so that short-term user intention t can be obtained s
Figure BDA0002350559940000053
/>
Figure BDA0002350559940000054
Wherein the method comprises the steps of
Figure BDA0002350559940000055
Figure BDA0002350559940000056
Is the average of the text vectors of the corresponding click documents. The above equation is similarly based on long term historyIs->
Figure BDA0002350559940000057
And->
Figure BDA0002350559940000058
Replaced by->
Figure BDA0002350559940000059
The average value of the text vector of the corresponding click document can be calculated to obtain the long-term user intention t l
Historical entity information modeling utilizes LSTM and intent mechanisms to give higher weight to queries in the historical queries that are related to the current query, and then utilizes entity information in these queries as related historical entity information. Thus, the text vector of the history query is taken as the input of the LSTM layer, and the short-term related entity vector e on the short-term history s Is calculated as follows:
Figure BDA00023505599400000510
Figure BDA00023505599400000511
wherein the method comprises the steps of
Figure BDA00023505599400000512
The same applies to Q in the above equation based on long term history s And->
Figure BDA00023505599400000513
Replaced by Q l And->
Figure BDA00023505599400000514
Can obtain long-term related entity vector e l
The entity link correlation based on the personalized history is as follows:
Figure BDA0002350559940000061
wherein g (x, y) =tanh (x T *MLP(y))。
Based on the predicted user intent, the user's preferences under that intent can be better modeled. And meanwhile, the model can learn finer preferences of the user by further utilizing entity information in the search history. Because of the memory neural network's better ability to store long sequences of information, the present invention uses a kernel-value (key value pair) memory neural network to store user history information to model user portraits, as shown in FIG. 3, which is a user preference portraits construction.
The physical memory neural network is utilized to construct a physical enhanced user representation. The key value is the entity vector of the history query, and the value is the average value of the entity vectors of the documents clicked by the user under the corresponding history query. In this way the fine preferences that the user embodies under historical query intent can be preserved. Thus, there are short-term histories:
Figure BDA0002350559940000062
wherein the method comprises the steps of
Figure BDA0002350559940000063
Figure BDA0002350559940000064
Is the entity vector mean of the corresponding click document.
The entity vector of the current query is then used as the predicted user intent vector to construct the user preference portrait because the predicted entity link probability reflects the user's intent. Based on the entity vector, a short-term entity representation is read from the short-term entity memory neural network through an attention mechanism
Figure BDA0002350559940000065
The following are provided:
Figure BDA0002350559940000066
because most of the entities directly related to the current query are retrieved from the memory neural network by using only the entity vector of the current query, the invention then splices the entity vector of the current query and the read user image as a new user intention vector for secondary reading. In this way, entities that are also related to user preferences can be further retrieved from the memory neural network, so that the constructed user representation encompasses a wider range of interests of the user. Thus, there are:
Figure BDA0002350559940000067
in the same way, based on long-term history, the keys of the neural network are memorized
Figure BDA0002350559940000068
Sum->
Figure BDA0002350559940000069
Replaced by->
Figure BDA0002350559940000071
And the entity vector average value of the corresponding click document, and performing secondary reading to obtain a long-term entity portrait +.>
Figure BDA0002350559940000072
The text memory neural network then constructs a user interest portrait based on the original text information. Wherein the key value is the text vector of the historical query, and the value is the text vector average value of the corresponding click document. Thus, there are short-term histories:
Figure BDA0002350559940000073
wherein the method comprises the steps of
Figure BDA0002350559940000074
Figure BDA0002350559940000075
Is the text vector mean of the corresponding click document.
Since the original query text may not fully reflect the user's query intent, the present invention queries the original text vector
Figure BDA0002350559940000076
With implicit user intention vector t modeled with LSTM s The user text preference portraits are read using an attention mechanism with stitching as user intent vectors. Since the association between words is not as strong as the association between entities, it is read only once here. Thus constructing a short-term user text representation based on short-term history>
Figure BDA0002350559940000077
Figure BDA0002350559940000078
Similarly, the keys of the neural network may be memorized based on long-term history
Figure BDA0002350559940000079
Sum->
Figure BDA00023505599400000710
Replaced by->
Figure BDA00023505599400000711
And the text vector mean of the corresponding click document can construct a long-term user text portrait +.>
Figure BDA00023505599400000712
Using the predicted user intent and the constructed user portraits, a personalized relevance score can be calculated for the document and personalized ranking can be performed accordingly.
Given user history
Figure BDA00023505599400000713
The relevance score for candidate document d under query q may be calculated as:
Figure BDA00023505599400000714
wherein the method comprises the steps of
Figure BDA00023505599400000715
And->
Figure BDA00023505599400000716
Representing predicted user intent and user preference vectors, respectively.
The user intent correlation calculates the correlation between the document and the user intent vector:
Figure BDA00023505599400000717
wherein g (x, y) =tanh (xt×mlp (y))
Figure BDA00023505599400000718
The user preference correlation calculates the correlation between the document and the user preference portrait:
Figure BDA00023505599400000719
query relevance is focused on matching between documents and current queries, including vector similarity and traditional click features. Meanwhile, in order to further explore personalized matching between the entities of the query link and the documents, the invention introduces the interactive matching characteristics between the entities, so that the method comprises the following steps:
Figure BDA0002350559940000081
wherein f d Representing a traditional click feature such as the number of clicks a user historically has on the url under the same query.
For entity interaction matching feature f m The invention proposes two interactive matching components with entities, PEDRM and PCERM. To simplify the symbolic representation, all candidate entities in the current query are integrated into one list, so there are:
Figure BDA0002350559940000082
Figure BDA0002350559940000083
hereinafter, e q And e d Will be used to represent the entity encoding vectors in the query and document, respectively.
PEDRM is a matching group price that incorporates personalized information. EDRM first builds a text and entity monitoring matrix between queries and documents, and then extracts matching features using a Gaussian kernel pooling layer:
Figure BDA0002350559940000084
Figure BDA0002350559940000085
wherein the method comprises the steps of
Figure BDA0002350559940000086
Representing the splicing operation, M e,e M is the interaction matrix between the query entity and the document entity e,w M is a matrix of interactions between the querying entity and the document text w,e To query the interaction matrix between text and document entities, M w,w For query text and document textInteraction matrix between.
In PEDRM, the invention fuses personalized information in the interaction matrix. When calculating the interaction matrix with the entity in the query, the predicted entity probability is taken as the weight of the entity interaction to reflect the relevance to the personalized intent of the user. Meanwhile, the interaction matching matrix R of the relation between the entities and the query vector is added to further extract matching characteristics:
Figure BDA0002350559940000087
Figure BDA0002350559940000088
wherein the relationship between the query and the document entity may be characterized as
Figure BDA0002350559940000089
The interaction matrix of entity relationships is added because the matches between entity vectors do not necessarily reflect the degree of matching between the query and the document entirely. For example, the queries "Obama's life", "Michelle" and "U.S. A" are all related to the entity "Obama", but only the relationship "islife" exists between "Michelle" and "Obama", which meets the requirements of the query. Thus for the interaction feature f m The calculation of (1) is as follows:
Figure BDA0002350559940000091
PCERM is a relatively simple interaction matching component that uses only one 3-channel CNN to extract personalized matching interaction features:
Figure BDA0002350559940000092
f m =MLP(Flat(Relu(C))),
wherein the method comprises the steps of
Figure BDA0002350559940000093
Representing a stitching operation in a first dimension, W CNN And b CNN A, b are parameters of convolution kernel in CNN, a, b are convolution kernel size, flat represents smoothing operation, and matrix is flattened into vector.
After the personalized relevance calculation of the documents is completed and the ranking is performed according to the personalized relevance calculation, the invention provides that the click feedback of the user under the current query and the current query is utilized to adjust the entity link probability of other historical queries in the current session, because the intention of the user in the same session is relatively consistent. For example, with the current query and the user clicking on the entity "software" in the document, the ambiguous historical query "Java" in the session may be considered to refer to the entity as "Java language". The adjusted link result can enable the user portrait constructed when the follow-up inquiry is personalized to be more accurate, and when the user inquires 'which IDE to choose' later, the 'eclipse' webpage suitable for java development can be arranged in front.
But once the entity link probabilities can be adjusted, the situation becomes more complex. Because there is a variation in the link probability of one entity, the link probabilities of other text segment associations can also need to be changed due to the consistent relationship between conversational intents. The idea of the invention is therefore to pick the entity with the highest probability of linking as a reliable link and then use this entity to adjust the probability of candidate entities associated with other text segments.
Specifically, after personalized ranking is completed, entity information in the user click document is utilized to adjust entity probability in the current query:
Figure BDA0002350559940000094
wherein the method comprises the steps of
Figure BDA0002350559940000095
To click on the entity vector mean of the document, the superscript t identifies that the query is the t-th query in the current session again. Next find candidate entities under the current queryEpsilon in the body t Entity with highest probability of linking +.>
Figure BDA0002350559940000101
If p < = δ (here δ is set to 0.5), the adjustment process ends, followed by processing the user's t+1th query.
If p > delta, assume that the reliably linked entity is associated with the a-th text segment in the query, and take the entity information of the text segment:
Figure BDA0002350559940000102
to adjust the probability epsilon of other candidate entities in the session k 1 < = k < = t-1. According to the entity vector similarity and the query text similarity, the link probability of other entities is adjusted as follows:
Figure BDA0002350559940000103
next find the candidate entity with the highest link probability in the whole session
Figure BDA0002350559940000104
If p > delta and the text segment associated with the selected entity has not been selected before, repeating the steps to adjust the link probabilities of other candidate entities using the text segment entity information associated with the selected entity; otherwise, the adjustment process is ended, and the t+1st query of the user is processed next. In summary, when the link probability of no entity in the session is greater than the threshold δ or all text segments associated with the entity have been selected, the adjustment process ends. In the present invention, the input of the training model is in units of sessions, so that the parameters w, w used in the process are adjusted while minimizing the loss value of the whole session 1 ,w 2 Would be optimized by training.
The invention trains the model by taking the conversation as a unit, adopts the pass function of the pair-wise, so the invention has the following steps:
Figure BDA0002350559940000105
/>
where s is the query session of user u,
Figure BDA0002350559940000106
for the user's search history prior to query q, d + Representing candidate document set under query q>
Figure BDA0002350559940000107
In (d) positive example document - Representing a negative example document.
In order to better model the user intention and the user portrait, the invention provides the effect of enhancing the personalized search by utilizing the entity information in the knowledge base. The invention firstly carries out personalized entity link to eliminate the ambiguity of inquiry, so that the model can better learn and express the intention of the user. According to the predicted user intention, the invention utilizes the entity information in the history search record and constructs the user portrait through the memory network, thereby better modeling the personalized preference of the user. After the personalized score calculation and the sorting of the documents are completed, the invention utilizes the click feedback of the current query and the user to adjust the entity link result of the historical query so as to better analyze the history of the user, which is helpful for further simulating the interests of the user. The invention effectively enhances the personalized searching effect by utilizing the entity information, and can greatly improve the user experience.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (4)

1. The searching method for enhancing the personalized retrieval effect by utilizing the entity information is characterized by comprising the following steps of step 1, personalizing entity links, wherein the personalized entity links utilize historic promotion to query entity link effects and utilize entity enhancement models to model user intention; step 2, constructing a user preference portrait, wherein the user preference portrait is constructed based on predicted intention, and a fine user preference portrait enhanced by a memory neural network construction entity by utilizing historical entity information; step 3, personalized relevance is obtained and ordered according to the user intention model and the fine user preference portrait model;
the personalized entity links are specifically formed by a series of search sessions with user history
Figure FDA0004168676840000011
The S is 1 ...S m The m is the number of sessions, the mth session is the current session, the session S h Consists of a series of queries and corresponding sets of document candidates: />
Figure FDA0004168676840000012
Where h is the id, x that identifies the session h Is the number of queries in the session when the user issues the t-th query q in the current session t Thereafter, search interest is ++A for candidate document sets based on the user history>
Figure FDA0004168676840000013
Personalized ranking to fit user at q t Search intention under d |Dt| Is the D t Candidate documents; the user history is then divided into a short-term history, which is a history search record in the current session, and a long-term history>
Figure FDA0004168676840000014
The s identifies a short term history; the long-term history is in history sessionSearching for records
Figure FDA0004168676840000015
Where l identifies a long-term history, m-1 represents m-1 sessions preceding the current session as long-term histories,
the query q contains x text fragments related to the entity, and the candidate entity set of the query is:
Figure FDA0004168676840000016
wherein n is i Is an id identifying a candidate entity associated with the ith text fragment in the query, then the query entity vector is expressed as:
Figure FDA0004168676840000017
wherein p is i,j For entity e i,j E i,j For pre-trained entity vectors, then training, the entity vectors of the document are expressed as:
Figure FDA0004168676840000021
wherein c i Is the frequency with which entities appear in a document;
the text vector of the document and the text vector of the query are respectively:
Figure FDA0004168676840000022
wherein w is i To use the glove pre-trained word vector,
then, sequence history modeling is carried out, firstly, LSTM layer is utilized to model the sequence of user history inquiry behaviors, and the attention mechanism based on the current inquiry is utilized to give higher attention to related history behaviorsForce, for short-term history, the query text vector in the history search behavior and the text vector of the corresponding click document are spliced to be used as the input of the LSTM layer, so that the short-term user intention t can be obtained s
Figure FDA0004168676840000023
Figure FDA0004168676840000024
Wherein the method comprises the steps of
Figure FDA0004168676840000025
For the average value of the text vectors of the corresponding click document, +.>
Figure FDA0004168676840000026
For output at LSTM layer i instant, alpha i Attention weight output for each moment, +.>
Figure FDA0004168676840000027
To normalize the probability function, the MLP represents the fully connected layer, based on long term history, long term user intent t l By the formula described above>
Figure FDA0004168676840000028
And->
Figure FDA0004168676840000029
Replaced by->
Figure FDA00041686768400000210
Calculating the average value of the text vectors of the corresponding click documents;
then modeling the historical entity information, giving higher weight to the query related to the current query in the historical queries by using LSTM and attribute mechanism, and then usingThe entity information in the queries is used as relevant historical entity information, the text vector of the historical queries is used as the input of an LSTM layer, and the short-term relevant entity vector e in short-term history is used as the input of the LSTM layer s Is as follows:
Figure FDA00041686768400000211
Figure FDA00041686768400000212
wherein the method comprises the steps of
Figure FDA00041686768400000213
Based on long-term history, Q in the formula s And->
Figure FDA00041686768400000214
Replaced by Q l And->
Figure FDA00041686768400000215
Obtaining long-term related entity vector e l
The entity link correlation based on the personalized history is as follows:
Figure FDA0004168676840000031
wherein g (x, y) =tanh (x T *MLP(y))。
2. The method of claim 1, wherein the user preference profile is constructed in a manner that utilizes an entity memory neural network to construct an entity-enhanced user profile having, over a short period of history:
Figure FDA0004168676840000032
wherein the method comprises the steps of
Figure FDA0004168676840000033
The entity vector average value of the corresponding click document;
then constructing a user preference portrait by taking the entity vector of the current query as the predicted user intention vector, and reading the short-term entity portrait once from the short-term entity memory neural network through an attention mechanism
Figure FDA0004168676840000034
The method comprises the following steps:
Figure FDA0004168676840000035
wherein beta is i Attention weight for memorizing the ith value in neural network, P e Is a set matrix variable which is used to determine the matrix,
Figure FDA0004168676840000036
is the entity vector of the current query; splicing the entity vector of the current query and the read user image to be used as a new user intention vector, and carrying out secondary reading:
Figure FDA0004168676840000037
/>
wherein W is e For the matrix variables set, β' i Attention weight for the ith value of the memory neural network; and will memorize the key of the neural network
Figure FDA0004168676840000038
Sum->
Figure FDA0004168676840000039
Replaced by->
Figure FDA00041686768400000310
And the entity vector average value of the corresponding click document, and performing secondary reading to obtain a long-term entity portrait +.>
Figure FDA00041686768400000311
Then, the text memory neural network is utilized to construct the user interest portraits based on the original text information, wherein the user interest portraits have the following short-term histories:
Figure FDA00041686768400000312
wherein the method comprises the steps of
Figure FDA00041686768400000313
The text vector average value of the corresponding click document;
will query the original text vector
Figure FDA00041686768400000314
With implicit user intention vector t modeled with LSTM s Splicing is used as a user intention vector, a user text preference portrait is read by using an attention mechanism, only one reading is performed, and a short-term user text portrait is constructed based on short-term history>
Figure FDA00041686768400000315
q′=[t s ,q],
Figure FDA0004168676840000041
And based on long-term history, memorizing the keys of the neural network
Figure FDA0004168676840000042
Sum->
Figure FDA0004168676840000043
Replaced by->
Figure FDA0004168676840000044
And constructing a long-term user text portrait +.>
Figure FDA0004168676840000045
3. The method of claim 2, wherein personalized relevance is derived and ranked according to the user intent model and the refined user preference portrait model in a particular manner for a given user history
Figure FDA0004168676840000046
Candidate document d under query q, user intent +.>
Figure FDA0004168676840000047
Is>
Figure FDA0004168676840000048
First, a user intent correlation, which is a correlation between a document and a user intent vector, is calculated
Figure FDA0004168676840000049
Wherein g (x, y) =tanh (x T *MLP(y)),
Figure FDA00041686768400000410
Figure FDA00041686768400000411
Then calculating a user preference correlation, wherein the user preference correlation is the correlation between the document and the user preference portrait, and the user preference correlation is that
Figure FDA00041686768400000412
Then calculate the query relevance, which is the match between the document and the current query, which is f (d, g) = [ g (d, q), MLP (f d ),f m ],
Figure FDA00041686768400000413
Wherein f d Representing a traditional click feature such as the number of clicks a user historically has on url under the same query, f m The relevance score of the candidate document d under the query g can be calculated as:
Figure FDA00041686768400000414
4. the method of claim 1, wherein the personalized relevance calculation of documents is completed and ranked followed by entity link adjustment that adjusts other historical queries in the current session using the current query and click feedback of the user under the current query.
CN201911413378.3A 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information Active CN111125538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911413378.3A CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911413378.3A CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Publications (2)

Publication Number Publication Date
CN111125538A CN111125538A (en) 2020-05-08
CN111125538B true CN111125538B (en) 2023-05-23

Family

ID=70506503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911413378.3A Active CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Country Status (1)

Country Link
CN (1) CN111125538B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163147A (en) * 2020-06-09 2021-01-01 中森云链(成都)科技有限责任公司 Recommendation method for website session scene
CN112069399B (en) * 2020-08-25 2023-06-02 中国人民大学 Personalized search system based on interaction matching
CN112182154B (en) * 2020-09-25 2023-10-10 中国人民大学 Personalized search model for eliminating keyword ambiguity by using personal word vector
CN112782982A (en) * 2020-12-31 2021-05-11 海南大学 Intent-driven essential computation-oriented programmable intelligent control method and system
US11947548B2 (en) 2021-11-29 2024-04-02 Walmart Apollo, Llc Systems and methods for providing search results based on a primary intent

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462325A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Search recommendation method and device
CN107943919A (en) * 2017-11-21 2018-04-20 华中科技大学 A kind of enquiry expanding method of session-oriented formula entity search

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8868548B2 (en) * 2010-07-22 2014-10-21 Google Inc. Determining user intent from query patterns
US10268646B2 (en) * 2017-06-06 2019-04-23 Facebook, Inc. Tensor-based deep relevance model for search on online social networks

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462325A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Search recommendation method and device
CN107943919A (en) * 2017-11-21 2018-04-20 华中科技大学 A kind of enquiry expanding method of session-oriented formula entity search

Also Published As

Publication number Publication date
CN111125538A (en) 2020-05-08

Similar Documents

Publication Publication Date Title
CN111125538B (en) Searching method for enhancing personalized retrieval effect by utilizing entity information
Wu et al. Session-based recommendation with graph neural networks
CN110046304B (en) User recommendation method and device
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
US10713317B2 (en) Conversational agent for search
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
WO2021128729A1 (en) Search result determination method, device, apparatus, and computer storage medium
US20190205761A1 (en) System and method for dynamic online search result generation
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
WO2015191652A1 (en) Modeling interestingness with deep neural networks
CN106095845B (en) Text classification method and device
CN111310023B (en) Personalized search method and system based on memory network
CN106708929B (en) Video program searching method and device
CN112182154B (en) Personalized search model for eliminating keyword ambiguity by using personal word vector
CN105975639B (en) Search result ordering method and device
CN111753167B (en) Search processing method, device, computer equipment and medium
CN115048586B (en) Multi-feature-fused news recommendation method and system
CN112507091A (en) Method, device, equipment and storage medium for retrieving information
CN110147494A (en) Information search method, device, storage medium and electronic equipment
CN104915399A (en) Recommended data processing method based on news headline and recommended data processing method system based on news headline
CN112182373A (en) Context expression learning-based personalized search method
CN105677828A (en) User information processing method based on big data
CN105677825A (en) Analysis method for client browsing operation
CN114722287A (en) Long-term and short-term recommendation method integrated into hierarchical structure

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant