CN111125538A - Searching method for enhancing personalized retrieval effect by using entity information - Google Patents

Searching method for enhancing personalized retrieval effect by using entity information Download PDF

Info

Publication number
CN111125538A
CN111125538A CN201911413378.3A CN201911413378A CN111125538A CN 111125538 A CN111125538 A CN 111125538A CN 201911413378 A CN201911413378 A CN 201911413378A CN 111125538 A CN111125538 A CN 111125538A
Authority
CN
China
Prior art keywords
entity
user
query
vector
history
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911413378.3A
Other languages
Chinese (zh)
Other versions
CN111125538B (en
Inventor
窦志成
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Renmin University of China
Original Assignee
Renmin University of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Renmin University of China filed Critical Renmin University of China
Priority to CN201911413378.3A priority Critical patent/CN111125538B/en
Publication of CN111125538A publication Critical patent/CN111125538A/en
Application granted granted Critical
Publication of CN111125538B publication Critical patent/CN111125538B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a searching method for enhancing personalized retrieval effect by utilizing entity information, which comprises the following steps of 1, personalizing entity links, wherein the personalized entity links are used for carrying out user intention modeling on query entity link effect by utilizing historical improvement and utilizing an entity enhancement model; step 2, constructing a user preference portrait, wherein the user preference portrait is constructed based on predicted intentions, and an entity-enhanced fine user preference portrait is constructed by using historical entity information through a memory neural network; and 3, obtaining personalized relevance according to the user intention model and the fine user preference portrait model and sequencing.

Description

Searching method for enhancing personalized retrieval effect by using entity information
Technical Field
The invention relates to a searching method, in particular to a searching method for enhancing personalized retrieval effect by utilizing entity information.
Background
Personalized search has been widely concerned, and aims to assist in judging the current query intention and preference of a user by using the historical behaviors of the user, so that different search result sequences are returned to different users, and the user experience is improved. Because of ambiguity and the problems that the query is generally short and short, the query issued by the user often cannot fully express the real intention, and different users may have different preferences even with the same intention, the personalization of the search result is necessary.
In the prior art, many documents are related by extracting document topics or sub-topics from user history and calculating the relevance of current candidate documents according to characteristics such as user click times and the like. Deep learning is then also introduced into personalized searches. In addition, the hierarchical recurrent neural network is used to dynamically learn the expression of the user portrait from the user history, thereby predicting the correlation between the current document and the user preference portrait. The effectiveness of the depth model in personalized search is further enhanced by using an antagonistic neural network.
Existing personalized search methods mainly learn the relevance between documents and the current query and the portrait of a user based on historical search records of the user, but may ignore the relation between things existing in the real world but not reflected in the search records, thereby affecting the learning of relevance matching. Many search models improve the accuracy of matching by introducing a knowledge base and utilizing the relationships existing between entities and semantic information. But there is a lack of relevant methods for introducing entity knowledge in the field of personalized search.
In addition to utilizing entity contacts to better learn about relevance, the introduction of entities can better meet some of the desirable characteristics of personalized searches. User intent, especially for ambiguous queries, can be better expressed, for example, with explicit entities. Meanwhile, historical search information of the user in the personalized search task is also helpful for judging entity links, and further helps the conjecture and expression of the user intention. Secondly, compared with the text information in the whole webpage, the entity contained in the clicked webpage of the user can better reflect the specific preference information of the user, because the text information of the whole webpage is more redundant. By utilizing the entity information, a user preference portrait can be better constructed, so that personalized relevance of the document can be better calculated.
Disclosure of Invention
The invention provides a searching method for enhancing personalized retrieval effect by utilizing entity information, which comprises the steps of firstly carrying out personalized entity link on query, utilizing history to improve the link effect of the query entity, simultaneously utilizing the entity to enhance the modeling and representation of user intention, then more accurately constructing a user portrait based on predicted intention, utilizing historical entity information to construct a refined user preference portrait enhanced by the entity through a memory neural network, and finally utilizing the predicted user intention and the user portrait to calculate the personalized relevance of documents and arrange the personalized relevance in sequence, thereby improving the user experience. After the sorting is finished, the invention provides that the entity link results of the previous query are adjusted by utilizing the click feedback of the user and the current query, and the understanding of the historical search intention and preference of the model is further optimized so as to be used for the individuation of the subsequent query results.
Drawings
FIG. 1 is an overall flow chart of the present invention
FIG. 2 is a diagram of a link structure of a personalized entity according to the present invention;
FIG. 3 is a block diagram of a user representation construction.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Knowledge bases have been extensively studied in recent years and, by their nature of storing large amounts of linkage and semantic information between entities in the real world, are often introduced into search models as external knowledge to improve the accuracy of matches between queries and documents. For example, the relevance of the document "chenkege" to the director of the query "(" bawangbeiji ") is not high only from the viewpoint of semantic similarity of texts, but the problem of utilizing the relation between entities can be well solved by making the document highly conform to the query intention based on the real world background. But research on introducing external knowledge in the personalized search field is relatively lacking.
In addition, the invention utilizes the entity to better predict the user search intention in the personalized search and construct the user preference portrait. For example, for a query of "cherry reviews", it is difficult to determine whether the user's search intention is cherry blossom or cherry keyboard due to semantic ambiguity. But entities are introduced, personalized entity links are carried out, and according to the related historical query of 'facial cherry blossoms in Asia', the search intention of the user can be predicted to be cherry blossoms. By explicitly expressing the user's intent as a cherry blossom entity, a web document describing cherry blossom may be ranked ahead to meet the user's needs. Based on the predicted user intention, entity information in historical search is utilized, and a more refined preference portrait of the user can be constructed. For example, based on the prediction of the linked cherry blossom entities, historical queries containing the cherry blossom entities can be retrieved, and further according to the query, the user clicks the entities such as "japan", "hokkaido" and the like contained in the document, so that the more detailed preference of the user is known as the "cherry blossom landscape of japan", the cherry blossom tourist attractions of japan can be recommended to the front of the search results, and the user experience is further improved.
As shown in FIG. 1, the method comprises the steps of firstly carrying out personalized entity link on queries, utilizing history to improve the link effect of the query entities, meanwhile utilizing an entity enhanced model to model and represent user intentions, then more accurately constructing a user portrait based on predicted intentions, utilizing historical entity information to construct an entity enhanced fine user preference portrait through a memory neural network, and finally utilizing the predicted user intentions and the user portrait to calculate personalized relevance of documents and arrange the personalized relevance in order, so that user experience is improved. After the sorting is finished, the invention provides that the entity link results of the previous query are adjusted by utilizing the click feedback of the user and the current query, and the understanding of the historical search intention and preference of the model is further optimized so as to be used for the individuation of the subsequent query results.
The link to the personalized entity is shown in figure 2. The user history consists of a series of search sessions:
Figure BDA0002350559940000031
a session consists of a series of queries and a corresponding candidate set of documents:
Figure BDA0002350559940000032
where h is id, x identifying the sessionhIs the number of queries within a session. When the user sends out the t query q in the current sessiontLater, the candidate document set needs to be searched according to the historical interest of the user
Figure BDA0002350559940000041
Making personalized sorting to make it conform to user's qtThe following search intention. This process is repeated until the current session is over.
Dividing the user history into a short-term history and a long-term history is very effective in personalized search. Short-term history is defined as historical search records in the current session
Figure BDA0002350559940000042
Long-term history is then defined as the search record in a history session
Figure BDA0002350559940000043
If query q contains x text segments associated with an entity, then the set of candidate entities for the query is defined as:
Figure BDA0002350559940000047
wherein n isiThe id of the candidate entity associated with the ith text fragment in the query is identified. The query entity vector is then expressed as:
Figure BDA0002350559940000044
wherein p isi,jAs entity ei,jLink probability of ei,jIs a pre-trained entity vector and then trained.
The entity vector of the document is represented as:
Figure BDA0002350559940000045
wherein c isiIs the frequency of occurrence in the document of the entity. Likewise, the text vector representation of documents and queries is defined as:
Figure BDA0002350559940000046
wiis a word vector pre-trained with glove.
As mentioned above, the invention firstly utilizes the user history information to carry out personalized entity linking on the query, namely calculates the link probability of each candidate entity, so that the user intention is clearer and clearer, and the method can be applied to the subsequent construction of the user portrait and the personalized relevance calculation of the document.
The calculation of entity link probability is mainly divided into two parts: link relevance between the entity and the query, entity link relevance determined based on user history:
Figure BDA0002350559940000051
wherein MLP stands for fully connected layer.
The link relevance between the entity and the query includes vector similarity and statistical features:
Figure BDA0002350559940000052
wherein li,jRepresenting a statistical characteristic such as the popularity of the candidate entity.
The entity link correlation calculation based on the user history comprises the following steps: modeling a historical search sequence of a user to infer an implicit intention under a current query so as to provide a basis for a current entity link; related queries in the user history are searched, and historical entity information in the queries is used for providing a basis for entity links of current queries.
Sequence history modeling first models the sequence of user historical query behaviors using the LSTM layer and assigns higher attention to relevant historical behaviors using an attention mechanism based on the current query to infer the current query intent. Firstly, splicing a query text vector in historical search behaviors and a text vector of a corresponding click document as input of an LSTM layer for short-term history, and obtaining a short-term user intention ts
Figure BDA0002350559940000053
Figure BDA0002350559940000054
Wherein
Figure BDA0002350559940000055
Figure BDA0002350559940000056
Is the average of the text vectors of the corresponding clicked documents. Similarly based on long-term history, from the above equation
Figure BDA0002350559940000057
And
Figure BDA0002350559940000058
is replaced by
Figure BDA0002350559940000059
The average value of the text vector of the corresponding click document can be calculated to obtain the long-term user intention tl
The modeling of historical entity information utilizes an LSTM and an attribution mechanism to give higher weight to queries related to a current query in historical queries, and then utilizes entity information in the queries as related historical entity information. The text vector of the historical query is thus used as input to the LSTM layer, the short-term related entity vector e over the short-term historysIs calculated as follows:
Figure BDA00023505599400000510
Figure BDA00023505599400000511
wherein
Figure BDA00023505599400000512
Similarly, based on long-term history, Q in the above equationsAnd
Figure BDA00023505599400000513
is replaced by QlAnd
Figure BDA00023505599400000514
a long-term correlation entity vector e can be derivedl
The entity link relevance based on the personalized history is as follows:
Figure BDA0002350559940000061
wherein g (x, y) ═ tanh (x)T*MLP(y))。
Based on the predicted user intent, the user's preferences under that intent can be better modeled. Meanwhile, entity information in the search history is further utilized, and the model can learn more detailed preference of the user. Because the memory neural network has better storage capacity for long-sequence information, the invention adopts a key-value memory neural network to store user historical information to model the user portrait, and the user preference portrait is constructed as shown in fig. 3.
A solid memory neural network is utilized to construct a solid augmented user representation. The key value is an entity vector of historical query, and the value is an average value of entity vectors of documents clicked by users under corresponding historical query. In this way, the refined preferences embodied by the user under historical query intent can be retained. Thus, over a short-term history:
Figure BDA0002350559940000062
wherein
Figure BDA0002350559940000063
Figure BDA0002350559940000064
To be corresponding toClick on the entity vector mean of the document.
The entity vector of the current query is then used as the predicted user intent vector to construct a user preference profile because the predicted entity link probabilities reflect the user's intent. Thus, based on the entity vector, a short-term entity image is read from the short-term entity memory neural network by an attention mechanism
Figure BDA0002350559940000065
The following were used:
Figure BDA0002350559940000066
since most entities directly related to the current query are retrieved from the memory neural network only by using the entity vector of the current query, the invention then splices the entity vector of the current query and the read user portrait as a new user intention vector for secondary reading. In this way, entities related to user preferences can be further retrieved from the memory neural network, so that the constructed user representation covers wider interests of the user. Therefore, there are:
Figure BDA0002350559940000067
in the same way, the keys of the neural network will be memorized based on the long-term history
Figure BDA0002350559940000068
Sum value
Figure BDA0002350559940000069
Is replaced by
Figure BDA0002350559940000071
And the entity vector mean value of the corresponding click document is read for the second time, so that the long-term entity portrait can be obtained
Figure BDA0002350559940000072
The text memory neural network constructs the user interest portrait based on the original text information. The key value is a text vector of the historical query, and the value is a text vector mean value of the corresponding clicked document. Thus, over a short-term history:
Figure BDA0002350559940000073
wherein
Figure BDA0002350559940000074
Figure BDA0002350559940000075
Is the text vector mean of the corresponding clicked document.
Since the original query text may not completely reflect the query intention of the user, the present invention will query the original text vector
Figure BDA0002350559940000076
With implicit user intention vector t modeled using LSTMsStitching is used as a user intention vector, and a user text preference portrait is read by using an attention mechanism. Since the association between words is not as strong as the association between entities, it is only read once here. Thus building short-term user text portrayal based on short-term history
Figure BDA0002350559940000077
Figure BDA0002350559940000078
Similarly, the keys of a neural network may be remembered based on long-term history
Figure BDA0002350559940000079
Sum value
Figure BDA00023505599400000710
Is replaced by
Figure BDA00023505599400000711
And the text vector mean value of the corresponding click document can construct a long-term user text portrait
Figure BDA00023505599400000712
With the predicted user intent and the constructed user representation, personalized relevance scores can be calculated for the documents and personalized ranking can be performed accordingly.
Given a user history
Figure BDA00023505599400000713
The relevance score for candidate document d under query q may be calculated as:
Figure BDA00023505599400000714
wherein
Figure BDA00023505599400000715
And
Figure BDA00023505599400000716
representing predicted user intent and user preference vectors, respectively.
User intent relevance the relevance between the document and the user intent vector is calculated:
Figure BDA00023505599400000717
wherein g (x, y) ═ tanh (xT × mlp (y))
Figure BDA00023505599400000718
User preference relevance the relevance between the document and the user preference representation is calculated:
Figure BDA00023505599400000719
query relevance concerns the matching between documents and the current query, including vector similarity and traditional click features. Meanwhile, in order to further explore the personalized matching between the entities of the query link and the entities of the document, the invention introduces the interactive matching characteristics between the entities, so that the method comprises the following steps:
Figure BDA0002350559940000081
wherein f isdRepresenting traditional click characteristics such as the number of clicks a user has historically made on the url under the same query.
Matching features f for entity interactionsmThe invention proposes two interacting matching components with entities, PEDRM and PCERM. To simplify the notation, all candidate entities in the current query are integrated into one list, so there are:
Figure BDA0002350559940000082
Figure BDA0002350559940000083
hereinafter, eqAnd edWill be used to represent entity-encoding vectors in queries and documents, respectively.
PEDRM is a matching group price that incorporates personalized information. EDRM first constructs a text and entity monitoring matrix between queries and documents, and then extracts matching features using a Gaussian kernel pooling layer:
Figure BDA0002350559940000084
Figure BDA0002350559940000085
wherein
Figure BDA0002350559940000086
Representing a splicing operation, Me,eFor interaction matrices between query entities and document entities, Me,wFor an interaction matrix between query entities and document text, Mw,eFor the interaction matrix between query text and document entities, Mw,wIs an interaction matrix between query text and document text.
In PEDRM, the invention fuses personalized information in the interaction matrix. When calculating the interaction matrix with the entities in the query, the predicted entity probabilities are used as weights for entity interactions to reflect the relevance to the personalized intent of the user. Meanwhile, an interactive matching matrix R of the relationship between the entities and the query vector is added to further extract matching characteristics:
Figure BDA0002350559940000087
Figure BDA0002350559940000088
wherein the relationship between the query and the document entity can be characterized as
Figure BDA0002350559940000089
The interaction matrix of entity relationships is added because the matching between entity vectors does not necessarily reflect the degree of matching between queries and documents completely. For example, the queries "Obama's life", "Michelle" and "u.s.a" are all related to the entity "Obama", but only the relationship "islife" exists between "Michelle" and "Obama", which meets the query requirement. Thus for the interaction feature fmThe calculation of (a) is:
Figure BDA0002350559940000091
the PCERM is a relatively simple interactive matching component, and extracts personalized matching interactive features only by using a 3-channel CNN:
Figure BDA0002350559940000092
fm=MLP(Flat(Relu(C))),
wherein
Figure BDA0002350559940000093
Representing a splicing operation in a first dimension, WCNNAnd bCNNThe parameters of convolution kernel in CNN, a, b are convolution kernel size, Flat represents smoothing operation, and matrix is flattened into vector.
After the document personalized relevance calculation is completed and the ranking is performed according to the document personalized relevance calculation, the entity link probability adjustment is performed on other historical queries in the current session by using the current query and the click feedback of the user under the current query, because the user intentions in the same session are relatively consistent. For example, with the current query and the user clicking on the entity "software" in the document, the ambiguous historical query "Java" in the session may be considered as referring to the entity as "Java language". The adjusted link result can enable the user portrait constructed during subsequent query personalization to be more accurate, and when the user subsequently queries the which IDE to cache, the eclipse webpage suitable for java development can be arranged in front.
But once the entity link probability can be adjusted, the situation becomes more complex. Because the link probability of one entity varies, the link probability of other entities associated with other text segments can also need to be changed due to the consistent relationship between the conversation intents. The idea of the invention is therefore to select the entity with the highest link probability as a reliable link and then use this entity to adjust the candidate entity probabilities associated with other text passages.
Specifically, after the personalized ranking is completed, the entity probability in the current query is first adjusted by the user clicking on the entity information in the document:
Figure BDA0002350559940000094
wherein
Figure BDA0002350559940000095
For clicking on documentsThe superscript t identifies that the query is the tth query in the current session. Next, find epsilon in the candidate entity under the current querytEntity with highest link probability
Figure BDA0002350559940000101
If p < > is δ (where δ is set to 0.5), the tuning process ends, followed by the t +1 th query from the user.
If p > delta, assuming that the reliably linked entity is associated with the a-th text segment in the query, taking the entity information of the text segment:
Figure BDA0002350559940000102
to adjust the probability epsilon of other candidate entities in the sessionkAnd 1 < k < t-1. According to the similarity of the entity vector and the similarity of the query text, the link probability of other entities is adjusted in the following way:
Figure BDA0002350559940000103
the candidate entity with the highest probability of being linked in the whole session is found next
Figure BDA0002350559940000104
If p is larger than delta and the text fragment associated with the selected entity is not selected previously, repeating the steps and adjusting the link probability of other candidate entities by using the text fragment entity information associated with the selected entity; otherwise the adjustment process ends and the t +1 th query of the user is processed next. In summary, when the link probability of no entity in the session is greater than the threshold δ or all the text segments associated with the entities have been selected, the adjustment process ends. In the present invention, the input to the training model is in units of sessions, so the parameters w, w used in the tuning process are minimized when the loss value of the entire session is minimized1,w2Will be trained for optimization.
The invention takes conversation as unit training model, and adopts the loss function of pair-wise, so the invention has the following steps:
Figure BDA0002350559940000105
where s is the query session for user u,
Figure BDA0002350559940000106
for the user's search history before query q, d+Representing a set of candidate documents under a query q
Figure BDA0002350559940000107
A good case document of (1), d-Representing negative example documents.
In order to better model the user intention and the user portrait, the invention provides the effect of enhancing the personalized search by utilizing entity information in the knowledge base. The invention firstly carries out personalized entity linkage to eliminate the ambiguity of the query, so that the model can better learn the intention of the user. According to the predicted user intention, the invention utilizes entity information in the historical search record and constructs the user portrait through the memory network, thereby better modeling the user on personalized preference. After the personalized score calculation and ranking of the documents are completed, the entity link results of the historical queries are adjusted by the invention by using the current queries and the click feedback of the user so as to better analyze the history of the user, which is helpful for further simulating the interests of the user. The invention utilizes the entity information, effectively enhances the effect of personalized search and can greatly improve the experience of the user.
Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims (5)

1. A search method for enhancing personalized retrieval effect by using entity information is characterized by comprising the following steps of 1, personalizing entity links, wherein the personalized entity links are used for carrying out user intention modeling on query entity link effect by using history improvement and by using an entity enhancement model; step 2, constructing a user preference portrait, wherein the user preference portrait is constructed based on predicted intentions, and an entity-enhanced fine user preference portrait is constructed by using historical entity information through a memory neural network; and 3, obtaining personalized relevance according to the user intention model and the fine user preference portrait model and sequencing.
2. The method of claim 1, wherein personalizing the entity links to queries is performed in such a way that a user history is composed of a series of search sessions
Figure FDA0002350559930000011
Said S1...SmIs a session, m is the number of sessions, the mth session is the current session, the session ShConsists of a series of queries and corresponding candidate sets of documents:
Figure FDA0002350559930000012
where h is id, x identifying the sessionhIs the number of queries in a session, when the user issues the tth query q in the current sessiontThereafter, the set of candidate documents is searched for interest based on the user history
Figure FDA0002350559930000013
Making personalized sorting to make it conform to user's qtA search intention of|Dt|Is No. DtA candidate document; the user history is then divided into a short-term history and a long-term history, the short-term history being a historical search record in the current session
Figure FDA0002350559930000014
The s identifies a short-term history; the long-term history is the search record in the history session
Figure FDA0002350559930000015
Wherein l identifies the long-term history, m-1 indicates that m-1 sessions prior to the current session are the long-term history,
if query q contains x text segments related to entities, the candidate entity set of the query is:
Figure FDA0002350559930000016
wherein n isiIs the id that identifies the candidate entity associated with the ith text fragment in the query, the query entity vector is represented as:
Figure FDA0002350559930000017
wherein p isi,jAs entity ei,jLink probability of ei,jFor the pre-trained entity vector, which is then trained, the entity vector for the document is represented as:
Figure FDA0002350559930000021
wherein c isiA frequency of occurrence in the document for the entity;
the text vector of the document and the text vector of the query are respectively:
Figure FDA0002350559930000022
wherein wiTo pre-train the word vector with glove,
then, sequence history modeling is carried out, namely, firstly, an LSTM layer is utilized to model the user history query behavior sequence, and an attention mechanism based on the current query is utilized to model the user history query behavior sequenceThe related historical behaviors are given higher attention, and for the short-term history, the query text vector in the historical search behaviors and the text vector of the corresponding click document are spliced to be used as the input of an LSTM layer, so that the short-term user intention t can be obtaineds
Figure FDA0002350559930000023
Figure FDA0002350559930000024
Wherein
Figure FDA0002350559930000025
Figure FDA0002350559930000026
For the average of the text vectors of the corresponding clicked documents,
Figure FDA0002350559930000027
for the output at time instant LSTM layer i, αiThe attention weight output for each time instant,
Figure FDA0002350559930000028
for the normalized probability function, MLP represents the fully connected layer, based on long-term history, long-term user intent tlBy applying the above formula
Figure FDA0002350559930000029
And
Figure FDA00023505599300000210
is replaced by
Figure FDA00023505599300000211
Calculating the mean value of the text vector of the corresponding clicked document;
then modeling historical entity information by using LSTM and atteThe ntion mechanism gives higher weight to the query related to the current query in the historical query, then uses the entity information in the query as the related historical entity information, uses the text vector of the historical query as the input of the LSTM layer, and uses the short-term related entity vector e in the short-term historysThe method comprises the following steps:
Figure FDA00023505599300000212
Figure FDA00023505599300000213
wherein
Figure FDA00023505599300000214
Based on long-term history, Q in the formula issAnd
Figure FDA00023505599300000215
is replaced by QlAnd
Figure FDA0002350559930000031
obtaining a long-term correlation entity vector el
The entity link relevance based on the personalized history is as follows:
Figure FDA0002350559930000032
wherein g (x, y) ═ tanh (x)T*MLP(y))。
3. The method of claim 2, wherein the user preference profile is constructed by constructing an entity enhanced user profile using an entity memory neural network, having, over a short term history:
Figure FDA0002350559930000033
wherein
Figure FDA0002350559930000034
Figure FDA0002350559930000035
And the entity vector mean value of the corresponding click document.
Then, the entity vector of the current query is used as a predicted user intention vector to construct a user preference portrait, and a short-term entity portrait is read from a short-term entity memory neural network once through an attention mechanism
Figure FDA0002350559930000036
Comprises the following steps:
Figure FDA0002350559930000037
β thereiniAttention weight, P, for the ith value in a memory neural networkeIs a variable of the matrix that is set up,
Figure FDA0002350559930000038
is the entity vector of the current query. Splicing the entity vector of the current query and the read user portrait as a new user intention vector, and performing secondary reading:
Figure FDA0002350559930000039
wherein WeIs a matrix variable of setting, β'iFor the attention weight of the ith value of the memory neural network. And will memorize the keys of the neural network
Figure FDA00023505599300000310
Sum value
Figure FDA00023505599300000311
Is replaced by
Figure FDA00023505599300000312
And the entity vector mean value of the corresponding click document is read for the second time to obtain a long-term entity portrait
Figure FDA00023505599300000313
Then, constructing a user interest portrait based on original text information by utilizing a text memory neural network has the following short-term history:
Figure FDA00023505599300000314
wherein
Figure FDA00023505599300000315
Figure FDA00023505599300000316
Is the text vector mean of the corresponding clicked document.
Will query the original text vector
Figure FDA00023505599300000317
With implicit user intention vector t modeled using LSTMsSplicing as a user intention vector, reading the user text preference portrait by using an attention mechanism, only reading once, and constructing a short-term user text portrait based on short-term history
Figure FDA00023505599300000318
Figure FDA0002350559930000041
And based on long-term history, will remember the keys of the neural network
Figure FDA0002350559930000042
Sum value
Figure FDA0002350559930000043
Is replaced by
Figure FDA0002350559930000044
Constructing long-term user text portrait with text vector mean of corresponding click document
Figure FDA0002350559930000045
4. The method of claim 3, wherein the personalized relevance and ranking based on the user intent model and the fine user preference profile model is by way of a history for a given user
Figure FDA0002350559930000046
Candidate document d, user intent under query q
Figure FDA0002350559930000047
And user preference vector
Figure FDA0002350559930000048
First, a user intent relevance is calculated, the user intent relevance being the relevance between the document and the user intent vector
Figure FDA0002350559930000049
Wherein g (x, y) ═ tanh (x)T*MLP(y)),
Figure FDA00023505599300000410
Figure FDA00023505599300000411
A user preference relevance is then calculated, the user preference relevance being a relevance between the document and the user preference profile, the user preference relevance being
Figure FDA00023505599300000412
Query relevance, which is the match between the document and the current query, is then calculated as f (d, q) — [ g (d, q), MLP (f)d),fm],
Figure FDA00023505599300000413
Wherein f isdRepresenting traditional click characteristics such as the number of clicks a user has historically made on the url under the same query, said fmThrough two interaction matching components with the entity, finally the relevance score of the candidate document d under the query q can be calculated as:
Figure FDA00023505599300000414
5. the method of claim 1, wherein the document personalized relevance computation is completed and ranked before an entity link adjustment is made that adjusts other historical queries in the current session using the current query and click feedback of the user under the current query.
CN201911413378.3A 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information Active CN111125538B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911413378.3A CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911413378.3A CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Publications (2)

Publication Number Publication Date
CN111125538A true CN111125538A (en) 2020-05-08
CN111125538B CN111125538B (en) 2023-05-23

Family

ID=70506503

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911413378.3A Active CN111125538B (en) 2019-12-31 2019-12-31 Searching method for enhancing personalized retrieval effect by utilizing entity information

Country Status (1)

Country Link
CN (1) CN111125538B (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112069399A (en) * 2020-08-25 2020-12-11 中国人民大学 Personalized search system based on interactive matching
CN112163147A (en) * 2020-06-09 2021-01-01 中森云链(成都)科技有限责任公司 Recommendation method for website session scene
CN112182154A (en) * 2020-09-25 2021-01-05 中国人民大学 Personalized search model for eliminating keyword ambiguity by utilizing personal word vector
CN112782982A (en) * 2020-12-31 2021-05-11 海南大学 Intent-driven essential computation-oriented programmable intelligent control method and system
US11947548B2 (en) 2021-11-29 2024-04-02 Walmart Apollo, Llc Systems and methods for providing search results based on a primary intent

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149399A1 (en) * 2010-07-22 2014-05-29 Google Inc. Determining user intent from query patterns
CN104462325A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Search recommendation method and device
CN107943919A (en) * 2017-11-21 2018-04-20 华中科技大学 A kind of enquiry expanding method of session-oriented formula entity search
US20180349477A1 (en) * 2017-06-06 2018-12-06 Facebook, Inc. Tensor-Based Deep Relevance Model for Search on Online Social Networks

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20140149399A1 (en) * 2010-07-22 2014-05-29 Google Inc. Determining user intent from query patterns
CN104462325A (en) * 2014-12-02 2015-03-25 百度在线网络技术(北京)有限公司 Search recommendation method and device
US20180349477A1 (en) * 2017-06-06 2018-12-06 Facebook, Inc. Tensor-Based Deep Relevance Model for Search on Online Social Networks
CN107943919A (en) * 2017-11-21 2018-04-20 华中科技大学 A kind of enquiry expanding method of session-oriented formula entity search

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112163147A (en) * 2020-06-09 2021-01-01 中森云链(成都)科技有限责任公司 Recommendation method for website session scene
CN112069399A (en) * 2020-08-25 2020-12-11 中国人民大学 Personalized search system based on interactive matching
CN112069399B (en) * 2020-08-25 2023-06-02 中国人民大学 Personalized search system based on interaction matching
CN112182154A (en) * 2020-09-25 2021-01-05 中国人民大学 Personalized search model for eliminating keyword ambiguity by utilizing personal word vector
CN112182154B (en) * 2020-09-25 2023-10-10 中国人民大学 Personalized search model for eliminating keyword ambiguity by using personal word vector
CN112782982A (en) * 2020-12-31 2021-05-11 海南大学 Intent-driven essential computation-oriented programmable intelligent control method and system
US11947548B2 (en) 2021-11-29 2024-04-02 Walmart Apollo, Llc Systems and methods for providing search results based on a primary intent

Also Published As

Publication number Publication date
CN111125538B (en) 2023-05-23

Similar Documents

Publication Publication Date Title
CN109829104B (en) Semantic similarity based pseudo-correlation feedback model information retrieval method and system
CN111125538B (en) Searching method for enhancing personalized retrieval effect by utilizing entity information
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN111581510A (en) Shared content processing method and device, computer equipment and storage medium
CN110442777B (en) BERT-based pseudo-correlation feedback model information retrieval method and system
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
CN109800434B (en) Method for generating abstract text title based on eye movement attention
EP3155540A1 (en) Modeling interestingness with deep neural networks
US20210149963A1 (en) Domain-agnostic structured search query exploration
CN112182154B (en) Personalized search model for eliminating keyword ambiguity by using personal word vector
CN110941698B (en) Service discovery method based on convolutional neural network under BERT
CN111310023B (en) Personalized search method and system based on memory network
JP2022050379A (en) Semantic retrieval method, apparatus, electronic device, storage medium, and computer program product
CN109710732B (en) Information query method, device, storage medium and electronic equipment
CN110147494B (en) Information searching method and device, storage medium and electronic equipment
CN111753167B (en) Search processing method, device, computer equipment and medium
CN112182373A (en) Context expression learning-based personalized search method
CN113806482A (en) Cross-modal retrieval method and device for video text, storage medium and equipment
CN112434533B (en) Entity disambiguation method, entity disambiguation device, electronic device, and computer-readable storage medium
EP4298556A1 (en) Granular neural network architecture search over low-level primitives
CN112579739A (en) Reading understanding method based on ELMo embedding and gating self-attention mechanism
US20230029590A1 (en) Evaluating output sequences using an auto-regressive language model neural network
CN116975221A (en) Text reading and understanding method, device, equipment and storage medium
CN116204622A (en) Query expression enhancement method in cross-language dense retrieval
CN115203514A (en) Commodity query redirection method and device, equipment, medium and product thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant