CN106844436B - Query result sorting method and device - Google Patents

Query result sorting method and device Download PDF

Info

Publication number
CN106844436B
CN106844436B CN201611159193.0A CN201611159193A CN106844436B CN 106844436 B CN106844436 B CN 106844436B CN 201611159193 A CN201611159193 A CN 201611159193A CN 106844436 B CN106844436 B CN 106844436B
Authority
CN
China
Prior art keywords
score
user
query
correlation
dimension
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611159193.0A
Other languages
Chinese (zh)
Other versions
CN106844436A (en
Inventor
苟秋媛
梁东
高原
吴霄
米献艳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Xingxuan Technology Co Ltd
Original Assignee
Beijing Xingxuan Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Xingxuan Technology Co Ltd filed Critical Beijing Xingxuan Technology Co Ltd
Priority to CN201611159193.0A priority Critical patent/CN106844436B/en
Publication of CN106844436A publication Critical patent/CN106844436A/en
Application granted granted Critical
Publication of CN106844436B publication Critical patent/CN106844436B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2264Multidimensional index structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24553Query execution of query operations
    • G06F16/24558Binary matching operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2462Approximate or statistical queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results

Abstract

The invention provides a method and a device for sequencing query results, wherein the method comprises the following steps: calculating the sentence field user correlation score of the query result item; and determining the ranking order of the query result items according to the user correlation scores in the statement field. The method and the device can give consideration to the correlation between the query result item and the query statement, the query field and the query user to determine the ranking order of the query result item, thereby giving more accurate ranking results and improving the experience degree of the user.

Description

Query result sorting method and device
Technical Field
The present invention relates to the field of communications, and in particular, to a method and an apparatus for ranking query results.
Background
Currently, for the vertical field of the internet (such as catering, movies, etc.), the mainstream way of displaying search recalls is to give the user selectable list information with the merchant as the dimension, and the ranking position of the recalled documents can directly influence the sales volume of the merchant and the ordering and searching experience of the user. Therefore, the ranking of the search results is very important for the query sentence input by the user.
However, conventional search ranking score methods determine the ranking position of query result items based on the relevance between the query statement and the query result items. However, such a search ranking method cannot give consideration to the query field and the specificity of the query user, and reduces the search experience of the user.
Disclosure of Invention
In order to effectively solve the technical problem, the invention provides a method and a device for sorting query results.
In one aspect, an embodiment of the present invention provides a method for ranking query results, where the method includes:
calculating the sentence field user correlation score of the query result item (meanwhile, reflecting the correlation scores between the query result item and the query sentence, the query field and the query user);
and determining the ranking order of the query result items according to the user correlation scores in the statement field.
The method provided by the invention can determine the ranking order of the query result items by considering the relevance between the query result items and the query statement, the query field and the query user, thereby giving more accurate ranking results and improving the search experience of the user.
In some embodiments of the invention, the method further comprises:
receiving a query statement input by a user;
and acquiring the query result item according to the query statement.
In some embodiments of the invention, the calculating the statement domain user relevance score for the query result item comprises:
and calculating the sentence field user correlation score according to the sentence correlation score, the field correlation score and the user correlation score of the query result item.
In some embodiments of the invention, the method further comprises: the domain relevance score (a score that embodies the relevance between a query result item and a query domain) is calculated based on the relevance of the query result item to the current query domain.
In some embodiments of the invention, the method further comprises: the user relevance score (a score that embodies the relevance between a query result item and a querying user) is calculated in accordance with the relevance of the query result item to the user.
In some embodiments of the invention, said calculating the domain-relevance score based on the relevance of the query result item to the current query domain comprises:
obtaining a word cutting result set of each dimension of the query result item;
according to the matching result of the word cutting result set and the keyword set of the query field, adjusting the dimension level field correlation score (the score reflecting the correlation between the dimension and the query field) of each dimension (including increasing the score, decreasing the score, or keeping the score unchanged);
and summing the adjusted dimension-level domain-related scores to calculate the domain-related score.
The query words based on the vertical field have the characteristic of a large number of repeatability, the field correlation scores are calculated based on the keyword set of the query field, and the correlation between the query result items and the current query field can be efficiently and accurately obtained.
In some embodiments of the invention, said calculating a user relevance score according to the relevance of the query result item to the user comprises:
identifying whether the query result item has dimensions that belong to the user's set of click dimensions;
and if the query result item has the dimension, summing the dimension-level user correlation scores corresponding to the dimension to calculate the user correlation score.
In some embodiments of the invention, the method further comprises:
responding to the query statement, and acquiring a query log of the user;
generating the click dimension set (the set of dimensions in the query result items clicked by the user) according to the query log;
a dimension-level user relevance score (a score that embodies the relevance between a dimension and a querying user) is set for each element in the click dimension set based on the user's click rate.
In some embodiments of the invention, generating the set of click dimensions from the query log comprises:
and generating the click dimension set according to click query result items (the query result items clicked by the user) with the click rate larger than a threshold value in the query log.
Because the click dimension set is based on the click query result items with the click rate larger than the threshold value in the query log, accidental click operations such as mistaken click of the user are eliminated, and the sequencing result of the method can more accurately reflect the relevance of the query result items and the current query user.
In some embodiments of the invention, calculating the sentence domain user relevance score from the sentence relevance score, the domain relevance score, and the user relevance score of the query result item comprises:
respectively performing score suppression processing on the sentence correlation score, the field correlation score and the user correlation score;
and summing the sentence related score, the field related score and the user related score obtained after the score suppression processing to calculate the sentence field user related score.
The sentence field user correlation score is calculated through the sentence correlation score, the field correlation score and the user correlation score which are obtained after score suppression processing, so that the sequencing result obtained based on the sentence field user correlation score is more reasonable.
In another aspect, an embodiment of the present invention further provides an apparatus for sorting query results, where the apparatus includes:
the first calculation module is used for calculating the sentence field user correlation score of the query result item;
and the determining module is used for determining the ranking order of the query result items according to the user correlation scores in the statement field.
The device provided by the invention can determine the ranking order of the query result items by considering the correlation between the query result items and the query statement, the query field and the query user, thereby giving more accurate ranking results and improving the search experience of the user.
In some embodiments of the invention, the apparatus further comprises:
the receiving module is used for receiving the query statement input by the user;
and the acquisition module is used for acquiring the query result item according to the query statement.
In some embodiments of the invention, the first computing module comprises:
and the first calculation unit is used for calculating the sentence domain user correlation score according to the sentence correlation score, the domain correlation score and the user correlation score of the query result item.
In some embodiments of the invention, the apparatus further comprises:
and the second calculation module is used for calculating the field correlation score based on the correlation between the query result item and the current query field.
In some embodiments of the invention, the apparatus further comprises:
and the third calculation module is used for calculating the user correlation score according to the correlation between the query result item and the user.
In some embodiments of the invention, the second calculation module comprises:
the acquisition unit is used for acquiring a word cutting result set of each dimension of the query result item;
the score adjusting unit is used for adjusting the relevant scores of the dimension level fields of each dimension according to the matching results of the word cutting result set and the keyword set of the query field;
and the second calculating unit is used for summing the adjusted dimension-level domain-related scores to calculate the domain-related scores.
The query words based on the vertical field have the characteristic of a large number of repeatability, the field correlation scores are calculated based on the keyword set of the query field, and the correlation between the query result items and the current query field can be efficiently and accurately obtained.
In some embodiments of the invention, the third computing module comprises:
the identification unit is used for identifying whether the query result item has dimensions belonging to the click dimension set of the user;
and the third calculation unit is used for summing the dimension-level user correlation scores corresponding to the dimensions under the condition that the query result items have the dimensions so as to calculate the user correlation scores.
In some embodiments of the invention, the apparatus further comprises:
the log obtaining module is used for responding to the query statement and obtaining the query log of the user;
the generating module is used for generating the click dimension set according to the query log;
and the score setting module is used for setting a dimension-level user related score for each element in the click dimension set based on the click rate of the user.
According to the method and the device, the relevance between the query result item and the current query user can be accurately obtained by calculating the relevant score of the user based on the click rate of the user.
In some embodiments of the invention, the generating module comprises:
and the generating unit is used for generating the click dimension set according to the click query result items with the click rate larger than the threshold value in the query log.
Because the click dimension set is based on the click query result items with the click rate larger than the threshold value in the query log, accidental click operations such as mistaken click of the user are eliminated, and the sequencing result of the method can more accurately reflect the relevance of the query result items and the current query user.
In some embodiments of the invention, the first calculation unit comprises:
the score suppression component is used for respectively performing score suppression processing on the sentence correlation score, the field correlation score and the user correlation score;
and the calculating component is used for summing the sentence related score, the field related score and the user related score obtained after the score suppression processing so as to calculate the sentence field user related score.
The sentence field user correlation score is calculated through the sentence correlation score, the field correlation score and the user correlation score which are obtained after score suppression processing, so that the sequencing result obtained based on the sentence field user correlation score is more reasonable.
Drawings
FIG. 1 is a flow diagram of a method of ranking query results according to method embodiment 1 of the present invention;
FIG. 2 is a flow chart of a method of ranking query results according to method embodiment 2 of the present invention;
FIG. 3 illustrates one embodiment of the process S100 illustrated in FIG. 1;
FIG. 4 is a schematic structural diagram of a query result ranking device according to device embodiment 1 of the present invention;
fig. 5 is a schematic structural diagram of a query result ranking device according to device embodiment 2 of the present invention.
Detailed Description
Various aspects of the invention are described in detail below with reference to the figures and the detailed description. Well-known modules, units and their interconnections, links, communications or operations with each other are not shown or described in detail. Furthermore, the described features, architectures, or functions can be combined in any manner in one or more implementations. It will be understood by those skilled in the art that the various embodiments described below are illustrative only and are not intended to limit the scope of the present invention. It will also be readily understood that the modules or units, or steps, of the embodiments described herein and illustrated in the figures can be combined and designed in a wide variety of different configurations.
[ METHOD EMBODIMENT 1 ]
Fig. 1 is a flowchart of a query result ranking method according to embodiment 1 of the present invention. Referring to fig. 1, the method includes:
s100: and calculating the sentence field user correlation score of the query result item (meanwhile, the score reflecting the correlation between the query result item and the query sentence, the query field and the query user).
S200: and determining the ranking order of the query result items according to the user correlation scores in the statement field.
Wherein the query result items include one or more dimensions.
The method provided by the invention can determine the ranking order of the query result items by considering the relevance between the query result items and the query statement, the query field and the query user, thereby giving more accurate ranking results and improving the search experience of the user.
[ METHOD EMBODIMENT 2 ]
The method provided by this embodiment includes all the processes shown in fig. 1, and is not described herein again. As shown in fig. 2, the method provided in this embodiment further includes the following steps:
s300: a query statement input by a user is received.
S400: and acquiring the query result item according to the query statement.
[ METHOD EMBODIMENT 3 ]
The method provided by this embodiment includes all the processes shown in fig. 2, and is not described herein again. In the present embodiment, the process S100 is implemented as follows:
and calculating the sentence field user correlation score according to the sentence correlation score, the field correlation score and the user correlation score of the query result item.
[ METHOD EMBODIMENT 4 ]
The method provided by this embodiment includes all the processing in method embodiment 3, and is not described herein again. The method provided by the embodiment further includes the following steps:
(1): and calculating the sentence correlation score according to the correlation between the query result item and the query sentence.
(2): and calculating the domain correlation score based on the correlation of the query result item and the current query domain.
(3): calculating the user relevance score according to the relevance of the query result item and the user.
The order of execution between the above processes (1), (2), and (3) can be appropriately arranged according to actual situations by those skilled in the art.
[ METHOD EMBODIMENT 5 ]
The method provided in this embodiment includes all the processes included in method embodiment 4, and is not described herein again. In this embodiment, among others, the process (2) in the method embodiment 4 is realized by the following processes:
(i) the method comprises the following steps And acquiring a word cutting result set of each dimension of the query result item.
The word cutting result set comprises the following steps: and (3) performing word segmentation processing on the text of the query result item (based on a comprehensive word stock, and adopting a semantic-based means to segment sentences and words of the text into target words or phrases) based on dimensions to generate a word or phrase set.
(ii) The method comprises the following steps And adjusting the dimension level field correlation score (the score reflecting the correlation between the dimension and the query field) of each dimension of the query result item according to the matching result of the word cutting result set and the keyword set of the current query field.
The keyword set is a set of search terms in the current query field for which the search rate is greater than a threshold.
Wherein the adjusting comprises two ways of increasing the score and keeping the score unchanged.
In the initialization stage, uniform dimension-level domain-related scores, which may be weighted scores, are set for different dimensions of different query result items.
Accordingly, if the word segmentation result set is identified to have elements belonging to the keyword set, the dimension level domain correlation score of the dimension corresponding to the word segmentation result set is increased (for example, by increasing the weight). And if the word segmentation result set is identified not to have elements belonging to the keyword set, keeping the relevant scores of the dimension level fields of the dimension corresponding to the word segmentation result set unchanged.
(iii) The method comprises the following steps The adjusted dimension-level domain-related scores are summed to calculate a domain-related score (query result item-level domain-related score).
In order to obtain a more accurate and reasonable ranking result, a person skilled in the art may periodically update the keyword set according to a predetermined time period, or update the keyword set in real time, that is, once it is monitored that a new search term exists in the current query field, the keyword set in the query field is updated.
The query words based on the vertical field have the characteristic of a large number of repeatability, the field correlation scores are calculated based on the keyword set of the query field, and the correlation between the query result items and the current query field can be efficiently and accurately obtained.
[ METHOD EMBODIMENT 6 ]
The method provided by this embodiment includes all the processes included in method embodiment 5, and is not described herein again. In this embodiment, among them, the process (3) in the method embodiment 4 is realized by the following processes:
(iv) the method comprises the following steps Identifying whether the query result item has dimensions that belong to the user's set of click dimensions. If so, the process (v) is executed, and if not, the process (vi) is executed.
Wherein the click dimension set refers to a set of dimensions in the query result items clicked by the user.
(v) The method comprises the following steps And summing the dimension-level user correlation scores (the scores reflecting the correlation between the dimensions and the query users) corresponding to the dimensions belonging to the click dimension set in the query result item to calculate the user correlation scores (the query result item-level user correlation scores).
(vi) The method comprises the following steps The user-related score is set according to a set value (for example, the user-related score is set to 0, but of course, a person skilled in the art may set the user-related score to other numerical values according to actual needs).
[ METHOD EMBODIMENT 7 ]
The method provided by this embodiment includes all the processes included in method embodiment 6, and is not described herein again. The method provided by the embodiment further includes the following steps:
(4) and responding to the query statement, and acquiring a query log of the user.
(5) And generating the click dimension set according to the acquired query log.
(6) And setting a dimension-level user-related score for each element in the generated click dimension set based on the click rate of the user.
Preferably, to eliminate accidental clicking operations such as mistaken clicking of the user, the click dimension set may be generated according to click query result items in the obtained query log, where the click rate is greater than a threshold.
Wherein, the clicked query result item refers to the query result item clicked by the user in the query log.
Because the click dimension set is based on the click query result items with the click rate larger than the threshold value in the query log, accidental click operations such as mistaken click of the user are eliminated, and the sequencing result of the method can more accurately reflect the relevance of the query result items and the current query user.
[ METHOD EMBODIMENT 8 ]
The method provided by this embodiment includes all the processes included in method embodiment 7, and is not described herein again. As shown in fig. 3, in the present embodiment, the process S100 shown in fig. 1 is implemented as follows:
s110: and respectively carrying out score suppression processing on the sentence correlation score, the field correlation score and the user correlation score of the query result item.
Wherein, the score suppression processing refers to controlling the maximum value of the corresponding score through a threshold value. For example, the calculated sentence-related score, domain-related score, and user-related score are compared to a sentence-related score threshold, a domain-related score threshold, and a user-related score threshold, respectively; if the sentence correlation score, the field correlation score or the user correlation score is larger than the corresponding threshold value, setting the sentence correlation score, the field correlation score or the user correlation score as the corresponding threshold value; if the sentence correlation score, the domain correlation score, or the user correlation score is less than or equal to the corresponding threshold value, the value of the sentence correlation score, the domain correlation score, or the user correlation score is kept unchanged.
S120: and summing the sentence related scores, the field related scores and the user related scores obtained after the score suppression processing so as to calculate the sentence field user related scores.
The sentence field user correlation score is calculated through the sentence correlation score, the field correlation score and the user correlation score which are obtained after score suppression processing, so that the sequencing result obtained based on the sentence field user correlation score is more reasonable.
In addition, the invention also provides a query sorting method, which is specifically described below by taking the query result item as a query document as an example. The method comprises the following steps:
step 1: a query statement input by a user is received.
Step 2: and acquiring one or more query documents of the query statement.
Wherein the query document includes one or more dimensions.
And step 3: the following processing is performed for each acquired query document:
A. and calculating a sentence correlation score of the query document based on the correlation between the query document and the query sentence.
For the process a, for example, a vector representation of a word (representing a word as a vector consisting of real numbers) may be learned using a neural network technique, and a sentence correlation score may be calculated by calculating a similarity between a query document consisting of a word vector and a query sentence.
B. A domain-relevance score for the query document is calculated based on the relevance of the query document to the current query domain.
For process B, this can be achieved by:
and B1, acquiring a word cutting result set of each dimension of the query document.
The word cutting result set comprises the following steps: and (3) performing word segmentation processing on the text of the query document (based on a comprehensive word stock, and adopting a semantic-based means to segment sentences and words of the text into target words or phrases) based on dimensions to generate a word or phrase set. In order to further improve the ranking effect, before performing word segmentation processing, data regularization processing, such as complex and simple transformation, regular matching, secondary verification, confidence analysis, and the like, needs to be performed on the document text.
And B2, adjusting the relevant scores of the dimension level fields of each dimension in the query documents according to the matching results of the word cutting result set and the keyword set of the current query field.
The keyword set is a set of search terms with a search rate larger than a threshold value in the current query field. Those skilled in the art can update the keyword set periodically according to a predetermined time period, and also can update the keyword set in real time, for example, once it is monitored that there are new search terms in the current query field, the keyword set in the query field is updated.
In addition, the adjustment comprises two ways of increasing the score and keeping the score unchanged.
In the initialization stage, uniform dimension-level domain-related scores, which may be weighted scores, are set for different dimensions of different query documents.
Accordingly, if the word segmentation result set is identified to have elements belonging to the keyword set, the dimension level domain correlation score of the dimension corresponding to the word segmentation result set is increased (for example, by increasing the weight). And if the word segmentation result set is identified not to have the elements belonging to the keyword set, keeping the relevant scores of the dimension level fields of the dimension corresponding to the word segmentation result set unchanged.
And B3, summing the adjusted dimension-level domain-related scores of the recall dimensions of the query sentences in the query documents to calculate the domain-related score (query document-level domain-related score).
When the query document is successfully matched with the query statement in a specific certain dimension, the dimension is called a recall dimension.
C. A user relevance score for a query document is calculated based on the relevance of the query document to the user.
For process C, this can be achieved by:
c1, identifying whether the query document has the dimension belonging to the click dimension set corresponding to the user. If so, perform C2, otherwise, perform C3.
The query ranking method provided by the invention further executes the following processing to obtain the click dimension set and the dimension-level user correlation score of each element in the click dimension set.
(1) And responding to the query statement, and acquiring a query log of the user.
(2) And generating the click dimension set according to click documents (the click documents refer to the click documents in the query log, wherein the click rate of the click documents is greater than a threshold value) in the obtained query log.
Wherein the click dimension set refers to a set of dimensions in a query document clicked by a user. For example, all the query documents clicked by the user are extracted from the query log, and dimensions in all the extracted query documents are subjected to deduplication processing to obtain a click dimension set.
(3) And setting a dimension-level user-related score for each element in the generated click dimension set based on the click rate of the user.
Wherein, for setting a dimension-level user correlation score for each element in the generated click dimension set based on the click rate of the user, for example, if the dimension W1 in the click dimension set belongs to both the query document F1 (click rate is R1) and the query document F2 (click rate is R2), the dimension-level user correlation score SW1 of the dimension W1 is calculated based on R1 and R2, for example, SW1 is R1+ R2.
And C2, summing the user correlation scores of the dimension levels corresponding to the dimensions belonging to the click dimension set in the query document to calculate the user correlation score (the user correlation score of the query document level).
C3, setting the user correlation score to 0 (of course, those skilled in the art can set the user correlation score to other values according to actual needs).
D. And calculating a sentence domain user correlation score according to the calculated sentence correlation score, the calculated domain correlation score and the calculated user correlation score.
For process D, this can be achieved by:
and D1, respectively performing score suppression processing on the calculated sentence correlation score, the calculated field correlation score and the calculated user correlation score.
Wherein, the score suppression processing refers to controlling the maximum value of the corresponding score through a threshold value. For example, the calculated sentence-related score, domain-related score, and user-related score are compared to a sentence-related score threshold, a domain-related score threshold, and a user-related score threshold, respectively; if the sentence correlation score, the field correlation score or the user correlation score is larger than the corresponding threshold value, setting the sentence correlation score, the field correlation score or the user correlation score as the corresponding threshold value; if the sentence correlation score, the domain correlation score, or the user correlation score is less than or equal to the corresponding threshold value, the value of the sentence correlation score, the domain correlation score, or the user correlation score is kept unchanged.
D2, summing the sentence related scores, the field related scores and the user related scores obtained after the score suppression processing to calculate the sentence field user related scores.
And 4, step 4: and sequencing the one or more query result items according to the calculated sentence domain user correlation scores.
[ DEVICE EMBODIMENT 1 ]
Fig. 4 is a schematic structural diagram of a query result ranking device according to device embodiment 1 of the present invention. Referring to fig. 4, the apparatus 1000 includes: the first calculation module 100, and the determination module 200, specifically:
the first calculation module 100 is used for calculating a statement domain user relevance score of the query result item (a score reflecting relevance between the query result item and the query statement, the query domain and the query user).
The determining module 200 is configured to determine the ranking order of the query result items according to the sentence domain user correlation score calculated by the first calculating module 100.
Wherein the query result items comprise, for example, one or more dimensions.
The method provided by the invention can determine the ranking order of the query result items by considering the relevance between the query result items and the query statement, the query field and the query user, thereby giving more accurate ranking results and improving the search experience of the user.
[ DEVICE EMBODIMENT 2 ]
The apparatus provided in this embodiment includes all the modules shown in fig. 4, and is not described herein again. As shown in fig. 5, the apparatus provided in this embodiment further includes: the receiving module 300 and the obtaining module 400 specifically:
the receiving module 300 is used for receiving a query statement input by a user.
The obtaining module 400 is configured to obtain the query result item according to the query statement.
[ DEVICE EMBODIMENT 3 ]
The apparatus provided in this embodiment includes all the modules in apparatus embodiment 2, and is not described herein again. In this embodiment, the first computing module 100 includes a first computing unit, specifically:
the first calculating unit is used for calculating the sentence domain user correlation score according to the sentence correlation score, the domain correlation score and the user correlation score of the query result item.
[ DEVICE EMBODIMENT 4 ]
The apparatus provided in this embodiment includes all the modules and units in apparatus embodiment 3, which are not described herein again, and further includes: a second calculation module, a third calculation module, and a fourth calculation module, specifically:
the second calculation module is used for calculating the domain correlation score based on the correlation between the query result item and the current query domain.
The third calculation module is used for calculating the user relevance score according to the relevance of the query result item and the user.
And the fourth calculation module is used for calculating the sentence correlation score according to the correlation between the query result item and the query sentence.
[ DEVICE EMBODIMENT 5 ]
The apparatus provided in this embodiment includes all the modules and units included in apparatus embodiment 4, and are not described herein again. Wherein the second calculation module comprises: an acquisition unit, a score adjustment unit, and a second calculation unit, specifically:
the obtaining unit is used for obtaining the word cutting result set of each dimension of the query result item.
And the score adjusting unit is used for adjusting the dimension level field related score of each dimension of the query result item according to the matching result of the word cutting result set and the keyword set of the query field.
The second calculating unit is used for summing the adjusted dimension-level domain-related scores to calculate the domain-related scores.
The query words based on the vertical field have the characteristic of a large number of repeatability, the field correlation scores are calculated based on the keyword set of the query field, and the correlation between the query result items and the current query field can be efficiently and accurately obtained.
[ DEVICE EMBODIMENT 6 ]
The apparatus provided in this embodiment includes all the modules and units included in apparatus embodiment 5, and are not described herein again. Wherein, the third calculation module includes an identification unit, a third calculation unit, and a fourth calculation unit, specifically:
the identification unit is used for identifying whether the query result item has dimensions belonging to the click dimension set of the user.
The third calculation unit is configured to execute the following processing: and under the condition that the query result item has the dimension belonging to the click dimension set, summing the dimension-level user-related scores corresponding to the dimensions belonging to the click dimension set to calculate the user-related score.
The fourth calculation unit is configured to execute the following processing: in the case that the query result item does not have the dimension belonging to the click dimension set, a user-related score is set according to a set value (e.g., the user-related score is set to 0 or other numerical value).
[ DEVICE EMBODIMENT 7 ]
The apparatus provided in this embodiment includes all the modules and units included in apparatus embodiment 6, and are not described herein again. The device provided by the present embodiment further includes: the log acquisition module, the generation module and the score setting module specifically:
and the log acquisition module is used for responding to the query statement and acquiring the query log of the user.
And the generating module is used for generating the click dimension set according to the query log.
The score setting module is used for setting a dimension-level user-related score for each element in the click dimension set based on the click rate of the user.
According to the method and the device, the relevance between the query result item and the current query user can be accurately obtained by calculating the relevant score of the user based on the click rate of the user.
Preferably, to eliminate accidental clicking operations such as a user's mis-click, the generating module may include: and the generating unit is used for generating the click dimension set according to the click query result items (the query result items clicked by the user in the query log) with the click rate larger than the threshold value in the query log.
Because the click dimension set is based on the click query result items with the click rate larger than the threshold value in the query log, accidental click operations such as mistaken click of the user are eliminated, and the sequencing result of the method can more accurately reflect the relevance of the query result items and the current query user.
[ DEVICE EMBODIMENT 8 ]
The apparatus provided in this embodiment includes all the modules and units included in apparatus embodiment 7, and are not described herein again. Wherein the first calculation unit includes: score suppression component and calculation component, specifically:
and the score suppression component is used for respectively performing score suppression processing on the sentence correlation score, the field correlation score and the user correlation score of the query result item.
Wherein, the score suppression processing refers to controlling the maximum value of the corresponding score through a threshold value. For example, the calculated sentence-related score, domain-related score, and user-related score are compared to a sentence-related score threshold, a domain-related score threshold, and a user-related score threshold, respectively; if the sentence correlation score, the field correlation score or the user correlation score is larger than the corresponding threshold value, setting the sentence correlation score, the field correlation score or the user correlation score as the corresponding threshold value; if the sentence correlation score, the domain correlation score, or the user correlation score is less than or equal to the corresponding threshold value, the value of the sentence correlation score, the domain correlation score, or the user correlation score is kept unchanged.
The calculating component is used for summing the sentence related score, the field related score and the user related score obtained after the score suppression processing so as to calculate the sentence field user related score.
The sentence field user correlation score is calculated through the sentence correlation score, the field correlation score and the user correlation score which are obtained after score suppression processing, so that the sequencing result obtained based on the sentence field user correlation score is more reasonable.
By implementing the query result ordering method and the query result ordering device provided by the invention, the ordering order of the query result item can be determined by considering the correlation between the query result item and the query statement, the query field and the query user, so that a more accurate ordering result can be given, and the search experience of the user is improved.
Although the gist of the present invention is described with reference to various embodiments, the gist of the present invention is not limited to these embodiments. On the contrary, the intent of the invention is to cover alternatives, modifications and equivalents as may be apparent to those skilled in the art.
Those skilled in the art will clearly understand that the present invention may be implemented entirely in software, or by a combination of software and a hardware platform. Based on such understanding, all or part of the technical solutions of the present invention contributing to the background may be embodied in the form of a software product, which may be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, a smart phone, a network device, etc.) to execute the method according to each embodiment or some parts of the embodiments of the present invention.
As used herein, the term "software" or the like refers to any type of computer code or set of computer-executable instructions in a general sense that is executed to program a computer or other processor to perform various aspects of the present inventive concepts as discussed above. Furthermore, it should be noted that according to one aspect of the embodiment, one or more computer programs implementing the method of the present invention when executed do not need to be on one computer or processor, but may be distributed in modules in multiple computers or processors to execute various aspects of the present invention.
Computer-executable instructions may take many forms, such as program modules, executed by one or more computers or other devices. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. In particular, the functionality of the program modules may be combined or split between various embodiments as desired.
Also, technical solutions of the present invention may be embodied as a method, and at least one example of the method has been provided. The actions may be performed in any suitable order and may be presented as part of the method. Thus, embodiments may be configured such that acts may be performed in an order different than illustrated, which may include performing some acts simultaneously (although in the illustrated embodiments, the acts are sequential).
The definitions given and used herein should be understood with reference to dictionaries, definitions in documents incorporated by reference, and/or their ordinary meanings.
In the claims, as well as in the specification above, all transitional phrases such as "comprising," "having," "containing," "carrying," "having," "involving," "consisting essentially of …," and the like are to be understood to be open-ended, i.e., to include but not limited to. Only "consisting of … …" should be an overphrase of being closed or semi-closed.
The terms and expressions used in the specification of the present invention have been set forth for illustrative purposes only and are not meant to be limiting. It will be appreciated by those skilled in the art that changes could be made to the details of the above-described embodiments without departing from the underlying principles thereof. The scope of the invention is, therefore, indicated by the appended claims, in which all terms are intended to be interpreted in their broadest reasonable sense unless otherwise indicated.

Claims (16)

1. A method for ranking query results, the method comprising:
calculating the sentence field user correlation score of the query result item;
determining the ranking order of the query result items according to the user correlation scores in the statement field;
wherein the calculating the sentence domain user correlation score of the query result item comprises:
calculating a sentence domain user correlation score according to the sentence correlation score, the domain correlation score and the user correlation score of the query result item;
wherein the method further comprises: calculating the domain correlation score based on the correlation of the query result item and the current query domain;
wherein said calculating the domain-relevance score based on the relevance of the query result item to the current query domain comprises:
obtaining a word cutting result set of each dimension of the query result item;
adjusting the relevant scores of the dimension level fields of each dimension according to the matching results of the word cutting result set and the keyword set of the query field;
summing the adjusted dimension-level domain-related scores to calculate a domain-related score;
the keyword set is a set of search terms with a search rate larger than a threshold value in the current query field.
2. The method of claim 1, wherein the method further comprises:
receiving a query statement input by a user;
and acquiring the query result item according to the query statement.
3. The method of claim 1, wherein the method further comprises:
calculating the user relevance score according to the relevance of the query result item and the user.
4. The method of claim 3, wherein said calculating the user relevance score according to the relevance of the query result item to the user comprises:
identifying whether the query result item has dimensions that belong to the user's set of click dimensions;
and if the query result item has the dimension, summing the dimension-level user correlation scores corresponding to the dimension to calculate a user correlation score.
5. The method of claim 4, wherein the method further comprises:
responding to the query statement, and acquiring a query log of the user;
generating the click dimension set according to the query log;
setting a dimension-level user-related score for each element in the click dimension set based on the click rate of the user.
6. The method of claim 5, wherein generating the set of click dimensions from the query log comprises:
and generating the click dimension set according to click query result items with click rates larger than a threshold value in the query log.
7. The method of any of claims 1-6, wherein computing the sentence domain user relevance score from the sentence relevance score, the domain relevance score, and the user relevance score of the query result item comprises:
respectively performing score suppression processing on the sentence correlation score, the field correlation score and the user correlation score;
and summing the sentence related score, the field related score and the user related score obtained after the score suppression processing to calculate the sentence field user related score.
8. An apparatus for ranking query results, the apparatus comprising:
the first calculation module is used for calculating the sentence field user correlation score of the query result item;
the determining module is used for determining the ranking order of the query result items according to the user correlation scores in the statement field;
wherein the first computing module comprises:
the first calculation unit is used for calculating the sentence domain user correlation score according to the sentence correlation score, the domain correlation score and the user correlation score of the query result item;
the device further comprises a second calculation module, a second search module and a second search module, wherein the second calculation module is used for calculating the domain correlation score based on the correlation between the query result item and the current query domain;
wherein the second computing module comprises;
the acquisition unit is used for acquiring a word cutting result set of each dimension of the query result item;
the score adjusting unit is used for adjusting the relevant scores of the dimension level fields of each dimension according to the matching results of the word cutting result set and the keyword set of the query field;
a second calculation unit for summing the adjusted dimension-level domain-related scores to calculate a domain-related score,
the keyword set is a set of search terms with a search rate larger than a threshold value in the current query field.
9. The apparatus of claim 8, wherein the apparatus further comprises:
the receiving module is used for receiving the query statement input by the user;
and the acquisition module is used for acquiring the query result item according to the query statement.
10. The apparatus of claim 8, wherein the apparatus further comprises:
and the third calculation module is used for calculating the user correlation score according to the correlation between the query result item and the user.
11. The apparatus of claim 10, wherein the third computing module comprises:
the identification unit is used for identifying whether the query result item has dimensions belonging to the click dimension set of the user;
and the third calculation unit is used for summing the dimension-level user correlation scores corresponding to the dimensions to calculate the user correlation scores under the condition that the query result items have the dimensions.
12. The apparatus of claim 11, wherein the apparatus further comprises:
the log obtaining module is used for responding to the query statement and obtaining the query log of the user;
the generating module is used for generating the click dimension set according to the query log;
and the score setting module is used for setting a dimension-level user related score for each element in the click dimension set based on the click rate of the user.
13. The apparatus of claim 12, wherein the generating module comprises:
and the generating unit is used for generating the click dimension set according to the click query result items with the click rate larger than the threshold value in the query log.
14. The apparatus of any of claims 8 to 13, wherein the first computing unit comprises:
the score suppression component is used for respectively performing score suppression processing on the sentence correlation score, the field correlation score and the user correlation score;
and the calculating component is used for summing the sentence related score, the field related score and the user related score obtained after the score suppression processing so as to calculate the sentence field user related score.
15. A computer storage medium having stored thereon computer instructions executable by a processor to perform the method of any one of claims 1 to 7.
16. A computer device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program to implement the method of any one of claims 1 to 7.
CN201611159193.0A 2016-12-15 2016-12-15 Query result sorting method and device Active CN106844436B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611159193.0A CN106844436B (en) 2016-12-15 2016-12-15 Query result sorting method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611159193.0A CN106844436B (en) 2016-12-15 2016-12-15 Query result sorting method and device

Publications (2)

Publication Number Publication Date
CN106844436A CN106844436A (en) 2017-06-13
CN106844436B true CN106844436B (en) 2020-07-31

Family

ID=59139263

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611159193.0A Active CN106844436B (en) 2016-12-15 2016-12-15 Query result sorting method and device

Country Status (1)

Country Link
CN (1) CN106844436B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719145A (en) * 2009-11-17 2010-06-02 北京大学 Individuation searching method based on book domain ontology
CN102486781A (en) * 2010-12-03 2012-06-06 阿里巴巴集团控股有限公司 Method and device for sorting searches
CN103294693A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Searching method, server and system

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20090119276A1 (en) * 2007-11-01 2009-05-07 Antoine Sorel Neron Method and Internet-based Search Engine System for Storing, Sorting, and Displaying Search Results

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101719145A (en) * 2009-11-17 2010-06-02 北京大学 Individuation searching method based on book domain ontology
CN102486781A (en) * 2010-12-03 2012-06-06 阿里巴巴集团控股有限公司 Method and device for sorting searches
CN103294693A (en) * 2012-02-27 2013-09-11 华为技术有限公司 Searching method, server and system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于分类技术的个性化检索系统的研究与设计;李光耀;《中国硕士学位论文全文数据库信息科技辑》;20131215;第18-29页 *
李光耀.基于分类技术的个性化检索系统的研究与设计.《中国硕士学位论文全文数据库信息科技辑》.2013, *

Also Published As

Publication number Publication date
CN106844436A (en) 2017-06-13

Similar Documents

Publication Publication Date Title
CN108829822B (en) Media content recommendation method and device, storage medium and electronic device
CN106815252B (en) Searching method and device
WO2019201098A1 (en) Question and answer interactive method and apparatus, computer device and computer readable storage medium
EP2866421B1 (en) Method and apparatus for identifying a same user in multiple social networks
US9251292B2 (en) Search result ranking using query clustering
WO2021204269A1 (en) Classification model training, and object classification
US20210049298A1 (en) Privacy preserving machine learning model training
JP2016511906A (en) Ranking product search results
EP2778985A1 (en) Search result ranking by department
US8832015B2 (en) Fast binary rule extraction for large scale text data
US20210366006A1 (en) Ranking of business object
CN110633464A (en) Semantic recognition method, device, medium and electronic equipment
CN111275205A (en) Virtual sample generation method, terminal device and storage medium
CN113988157A (en) Semantic retrieval network training method and device, electronic equipment and storage medium
Ertekin et al. Approximating the crowd
CN113537630A (en) Training method and device of business prediction model
CN104933099B (en) Method and device for providing target search result for user
CN114116997A (en) Knowledge question answering method, knowledge question answering device, electronic equipment and storage medium
CA3119416C (en) Combining statistical methods with a knowledge graph
CN113821588A (en) Text processing method and device, electronic equipment and storage medium
CN110555747A (en) method and device for determining target user
CN110516033A (en) A kind of method and apparatus calculating user preference
CN115794898B (en) Financial information recommendation method and device, electronic equipment and storage medium
CN106844436B (en) Query result sorting method and device
CN116383340A (en) Information searching method, device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Building N3, building 12, No. 27, Jiancai Chengzhong Road, Haidian District, Beijing 100086

Applicant after: Beijing Xingxuan Technology Co.,Ltd.

Address before: 100085 Beijing, Haidian District on the road to the information on the ground floor of the 1 to the 3 floor of the 2 floor, room 11, 202

Applicant before: Beijing Xiaodu Information Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant