CN108536709B - Search optimization method and device - Google Patents

Search optimization method and device Download PDF

Info

Publication number
CN108536709B
CN108536709B CN201710124430.8A CN201710124430A CN108536709B CN 108536709 B CN108536709 B CN 108536709B CN 201710124430 A CN201710124430 A CN 201710124430A CN 108536709 B CN108536709 B CN 108536709B
Authority
CN
China
Prior art keywords
entity
relevance
query request
correlation
final
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710124430.8A
Other languages
Chinese (zh)
Other versions
CN108536709A (en
Inventor
李梅雯
王啸风
孟嘉
邵蓥侠
傅强
冯是聪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Mininglamp Software System Co ltd
Original Assignee
Beijing Mininglamp Software System Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Mininglamp Software System Co ltd filed Critical Beijing Mininglamp Software System Co ltd
Priority to CN201710124430.8A priority Critical patent/CN108536709B/en
Publication of CN108536709A publication Critical patent/CN108536709A/en
Application granted granted Critical
Publication of CN108536709B publication Critical patent/CN108536709B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a search optimization method and a device, wherein the search optimization method comprises the following steps: after receiving a query request, acquiring an entity set related to the query request; for any entity in the entity set, acquiring an associated entity which has an association relationship with the entity in the entity set; acquiring a first correlation between the entity and the query request and a second correlation between the associated entity and the query request, and determining a final correlation between the entity and the query request according to the first correlation and the second correlation. The scheme of the embodiment of the invention determines the final relevance of the entity by using the relevance of the associated entity, and outputs the entity after sequencing according to the final relevance, so that the relevance between the entity and the search can be more accurately reflected, and the user can be more accurately and quickly helped to locate the entity concerned by the user.

Description

Search optimization method and device
Technical Field
The present invention relates to information processing technologies, and in particular, to a search optimization method and apparatus.
Background
In the field of entity search, the most popular result ranking algorithm at present mainly calculates the correlation between a hit entity and a query based on the psf (practical screening function) algorithm in Lucene, and adds search intention identification. The search intention recognition means that the real requirements of the user are acquired through the request input by the user, and a classification problem is completed by a rule or a machine learning method. The searching method only considers the attribute of a single entity and has certain limitation and one-sidedness, so that the visibility of the result really required by the user is influenced, and the user is prevented from quickly positioning the entity object really interested by the user.
Disclosure of Invention
In order to solve the technical problems, the invention provides a search optimization method and a search optimization device, which are used for optimizing the existing search method and are beneficial to a user to quickly locate an entity object really interested by the user.
In order to achieve the object of the present invention, the present invention provides a search optimization method, comprising:
after receiving a query request, acquiring an entity set related to the query request;
for any entity in the entity set, acquiring an associated entity which has an associated relationship with the entity in the entity set;
and acquiring a first correlation between the entity and the query request and a second correlation between the associated entity and the query request, and determining the final correlation between the entity and the query request according to the first correlation and the second correlation.
Optionally, the method further includes: and after the final relevance of each entity in the entity set is determined, sorting the entities in the entity set according to the final relevance of each entity, and outputting the sorted entities.
Optionally, the obtaining an associated entity in the entity set that has an association relationship with the entity includes:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
Optionally, the preset conditions are: the relationship weight value of the associated entity relative to the entity is greater than 0.
Optionally, the determining a final relevance of the entity to the query request according to the first relevance and the second relevance includes:
determining a final relevance of the entity to the query request according to the first relevance, the second relevance, and a relationship weight value of the associated entity relative to the entity.
Optionally, the determining a final relevance of the entity to the query request according to the first relevance and the second relevance includes:
Figure BDA0001238023630000021
and said
Figure BDA0001238023630000022
The score _ mx (q, d) is a final relevance of the entity to the query request, the score (q, d) is a first relevance of the entity to the query request, the α is a weight of the entity, and 0 ≦ α ≦ 1, the score (q, dr)i) Associated entity dr for said entityiA second correlation to the query request, the boostiIs the associated entity driAnd i is 1-m relative to the relationship weight value of the entity.
An embodiment of the present invention further provides a search optimization apparatus, including:
the search unit is used for acquiring an entity set relevant to the query request after receiving the query request;
an associated entity determining unit, configured to acquire, for any entity in the entity set, an associated entity in the entity set, where the associated entity has an association relationship with the entity;
and the optimizing unit is used for acquiring a first correlation between the entity and the query request, acquiring a second correlation between the associated entity and the query request, and determining the final correlation between the entity and the query request according to the first correlation and the second correlation.
Optionally, the apparatus further comprises: and the output unit is used for sequencing the entities in the entity set according to the final relevance of each entity and then outputting the sequenced entities after determining the final relevance of each entity in the entity set.
Optionally, the obtaining, by the associated entity determining unit, an associated entity having an association relationship with the entity in the entity set includes:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
Optionally, the preset conditions are: the relationship weight value of the associated entity relative to the entity is greater than 0.
Optionally, the determining, by the optimization unit, a final relevance of the entity to the query request according to the first relevance and the second relevance includes:
determining a final relevance of the entity to the query request according to the first relevance, the second relevance, and a relationship weight value of the associated entity relative to the entity.
Optionally, the determining, by the optimization unit, a final relevance of the entity to the query request according to the first relevance and the second relevance includes:
Figure BDA0001238023630000031
and said
Figure BDA0001238023630000032
The score _ mx (q, d) is a final relevance of the entity d to the query request q, the score (q, d) is a first relevance of the entity d to the query request q, the α is a weight of the entity d, and 0 ≦ α ≦ 1, the score (q, dr)i) Associated entity dr for said entity diA second correlation with the query request q, the boostiIs the associated entity driAnd the relation weight value of the entity d is 1-m.
Compared with the prior art, the scheme of the embodiment of the invention determines the final relevance of the entity by using the relevance of the associated entity of the entity, and outputs the entity after sequencing according to the final relevance, so that the relevance between the entity and the search can be more accurately reflected, and the user can be more accurately and quickly helped to locate the entity concerned by the user.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
Drawings
The accompanying drawings are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the example serve to explain the principles of the invention and not to limit the invention.
FIG. 1 is a flowchart of a search optimization method according to an embodiment of the present invention;
fig. 2 is a block diagram of a search optimization apparatus according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail below with reference to the accompanying drawings. It should be noted that, in the present application, the embodiments and features of the embodiments may be arbitrarily combined with each other without conflict.
The steps illustrated in the flow charts of the figures may be performed in a computer system such as a set of computer-executable instructions. Also, while a logical order is shown in the flow diagrams, in some cases, the steps shown or described may be performed in an order different than here.
In the prior art, when the PSF is used for search calculation, the information based on is completely from the attribute of the entity, and another key information relationship influencing the entity is not considered. In the embodiment of the invention, other entities having the association relation with the entity are considered, the characteristics of the other entities related to the entity have the representativeness which is the same as the attribute to the entity, and the association value of the entity related to the entity is used for correcting the association value of the entity.
Fig. 1 is a flowchart of a search optimization method according to an embodiment of the present invention. As shown in fig. 1, a search optimization method provided in an embodiment of the present invention includes:
step 101, receiving a query request;
102, acquiring an entity set relevant to the query request;
103, acquiring an associated entity which has an association relation with the entity in the entity set for any entity in the entity set;
step 104, obtaining a first correlation between the entity and the query request and a second correlation between the associated entity and the query request, and determining a final correlation between the entity and the query request according to the first correlation and the second correlation.
In another embodiment of the present invention, the method further includes step 105, after determining the final relevance of each entity in the entity set, sorting the entities in the entity set according to the final relevance of each entity, and outputting the sorted entities.
The entity in step 102 may be a type of object that is not further divided in reality, and is generally defined as a type of object that is mainly analyzed in the system, such as a person, a car, a family, and so on. Each entity includes attribute information that represents characteristics of the entity. For example, when the entity is a person, the attribute information thereof may include one or a combination of the following: name, age, gender, native place, whether there is a criminal record, key person categories (such as virus-related, terrorism-related, etc.), behavioral habits, and the like.
The association relationship between entities is a kind of object that describes various relationships between entities, and is generally defined as the association between entities, for example, when the entities are people, the association relationship includes spouse, father-son, husband-wife, mother-son, peer, live-at-the-same, classmate relationship, and the like, and may also include the affiliation relationship between people and household. The partial association relationship between persons can be extracted from the house account book.
The attribute information of the entities and the association relationship between the entities can be stored in a database. The database may be a graph database, attribute information of the entities is marked by points, and association relations between the entities are marked by edges.
In step 102, an existing search algorithm may be used to obtain the entity set related to the query request. For example, a PSF algorithm may be used. Of course, the present application is not limited thereto, and other search methods may be used to obtain the entity set related to the query request. And when the entity set relevant to the query request is obtained by using the existing algorithm, the relevance of each entity in the entity set and the query request can also be obtained. The entities in the set of entities are referred to as hits, and are denoted as { entity 1, entity 2.
In step 103, for any entity in the entity set, a corresponding associated entity is obtained, where the associated entity of each entity may be one or more, for example, the associated entity of entity 1 may be denoted as: { entity 1_1, entity 1_ 2., entity 1_ m1}, the associated entity of entity 2 can be written as: { entity 2_1, entity 2_ 2., entity 2_ m2}, the associated entity of entity n may be { entity n _1, entity n _ 2., entity n _ mn }. Wherein the associated entity is also an entity in the entity set.
In step 104, for entity 1 in the entity set, the correlation between entity 1 and the query request is obtained, and the associated entity of entity 1 is obtained: entity 1_1, entity 1_2,.., the relevance of each of entity 1_ m1 to the query request, the final relevance of entity 1 is determined from the relevance of entity 1 to the query request, the relevance of entity 1_2 to the query request, …, the relevance of entity 1_ m1 to the query request. By analogy, the final relevance of other entities in the entity set, such as entity 2, …, entity n, may be obtained.
In an embodiment of the present invention, the associated entity has a relationship weight value with respect to the entity.
For different relationships, different relationship weight values are set to distinguish their components according to their closeness and influence on the entity. For example, in the "virus-related" entity correlation query, the relationship weight value of the spouse relationship may be set to be greater than the relationship weight value of the parent-child relationship, and the relationship weight value of the parent-child relationship is greater than the relationship weight value of the classmate relationship. For example, the relationship weight of the spouse relationship may be set to 50%, the relationship weight of the parent-child relationship may be set to 30%, and the relationship weight of the classmate relationship may be set to 10%. It means that the effect of the spouse on a person is stronger than that of father-son relationship or classmate relationship on the behavior characteristic of virus absorption. Here, it is only an example, and the relationship weight value may be determined according to actual needs. Relationship weight values may be represented by boost values. The specific value of the boost needs to be comprehensively set according to the application field and the setting range of the boost value of the reference attribute part. There may be multiple relationship weight values, for example, the relationship weight values for a spouse may be different for different behaviors of an entity. The relation weight value can come from historical experience at the initial stage, and the method of machine learning can be introduced at the later stage for continuous improvement.
In another embodiment of the present invention, in the step 103, acquiring an associated entity in the entity set, where the associated entity has an association relationship with the entity, includes:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
In another embodiment of the present invention, the preset conditions are: the relationship weight value of the associated entity relative to the entity is greater than 0. The preset condition may also be set to other conditions as needed. The preset conditions are set as follows: the relationship weight value of the associated entity relative to the entity is greater than 0. The preset condition mainly considers that different associated entities have different influences on the entity, and in order to reduce the calculation consumption, some associated entities which have no influence on the entity under a certain application scene are not included in the calculated range any more. For example, when some characteristic behaviors of a human entity (e.g., drug abuse) are queried, the associated entity (vehicle) having a human-vehicle relationship has no influence on the entity. In most cases the same type of entities (people) will have the ability to interact with each other. With this configuration, a large number of unrelated entity computations are directly filtered out.
In yet another embodiment of the present invention, the determining the final relevance of the entity to the query request according to the first relevance and the second relevance comprises:
determining a final relevance of the entity to the query request according to the first relevance, the second relevance, and a relationship weight value of the associated entity relative to the entity.
In yet another embodiment of the present invention, the determining the final relevance of the entity to the query request according to the first relevance and the second relevance comprises:
for any entity d in the entity set, the correlation between the entity d and the query request q is score (q, d), and the associated entity dr of which the associated relation with the hit entity d meets the preset conditioniDetermining each associated entity driCorrelation score (q, dr) with query request qi) According to the relevance score (q, d) of the entity d to the query request q and the relevance score (q, dr) of each associated entity of the entityi) Determining the final correlation score _ mx (q, d) of entity d, one implementation is as follows:
Figure BDA0001238023630000071
and said
Figure BDA0001238023630000072
Wherein, the α is the influence weight of the entity d on the search correlation calculation, and correspondingly, (1- α) is the influence weight of all the associated entities of the entity d on the entity d; z is the total value of the contribution of the associated entity to the entity d, the boostiIs the associated entity driAnd the relation weight value of the entity d is 1-m. In this embodiment, the associated entity dr whose associated relationship with the hit entity d satisfies the predetermined conditioniReferred to as an associated entity having a positive relationship with entity d.
In the entity query scenario, the relevance between one hit entity and the search is not only related to the attribute of the hit entity, but also to the hit situation and degree of other entities with incidence relation. To characterize the influence of the second part on the weights, the concept of z (the contribution z of the associated entity to the entity d) is proposed. And z represents the sum of the contributions of all the associated entities having positive relations with the entity d to the entity and the query relevance score in the query. z is calculated by multiplying the correlation obtained by each associated entity according to the existing algorithm by a weight determined by the relationship between the associated entity and the entity d, and adding the sum to quantitatively represent the influence/representative strength of the surrounding human environment and the social relationship on the entity d.
In this embodiment, according to the practical application, α is set by considering the attribute value of the entity and the influence weight of the associated entity on the correlation between the entity d and the query request. The value of α is generally between [0 and 1], for example, α may be 0.8, and of course, other values may be taken as needed.
In this example, by introduction
Figure BDA0001238023630000081
So that the score brought by the associated entity falls within a certain metric range, it can be further shown that there is a rough upper limit on the impact of the associated entity. After a certain value is reached, the influence of the unit increment on the whole becomes a decreasing trend, and the final correlation is normalized so as to make the results comparable between different queries.
Of course, the calculation manner of score _ mx (q, d) is only an example, and other calculation methods may be used as needed to comprehensively consider the correlation between the entity and its associated entity, which is not limited in the present application. For example, instead of calculating z, the final relevance of the entity d may be calculated directly according to the relevance of the associated entity, the relationship weight value, and the relevance of the entity d.
An embodiment of the present invention further provides a search optimization apparatus, as shown in fig. 2, including:
the searching unit 201 is configured to, after receiving a query request, obtain an entity set related to the query request;
an associated entity determining unit 202, configured to acquire, for any entity in the entity set, an associated entity in the entity set, where the associated entity has an association relationship with the entity;
an optimizing unit 203, configured to obtain a first correlation between the entity and the query request, obtain a second correlation between the associated entity and the query request, and determine a final correlation between the entity and the query request according to the first correlation and the second correlation.
In another embodiment of the present invention, the apparatus further comprises: an output unit 204, configured to determine the final relevance of each entity in the entity set, sort the entities in the entity set according to the final relevance of each entity, and output the sorted entities.
In another embodiment of the present invention, the obtaining, by the associated entity determining unit 202, an associated entity in the entity set that has an association relationship with the entity includes:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
In another embodiment of the present invention, the preset conditions are: the relationship weight value of the associated entity relative to the entity is greater than 0.
In another embodiment of the present invention, the optimizing unit 203 determining the final relevance of the entity to the query request according to the first relevance and the second relevance comprises:
determining a final relevance of the entity to the query request according to the first relevance, the second relevance, and a relationship weight value of the associated entity relative to the entity.
In another embodiment of the present invention, the optimizing unit 203 determining the final relevance of the entity to the query request according to the first relevance and the second relevance comprises:
Figure BDA0001238023630000091
and said
Figure BDA0001238023630000092
The score _ mx (q, d) is a final relevance of the entity d to the query request q, the score (q, d) is a first relevance of the entity d to the query request q, the α is a weight of the entity d, and 0 ≦ α ≦ 1, the score (q, dr)i) Associated entity dr for said entity diA second correlation with the query request q, the boostiIs the associated entity driAnd the relation weight value of the entity d is 1-m.
It should be noted that, for the technical details of each unit in the search optimization apparatus, reference may be made to the description in the search optimization method, and details are not repeated here.
The scheme of the embodiment of the invention can be applied to a big data application system in the field of public security, is used for inquiring the entity, particularly obtains more accurate and comprehensive result sequencing in the inquiry of the human entity behavior characteristics compared with the traditional method, provides favorable help for related business personnel to quickly process the business, accelerates the speed of the business personnel for detecting the case to a certain extent, and greatly improves the experience of the user on the search function.
The invention is further illustrated by the following specific example. In the existing application, due to the reasons of renaming and the like, the results are output in a large number, and the PSF algorithm only considers the attributes of the entities, so the sequencing has certain limitation. It is possible that a person entity in front of it is less dangerous than a person behind it, affecting the force.
Example one
In this example, the search background is that a person who needs to search for a toxin-addict who has lost internal heat, Zhao shan Chuan, enters the search keyword "the toxin-addict in the lost internal heat, Zhao shan Chuan".
Then, firstly, the search engine selects all entity people hit by the keyword, and the entity people are scored according to the relevance between the attribute of the entity people and the keyword. For the sake of simplicity and clarity, it is assumed that there are only 11 people in the hit entity set, and the names, identification numbers and first scores are as follows:
{ (Leyi, 102398 XXXXXXXXX 0740,0.03),
(Zhao shan, 102398XXXXXXX 0723,0.25),
(Zhao Yi's father, 102398XXXXXX 0523,0.04),
(Zhao Yi Ma, 102398XXXXXXX 0723,0.04),
(Zhao shan, 212398XXXXXXX 0723,0.26),
(Zhaoqi, 102398XXXXXX 0522,0.04),
(Gao-ao, 212398XXXXXXXX 0723,0.04),
(Zhao shan, 262398XXXXXXX 0723,0.25),
(Wan somebody, 432398XXXXXXX 0723,0.02),
(Liu Gong, 432398XXXXXXX 0723,0.14),
(somebody, 423193XXXXXXX 0723,0.20) }
If the score calculation is finished at this time according to the existing algorithm, and each entity performs reverse ranking according to the scores and displays the scores on a front page, the following information is displayed:
(Zhao shan Chuan, 212398XXXXXXX 0723)
(Zhao shan Chuan, 262398XXXXXXX 0723)
(Zhao shan Chuan, 102398XXXXXXX 0723)
(somebody square, 423193XXXXXXX 0723)
(Liu Gong, 432398XXXXXXX 0723)
(Zhao Yi's father, 102398XXXXXX 0523)
(Zhao Yi Ma, 102398XXXXXXX 0723)
(Zhangyi, 102398XXXXXX 0522)
(Gao Zhi, 212398XXXXXXX 0723)
(Li somewhat, 102398XXXXXXX 0740)
(Wangzhi somewhat, 432398XXXXXXXX 0723)
However, as described above, this sort only considers its attributes, and the society is an interactive society, and the influence and association between them are very important. Therefore, with the application of the solution of the embodiment of the present invention, the following relationships among 11 persons are firstly:
the relationship between the couple (weight 0.5),
(Li somewhat, 102398XXXXXXXX 0740), (Zhao shan, 102398XXXXXX 0723)
(Zhao somebody 102398XXXXXX 0522), (Zhao shan, 212398XXXXXX 0723)
(Zhao Yi's father, 102398XXXXXX 0523), (Zhao Yi Ma, 102398XXXXXXXX 0723),
father/mother-son relationship (weight 0.3)
(Zhao Yi's father, 102398XXXXXX 0523), (Zhao shan, 102398XXXXXX 0723)
(Zhao Yi Ma, 102398XXXXXXX 0723), (Zhao shan, 102398XXXXXXX 0723)
Relationship between students (weight 0.1)
(Liu Gong, 432398XXXXXX 0723), (Zhao shan, 102398XXXXXX 0723)
(somebody 423193XXXXXX 0723), (Zhao shan, 212398XXXXXX 0723),
unrelated orphan points
(height, 212398XXXXXXX 0723),
(Zhao shan Chuan, 262398XXXXXXX 0723)
(Wan somewhat, 432398XXXXXXX 0723),
the algorithm in the above embodiment is then applied to recalculate the relevance score for each entity based on the weights etc
Figure RE-GDA0001314550180000111
And said
Figure RE-GDA0001314550180000112
An empirical value is taken for the behavior characteristic of the virus, alpha in a formula is set to be 0.7, which represents that the influence of the score generated by the attribute of the person on the final score is 70%, and the influence of the scores of other related entities on the entity is 30%. Then, after applying the formula to recalculate, the final score of each entity (the last column) is as follows:
(name, ID, score (q, d), score _ mx (q, d))
{ (Li somewhat, 102398 XXXXXXXXX 0740,0.03,0.03897843105874305046)
(Zhao shan, 102398XXXXXXX 0723,0.25,0.18294813956009582154
(Zhao Yi 'a'd, 102398XXXXXX 0523,0.04,0.04223929247593665204),
(Zhao Yi Ma, 102398XXXXXXX 723,0.04,0.04223929247593665204),
(Zhao shan, 212398XXXXXXX 0723,0.26,0.18799920012797927954),
(Zhaoqi, 102398XXXXXXX 522,0.04,0.04747258383239125145),
(Gao Zhi, 212398XXXXXXX 0723,0.04,0.028),
((Zhao mountain, 262398XXXXXXX 0723,0.25,0.175),
(Wan somewhat, 432398XXXXXXX 0723,0.02,0.014),
(Liu Gong, 432398XXXXXXX 0723,0.14,0.10174980469970625927),
(somebody, 423193XXXXXXX 0723,0.20,0.14389978031485070414) }
Then sorting according to final correlation (last column value):
(Zhao shan Chuan, 102398XXXXXXX 0723)
(Zhao shan Chuan, 212398XXXXXXX 0723)
(Zhao shan Chuan, 262398XXXXXXX 0723)
(Zhangyi, 102398XXXXXX 0522)
(Zhao Yi's father, 102398XXXXXX 0523)
(Zhao Yi Ma, 102398XXXXXXX 0723)
(Li somewhat, 102398XXXXXXX 0740)
(Gao Zhi, 212398XXXXXXX 0723)
(somebody square, 423193XXXXXXX 0723)
(Wangzhi somewhat, 432398XXXXXXXX 0723)
(Liu Gong, 432398XXXXXXX 0723)
The change of the sequence can be found, and the entity arranged in front has the danger of being involved in the virus proved by practice.
In order to simplify the problem, the number of entities taken in the example is small, and the number of data in the actual application environment is extremely large, so that the sorting optimization of the search results is very meaningful and can help the user to quickly locate the result records in which the user is interested.
In contrast, by adopting the searching method provided by the embodiment of the invention, the social relationship and the surrounding environment of the entity are considered, so that the correlation between the entity and the search can be more accurately reflected, and the user can be more accurately and quickly helped to locate the entity concerned by the user. The entity arranged in the front in the actual query result is proved to have higher risk and worse influence, is more concerned and interesting by the public security department, and accelerates the speed of case detection to a certain extent. Even the public security department can be helped to find hidden drug addicts in advance by reasonably adjusting the value of alpha.
In the embodiment of the invention, the incidence relation data between the entities is introduced into the calculation of the correlation between the entities and the query request, in the field of big data application, particularly in the field of entity behavior characteristic detection, the visibility of the search result is perfectly matched with the intention and the requirement of the user, the search result finally presented to the user has a more accurate sequence, the user can be quickly and accurately helped to locate the entity concerned by the user, and the satisfaction degree of the user on the query result is greatly improved.
Although the embodiments of the present invention have been described above, the above description is only for the convenience of understanding the present invention, and is not intended to limit the present invention. It will be apparent to those skilled in the art that various modifications and variations can be made in the form and details of the present invention without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (12)

1. A method of search optimization, comprising:
after receiving a query request, acquiring an entity set related to the query request;
for any entity in the entity set, acquiring an associated entity which has an association relationship with the entity in the entity set;
acquiring a first correlation between the entity and the query request and a second correlation between the associated entity corresponding to the entity and the query request, and determining a final correlation between the entity and the query request according to the first correlation and the second correlation;
wherein said determining a final relevance of the entity to the query request based on the first relevance and the second relevance comprises:
and determining the final relevance of the entity and the query request according to the first relevance, the second relevance, the set weight of the entity and the set relation weight value of the associated entity relative to the entity.
2. The method of claim 1, wherein the method further comprises: and after the final relevance of each entity in the entity set is determined, sorting the entities in the entity set according to the final relevance of each entity, and outputting the sorted entities.
3. The method of claim 1, wherein the obtaining the associated entity in the entity set having an association relationship with the entity comprises:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
4. The method of claim 3, wherein the predetermined condition is: the relationship weight value of the associated entity relative to the entity is greater than 0.
5. The method of claim 1, wherein:
the number of the association relations is a positive integer greater than or equal to 1, and the number of the associated entities corresponding to each entity is a natural number greater than or equal to 0.
6. The method of any of claims 1 to 5, wherein said determining a final relevance of the entity to the query request based on the first relevance and the second relevance comprises:
Figure FDA0002655548710000021
and said
Figure FDA0002655548710000022
The score _ mx (q, d) is a final relevance of the entity to the query request, the score (q, d) is a first relevance of the entity to the query request, the α is a weight of the entity, and 0 ≦ α ≦ 1, the score (q, dr)i) Associated entity dr for said entityiA second correlation to the query request, the boostiIs the associated entity driAnd i is 1-m relative to the relationship weight value of the entity.
7. A search optimization apparatus, comprising:
the search unit is used for acquiring an entity set relevant to the query request after receiving the query request;
an associated entity determining unit, configured to acquire, for any entity in the entity set, an associated entity in the entity set, where the associated entity has an association relationship with the entity;
the optimization unit is used for acquiring a first correlation between the entity and the query request, acquiring a second correlation between the associated entity corresponding to the entity and the query request, and determining a final correlation between the entity and the query request according to the first correlation and the second correlation;
wherein the determining, by the optimization unit, a final relevance of the entity to the query request based on the first relevance and the second relevance comprises:
determining a final relevance of the entity to the query request according to the first relevance, the second relevance, and a relationship weight value of the associated entity relative to the entity.
8. The apparatus of claim 7, wherein the apparatus further comprises: and the output unit is used for sequencing the entities in the entity set according to the final relevance of each entity and then outputting the sequenced entities after determining the final relevance of each entity in the entity set.
9. The apparatus of claim 7, wherein the obtaining of the associated entity having the association relationship with the entity in the entity set by the associated entity determining unit comprises:
and acquiring the associated entities of which the association relation with the entities in the entity set meets the preset conditions.
10. The apparatus of claim 9, wherein the preset condition is: the relationship weight value of the associated entity relative to the entity is greater than 0.
11. The apparatus of claim 7, wherein:
the number of the association relations is a positive integer greater than or equal to 1, and the number of the associated entities corresponding to each entity is a natural number greater than or equal to 0.
12. The apparatus of any of claims 7 to 11, wherein the optimization unit to determine a final relevance of the entity to the query request based on the first relevance and the second relevance comprises:
Figure FDA0002655548710000031
and said
Figure FDA0002655548710000032
The score _ mx (q, d) is a final relevance of the entity d to the query request q, the score (q, d) is a first relevance of the entity d to the query request q, the α is a weight of the entity d, and 0 ≦ α ≦ 1, the score (q, dr)i) Associated entity dr for said entity diA second correlation with the query request q, the boostiIs the associated entity driAnd the relation weight value of the entity d is 1-m.
CN201710124430.8A 2017-03-03 2017-03-03 Search optimization method and device Active CN108536709B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710124430.8A CN108536709B (en) 2017-03-03 2017-03-03 Search optimization method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710124430.8A CN108536709B (en) 2017-03-03 2017-03-03 Search optimization method and device

Publications (2)

Publication Number Publication Date
CN108536709A CN108536709A (en) 2018-09-14
CN108536709B true CN108536709B (en) 2021-04-30

Family

ID=63488570

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710124430.8A Active CN108536709B (en) 2017-03-03 2017-03-03 Search optimization method and device

Country Status (1)

Country Link
CN (1) CN108536709B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109344299A (en) * 2018-11-12 2019-02-15 考拉征信服务有限公司 Object search method, apparatus, electronic equipment and computer readable storage medium
CN110457313A (en) * 2019-07-12 2019-11-15 平安普惠企业管理有限公司 A kind of application configuration management method, server and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102063432A (en) * 2009-11-12 2011-05-18 阿里巴巴集团控股有限公司 Retrieval method and retrieval system
CN103678481A (en) * 2003-09-30 2014-03-26 雅虎公司 Method and apparatus for search scoring
CN104572651A (en) * 2013-10-11 2015-04-29 华为技术有限公司 Picture ordering method and device
CN104794163A (en) * 2015-03-25 2015-07-22 中国人民大学 Entity set extension method
CN105786969A (en) * 2016-02-01 2016-07-20 百度在线网络技术(北京)有限公司 Information display method and apparatus

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8793260B2 (en) * 2012-04-05 2014-07-29 Microsoft Corporation Related pivoted search queries
CN105068661B (en) * 2015-09-07 2018-09-07 百度在线网络技术(北京)有限公司 Man-machine interaction method based on artificial intelligence and system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103678481A (en) * 2003-09-30 2014-03-26 雅虎公司 Method and apparatus for search scoring
CN102063432A (en) * 2009-11-12 2011-05-18 阿里巴巴集团控股有限公司 Retrieval method and retrieval system
CN104572651A (en) * 2013-10-11 2015-04-29 华为技术有限公司 Picture ordering method and device
CN104794163A (en) * 2015-03-25 2015-07-22 中国人民大学 Entity set extension method
CN105786969A (en) * 2016-02-01 2016-07-20 百度在线网络技术(北京)有限公司 Information display method and apparatus

Also Published As

Publication number Publication date
CN108536709A (en) 2018-09-14

Similar Documents

Publication Publication Date Title
CN108846422B (en) Account number association method and system across social networks
RU2696230C2 (en) Search based on combination of user relations data
CN109815314B (en) Intent recognition method, recognition device and computer readable storage medium
JP6629246B2 (en) Learning and Using Context-Aware Content Acquisition Rules for Query Disambiguation
CN106445963B (en) Advertisement index keyword automatic generation method and device of APP platform
CN110019943B (en) Video recommendation method and device, electronic equipment and storage medium
CN105787025B (en) Network platform public account classification method and device
CN110781308B (en) Anti-fraud system for constructing knowledge graph based on big data
US20230153870A1 (en) Unsupervised embeddings disentanglement using a gan for merchant recommendations
GB2513472A (en) Resolving similar entities from a database
JP2014522540A (en) Microblog sequencing, search, display method and system
JP2013182338A (en) Document classification system and document classification method and document classification program
CN108932646B (en) User tag verification method and device based on operator and electronic equipment
CN107122438A (en) A kind of judicial case search method and system
Yao et al. SoRank: incorporating social information into learning to rank models for recommendation
CN117114514B (en) Talent information analysis management method, system and device based on big data
CN108536709B (en) Search optimization method and device
US20230004610A1 (en) Personalized whole search page organization and relevance
JP2015527677A (en) Social network search result presentation method and apparatus, and storage medium
AU2017236048A1 (en) Determining an emergent identity over time
CN111259115B (en) Training method and device for content authenticity detection model and computing equipment
KR100876214B1 (en) Apparatus and method for context aware advertising and computer readable medium processing the method
JP5050724B2 (en) Document monitoring program, document monitoring apparatus, and document monitoring method
CN111222566B (en) User attribute identification method, device and storage medium
US7716209B1 (en) Automated advertisement publisher identification and selection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant