WO2020019562A1

WO2020019562A1 - Search sorting method and device, electronic device, and storage medium

Info

Publication number: WO2020019562A1
Application number: PCT/CN2018/113348
Authority: WO
Inventors: 彭钊
Original assignee: 天津字节跳动科技有限公司
Priority date: 2018-07-27
Filing date: 2018-11-01
Publication date: 2020-01-30
Also published as: CN109063108B; CN109063108A

Abstract

A search sorting method and system, an electronic device and a storage medium. The method comprises: acquiring search keywords, and determining a plurality of initial search results that match a plurality of the keywords (S210); extracting a text similarity level, an update time dimension and a user association degree which are associated with each of the initial search results (S220); acquiring a corresponding text similarity level weight, update time dimension weight and user association degree weight according to the text similarity, the update time dimension and the user association degree, and performing fusion calculation on each of the initial search results according to the text similarity level weight, the update time dimension weight and the user association degree so as to obtain a comprehensive weight of each of the initial search results (S230); and sorting the plurality of initial search results according to the comprehensive weights (S240). By sorting initial search results of multiple columns, the described method may quickly find a target result, thereby saving operation time and improving search efficiency.

Description

Search sorting method, device, electronic equipment and storage medium

Cross-reference to related applications

This application claims the priority of China Patent Application No. “201810847290.1”, submitted by Tianjin BYTE Technology Co., Ltd. on July 27, 2018, with the application name “Search Sorting Method, Device, Electronic Equipment and Storage Medium”, which The entire contents of the application are incorporated herein by reference.

Technical field

The present application relates to the technical field of enterprise instant messaging systems, and in particular, to a search sorting method, device, electronic device, and storage medium.

Background technique

With the rapid development of smart devices, there are more and more chat application software. The use of chat application software can facilitate users to communicate in different places. The chat application software includes personal chat application software and enterprise chat application software. During the use of the enterprise chat application software, when a user needs to find relevant information, a search function is activated, such as searching for chat information, contacts, or group chat, in order to quickly find relevant information or quickly establish a chat link.

At present, when implementing the search function of enterprise chat application software, the following problems are found:

The search results of the enterprise chat application software are displayed separately according to different objects, such as contacts, group chats, messages, etc. are displayed in columns, and the displayed objects are sorted by time, and users are displayed according to the displayed columns. Finding relevant information is tedious and time-consuming.

Summary of the Invention

Based on this, it is necessary to provide a search sorting method, device, electronic device, and storage medium capable of sorting in multiple dimensions in response to the above technical problems.

A search ranking method, the method includes:

Acquiring search keywords, and determining a plurality of initial search results matching the plurality of keywords;

Extracting text similarity, update time dimension and user relevance degree related to each of the initial search results;

Obtain the corresponding text similarity weight, update time dimension weight and user relevance degree weight according to the text similarity, update time dimension and user relevance degree, and according to the text similarity weight, update time dimension weight and user relevance degree Performing weight calculation on each of the initial search results to obtain a comprehensive weight value of each of the initial search results;

Sort the plurality of initial search results according to the comprehensive weight.

A search sorting device, the device includes:

An initial search result extraction module, which obtains search keywords and determines a plurality of initial search results matching the plurality of keywords;

A feature factor extraction module that extracts text similarity, update time dimension, and user association degree related to each of the initial search results;

The weight calculation module obtains the corresponding text similarity weight, update time dimension weight, and user relevance degree weight according to the text similarity, update time dimension, and user association degree, and updates the time dimension according to the text similarity weight, Performing a fusion calculation on each of the initial search results with a weight and a weight of a degree of user association to obtain a comprehensive weight of each of the initial search results;

A sorting module sorts the plurality of initial search results according to the comprehensive weight.

An electronic device includes a memory and a processor. The memory stores a computer program. When the processor executes the computer program, the following steps are implemented:

A computer-readable storage medium stores a computer program thereon. When the computer program is executed by a processor, the following steps are implemented:

The above search sorting method, device, electronic device and storage medium ensure that the sorting is performed according to time by extracting and updating the time dimension parameters, ranking the initial search results that have common characteristics with the user according to the degree of user association, and sorting through multiple dimensions. Sorting the search results makes the sorting intelligent, making it easy for users to quickly find relevant information, simplifying operations and improving search efficiency.

BRIEF DESCRIPTION OF THE DRAWINGS

In order to explain the technical solutions in the embodiments of the present disclosure more clearly, one or more embodiments will be exemplarily described through the pictures in the accompanying drawings, and these exemplary descriptions do not limit the embodiments. ,among them:

FIG. 1 is an application environment diagram of a search ranking method in an embodiment; FIG.

FIG. 2 is a schematic flowchart of a search ranking method according to an embodiment; FIG.

3 is a schematic flowchart of a step of obtaining a text similarity weight in an embodiment;

4 is a schematic flowchart of a step of obtaining weights of an update time dimension according to an embodiment;

FIG. 5 is a schematic flowchart of a step of obtaining a user association degree weight in an embodiment; FIG.

FIG. 6 is a structural block diagram of a search and ranking device in an embodiment; FIG.

7 is a structural block diagram of a feature factor extraction module in an embodiment;

8 is a structural block diagram of a weight calculation module in an embodiment;

FIG. 9 is an internal structure diagram of an electronic device in an embodiment

FIG. 10 is a block diagram of a server search subject in an embodiment.

detailed description

In order to make the purpose, technical solution, and advantages of the present application clearer, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are only used to explain the application, and are not used to limit the application.

The search ranking method provided in this application can be applied to the application environment shown in FIG. 1. The terminal 102 communicates with the server 104 through the network through the network. Enter the search keywords in the terminal 102, the server 104 obtains the search keywords, and determines a plurality of initial search results that match the plurality of keywords; extracts the text similarity, update time dimension and user related to each of the initial search results Degree of relevance; obtaining corresponding text similarity weights, update time dimension weights, and user relevance degree weights according to the text similarity, update time dimension, and user relevance degrees, and according to the text similarity weights, update time dimension weights, and The user relevance degree weight is calculated by fusing each of the initial search results to obtain a comprehensive weight of each of the initial search results; the plurality of initial search results are sorted according to the comprehensive weight, and the ranking results are displayed in Terminal 102. The terminal 102 may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server 104 may be implemented by an independent server or a server cluster composed of multiple servers.

In one embodiment, as shown in FIG. 2, a search ranking method is provided. The method is applied to the server in FIG. 1 as an example, and includes the following steps:

Step 210: Obtain a search keyword, and determine multiple initial search results that match the multiple keywords.

The search keywords are input information such as words, words, and symbols entered by the user when using a search engine to find related information. The initial search results include multiple fields. Specifically, the initial search results refer to contacts or group chats. .

Specifically, a search keyword is input at the terminal, and the terminal obtains the search keyword input by the user and sends it to the server.

Step 220: Extract the text similarity, update time dimension and user association degree related to each of the initial search results.

The fields contained in each initial search result include: object type, object status, object name, initial recall search engine score, chat update time, last message location, object pinyin name, object English name, department information, or Multiple. Among them, the object types include chat applications and emails, and the object status includes whether to register or leave.

As a preferred embodiment, before extracting the text similarity, the update time dimension, and the degree of user relevance related to each of the initial search results, the method includes: filtering the initial search results. Wherein, the filtering the initial search results includes: the initial search results of the leaving users and no chat history are not sorted; and the initial search results of the unregistered users are ranked last. The chat history can be determined by the chat update time or the corresponding location of the latest message.

Step 230: Obtain corresponding text similarity weights, update time dimension weights, and user correlation degree weights according to the text similarity, update time dimension, and user association degree, and according to the text similarity weights, update time dimension weights, and The user's degree of relevance weight performs fusion calculation on each of the initial search results to obtain a comprehensive weight of each of the initial search results.

Among them, the text similarity weight is used to represent the degree of matching between the search keywords and the initial search results, the update time dimension weight is used to represent the initial search results chat record update, and the user relevance degree weight is used to indicate that the initial search results are of interest to multiple users aims.

Step 240: Sort the multiple initial search results according to the comprehensive weight.

Wherein, when sorting, the sorting can be performed according to the weight value from large to small, or the sorting can be performed according to the weight value from small to large. Using this technical solution does not distinguish the sorting methods according to the columns, but sorts according to the weights, so as to quickly find relevant information.

In this embodiment, the degree of user association is determined by common feature data of a user currently performing a search and the initial search result.

In the above search ranking method, the update time dimension parameters are extracted to ensure that the ranking is performed according to time, the initial search results that have common characteristics with the user are ranked higher by the degree of user association, and the search results are ranked by multiple dimensions. The sorting is intelligent, so that users can quickly find relevant information, simplifying operations and improving search efficiency.

In one embodiment, as shown in FIG. 3, the obtaining the text similarity weight includes:

S321. Calculate a hit rate, an order consistency index, a position closeness, and a coverage rate of the keywords in the initial search result.

S322. Calculate a text similarity weight according to the hit ratio, the order consistency index, the position closeness, and the coverage ratio.

In one embodiment, the step of calculating a text similarity weight according to the hit ratio, the order consistency index, the position closeness, and the coverage includes: according to the hit rate, the order consistency index, the position closeness, and the coverage. Obtain offset values and correction values respectively; perform fusion calculations according to the hit ratio, order consistency index, location closeness and coverage, and the offset values and correction values to obtain text similarity weights. The offset value and the correction value can be determined through machine learning. Wherein, obtaining the offset value and the correction value according to the hit rate, the order consistency index, the position closeness, and the coverage rate includes: obtaining the offset value and the correction value according to the hit rate, and obtaining the offset value and the correction value according to the order consistency index. An offset value and a correction value, an offset value and a correction value are obtained according to the position closeness index, and an offset value and a correction value are obtained according to the coverage ratio.

In one embodiment, the specific formula for calculating the text similarity weight is:

text_similar = (a * hit + b) * (c * sequence + d) * (e * position + f) * (g * cover + h); where text_similar is the text similarity weight, hit is the text hit rate, and sequence Is the order consistency index, position is the position closeness, and cover is the coverage. Among them, a, b are the offset values and correction values of the hit rate, c, d are the offset values and correction values of the order consistency index, e, f are the offset values and correction values of the position closeness, and g, h It is the offset value and correction value of the coverage rate. The larger the offset value is, the more important the item is. The text hit ratio indicates the ratio of the number of search keywords hitting in the corresponding text document to the total number of search keywords. Obviously, the higher the ratio, the closer the initial search result is to the search target. The order consistency index indicates the consistency of the order of the search keywords and the order of the search keywords that appear in the corresponding text documents. The order consistency is expressed by the ratio of the number of reverse orders, such as (1, 2, 3) reverse order The number is 0, which is the most ordered arrangement, and the number of (3, 2, 1) reverse order is 3, which is the most disordered arrangement. The position closeness indicates the ratio of the number of hit text documents to the sum of the number of hit text documents and the number of hit intervals. For example, the keyword "Zhang San Zhang Si Li Si", the initial search results of the hit "Zhang San", "Li "Group of four", the hit keyword "Zhang Sanli Si", the number of hits to the text document t is 2, and the sum of the number of hits is 1 (because there is a Zhang Si in the middle), so the position tightness = 2 / (1 +2) = 2/3. Coverage represents the ratio of hit keywords to the total field of all hit text documents.

In one embodiment, as shown in FIG. 4, the obtaining the update time dimension weights includes:

S421. Obtain a time interval between the last chat time and the current time according to the initial search result.

S422. Calculate the ratio of the attenuation constant to the sum of the time interval and the attenuation constant to obtain the chat update time weight.

In one embodiment, the calculation formula of the update time dimension weight is as follows:

update_time_weight = factor / (factor + update_time_secs);

Among them, update_time_weight is the weight of the update time dimension, factor is a constant that decays with time, and the unit is second. Here, it is calculated based on a 30-day decay, factor = 30 * 24 * 3600 = 2592000. update_time_secs is the number of seconds from the last chat time to the present. For example, the last chat time was 30 days ago, then update_time_secs = 30 * 24 * 3600 = 259200, then the update time dimension update_time_weight = 259200 / (259200 + 259200) = 1/2 .

In one embodiment, as shown in FIG. 5, the acquiring user association degree weight includes:

S521. Calculate the initial search result and the number of common contacts, common department characteristic values, common office location characteristic values, and common personal tags that are currently being searched;

S522. Calculate a user association degree weight according to the number of common contacts, a characteristic value of a common department, a characteristic value of a common office location, and a number of common personal tags.

In one embodiment, the step of calculating the weight of the degree of user association based on the number of common contacts, common department characteristic values, common office location characteristic values, and common personal tags includes: Obtain the offset value and the correction value of the department characteristic value, the characteristic value of the common office location and the number of the common personal tags, respectively; The shift value and the correction value are fused and calculated to obtain the user relevance degree weight. The offset value and the correction value can be determined through machine learning. Wherein, obtaining the offset value and the correction value according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags includes: obtaining the offset value and the correction value according to the number of the common contacts, An offset value and a correction value are obtained according to the characteristic value of the common department, an offset value and a correction value are obtained according to the characteristic value of the common office location, and an offset value and a correction value are obtained according to the number of the common personal tags.

Among them, the degree of user association is used to describe the common characteristics of users and contacts. Common characteristics include: people who have been connected, departments, offices, and personal tags. Users refer to users who perform searches, contacts The person refers to the contact corresponding to the initial search result. For example, if there are many contacts between User A and Contact B, it means that User A and Contact B are highly related, and User A and Contact B have not yet established a contact, but there are many common characteristics, then Contact B is User A tends to search for objects. By calculating the degree of user association, the user's personalized search can be satisfied, and contacts with the same characteristics as the user are ranked higher.

In one of the embodiments, the degree of user association is mined by offline data, and calculated by a plurality of common characteristics. The specific calculation formula for the user association degree weight is as follows:

user_relevant_weight = (i * same_user_num + j) * (k * same_department + l) * (m * same_place + n) * (o * same_tag + p);

Among them, user_relevant_weight is the weight of the user's association degree; same_user_num is the number of common contacts, and the number of common contacts represents the number of common contacts of the contact corresponding to the initial search result of the subject and the value is an integer greater than 0; same_department is Common department characteristic value, when it is located in the same department, the value is 1, but not in the same department, the value is 0; same_place is the characteristic value of the common office location, when it is located in the same office, the value is 1, but not in the same office. The value of place is 0; same_tag is the number of common personal tags, which means that users have the same number of tags. If they all have the same "travel reading" tag, the value of same_tag is 2. Among them, i, j are the offset values and correction values of the number of common contacts, k, l are the offset values and correction values of the common department characteristic values, and m, n are the offset values and correction values of the common office location characteristic values, o and p are the offset value and the correction value of the number of common personal tags, where a larger offset value indicates that the item is more important.

In one embodiment, performing the fusion calculation according to the text similarity weight, the update time dimension weight, and the user relevance degree weight, and obtaining the comprehensive weight of each of the initial search results includes: weighting the text similarity The update time dimension weight and the user relevance degree weight are normalized to decimals between 0 and 1. According to the normalized text similarity weight, the update time dimension weight and the user relevance degree weight, a fusion calculation is performed to obtain each A comprehensive weight of the initial search result.

In one embodiment, the corresponding text similarity weight, update time dimension weight, and user correlation degree weight are obtained according to the text similarity, update time dimension, and user association degree, and according to the text similarity weight, Performing a fusion calculation on each of the initial search results by updating the weight of the time dimension and the weight of the degree of user association, and obtaining a comprehensive weight of each of the initial search results includes: according to the text similarity, the update time dimension, and the degree of user association, Calculate text similarity weight, update time dimension weight, and user relevance degree weight; obtain offset and correction values respectively according to the text similarity weight, update time dimension weight, and user relevance degree weight; calculate text similarity weight, update respectively The product of the time dimension weight and the user's relevance degree weight and the offset value corresponding to it is then added to the sum of the corresponding correction value to obtain a fusion coefficient; the fusion coefficient is multiplied to obtain each of the initial search results Comprehensive weight. The offset value and the correction value can be determined through machine learning. Wherein obtaining the offset value and the correction value according to the text similarity weight, the update time dimension weight, and the user relevance degree weight respectively includes: obtaining the offset value and the correction value according to the text similarity weight, and obtaining the offset value and the correction value according to the update time dimension weight. Offset value and correction value, and obtain offset value and correction value according to the user's correlation degree weight.

In a specific embodiment, the formula for calculating the comprehensive weight is as follows:

weight = (a1 * text_weight + b1) * (a2 * update_time_weight + b2) * (a3 * user_relevant_weight + b3)

Among them, weight represents the comprehensive weight of the initial search result, text_weight represents the text similarity weight, update_time_weight represents the chat update time weight, user_relevant_weight represents the user relevance degree weight a1 is the offset value, b1 is the correction value, and a1 * text_weight + b1 is calculated to obtain the first A fusion coefficient; update_time_weight represents the update time dimension weight, a2 is the offset value, b2 is the correction value, and a2 * update_time_weight + b2 is calculated to obtain the second fusion coefficient; multiple fusion coefficients are multiplied to obtain the comprehensive weight of the initial search result. In the formula, a1, a2, and a3 are offset values, and b1, b2, and b3 are correction values.

It should be understood that although the steps in the flowchart of FIG. 2-5 are sequentially displayed according to the directions of the arrows, these steps are not necessarily performed sequentially in the order indicated by the arrows. Unless explicitly stated in this document, the execution of these steps is not strictly limited, and these steps can be performed in other orders. Moreover, at least a part of the steps in FIG. 2-5 may include multiple sub-steps or stages. These sub-steps or stages are not necessarily performed at the same time, but may be performed at different times. These sub-steps or stages The execution order of is not necessarily performed sequentially, but may be performed in turn or alternately with at least a part of another step or a sub-step or stage of another step.

In one embodiment, as shown in FIG. 6, a search ranking device is provided, which includes: an initial retrieval result extraction module 601, a feature factor extraction module 602, a weight calculation module 603, and a ranking module 604, where:

An initial search result extraction module 601 is configured to obtain a search keyword and determine a plurality of initial search results that match the plurality of keywords.

A feature factor extraction module 602 is configured to extract a text similarity, an update time dimension, and a user association degree related to each of the initial search results.

The initial search result is a text document matching the search keywords; the text similarity, update time dimension, and user relevance are obtained from the initial search results, and some information related to the keywords is extracted according to the text document.

As a preferred embodiment, the search and ranking device further includes: a filtering module, configured to filter the initial search result. Wherein, the filtering the initial search results includes: the initial search results of the leaving users and no chat history are not sorted; and the initial search results of the unregistered users are ranked last. The chat history can be determined by the chat update time or the corresponding location of the latest message.

A weight calculation module 603, configured to obtain corresponding text similarity weights, update time dimension weights, and user correlation degree weights according to the text similarity, update time dimension, and user association degree, and according to the text similarity weights, The update time dimension weight and the user relevance degree weight and the text similarity parameter, the update time dimension parameter, and the user relevance degree parameter are subjected to fusion calculation to obtain a comprehensive weight value of each of the initial search results.

A sorting module 604 is configured to sort the multiple initial search results according to the comprehensive weight.

Among them, the initial search result is aimed at contacts or groups. The fields contained in each initial search result include: one or more of object type, object status, object name, initial recall search engine score, chat update time, last message location, object pinyin name, object English name, and department information Species. Among them, the object types include chat applications and emails, and the object status includes whether to register or leave.

In one embodiment, as shown in FIG. 7, the feature factor extraction module 602 includes a text similarity weight calculation unit 701, an update time dimension weight calculation unit 702, and a user relevance degree weight calculation unit 703, where:

The text similarity weight calculation unit 701 is configured to calculate a hit rate, an order consistency index, a location closeness, and a coverage rate of the keywords in the initial search result, and according to the hit rate, the order consistency index, Position compactness and coverage, and calculate text similarity weights.

In one embodiment, the text similarity weight calculation unit includes: a first offset value and a correction value acquisition subunit, configured to obtain offsets respectively according to the hit ratio, the order consistency index, the position closeness, and the coverage ratio. Value and correction value; a text similarity fusion calculation subunit, configured to perform fusion calculation according to the hit ratio, order consistency index, position closeness and coverage, and the offset value and correction value to obtain a text similarity weight . The offset value and the correction value can be determined through machine learning. Wherein, obtaining the offset value and the correction value according to the hit rate, the order consistency index, the position closeness, and the coverage rate includes: obtaining the offset value and the correction value according to the hit rate, and obtaining the offset value and the correction value according to the order consistency index. An offset value and a correction value, an offset value and a correction value are obtained according to the position closeness index, and an offset value and a correction value are obtained according to the coverage ratio.

An update time dimension weight calculation unit 702 is configured to obtain a time interval between the last chat time and the current time according to the initial search result, and calculate a ratio of the attenuation constant to the sum of the time interval and the attenuation constant to obtain the The chat update time weight.

In one embodiment, the formula for calculating the update time dimension weight is as follows:

update_time_weight = factor / (factor + update_time_secs);

The user association degree weight calculation unit 703 is configured to calculate the number of common contacts, common department characteristic values, common office location characteristic values, and common personal tags in the initial search result and the current search, and according to the number of common contacts , The characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags, and calculate the user's relevance degree weight.

The user association degree weight calculation unit 703 includes: a second offset value and a correction value acquisition subunit, configured to obtain offsets respectively according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags. Value and correction value; a user correlation degree fusion calculation subunit, configured to perform fusion calculation according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, the number of common personal tags, and the offset value and the correction value To get the user relevance weight. The offset value and the correction value can be determined through machine learning. Wherein, obtaining the offset value and the correction value according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags includes: obtaining the offset value and the correction value according to the number of the common contacts, An offset value and a correction value are obtained according to the characteristic value of the common department, an offset value and a correction value are obtained according to the characteristic value of the common office location, and an offset value and a correction value are obtained according to the number of the common personal tags.

Among them, the degree of user association is used to describe the common characteristics of users and contacts. Common characteristics include: people who have been connected, departments, offices, and personal tags. Users refer to users who perform searches, contacts The person refers to the contact corresponding to the initial search result. For example, there are many people that have been contacted between User A and Contact B, indicating that User A and Contact B are highly relevant, and User A and Contact B have not yet established a contact, but there are many common characteristics, then Contact B is User A tends to search for objects. By calculating the degree of user association, the user's personalized search can be satisfied, and contacts with the same characteristics as the user are ranked higher.

Among them, user_relevant_weight is the weight of the user's association degree; same_user_num is the number of common contacts, and the number of common contacts represents the number of common contacts of the contact corresponding to the initial search result of the subject and the value is an integer greater than 0; Common department characteristic value, when it is located in the same department, the value is 1, but not in the same department, the value is 0; same_place is the characteristic value of the common office location, when it is located in the same office, the value is 1, but not in the same office. The value of place is 0; same_tag is the number of common personal tags, which means that users have the same number of tags. If they all have the same "travel reading" tag, the value of same_tag is 2. Among them, i, j are the offset values and correction values of the number of common contacts, k, l are the offset values and correction values of the common department characteristic values, and m, n are the offset values and correction values of the common office location characteristic values, o and p are the offset value and the correction value of the number of common personal tags, where a larger offset value indicates that the item is more important.

In one embodiment, the weight calculation module 603 includes:

A normalization unit 801, configured to normalize the text similarity weight, the update time dimension weight, and the user association degree weight to decimals between 0 and 1;

A fusion calculation unit 802 is configured to perform fusion calculation according to the normalized text similarity weight, update time dimension weight, and user association degree weight to obtain a comprehensive weight of each of the initial search results.

In one embodiment, the weight calculation module includes a weight acquisition unit configured to calculate a text similarity weight, an update time dimension weight, and a user correlation degree weight according to the text similarity, an update time dimension, and a user association degree. An offset value and a correction value acquisition unit, configured to obtain an offset value and a correction value respectively according to the text similarity weight, an update time dimension weight, and a user relevance degree weight; a fusion coefficient calculation unit to calculate a text similarity weight, The product of the update time dimension weight and the user association degree weight and the offset value corresponding to it is added to the sum of the corresponding correction value to obtain a fusion coefficient; a comprehensive weight calculation unit is configured to multiply the fusion coefficient To obtain a comprehensive weight of each of the initial search results.

The above-mentioned search sorting device ensures that the sorting is performed according to time by extracting and updating the time dimension parameters, ranking the initial search results that have common characteristics with the user according to the degree of user association, and sorting the search results through multiple dimensions, so that Intelligent sorting makes it easy for users to quickly find relevant information, which simplifies operations and improves search efficiency.

For the specific limitation of the search sorting device, refer to the foregoing limitation on the search sorting method, which is not described herein again. Each module in the search sorting device can be implemented in whole or in part by software, hardware, and a combination thereof. The above-mentioned modules may be embedded in the hardware or independent of the processor in the electronic device, or may be stored in the memory of the electronic device in the form of software, so that the processor can call and execute the operations corresponding to the above modules.

In one embodiment, an electronic device is provided. The electronic device may be a server, and the internal structure diagram may be as shown in FIG. 9. The electronic device includes a processor, a memory, a network interface, and a database connected through a system bus. The processor of the electronic device is used to provide computing and control capabilities. The memory of the electronic device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for running an operating system and computer programs in a non-volatile storage medium. The database of the electronic device is used to store search-sorted data. The network interface of the electronic device is used to communicate with an external terminal through a network connection. The computer program is executed by a processor to implement a search ranking method.

Those skilled in the art can understand that the structure shown in FIG. 9 is only a block diagram of a part of the structure related to the solution of the present application, and does not constitute a limitation on the electronic device to which the solution of the present application is applied. The specific electronic device may be Include more or fewer parts than shown in the figure, or combine certain parts, or have a different arrangement of parts.

In one embodiment, as shown in FIG. 10, ElasticSearch (hereinafter referred to as ES) is an open source distributed search engine. ES is used for data storage, and it can quickly recall the matching initial search results by establishing an inverted index; Search is used to pass the search request issued by the application layer to the ES and obtain the initial search results corresponding to the search request; Ranker is used to combine the initial search results with the text similarity, update time dimension and user association degree to perform comprehensive weight calculation And sort, and return the sorted results to the Searcher. The initial search results of the ES recall include the initial recall search engine scores. The initial recall search engine scores cannot meet the needs of multi-dimensional sorting. Using the search ranking method of the embodiment of the present invention can sort the initial search results. Search and Ranker can be implemented through the server.

In one embodiment, an electronic device is provided, which includes a memory and a processor. A computer program is stored in the memory, and the processor executes the computer program to implement the following steps: acquiring a search keyword, and determining a relationship with the plurality of keywords. Matching multiple initial search results; extracting text similarity, update time dimension, and user relevance degree related to each of the initial search results; obtaining corresponding text similarity according to the text similarity, update time dimension, and user relevance degree Degree weight, update time dimension weight, and user association degree weight, and fuse according to the text similarity weight, update time dimension weight and user association degree weight, and the text similarity parameter, update time dimension parameter, and user association degree parameter. Calculate to obtain an integrated weight of each of the initial search results; and sort the plurality of initial search results according to the integrated weight.

In one embodiment, a computer-readable storage medium is provided on which a computer program is stored. When the computer program is executed by a processor, the following steps are implemented: obtaining a search keyword, and determining a plurality of keywords matching the plurality of keywords. Initial search results; extracting text similarity, update time dimension, and user relevance degree associated with each of the initial search results; obtaining corresponding text similarity weights, based on the text similarity, update time dimension, and user relevance degree, Update the time dimension weight and user relevance degree weight, and perform fusion calculation based on the text similarity weight, update time dimension weight and user relevance degree weight, and the text similarity parameter, update time dimension parameter, and user relevance degree parameter to obtain A comprehensive weight of each of the initial search results; and ranking the plurality of initial search results according to the comprehensive weight.

A person of ordinary skill in the art can understand that all or part of the processes in the methods of the foregoing embodiments can be implemented by using a computer program to instruct related hardware. The computer program can be stored in a non-volatile computer-readable storage. In the medium, the computer program, when executed, may include the processes of the embodiments of the methods described above. Wherein, any reference to the memory, storage, database or other media used in the embodiments provided in this application may include non-volatile and / or volatile memory. Non-volatile memory may include read-only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory. Volatile memory can include random access memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in various forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), dual data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Synchlink DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined. In order to make the description concise, all possible combinations of the technical features in the above embodiments have not been described. However, as long as there is no contradiction in the combination of these technical features, they should be It is considered to be the range described in this specification.

The above-mentioned embodiments only express several implementation manners of the present application, and their descriptions are more specific and detailed, but they cannot be understood as limiting the scope of the invention patent. It should be noted that, for those of ordinary skill in the art, without departing from the concept of the present application, several modifications and improvements can be made, which all belong to the protection scope of the present application. Therefore, the protection scope of this application patent shall be subject to the appended claims.

Claims

A search ranking method, characterized in that the method includes:

Acquiring search keywords, and determining a plurality of initial search results matching the plurality of keywords;

Extracting text similarity, update time dimension and user relevance degree related to each of the initial search results;

Obtain the corresponding text similarity weight, update time dimension weight and user relevance degree weight according to the text similarity, update time dimension and user relevance degree, and according to the text similarity weight, update time dimension weight and user relevance degree Performing weight calculation on each of the initial search results to obtain a comprehensive weight value of each of the initial search results;

Sort the plurality of initial search results according to the comprehensive weight.
The method according to claim 1, wherein the obtaining the text similarity weight comprises:

Calculating a hit rate, an order consistency index, a location closeness, and a coverage rate of the keywords in the initial search result;

The text similarity weight is calculated according to the hit ratio, the order consistency index, the position closeness, and the coverage ratio.
The method according to claim 2, wherein the step of calculating a text similarity weight according to the hit ratio, order consistency index, location closeness, and coverage ratio comprises:

Obtaining an offset value and a correction value respectively according to the hit ratio, the order consistency index, the position closeness, and the coverage ratio;

Fusion calculation is performed according to the hit ratio, the order consistency index, the position closeness and coverage, and the offset value and the correction value to obtain a text similarity weight.
The method according to claim 1, wherein the obtaining the update time dimension weight comprises:

Obtaining the time interval between the last chat time and the current time according to the initial search result;

Calculate the ratio of the attenuation constant to the sum of the time interval and the attenuation constant to obtain the chat update time weight.
The method according to claim 1, wherein the obtaining a user association degree weight comprises:

Calculating the initial search result and the number of common contacts, common department characteristic values, common office location characteristic values, and common personal tags currently being searched;

According to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags, the weight of the user association degree is calculated.
The method according to claim 5, wherein the step of calculating the weight of the degree of user association based on the number of common contacts, characteristic values of common departments, characteristic values of common office locations, and number of common personal tags comprises:

Obtaining an offset value and a correction value according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, and the number of common personal tags;

Fusion calculation is performed according to the number of the common contacts, the characteristic value of the common department, the characteristic value of the common office location, the number of common personal tags, and the offset value and the correction value to obtain the user relevance degree weight.
The method according to any one of claims 1-6, wherein the fusion calculation is performed according to the text similarity weight, the update time dimension weight, and the user relevance degree weight to obtain each of the initial search results. Comprehensive weights include:

Normalizing the text similarity weight, update time dimension weight, and user relevance degree weight to decimals between 0 and 1;

Fusion calculation is performed according to the normalized text similarity weight, update time dimension weight, and user relevance degree weight to obtain a comprehensive weight of each of the initial search results.
The method according to any one of claims 1 to 6, wherein the corresponding text similarity weight, update time dimension weight, and user association are obtained according to the text similarity, update time dimension, and user association degree. Degree weight, and performing fusion calculation on each of the initial search results according to the text similarity weight, update time dimension weight, and user relevance degree weight, and obtaining a comprehensive weight of each of the initial search results includes:

Calculating a text similarity weight, an update time dimension weight, and a user relevance degree weight according to the text similarity, an update time dimension, and a user relevance degree;

Obtaining an offset value and a correction value according to the text similarity weight, the update time dimension weight, and the user relevance degree weight, respectively;

Calculate a text similarity weight, an update time dimension weight, a user correlation weight, a product of the offset value corresponding to the product weight, and a sum of the correction value corresponding to the fusion coefficient to obtain a fusion coefficient;

The fusion coefficients are multiplied to obtain a comprehensive weight of each of the initial search results.
The method according to any one of claims 1-8, wherein before extracting the text similarity, update time dimension, and user relevance degree related to each of the initial search results comprises:

Screening the initial search results includes:

Do not sort the initial search results of departing users without chat history;

The initial search results of unregistered users are ranked last.
A search sorting device, characterized in that the device comprises:

An initial search result extraction module, which obtains search keywords and determines a plurality of initial search results matching the plurality of keywords;

A feature factor extraction module that extracts text similarity, update time dimension, and user association degree related to each of the initial search results;

The weight calculation module obtains the corresponding text similarity weight, update time dimension weight, and user relevance degree weight according to the text similarity, update time dimension, and user association degree, and updates the time dimension according to the text similarity weight, Performing a fusion calculation on each of the initial search results with a weight and a weight of a degree of user association to obtain a comprehensive weight of each of the initial search results;

A sorting module sorts the plurality of initial search results according to the comprehensive weight.
An electronic device includes a memory and a processor. The memory stores a computer program, wherein the processor implements the steps of the method according to any one of claims 1 to 9 when the processor executes the computer program.
A computer-readable storage medium having stored thereon a computer program, characterized in that when the computer program is executed by a processor, the steps of the method according to any one of claims 1 to 9 are implemented.