WO2022262849A1 - 搜索结果输出方法、装置、计算机设备及可读存储介质 - Google Patents

搜索结果输出方法、装置、计算机设备及可读存储介质 Download PDF

Info

Publication number
WO2022262849A1
WO2022262849A1 PCT/CN2022/099454 CN2022099454W WO2022262849A1 WO 2022262849 A1 WO2022262849 A1 WO 2022262849A1 CN 2022099454 W CN2022099454 W CN 2022099454W WO 2022262849 A1 WO2022262849 A1 WO 2022262849A1
Authority
WO
WIPO (PCT)
Prior art keywords
search
search results
queue
result queue
matching degree
Prior art date
Application number
PCT/CN2022/099454
Other languages
English (en)
French (fr)
Inventor
苑爱泉
Original Assignee
浙江口碑网络技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江口碑网络技术有限公司 filed Critical 浙江口碑网络技术有限公司
Publication of WO2022262849A1 publication Critical patent/WO2022262849A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9537Spatial or temporal dependent retrieval, e.g. spatiotemporal queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0623Item investigation
    • G06Q30/0625Directed, with specific intent or strategy
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q30/00Commerce
    • G06Q30/06Buying, selling or leasing transactions
    • G06Q30/0601Electronic shopping [e-shopping]
    • G06Q30/0641Shopping interfaces

Definitions

  • the present application relates to the technical field of the Internet, in particular to a search result output method, device, computer equipment and readable storage medium.
  • the search platform will query a series of results related to the search content according to the search content input by the user, estimate the CTR (Click-Through-Rate, click-through rate) of these results, and then perform the CTR (Click-Through-Rate) prediction according to the preset standard.
  • the evaluation results are evaluated, and then these results are sorted, and the sorted results are output to the user for viewing.
  • the search results output after sorting information based on preset criteria have limitations, and the top search results may not meet the real search needs of users, resulting in inaccurate search results and poor versatility.
  • the present application provides a search result output method, device, computer equipment and readable storage medium.
  • the main purpose is to solve the problem that the current search results have limitations, and the top search results may not meet the real search needs of users, resulting in inaccurate and poor versatility of the output search results.
  • a method for outputting search results comprising:
  • the initial result queue including a plurality of search results related to the search content determined based on the search content input by the user;
  • said acquiring the initial result queue includes:
  • CTR click-through rate
  • dividing the multiple search results into Multiple information groups including:
  • the semantic matching degree indicates the degree of semantic relevance between the search result and the search content
  • the scene matching degree indicates the relevance of the search result to the search scene degree
  • the determining the semantic matching degree and scene matching degree of the search result includes:
  • the determining a plurality of semantic matching degree value intervals and a plurality of scene matching degree value intervals includes:
  • Counting the first preset number of sample parameters, performing semantic training on the first preset number of sample parameters based on the machine learning depth model, and obtaining a plurality of semantic matching degree values, according to the value of the plurality of semantic matching degree values Construct the value intervals of the multiple semantic matching degrees based on the size relationship between them, and perform scene training on the first preset number of sample parameters based on the machine learning depth model to obtain a plurality of scene matching degree values, according to the set
  • the value intervals of the multiple scene matching degrees are constructed based on the size relationship between the multiple scene matching degree values.
  • the sorting of the search results included in each of the multiple information groups according to multiple preset sorting factors to obtain an intermediate result queue includes:
  • the plurality of sorted information groups are combined in descending order of grouping levels corresponding to the plurality of information groups to obtain the intermediate result queue.
  • the calculation of the relevance score of each search result included in the information group based on the plurality of preset ranking factors includes:
  • the acquiring multiple factor weights corresponding to the multiple preset ranking factors includes:
  • the search intent, the search industry, and each of the preset ranking factors are respectively trained to obtain the multiple factor weights.
  • adjusting the order of the search results included in the intermediate result queue based on the correlation between the initial result queue and the search results of the same rank in the intermediate result queue to obtain the target result queue includes :
  • the adjusted intermediate result queue is input to the evaluator for evaluation, and the queue score output by the evaluator is obtained, and the adjusted intermediate result queue is marked by using the queue score, and the evaluator is based on multiple
  • a sample queue is trained and indicates a score for each sample queue in the plurality of sample queues;
  • Determining a second preset number intercepting search results whose quantity satisfies the second preset number at the head of the target intermediate result queue, and using the search results whose number satisfies the second preset number as the target result queue .
  • the method also includes:
  • a search result output device includes:
  • An acquisition module configured to acquire an initial result queue, where the initial result queue includes a plurality of search results related to the search content determined based on the search content input by the user;
  • a dividing module configured to divide the multiple search results into multiple search results according to the matching degree between each search result of the multiple search results and the search content and the search scene where the user performs the search. information group;
  • a sorting module configured to sort the search results included in each of the multiple information groups according to a plurality of preset sorting factors to obtain an intermediate result queue
  • An adjustment module configured to adjust the order of the search results included in the intermediate result queue based on the correlation between the initial result queue and the search results of the same rank in the intermediate result queue to obtain a target result queue
  • An output module configured to output the target result queue.
  • the obtaining module is configured to receive the search content input by the user; parse the search content, query the multiple search results related to the search content; Estimating the click-through rate (CTR), outputting multiple estimated click-through rates of the multiple search results; sorting the multiple search results according to the multiple estimated click-through rates to obtain the initial result queue.
  • CTR click-through rate
  • the division module is configured to determine a plurality of semantic matching degree value intervals and a plurality of scene matching degree value intervals; for each search result in the plurality of search results, determine the semantic matching of the search result degree and scene matching degree, the semantic matching degree indicates the degree of semantic correlation between the search result and the search content, and the scene matching degree indicates the degree of relevance between the search result and the search scene; the search result's The semantic matching degree and the scene matching degree are compared with the multiple semantic matching degree value intervals and the multiple scene matching degree value intervals; the semantic matching degree and the scene matching degree are all at the same value The search results of the intervals are divided into the same group to obtain the plurality of information groups.
  • the dividing module is configured to, for each of the plurality of search results, inquire about the semantic matching degree between the search result and the search content; determine that the user inputs the The geographic location where the content is searched and the store location of the store related to the search result; calculating the location distance between the geographic location and the store location, and obtaining the scene matching degree indicated by the location distance.
  • the division module is configured to query a preset division standard, and extract the multiple semantic matching degree value intervals and the multiple scene matching degree value intervals from the preset division standard; or, Counting the first preset number of sample parameters, performing semantic training on the first preset number of sample parameters based on the machine learning depth model, and obtaining a plurality of semantic matching degree values, according to the value of the plurality of semantic matching degree values Construct the value intervals of the multiple semantic matching degrees based on the size relationship between them, and perform scene training on the first preset number of sample parameters based on the machine learning depth model to obtain a plurality of scene matching degree values, according to the set The value intervals of the multiple scene matching degrees are constructed based on the size relationship between the multiple scene matching degree values.
  • the ranking module is configured to, for each information group in the plurality of information groups, calculate the relevance score of each search result included in the information group based on the plurality of preset ranking factors; Sorting all the search results included in the information group in descending order of the correlation score to obtain the sorted information group; Combine the sorted information groups to obtain the intermediate result queue.
  • the sorting module is configured to, for each of the search results included in the information group, query the multiple factor scores of the search result on the multiple preset sorting factors; obtain the multiple A plurality of factor weights corresponding to the ranking factors are preset, and weight calculation is performed on the plurality of factor scores based on the plurality of factor weights to obtain the correlation score of the search result.
  • the sorting module is configured to, for each of the plurality of preset sorting factors, query the search intent and search industry corresponding to the search content input by the user; based on a linear function, respectively
  • the search intent, the search industry, and each of the preset ranking factors are trained to obtain the multiple factor weights.
  • the adjustment module is configured to compare the two search results for every two search results in the same position in the initial result queue and the intermediate result queue, and determine the two search results the correlation between the two search results; according to the correlation between the two search results, determine the sequence of the two search results and adjust the intermediate result queue to obtain the adjusted intermediate result queue;
  • the adjusted intermediate result queue is input to the evaluator for evaluation, and the queue score output by the evaluator is obtained, and the adjusted intermediate result queue is marked by using the queue score, and the evaluator is based on multiple sample queues for training and indicating the score of each sample queue in the plurality of sample queues; repeat the above-mentioned comparison process, compare the adjusted intermediate result queue with the initial result queue and Readjust the adjusted intermediate result queue until the number of adjustments reaches the preset number of rotations, and obtain queue scores whose quantity satisfies the preset number of rotations; extract queues from the queue scores whose number meets the preset number of rotations The target queue score with the highest score and the target intermediate result queue
  • the device also includes:
  • a query module configured to query a preset regulation strategy, and determine regulation requirements of the preset regulation strategy
  • the adjustment module is further configured to adjust the order of the search results included in the target result queue according to the control requirements, so as to obtain the regulated target result queue;
  • the output module is further configured to output the regulated target result queue.
  • a computer device including a memory and a processor, the memory stores a computer program, and when the processor executes the computer program, the method described in any one of the above-mentioned first aspects is implemented A step of.
  • a readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the method described in any one of the above-mentioned first aspects are implemented.
  • the application provides a search result output method, device, computer equipment, and readable storage medium. After the application obtains the initial result queue according to the user's search content, according to each search result in the initial result queue The matching degree between the results and the search content and the search scene where the user is searching, divides the search results in the initial result queue into multiple information groups, and searches for multiple information groups according to multiple preset sorting factors The results are sorted within the group, and multiple sorted information groups are integrated to obtain an intermediate result queue.
  • the order of the search results included in the intermediate result queue is adjusted, and the target result queue is obtained and output, so that the scene where the user is in when searching As well as the correlation between search results are considered in the sorting process of the target result queue, to realize the diversified sorting of search results, to ensure that the top search results in the target result queue fit the user's search needs, and to improve the search output
  • the accuracy and versatility of the results are good.
  • FIG. 1 shows a schematic flowchart of a method for outputting search results provided by an embodiment of the present application
  • FIG. 2A shows a schematic flowchart of a method for outputting search results provided by an embodiment of the present application
  • FIG. 2B shows a schematic diagram of a method for outputting search results provided by an embodiment of the present application
  • FIG. 3A shows a schematic structural diagram of a search result output device provided by an embodiment of the present application
  • FIG. 3B shows a schematic structural diagram of a search result output device provided by an embodiment of the present application.
  • FIG. 4 shows a schematic diagram of an apparatus structure of a computer device provided by an embodiment of the present application.
  • the embodiment of the present application provides a search result output method, as shown in Figure 1, the method includes:
  • the initial result queue includes a plurality of search results related to the search content determined based on the search content input by the user.
  • the initial result queue is obtained according to the user's search content, according to the matching degree between each search result in the initial result queue and the search content and the search scene where the user searches, the The search results in the initial result queue are divided into multiple information groups, and the search results included in the multiple information groups are sorted according to multiple preset sorting factors, and the sorted multiple information groups are integrated to obtain intermediate results queue.
  • the order of the search results included in the intermediate result queue is adjusted, and the target result queue is obtained and output, so that the scene where the user is in when searching As well as the correlation between search results are considered in the sorting process of the target result queue, to realize the diversified sorting of search results, to ensure that the top search results in the target result queue fit the user's search needs, and to improve the search output
  • the accuracy and versatility of the results are good.
  • the embodiment of the present application provides a search information output method, as shown in Figure 2A, the method includes:
  • the search platform can provide users with the largest entrance for information search, and is an important link between users and information. As the most commonly used search tool, the search platform has become an indispensable part of people's lives. When the search platform provides users with search services, a large number of relevant search results will be found according to the search content entered by the user. Considering that the search results are good or bad, the search platform is designed with a sorting mechanism, which will be sorted according to The sorting mechanism sorts the obtained search results, and outputs the sorted search results to the user.
  • this application proposes a search information output method. After the initial result queue is obtained according to the user's search content, each search result in the initial result queue, the search content, and the search scene where the user is searching According to the matching degree, the search results in the initial result queue are divided into multiple information groups, and the search results included in the multiple information groups are sorted according to multiple preset sorting factors, and the sorted multiple information groups are sorted Integrate to get the intermediate result queue.
  • the order of the search results included in the intermediate result queue is adjusted, and the target result queue is obtained and output, so that the scene where the user is in when searching As well as the correlation between search results are considered in the sorting process of the target result queue, to realize the diversified sorting of search results, to ensure that the top search results in the target result queue fit the user's search needs, and to improve the search output
  • the accuracy and versatility of the results are good.
  • the initial result queue includes a plurality of search results related to the search content determined based on the search content input by the user.
  • the search content input by the user is received, the search content is parsed, and multiple search results related to the search content are queried.
  • the CTR is estimated for the multiple search results, and multiple estimated click rates of the multiple search results are output.
  • the multiple search results are sorted according to multiple estimated click-through rates to obtain an initial result queue.
  • the multiple search results need to be divided according to the matching degree between each search result and the search content and the search scene where the user performs the search.
  • the multiple search results For multiple information groups, establish experience-based binning to ensure the interpretability of search results.
  • the process of dividing multiple search results into multiple information groups is actually a process of classifying the search results.
  • the first dimension needs to consider the semantic matching degree, which refers to the degree of semantic correlation between the search results and the search content; the second dimension needs to consider the scene matching degree, which refers to the relationship between the search results and the search scene
  • the degree of correlation can specifically be the matching closeness between the search results and the user's LBS (Location Based Service, location-based service), physical, time and space, and the more common one is the distance.
  • LBS Location Based Service, location-based service
  • multiple semantic matching degree value intervals and multiple scene matching degree value intervals are determined.
  • the value interval of semantic matching degree and the value interval of scene matching degree are used to divide the information group by synthesizing the parameters of the two dimensions at the same time.
  • three semantic matching degree value intervals and three scene matching degree value intervals may be set.
  • the three intervals of semantic matching degree can be (0-30%), (30%-60%) and (60%-90%) respectively, and the interval label of (0-30%) is "irrelevant”.
  • (30%-60%) interval label is "weak correlation”
  • (60%-90%) interval label is "strong correlation”.
  • the 3 scene matching value ranges can be (0-500 meters), (500 meters-2000 meters) and (above 2000 meters), (0-500 meters) interval label is "short distance", (500 meters ⁇ 2000 meters) is labeled as “middle distance”, and the label of the interval (above 2000 meters) is "long distance”.
  • search results from different industries can bear different thresholds, for example, when users search for entertainment venues, they can bear a longer distance. It can be extended to (0 ⁇ 5000 meters). In some industries, all the search results obtained by searching may be within 5000 meters, which belong to the "short distance" range, and the stalls cannot be distinguished, and there is no meaning of division. Therefore, in the embodiment of the present application, two ways can be used to determine multiple value ranges of semantic matching degree and multiple value ranges of scene matching degree.
  • One way is to manually set preset division standards by the staff according to different industries , multiple semantic matching degree value intervals and multiple scene matching degree value intervals are defined in the preset division standard, so that the search platform directly queries the preset division standard, and extracts multiple semantic matching degree value intervals from the preset division standard Value interval and multiple scene matching value intervals.
  • Another way is to count the first preset number of sample parameters, for example, 100,000 sample parameters, perform semantic training on the first preset number of sample parameters based on the machine learning depth model, and obtain multiple semantic matching values , construct multiple semantic matching degree value intervals according to the size relationship between multiple semantic matching degree values, and conduct scene training on the first preset number of sample parameters based on the machine learning depth model, and obtain multiple scene matching degree values value, construct multiple value ranges of scene matching degree according to the size relationship between multiple scene matching degree values, and use online dynamic clustering algorithm to form multiple value ranges to ensure that the value ranges are more suitable for users' search scenarios .
  • the semantic matching degree between the search result and the search content can be queried directly, and the geographical location where the user enters the search content and the store location of the store related to the search result can be determined, and the geographical location can be calculated The location distance between the location and the store location to obtain the scene matching degree indicated by the location distance.
  • the order of the above-mentioned operations of determining multiple semantic matching degree value intervals and multiple scene matching degree value intervals and the operation of determining the semantic matching degree and scene matching degree of each search result is not limited, and can also be Firstly, the semantic matching degree and the scene matching degree of each search result are determined, and then a plurality of semantic matching degree value intervals and a plurality of scene matching degree value intervals are determined.
  • semantic matching degree and scene matching degree of each search result are compared with multiple semantic matching degree value intervals and multiple scene matching degree value intervals, and the semantic matching degree and scene matching degree are at the same value
  • the search results of the interval are divided into the same group, and multiple information groups are obtained to realize the nested sorting of two dimensions.
  • 3 semantic matching degree value ranges and 3 scene matching degree value ranges as an example for illustration, when dividing information groups, in fact, a two-dimensional orthogonal grid.
  • one dimension is three value intervals of semantic matching degree, which are "strongly relevant”, “weakly relevant” and “irrelevant”, and the other dimension is three value intervals of scene matching degree, They are "near distance”, “middle distance” and “far distance”.
  • the search results whose matching degree is in the "middle distance” are divided into the information groups corresponding to the blank area, and the search results whose semantic matching degree is "irrelevant” and the scene matching degree is in the "far distance” are divided into information groups corresponding to the black area .
  • the semantic matching degree of some search results is "weak correlation”, while the scene matching degree is "close”, in this case, according to the division method in Figure 2B, the search results are divided Just go to the information group corresponding to the blank area.
  • an information group name may be set for each information group.
  • the information group name of the information group corresponding to the shaded area can be "excellent"; while the blank area
  • the search results included in the corresponding information group are search results with weak correlation and medium distance. Therefore, the information group name of the information group corresponding to the blank area can be "good”; while the search results included in the information group corresponding to the black area
  • the result is a search result with irrelevant relevance and a long distance, therefore, the information group name of the information group corresponding to the black area may be "poor".
  • the plurality of preset sorting factors may be one or more of query matching degree, scene matching degree, user matching degree and store quality degree.
  • Query matching degree can be text correlation between search results and search content, category consistency, entity consistency, knowledge correlation, general large model classification, etc.; scene matching degree can be distance classification and distance smoothing of search results points, POI (Point Of Interest, point of interest) type matching, etc.; the user matching degree can be the estimated click rate and conversion rate corresponding to the search result; the store quality degree can be the store service quality score and store material of the store corresponding to the search result quality score etc. Since each search result will correspond to multiple different factor scores on multiple preset sorting factors, sorting cannot be achieved. Therefore, multiple factor scores need to be fused into a correlation score, and then sorted within the group according to the correlation score .
  • the following takes any search result in any information group as an example to describe the generation process of the relevance score:
  • score1 and score2 are the scores of two factors
  • w1 and w2 are the factor weights corresponding to the scores of the two factors.
  • w is the factor weight
  • f is used to indicate the linear function used
  • intention_id is the search intent, specifically the brand, type, content, address and other attributes of the search content
  • trade_id is the search industry, such as catering, medicine, retail, entertainment, Fitness, etc.
  • is the preset ranking factor for the current training.
  • the following formula 3 can be used to calculate the calculated factor weights, so that the factor weights corresponding to the information groups are relatively Excellent weight.
  • w ⁇ is a better weight
  • argmaxw indicates the ⁇ R/ ⁇
  • R is the result of weighted summation of each factor weight obtained according to the position from high to low
  • indicates Rounds of weighted summation.
  • the relevance score of each search result can be calculated respectively, and the relevance score of each search result can be obtained.
  • the correlation scores are calculated and sorted for the search results included in each of the multiple information groups, and multiple sorted information groups are obtained.
  • the sorted multiple information groups are combined in order of rank from high to low to obtain an intermediate result queue.
  • the obtained intermediate result queue is "ACDBE FHIGJ OLKMN”.
  • the search results have already formed an intermediate result queue.
  • the internal connection between the search results and the context of the intermediate result queue will be considered comprehensively.
  • adjust and rearrange the intermediate result queue to make the adjusted intermediate result queue more reasonable and form a new target result queue.
  • the process of adjusting the intermediate result queue consists of two steps, as follows:
  • Step 1 Compare the initial result queue with the intermediate result queue, and adjust the intermediate result queue.
  • the two search results are compared to determine the two search results According to the correlation between the two search results, the sequence of the two search results is determined and the intermediate result queue is adjusted to obtain the adjusted intermediate result queue. For example, suppose the initial result queue is "12345" and the intermediate result queue is "23541", then compare "1" in the initial result queue with "2" in the intermediate result queue, when it is determined that "2" is better than “1" ", record "21".
  • GRU Gate Recurrent Unit, gated recurrent unit
  • Attention attention structure modeling
  • Step 2 Evaluate the adjusted intermediate result queue, and output the queue score.
  • the other is the collaborative relationship between stores, which has nothing to do with the location relationship, which helps to extract more long-term relationship dependencies. For example, assuming that the user only likes hot pot, no matter how the hot pot and rice noodles are sorted, it will not affect the user's choice of hot pot. There is no interaction between search results. And assuming that the user likes hot pot and rice noodles at the same time, there will be a strong internal connection between hot pot and rice noodles, and whichever is ranked first will be easily selected by the user. Therefore, only by capturing the influence of the contextual environment in the result list can we truly achieve context awareness.
  • an evaluator is set in the search platform.
  • the evaluator is trained based on multiple sample queues and indicates the score of each sample queue in multiple sample queues.
  • the evaluator can be based on the correlation between user intent and search results
  • the adjusted intermediate result queue is scored, and then the result queue with the highest score is subsequently selected as the final result queue.
  • the evaluator can be obtained by modeling the collaborative relationship between user intent and search results through Bidirectional-Long Short Term Memory (Bidirectional-Long Short Term Memory) and Self-attention (self-attention mechanism). Indicates the scores corresponding to a large number of sample queues.
  • the adjusted intermediate result queue is input to the evaluator for evaluation, and the queue score output by the evaluator is obtained, and the adjusted intermediate result queue is marked with the queue score. For example, suppose the adjusted intermediate result queue is "34521", and the score corresponding to the sample queue "34521" in the evaluator is 10 points, then the queue score of the adjusted intermediate result queue is 10 points.
  • step 2 compares the adjusted intermediate result queue with the initial result queue, and readjust the adjusted intermediate result queue until the number of adjustments reaches the predetermined number. If the number of rotations is set, the scores of the queues whose number meets the preset number of rotations can be obtained. Subsequently, the target queue score with the highest queue score and the target intermediate result queue marked by the target queue score are extracted from the queue scores whose quantity satisfies the preset rotation times.
  • the second preset number it may be necessary to determine the second preset number, intercept the search results whose number satisfies the second preset number at the head of the target intermediate result queue, and store the search results whose number satisfies the second preset number Search results serve as the target result queue.
  • the second preset number is applied when determining the target result queue, but in the process of actual application, when the adjusted intermediate result queue can be obtained in step 1, the second preset number can be used. Set the number to intercept the adjusted intermediate result queue, thereby reducing the pressure of subsequent creation of estimators and queue evaluation based on estimators. Furthermore, the second preset number can be set according to the length of the information group "Excellent”, which can be consistent with the length of the information group "Excellent" or greater than the length of the information group "Excellent”, so that the search results in the information group "Excellent" are uniform While it can be considered, it can also be appropriately extended to the information group "good”, so as to prevent excellent search results from being filtered out.
  • some control requirements will be set with the launch of activities and changes in business demands, such as breaking up the search results of the same brand, enhancing the order of distance, adding weights to new search results, and reverse search results. cheating and more. Therefore, in the process of actual application, it is necessary to query the preset regulation strategy, determine the regulation requirements of the preset regulation strategy, and adjust the order of the search results included in the target result queue according to the regulation requirements, and obtain the regulated target result queue. Output the adjusted target result queue.
  • the preset control strategy can be divided into three types.
  • the first is the service strategy, such as breaking up search results of the same brand, enhancing distance order preservation, and so on.
  • the second is the order-preserving strategy, which regulates the search content in the target result queue according to the real-time traffic in the search platform.
  • the third is the anti-cheating strategy, which is connected to an external anti-cheating system and intervenes in the target result queue based on the anti-cheating system.
  • the target result queue after the target result queue is obtained, the target result queue can be output.
  • the multi-dimensional grouping of the initial result queue, sorting within the group, queue rearrangement, and service scheduling are realized, and the search results are sorted according to their excellence, so that the excellent search results are ranked first.
  • both group sorting and queue rearrangement are realized based on large-scale deep machine learning models, which can face complex and high-dimensional parameter spaces, so that the method proposed in this application can be used in various scenarios Both have strong promotional properties.
  • the initial result queue is obtained according to the user's search content, according to the matching degree between each search result in the initial result queue and the search content and the search scene where the user searches, the The search results in the initial result queue are divided into multiple information groups, and the search results included in the multiple information groups are sorted according to multiple preset sorting factors, and the sorted multiple information groups are integrated to obtain intermediate results queue.
  • the order of the search results included in the intermediate result queue is adjusted, and the target result queue is obtained and output, so that the scene where the user is in when searching As well as the correlation between search results are considered in the sorting process of the target result queue, to realize the diversified sorting of search results, to ensure that the top search results in the target result queue fit the user's search needs, and to improve the search output
  • the accuracy and versatility of the results are good.
  • the embodiment of the present application provides a search result output device, as shown in FIG. module 304 and output module 305 .
  • the acquiring module 301 is configured to acquire an initial result queue, where the initial result queue includes a plurality of search results related to the search content determined based on the search content input by the user.
  • the division module 302 is configured to divide the plurality of search results according to the matching degree between each search result in the plurality of search results and the search content and the search scene where the user performs the search. for multiple information groups.
  • the sorting module 303 is configured to sort the search results included in each of the multiple information groups according to a plurality of preset sorting factors to obtain an intermediate result queue.
  • the adjustment module 304 is configured to adjust the order of the search results included in the intermediate result queue based on the correlation between the initial result queue and the search results of the same rank in the intermediate result queue to obtain a target result queue.
  • the output module 305 is configured to output the target result queue.
  • the acquisition module 301 is configured to receive the search content input by the user; parse the search content, query the multiple search results related to the search content; Carry out a click-through rate CTR prediction for each search result, and output a plurality of estimated click-through rates of the plurality of search results; sort the plurality of search results according to the plurality of estimated click-through rates to obtain the initial result queue.
  • the division module 302 is configured to determine a plurality of semantic matching degree value intervals and a plurality of scene matching degree value intervals; for each search result in the plurality of search results, determine the search result
  • the semantic matching degree and the scene matching degree, the semantic matching degree indicates the degree of semantic correlation between the search result and the search content
  • the scene matching degree indicates the degree of relevance between the search result and the search scene
  • the semantic matching degree and the scene matching degree of the search result are compared with the multiple semantic matching degree value intervals and the multiple scene matching degree value intervals; the semantic matching degree and the scene matching degree are both in The search results in the same value range are divided into the same group to obtain the multiple information groups.
  • the division module 302 is configured to query the semantic matching degree between the search result and the search content for each of the multiple search results; determine the user The geographic location when the search content is input and the store location of the store related to the search result; calculate the location distance between the geographic location and the location of the store, and obtain the scene indicated by the location distance suitability.
  • the division module 302 is configured to query preset division standards, and extract the multiple semantic matching degree value intervals and the multiple scene matching degree value intervals from the preset division standards or, counting the first preset number of sample parameters, performing semantic training on the first preset number of sample parameters based on the machine learning depth model, and obtaining multiple semantic matching values, according to the multiple semantic matching degrees Constructing the multiple semantic matching degree value intervals based on the size relationship between the values, and performing scene training on the first preset number of sample parameters based on the machine learning depth model to obtain multiple scene matching degree values Constructing value intervals for the multiple scene matching degrees according to the magnitude relationship among the multiple scene matching degree values.
  • the ranking module 303 is configured to, for each information group in the plurality of information groups, calculate the relevance of each search result included in the information group based on the plurality of preset ranking factors Score: sort all the search results included in the information group according to the order of the correlation scores from large to small, and obtain the sorted information groups according to the order of the grouping levels corresponding to the multiple information groups from high to low Combining multiple sorted information groups to obtain the intermediate result queue.
  • the sorting module 303 is configured to, for each of the search results included in the information group, query the multiple factor scores of the search result on the multiple preset sorting factors; obtain the A plurality of factor weights corresponding to the plurality of preset sorting factors, weight calculation of the plurality of factor scores based on the plurality of factor weights, to obtain the correlation score of the search result.
  • the sorting module 303 is configured to, for each of the multiple preset sorting factors, query the search intent and search industry corresponding to the search content input by the user; based on a linear function , respectively train the search intent, the search industry and each of the preset ranking factors to obtain the multiple factor weights.
  • the adjustment module 304 is configured to compare the two search results for every two search results at the same rank in the initial result queue and the intermediate result queue, and determine Relevance between the two search results; according to the relevancy between the two search results, determine the order of the two search results and adjust the intermediate result queue to obtain the adjusted intermediate The result queue; the adjusted intermediate result queue is input to the evaluator for evaluation, and the queue score output by the evaluator is obtained, and the adjusted intermediate result queue is marked by using the queue score, and the evaluator
  • the training is based on multiple sample queues and indicates the score of each sample queue in the multiple sample queues; the above-mentioned comparison process is repeatedly performed, and the adjusted intermediate result queue is compared with the initial result queue Comparing and readjusting the adjusted intermediate result queues until the number of adjustments reaches the preset number of rotations, obtaining the scores of the queues whose quantity meets the preset number of rotations; scoring the queues whose quantity meets the preset number of rotations Extract the target queue score with the
  • the device further includes: a query module 306 .
  • the inquiry module 306 is configured to inquire about a preset regulation strategy, and determine regulation requirements of the preset regulation strategy.
  • the adjustment module 304 is further configured to adjust the order of the search results included in the target result queue according to the regulation requirement, so as to obtain the regulated target result queue.
  • the output module 305 is also configured to output the regulated target result queue.
  • the device provided in the embodiment of this application after obtaining the initial result queue according to the user's search content, according to the matching degree between each search result in the initial result queue and the search content and the search scene where the user searches, will
  • the search results in the initial result queue are divided into multiple information groups, and the search results included in the multiple information groups are sorted according to multiple preset sorting factors, and the sorted multiple information groups are integrated to obtain intermediate results queue.
  • the order of the search results included in the intermediate result queue is adjusted, and the target result queue is obtained and output, so that the scene where the user is in when searching As well as the correlation between search results are considered in the sorting process of the target result queue, to realize the diversified sorting of search results, to ensure that the top search results in the target result queue fit the user's search needs, and to improve the search output
  • the accuracy and versatility of the results are good.
  • a device which includes a bus, a processor, a memory, and a communication interface, and may also include an input and output interface and a display device, wherein each functional unit can be completed through the bus mutual communication.
  • the memory stores computer programs
  • the processor is used to execute the programs stored in the memory and execute the search result output method in the above-mentioned embodiments.
  • a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, the steps of the search result output method are realized.
  • the present application can be realized by hardware, or by software plus a necessary general hardware platform.
  • the technical solution of the present application can be embodied in the form of software products, which can be stored in a non-volatile storage medium (which can be CD-ROM, U disk, mobile hard disk, etc.), including several The instructions are used to make a computer device (which may be a personal computer, a server, or a network device, etc.) execute the methods described in various implementation scenarios of the present application.
  • modules in the devices in the implementation scenario can be distributed among the devices in the implementation scenario according to the description of the implementation scenario, or can be located in one or more devices different from the implementation scenario according to corresponding changes.
  • the modules of the above implementation scenarios can be combined into one module, or can be further split into multiple sub-modules.

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Business, Economics & Management (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Accounting & Taxation (AREA)
  • Finance (AREA)
  • Physics & Mathematics (AREA)
  • Economics (AREA)
  • General Business, Economics & Management (AREA)
  • Strategic Management (AREA)
  • Marketing (AREA)
  • Development Economics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

提供了一种搜索结果输出方法、装置、计算机设备及可读存储介质。搜索结果输出方法包括:获取初始结果队列;根据多个搜索结果中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将多个搜索结果划分为多个信息组(102);按照多个预设排序因子对多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列(103);基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列(104);输出目标结果队列(105)。

Description

搜索结果输出方法、装置、计算机设备及可读存储介质 技术领域
本申请涉及互联网技术领域,特别是涉及一种搜索结果输出方法、装置、计算机设备及可读存储介质。
背景技术
近年来,随着互联网技术的快速发展,各种互联网应用广泛深入各类领域,大数据呈现爆炸式增长,海量的数据和信息分散于网络空间,当用户需要获取信息和数据时,可以通过搜索平台进行信息搜索,从而使搜索平台可以输出相关的搜索内容。搜索平台作为用户与信息之间的重要纽带,一般会针对用户搜索的内容输出大量相关的搜索结果,尽可能的为用户提供足够的信息进行参考。
相关技术中,搜索平台会根据用户输入的搜索内容查询一系列与搜索内容相关的结果,对这些结果进行CTR(Click-Through-Rate,点击通过率)预估,之后按照预设的标准对预估的结果进行评估,进而对这些结果进行排序,并将排序后的结果输出给用户查看。
在实现本申请的过程中,申请人发现相关技术至少存在以下问题:
基于预设标准对信息进行排序后输出的搜索结果具有局限性,排在前面的搜索结果可能并不贴合用户真正的搜索需求,导致输出的搜索结果不够准确,通用性较差。
发明内容
有鉴于此,本申请提供了一种搜索结果输出方法、装置、计算机设备及可读存储介质。主要目的在于解决目前搜索结果具有局限性,排在前面的搜索结果可能并不贴合用户真正的搜索需求,导致输出的搜索结果不够准确,通用性较差的问题。
依据本申请第一方面,提供了一种搜索结果输出方法,该方法包括:
获取初始结果队列,所述初始结果队列包括基于用户输入的搜索内容确定的与所述搜索内容相关的多个搜索结果;
根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组;
按照多个预设排序因子对所述多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列;
基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列;
输出所述目标结果队列。
可选地,所述获取初始结果队列,包括:
接收所述用户输入的所述搜索内容;
解析所述搜索内容,查询与所述搜索内容相关的所述多个搜索结果;
对所述多个搜索结果进行点击通过率CTR预估,输出所述多个搜索结果的多个预估点击率;
按照所述多个预估点击率对所述多个搜索结果进行排序,得到所述初始结果队列。
可选地,所述根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组,包括:
确定多个语义匹配度取值区间和多个场景匹配度取值区间;
对于所述多个搜索结果中每个搜索结果,
确定该搜索结果的语义匹配度和场景匹配度,所述语义匹配度指示了该搜索结果与所述搜索内容的语义相关程度,所述场景匹配度指示了该搜索结果与所述搜索场景的相关程度;
将该搜索结果的所述语义匹配度和所述场景匹配度与所述多个语义匹配度取值区间和所述多个场景匹配度取值区间进行比对;
将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到所述多个信息组。
可选地,所述确定该搜索结果的语义匹配度和场景匹配度,包括:
查询该搜索结果和所述搜索内容之间的所述语义匹配度;
确定所述用户输入所述搜索内容时所处的地理位置以及与所述搜索结果相关的门店的门店位置;
计算所述地理位置与所述门店位置之间的位置距离,获取所述位置距离指示的所述场景匹配度。
可选地,所述确定多个语义匹配度取值区间和多个场景匹配度取值区间,包括:
查询预设划分标准,在所述预设划分标准中提取所述多个语义匹配度取值区间和所述多个场景匹配度取值区间;或,
统计第一预设数目的样本参数,基于机器学习深度模型对所述第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照所述多个语义匹配度取值之间的大小关系构建所述多个语义匹配度取值区间,并基于所述机器学习深度模型对所述第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照所述多个场景匹配度取值之间的大小关系构建所述多个场景匹配度取值区间。
可选地,所述按照多个预设排序因子对所述多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列,包括:
对于所述多个信息组中每个信息组,
基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分;
按照所述相关性评分从大到小的顺序,对该信息组包括的全部搜索结果进行排序,得到排序后的信息组;
按照所述多个信息组对应的分组等级从高到低的顺序将多个所述排序后的信息组进行组合,得到所述中间结果队列。
可选地,所述基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分,包括:
对于所述信息组包括的所述每个搜索结果,
查询该搜索结果在所述多个预设排序因子上的多个因子评分;
获取所述多个预设排序因子对应的多个因子权重,基于所述多个因子权重对所述多个因子评分进行权重计算,得到该搜索结果的所述相关性评分。
可选地,所述获取所述多个预设排序因子对应的多个因子权重,包括:
对于所述多个预设排序因子中每个预设排序因子,查询所述用户输入的搜索内容对应的搜索意图和搜索行业;
基于线性函数,分别对所述搜索意图、所述搜索行业以及所述每个预设排序因子进行训练,得到所述多个因子权重。
可选地,所述基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列,包括:
对于所述初始结果队列与所述中间结果队列中处于同一位次的每两个搜索结果,将所述两个搜索结果进行比对,确定所述两个搜索结果之间的关联性;
按照所述两个搜索结果之间的关联性,确定所述两个搜索结果的先后顺序并对所述中间结果队列进行调整,得到调整后的中间结果队列;
将所述调整后的中间结果队列输入至评估器进行评估,得到所述评估器输出的队列评分,采用所述队列评分对所述调整后的中间结果队列进行标注,所述评估器是基于多个样本队列进行训练的且指示有所述多个样本队列中每个样本队列的评分;
重复执行上述的比对过程,将所述调整后的中间结果队列与所述初始结果队列进行比对并对所述调整后的中间结果队列进行重新调整,直至调整次数达到预设轮转次数,得到数量满足所述预设轮转次数的队列评分;
在数量满足所述预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及所述目标队列评分标注的目标中间结果队列;
确定第二预设数目,在所述目标中间结果队列的队首截取数量满足所述第二预设数目的搜索结果,将数量满足所述第二预设数目的搜索结果作为所述目标结果队列。
可选地,所述基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列之后,所述方法还包括:
查询预设调控策略,确定所述预设调控策略的调控需求;
按照所述调控需求对所述目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列;
将所述调控后的目标结果队列输出。
依据本申请第二方面,提供了一种搜索结果输出装置,该装置包括:
获取模块,用于获取初始结果队列,所述初始结果队列包括基于用户输入的搜索内容确定的与所述搜索内容相关的多个搜索结果;
划分模块,用于根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组;
排序模块,用于按照多个预设排序因子对所述多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列;
调整模块,用于基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列;
输出模块,用于输出所述目标结果队列。
可选地,所述获取模块,用于接收所述用户输入的所述搜索内容;解析所述搜索内容,查询与所述搜索内容相关的所述多个搜索结果;对所述多个搜索结果进行点击通过率CTR预估,输出所述多个搜索结果的多个预估点击率;按照所述多个预估点击率对所述多个搜索结果进行排序,得到所述初始结果队列。
可选地,所述划分模块,用于确定多个语义匹配度取值区间和多个场景匹配度取值区间;对于所述多个搜索结果中每个搜索结果,确定该搜索结果的语义匹配度和场景匹配度,所述语义匹配度指示了该搜索结果与所述搜索内容的语义相关程度,所述场景匹配度指示了该搜索结果与所述搜索场景的相关程度;将该搜索结果的所述语义匹配度和所述场景匹配度与所述多个语义匹配度取值区间和所述多个场景匹配度取值区间进行比对;将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到所述多个信息组。
可选地,所述划分模块,用于对于所述多个搜索结果中每个搜索结果,查询所述搜索结果和所述搜索内容之间的所述语义匹配度;确定所述用户输入所述搜索内容时所处的地理位置以及与所述搜索结果相关的门店的门店位置;计算所述地理位置与所述门店位置之间的位置距离,获取所述位置距离指示的所述场景匹配度。
可选地,所述划分模块,用于查询预设划分标准,在所述预设划分标准中提取所述多个语义匹配度取值区间和所述多个场景匹配度取值区间;或,统计第一预设数目的样本参数,基于机器学习深度模型对所述第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照所述多个语义匹配度取值之间的大小关系构建所述多个语义匹配度取值区间,并基于所述机器学习深度模型对所述第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照所述多个场景匹配度取值之间的大小关系构建所述多个场景匹配度取值区间。
可选地,所述排序模块,用于对于所述多个信息组中每个信息组,基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分;按照所述相关性评分从大到小的顺序,对该信息组包括的全部搜索结果进行排序,得到排序后的信息组;按照所述多个信息组对应的分组等级从高到低的顺序将多个所述排序后的信息组进行组合,得到所述中间结果队列。
可选地,所述排序模块,用于对于所述信息组包括的所述每个搜索结果,查询该搜索结果在所述多个预设排序因子上的多个因子评分;获取所述多个预设排序因子对应的多个因子权重,基于所述多个因子权重对所述多个因子评分进行权重计算,得到该搜索结果的所述相关性评分。
可选地,所述排序模块,用于对于所述多个预设排序因子中每个预设排序因子,查询所述用户输入的搜索内容对应的搜索意图和搜索行业;基于线性函数,分别对所述搜索意图、所述搜索行业以及所述每个预设排序因子进行训练,得到所述多个因子权重。
可选地,所述调整模块,用于对于所述初始结果队列与所述中间结果队列中处于同一位次的每两个搜索结果,将所述两个搜索结果进行比对,确定所述两个搜索结果之间的关联性;按照所述两个搜索结果之间的关联性,确定所述两个搜索结果的先后顺序并对所述中间结果队列进行调整,得到调整后的中间结果队列;将所述调整后的中间结果队列输入至评估器进行评估,得到所述评估器输出的队列评分,采用所述队列评分对所述调整后的中间结果队列进行标注,所述评估器是基于多个样本队列进行训练的且指示有所述多个样本队列中每个样本队列的评分;重复执行上述的比对过程,将所述调整后的中间结果队列与所述初始结果队列进行比对并对所述调整后的中间结果队列进行重 新调整,直至调整次数达到预设轮转次数,得到数量满足所述预设轮转次数的队列评分;在数量满足所述预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及所述目标队列评分标注的目标中间结果队列;确定第二预设数目,在所述目标中间结果队列的队首截取数量满足所述第二预设数目的搜索结果,将数量满足所述第二预设数目的搜索结果作为所述目标结果队列。
可选地,所述装置还包括:
查询模块,用于查询预设调控策略,确定所述预设调控策略的调控需求;
所述调整模块,还用于按照所述调控需求对所述目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列;
所述输出模块,还用于将所述调控后的目标结果队列输出。
依据本申请第三方面,提供了一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,所述处理器执行所述计算机程序时实现上述第一方面中任一项所述方法的步骤。
依据本申请第四方面,提供了一种可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现上述第一方面中任一项所述的方法的步骤。
借由上述技术方案,本申请提供的一种搜索结果输出方法、装置、计算机设备及可读存储介质,本申请在根据用户的搜索内容获取到初始结果队列后,根据初始结果队列中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将初始结果队列中的搜索结果划分为多个信息组,按照多个预设排序因子对多个信息组包括的搜索结果进行组内排序,并将排序后的多个信息组进行整合,得到中间结果队列。接着,基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列并输出,使得用户进行搜索时所处的场景以及搜索结果之间的关联性均考虑在目标结果队列的排序过程中,实现搜索结果的多样化排序,保证目标结果队列中排在前面的搜索结果与用户的搜索需求贴合,提升输出的搜索结果的准确性,通用性较好。
上述说明仅是本申请技术方案的概述,为了能够更清楚了解本申请的技术手段,而可依照说明书的内容予以实施,并且为了让本申请的上述和其它目的、特征和优点能够更明显易懂,以下特举本申请的具体实施方式。
附图说明
通过阅读下文优选实施方式的详细描述,各种其他的优点和益处对于本领域普通技术人员将变得清楚明了。附图仅用于示出优选实施方式的目的,而并不认为是对本申请的限制。而且在整个附图中,用相同的参考符号表示相同的部件。在附图中:
图1示出了本申请实施例提供的一种搜索结果输出方法流程示意图;
图2A示出了本申请实施例提供的一种搜索结果输出方法流程示意图;
图2B示出了本申请实施例提供的一种搜索结果输出方法的示意图;
图3A示出了本申请实施例提供的一种搜索结果输出装置的结构示意图;
图3B示出了本申请实施例提供的一种搜索结果输出装置的结构示意图;
图4示出了本申请实施例提供的一种计算机设备的装置结构示意图。
具体实施方式
下面将参照附图更详细地描述本申请的示例性实施例。虽然附图中显示了本申请的示例性实施例,然而应当理解,可以以各种形式实现本申请而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本申请,并且能够将本申请的范围完整的传达给本领域的技术人员。
本申请实施例提供了一种搜索结果输出方法,如图1所示,该方法包括:
101、获取初始结果队列,初始结果队列包括基于用户输入的搜索内容确定的与搜索内容相关的多个搜索结果。
102、根据多个搜索结果中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将多个搜索结果划分为多个信息组。
103、按照多个预设排序因子对多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列。
104、基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列。
105、输出目标结果队列。
本申请实施例提供的方法,在根据用户的搜索内容获取到初始结果队列后,根据初始结果队列中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将初始结果队列中的搜索结果划分为多个信息组,按照多个预设排序因子对多个信息组包括的搜索结果进行组内排序,并将排序后的多个信息组进行整合,得到中间结果队列。接着,基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列并输出,使得用户进行搜索时所处的场景以及搜索结果之间的关联性均考虑在目标结果队列的排序过程中,实现搜索结果的多样化排序,保证目标结果队列中排在前面的搜索结果与用户的搜索需求贴合,提升输出的搜索结果的准确性,通用性较好。
本申请实施例提供了一种搜索信息输出方法,如图2A所示,该方法包括:
201、获取初始结果队列。
搜索平台可以为用户提供信息查找的最大入口,是连接用户和信息的重要纽带,搜索平台作为最常用的搜索工具已经成为人们生活中必不可少的一部分。搜索平台在为用户提供搜索服务时,根据用户输入的搜索内容会搜索到大量相关的搜索结果,而考虑到搜索结果是存在优劣之分的,因此,搜索平台中设计有排序机制,会按照排序机制将得出的搜索结果进行排序,并将排序后的搜索结果输出至用户。目前,很多搜索平台在对搜索结果进行排序时,通常利用CTR(Click Through Rate,点击通过率)或者CVR(Conversion Rate,转化率)等模型预估每个搜索结果的预估点击率,将预估点击率较大的搜索结果排在前面,使用户通过搜索能够最先获取到优质的搜索结果。或者有些搜索平台是直接设置排序规则,按照排序规则对搜索结果进行排序。但是,申请人认识到,实际上搜索结果排序时还有很多点需要考虑,比如考虑用户的搜索体验、搜索效率、商家诉求等等,单纯依赖预估点击率或者排序规则进行搜索结果的排序不够柔性,虽然可解释性较强,但是最终的排序结果并不平滑,并非优秀的排序方式。而且这种排序方式将复杂、高维的排序参数摒弃,难以在不同的搜索场景中通用,推广性较差。
因此,本申请提出了一种搜索信息输出方法,在根据用户的搜索内容获取到初始结果队列后,根据初始结果队列中每个搜索结果与搜索内容、用户进行搜索时所处的搜索 场景之间的匹配度,将初始结果队列中的搜索结果划分为多个信息组,按照多个预设排序因子对多个信息组包括的搜索结果进行组内排序,并将排序后的多个信息组进行整合,得到中间结果队列。接着,基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列并输出,使得用户进行搜索时所处的场景以及搜索结果之间的关联性均考虑在目标结果队列的排序过程中,实现搜索结果的多样化排序,保证目标结果队列中排在前面的搜索结果与用户的搜索需求贴合,提升输出的搜索结果的准确性,通用性较好。
而为了实现本申请中的技术方案,首先需要获取初始结果队列,在后续以初始结果队列为基础进行后续的分组、组内排序、重排、调整等过程。其中,初始结果队列包括基于用户输入的搜索内容确定的与搜索内容相关的多个搜索结果。在生成初始结果队列时,首先,接收用户输入的搜索内容,解析搜索内容,查询与搜索内容相关的多个搜索结果。随后,对多个搜索结果进行CTR预估,输出多个搜索结果的多个预估点击率。最后,按照多个预估点击率对多个搜索结果进行排序,得到初始结果队列。
202、根据多个搜索结果中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将多个搜索结果划分为多个信息组。
在本申请实施例中,获取到初始结果队列之后,需要根据多个搜索结果中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将多个搜索结果划分为多个信息组,建立基于体验的分档,以保证搜索结果的可解释性。
其中,将多个搜索结果划分为多个信息组的过程实际上是对搜索结果进行分档的过程,在搜索场景下,划分多个信息组时需要考虑至少两种维度下的搜索体验。第一种维度需要考虑语义匹配度,语义匹配度指的是搜索结果与搜索内容之间的语义相关程度;第二种维度需要考虑场景匹配度,场景匹配度指的是搜索结果与搜索场景的相关程度,具体可以是搜索结果与用户的LBS(Location Based Service,基于位置服务)、物理、时空的匹配接近程度,较为常见的是距离。这样,将多个搜索结果划分为多个信息组的过程具体如下:
首先,确定多个语义匹配度取值区间和多个场景匹配度取值区间。语义匹配度取值区间和场景匹配度取值区间是用来同时综合两个维度的参数进行信息组的划分的。具体地,可以设置3个语义匹配度取值区间和3个场景匹配度取值区间。比如,3个语义匹配度取值区间可以分别为(0~30%)、(30%~60%)以及(60%~90%),(0~30%)的区间标签为“不相关”,(30%~60%)的区间标签为“弱相关”,(60%~90%)的区间标签为“强相关”。3个场景匹配度取值区间可以分别为(0~500米)、(500米~2000米)以及(2000米以上),(0~500米)的区间标签为“近距”,(500米~2000米)的区间标签为“中距”,(2000米以上)的区间标签为“远距”。
需要说明的是,由于来自不同行业的搜索结果能够承受的阈值是不同的,比如用户在搜索娱乐场所时,可以承受更远的距离,区间标签“近距”对应的场景匹配度取值区间实际上可以扩展为(0~5000米)。而有些行业的搜索得到的全部搜索结果可能都在5000米以内,均属于“近距”这一区间,无法区分档位,没有划分的意义。因此,在本申请实施例中,可以采用两种方式确定多个语义匹配度取值区间和多个场景匹配度取值区间,一种方式是由工作人员根据行业的不同人工设置预设划分标准,在预设划分标准中限定多个语义匹配度取值区间和多个场景匹配度取值区间,这样,搜索平台直接查询预设划分标准,在预设划分标准中提取多个语义匹配度取值区间和多个场景匹配度取值区间。另一种方式是统计第一预设数目的样本参数,例如可以是10万的样本参数,基于机器学习深度模型对第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照多个语义匹配度取值之间的大小关系构建多个语义匹配度取值区间,并基于机器学习 深度模型对第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照多个场景匹配度取值之间的大小关系构建多个场景匹配度取值区间,采用在线动态的聚类算法形成多个取值区间,保证取值区间更加贴合用户的搜索场景。
随后,对于多个搜索结果中每个搜索结果,需要确定每个搜索结果的语义匹配度和场景匹配度。具体地。对于多个搜索结果中每个搜索结果,可直接查询搜索结果和搜索内容之间的语义匹配度,并确定用户输入搜索内容时所处的地理位置以及搜索结果相关的门店的门店位置,计算地理位置与门店位置之间的位置距离,获取位置距离指示的场景匹配度。
需要说明的是,上述确定多个语义匹配度取值区间和多个场景匹配度取值区间的操作与确定每个搜索结果的语义匹配度和场景匹配度的操作的顺序不受限制,也可以首先确定每个搜索结果的语义匹配度和场景匹配度,随后确定多个语义匹配度取值区间和多个场景匹配度取值区间。
最后,将每个搜索结果的语义匹配度和场景匹配度与多个语义匹配度取值区间和多个场景匹配度取值区间进行比对,将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到多个信息组,实现两个维度的嵌套排序。继续以上述设置3个语义匹配度取值区间和3个场景匹配度取值区间为例进行说明,在划分信息组时,实际上可以按照布尔逻辑,建立如图2B所示的二维正交网格。二维正交网格中的一维为3个语义匹配度取值区间,分别为“强相关”、“弱相关”和“不相关”,另一维为3个场景匹配度取值区间,分别为“近距”、“中距”和“远距”。这样,在划分信息组时,将语义匹配度处于“强相关”且场景匹配度处于“近距”的搜索结果划分为至阴影区域对应的信息组,将语义匹配度处于“弱相关”且场景匹配度处于“中距”的搜索结果划分为至空白区域对应的信息组,将语义匹配度处于“不相关”且场景匹配度处于“远距”的搜索结果划分为至黑色区域对应的信息组。需要说明的是,实际应用中有些搜索结果的语义匹配度处于“弱相关”,而场景匹配度处于“近距”,在这种情况下,按照图2B中的划分方式,将该搜索结果划分至空白区域对应的信息组即可。另外,为了便于区分多个信息组,方便后续对信息组的整合,可以为每个信息组设置信息组名称。继续参见图2B,由于阴影区域对应的信息组中包括的搜索结果是相关性优且距离近距的搜索结果,因此,阴影区域对应的信息组的信息组名称可以为“优”;而空白区域对应的信息组中包括的搜索结果是相关性弱且距离中距的搜索结果,因此,空白区域对应的信息组的信息组名称可以为“良”;而黑色区域对应的信息组中包括的搜索结果是相关性不相关且距离远距的搜索结果,因此,黑色区域对应的信息组的信息组名称可以为“差”。
203、按照多个预设排序因子对多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列。
在本申请实施例中,完成信息组的划分后,需要按照多个预设排序因子对多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列,实现信息组的组内排序以及信息组的整合。其中,多个预设排序因子可以为查询匹配度、场景匹配度、用户匹配度以及店铺质量度中的一种或多种。查询匹配度可以是搜索结果与搜索内容之间的文本相关性、类目一致性、实体一致性、知识相关性、通用大模型分等;场景匹配度可以是搜索结果的距离分档、距离平滑分、POI(Point Of Interesting,兴趣点)类型匹配等;用户匹配度可以是搜索结果对应的预估点击率、转化率;店铺质量度可以是搜索结果对应的门店的店铺服务质量分、店铺物料质量分等等。由于每个搜索结果在多个预设排序因子上会对应多个不同的因子评分,无法实现排序,因此,需要将多个因子评分融合为一个相关性评分,进而按照相关性评分实现组内排序。下面以任一信息组内的任一搜索结果为例描述相关性评分的生成过程:
首先,查询搜索结果在多个预设排序因子上的多个因子评分。随后,获取多个预设排序因子对应的多个因子权重,基于多个因子权重对多个因子评分进行权重计算,得到搜索结果的相关性评分,权重计算的公式如公式1所示:
公式1:相关性评分=w1×score1+w2×score2+….
其中,score1和score2为两个因子评分,w1和w2为两个因子评分对应的因子权重。需要说明的是,考虑到来自不同行业的搜索结果在同一预设排序因子上的敏感程度是存在差异的,比如餐饮行业对场景匹配度较为敏感,用户通常希望就近吃饭,而摄影行业对场景匹配度并不敏感,用户通常并不在乎摄影门店的距离,因此,实际上对于多个预设排序因子中每个预设排序因子,可以查询用户输入的搜索内容对应的搜索意图和搜索行业,基于线性函数,分别对搜索意图、搜索行业以及每个预设排序因子进行训练,得到多个因子权重。具体地,可以基于下述公式2进行因子权重的训练:
公式2:w=f(intention_id,trade_id,θ)
其中,w为因子权重;f用于指示采用的线性函数;intention_id为搜索意图,具体可以搜索内容的品牌、类型、内容、地址等属性;trade_id为搜索行业,比如餐饮、医药、零售、娱乐、健身等等;θ为当前训练的预设排序因子。而在后续为了继续优化因子权重,使因子权重能够符合每个信息组的取值区间的约束,可以采用下述公式3对计算得到的因子权重进行计算,使得信息组对应的因子权重均是较优的权重。
公式3:w`=argmaxw(∑τR/∣τ∣)
其中,w`为较优的权重,argmaxw指示使w取得最大值所对应的∑τR/∣τ∣,R是按照位置从高到低对得到的每个因子权重加权求和的结果,τ指示加权求和的轮次。
通过重复执行上述计算搜索结果的相关性评分的过程,便可以分别为每个搜索结果计算相关性评分,得到每个搜索结果的相关性评分。
计算得到每个搜索结果的相关性评分后,对于多个信息组中每个信息组,按照相关性评分从大到小的顺序,对信息组包括的全部搜索结果进行排序,得到排序后的信息组,从而实现信息组的组内排序。重复执行排序的过程便分别为多个信息组中每个信息组包括的搜索结果计算相关性评分并进行排序,得到排序后的多个信息组。具体参见步骤202中为信息组设置信息组名称的内容可知,多个信息组是存在优劣之分的,因此,在完成了信息组的组内排序后,需要按照多个信息组对应的分组等级从高到低的顺序将排序后的多个信息组进行组合,得到中间结果队列。例如,假设信息组“优”为“ACDBE”,信息组“良”为“FHIGJ”,信息组“差”为“OLKMN”,则得到的中间结果队列为“ACDBE FHIGJ OLKMN”。
204、基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列。
在本申请实施例中,通过上述步骤202至步骤203中的过程,搜索结果已经形成了一个中间结果队列。但是搜索结果之间是存在内在联系的,且用户的意图在浏览过程中也是存在双向变化,因此,在本申请实施例中,会综合考虑搜索结果之间的内在联系以及中间结果队列的上下文环境对用户的影响,进行中间结果队列的调整重排,使调整后的中间结果队列更加合理,形成新的目标结果队列。调整中间结果队列的过程包括两个步骤,具体如下:
步骤一、将初始结果队列和中间结果队列进行比对,对中间结果队列进行调整。
其中,在将初始结果队列和中间结果队列进行比对时,对于初始结果队列与中间结果队列中处于同一位次的每两个搜索结果,将两个搜索结果进行比对,确定两个搜索 结果之间的关联性,并按照两个搜索结果之间的关联性,确定两个搜索结果的先后顺序并对中间结果队列进行调整,得到调整后的中间结果队列。例如,假设初始结果队列为“12345”,中间结果队列为“23541”,则将初始结果队列中的“1”与中间结果队列中的“2”进行对比,当确定“2”优于“1”时,将“21”记录。接下来,继续将初始结果队列中的“2”与中间结果队列中的“3”进行比对,当确定“3”优于“2”时,在上面记录的内容上将“3”添加进去,得到“321”,以此类推,直至整个初始结果队列与中间结果队列均对比完成,得到调整后的中间结果队列即为“32145”。
实际应用的过程中,可以通过GRU(Gate Recurrent Unit,门控循环单元)和Attention(注意力)结构建模,识别两个搜索结果之间的内在联系,接着通过Pointer-network(指针网络)对两个搜索结果进行比较,如此迭代,得到调整后的中间结果队列。
步骤二、对调整后的中间结果队列进行评估,输出队列评分。
在搜索场景中,用户与最后输出的结果队列是否会产生交互除了与用户和搜索结果本身有关外,还极大地受到上下文环境的影响。这里,主要考虑这两种影响:一种是用户意图在浏览过程中的双向变化。比如用户在滑动浏览搜索结果时,浏览到第7位的搜索结果时想要回忆第3位的搜索结果是什么,就会重新向上寻找第3位,也即用户的滑动浏览过程并不是一味的向后浏览,还可能向回查看。一般的,在某一页的浏览中,除了顺序地浏览搜索结果时发生的意图变化,后续的搜索结果也会对用户的意图产生影响,尤其是在双列信息流的情况中。另一种是店铺之间的协同关系,这种与位置关系无关的影响有助于提取更加长期的关系依赖。比如,假设用户只喜欢火锅,所以火锅和米线无论如何进行排序,都不会影响用户对火锅的选择,即使米线排在前面了,用户还会向后滑动寻找火锅,所以在这个场景中两个搜索结果之间是不会发生影响的。而假设用户同时喜欢火锅和米线,则火锅和米线之间就有较强的内在联系,哪个排在前面哪个就容易被用户选中。因此,只有捕捉了结果列表中上下文环境产生的影响,才能真正达到上下文感知。
所以,搜索平台中设置有评估器,评估器是基于多个样本队列进行训练的且指示有多个样本队列中每个样本队列的评分,评估器能够根据用户意图和搜索结果之间的关联关系对调整后的中间结果队列进行评分,进而在后续选择评分最高的结果队列作为最后的结果队列。具体地,评估器可以通过Bi-LSTM(Bidirectional-Long Short Term Memory,双向长短时记忆)和Self-attention(自注力机制)对用户意图和搜索结果之间的协同关系进行建模得到,评估器指示有大量样本队列对应的评分,因此,将调整后的中间结果队列输入至评估器进行评估,得到评估器输出的队列评分,并采用队列评分对调整后的中间结果队列进行标注。例如,假设调整后的中间结果队列为“34521”,评估器中的样本队列“34521”对应的评分为10分,则调整后的中间结果队列的队列评分即为10分。
重复执行上述步骤一中的比对过程以及步骤二中的评分过程,将调整后的中间结果队列与初始结果队列进行比对,并对调整后的中间结果队列进行重新调整,直至调整次数达到预设轮转次数,便可以得到数量满足预设轮转次数的队列评分。随后,在数量满足预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及目标队列评分标注的目标中间结果队列。考虑到输出的结果列表需要限定长度,因此,可以需要确定第二预设数目,在目标中间结果队列的队首截取数量满足第二预设数目的搜索结果,将数量满足第二预设数目的搜索结果作为目标结果队列。
需要说明的是,本申请实施例中,在确定目标结果队列时应用了第二预设数目,而在实际应用的过程中,可以在步骤一得到调整的中间结果队列时,便按照第二预设数目对调整的中间结果队列进行截取,从而减轻后续创建评估器以及基于评估器进行队列评估的压力。再有,第二预设数目可以根据信息组“优”的长度设置,可以与信息组“优” 的长度一致或者大于信息组“优”的长度,使信息组“优”中的搜索结果均能被考虑到的同时,还能适当延伸到信息组“良”中,避免优秀的搜索结果被过滤掉。
205、按照调控需求对目标结果队列包括的搜索结果进行顺序调整。
在本申请实施例中,有些搜索场景中随着活动的投放、商家诉求的变化等会设置一些调控需求,比如将同一品牌的搜索结果打散、距离保序增强、新加搜索结果加权、反作弊等等。因此,实际应用的过程中,需要查询预设调控策略,确定预设调控策略的调控需求,并按照调控需求对目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列,在后续将调控后的目标结果队列输出。
其中,预设调控策略可以分为三种,第一种是服务策略,比如同一品牌的搜索结果打散、距离保序增强等等。第二种是保序策略,按照搜索平台中的实时流量对目标结果队列中的搜索内容进行调控。第三种是反作弊策略,接入外部的反作弊系统,基于反作弊系统对目标结果队列进行干预。
206、输出目标结果队列。
在本申请实施例中,得到目标结果队列后,将目标结果队列输出即可。这样,通过上述过程,便实现了初始结果队列的多维分组、组内排序、队列重排以及服务调序,按照搜索结果的优良程度进行排序,使优秀的搜索结果排在前面。而且,在本申请中,组内排序以及队列重排均是基于大规模深度机器学习模型进行学习实现的,能够面对复杂、高维的参数空间,使得本申请提出的方法在多种场景中均具备较强的推广性。
本申请实施例提供的方法,在根据用户的搜索内容获取到初始结果队列后,根据初始结果队列中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将初始结果队列中的搜索结果划分为多个信息组,按照多个预设排序因子对多个信息组包括的搜索结果进行组内排序,并将排序后的多个信息组进行整合,得到中间结果队列。接着,基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列并输出,使得用户进行搜索时所处的场景以及搜索结果之间的关联性均考虑在目标结果队列的排序过程中,实现搜索结果的多样化排序,保证目标结果队列中排在前面的搜索结果与用户的搜索需求贴合,提升输出的搜索结果的准确性,通用性较好。
进一步地,作为图1所述方法的具体实现,本申请实施例提供了一种搜索结果输出装置,如图3A所示,所述装置包括:获取模块301,划分模块302,排序模块303,调整模块304以及输出模块305。
该获取模块301,用于获取初始结果队列,所述初始结果队列包括基于用户输入的搜索内容确定的与所述搜索内容相关的多个搜索结果。
该划分模块302,用于根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组。
该排序模块303,用于按照多个预设排序因子对所述多个信息组中每个信息组包括的搜索结果进行排序,得到中间结果队列。
该调整模块304,用于基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列。
该输出模块305,用于输出所述目标结果队列。
在具体的应用场景中,该获取模块301,用于接收所述用户输入的所述搜索内容;解析所述搜索内容,查询与所述搜索内容相关的所述多个搜索结果;对所述多个搜索结果进行点击通过率CTR预估,输出所述多个搜索结果的多个预估点击率;按照所述多个预估点击率对所述多个搜索结果进行排序,得到所述初始结果队列。
在具体的应用场景中,该划分模块302,用于确定多个语义匹配度取值区间和多个场景匹配度取值区间;对于所述多个搜索结果中每个搜索结果,确定该搜索结果的语义匹配度和场景匹配度,所述语义匹配度指示了该搜索结果与所述搜索内容的语义相关程度,所述场景匹配度指示了该搜索结果与所述搜索场景的相关程度;将该搜索结果的所述语义匹配度和所述场景匹配度与所述多个语义匹配度取值区间和所述多个场景匹配度取值区间进行比对;将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到所述多个信息组。
在具体的应用场景中,该划分模块302,用于对于所述多个搜索结果中每个搜索结果,查询所述搜索结果和所述搜索内容之间的所述语义匹配度;确定所述用户输入所述搜索内容时所处的地理位置以及与所述搜索结果相关的门店的门店位置;计算所述地理位置与所述门店位置之间的位置距离,获取所述位置距离指示的所述场景匹配度。
在具体的应用场景中,该划分模块302,用于查询预设划分标准,在所述预设划分标准中提取所述多个语义匹配度取值区间和所述多个场景匹配度取值区间;或,统计第一预设数目的样本参数,基于机器学习深度模型对所述第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照所述多个语义匹配度取值之间的大小关系构建所述多个语义匹配度取值区间,并基于所述机器学习深度模型对所述第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照所述多个场景匹配度取值之间的大小关系构建所述多个场景匹配度取值区间。
在具体的应用场景中,该排序模块303,用于对于所述多个信息组中每个信息组,基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分;按照所述相关性评分从大到小的顺序,对该信息组包括的全部搜索结果进行排序,得到排序后的信息组按照所述多个信息组对应的分组等级从高到低的顺序将多个所述排序后的信息组进行组合,得到所述中间结果队列。
在具体的应用场景中,该排序模块303,用于对于所述信息组包括的所述每个搜索结果,查询该搜索结果在所述多个预设排序因子上的多个因子评分;获取所述多个预设排序因子对应的多个因子权重,基于所述多个因子权重对所述多个因子评分进行权重计算,得到该搜索结果的所述相关性评分。
在具体的应用场景中,该排序模块303,用于对于所述多个预设排序因子中每个预设排序因子,查询所述用户输入的搜索内容对应的搜索意图和搜索行业;基于线性函数,分别对所述搜索意图、所述搜索行业以及所述每个预设排序因子进行训练,得到所述多个因子权重。
在具体的应用场景中,该调整模块304,用于对于所述初始结果队列与所述中间结果队列中处于同一位次的每两个搜索结果,将所述两个搜索结果进行比对,确定所述两个搜索结果之间的关联性;按照所述两个搜索结果之间的关联性,确定所述两个搜索结果的先后顺序并对所述中间结果队列进行调整,得到调整后的中间结果队列;将所述调整后的中间结果队列输入至评估器进行评估,得到所述评估器输出的队列评分,采用所述队列评分对所述调整后的中间结果队列进行标注,所述评估器是基于多个样本队列进行训练的且指示有所述多个样本队列中每个样本队列的评分;重复执行上述的比对过程,将所述调整后的中间结果队列与所述初始结果队列进行比对并对所述调整后的中间结果队列进行重新调整,直至调整次数达到预设轮转次数,得到数量满足所述预设轮转次 数的队列评分;在数量满足所述预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及所述目标队列评分标注的目标中间结果队列;确定第二预设数目,在所述目标中间结果队列的队首截取数量满足所述第二预设数目的搜索结果,将数量满足所述第二预设数目的搜索结果作为所述目标结果队列。
在具体的应用场景中,如图3B所示,该装置还包括:查询模块306。
该查询模块306,用于查询预设调控策略,确定所述预设调控策略的调控需求。
该调整模块304,还用于按照所述调控需求对所述目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列。
该输出模块305,还用于将所述调控后的目标结果队列输出。
本申请实施例提供的装置,在根据用户的搜索内容获取到初始结果队列后,根据初始结果队列中每个搜索结果与搜索内容、用户进行搜索时所处的搜索场景之间的匹配度,将初始结果队列中的搜索结果划分为多个信息组,按照多个预设排序因子对多个信息组包括的搜索结果进行组内排序,并将排序后的多个信息组进行整合,得到中间结果队列。接着,基于初始结果队列与中间结果队列中同一位次的搜索结果之间的关联性,调整中间结果队列包括的搜索结果的顺序,得到目标结果队列并输出,使得用户进行搜索时所处的场景以及搜索结果之间的关联性均考虑在目标结果队列的排序过程中,实现搜索结果的多样化排序,保证目标结果队列中排在前面的搜索结果与用户的搜索需求贴合,提升输出的搜索结果的准确性,通用性较好。
需要说明的是,本申请实施例提供的一种搜索结果输出装置所涉及各功能单元的其他相应描述,可以参考图1和图2A中的对应描述,在此不再赘述。
在示例性实施例中,参见图4,还提供了一种设备,该设备包括总线、处理器、存储器和通信接口,还可以包括输入输出接口和显示设备,其中,各个功能单元可以通过总线完成相互间的通信。该存储器存储有计算机程序,处理器,用于执行存储器上所存放的程序,执行上述实施例中的搜索结果输出方法。
一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现所述的搜索结果输出方法的步骤。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本申请可以通过硬件实现,也可以借助软件加必要的通用硬件平台的方式来实现。基于这样的理解,本申请的技术方案可以以软件产品的形式体现出来,该软件产品可以存储在一个非易失性存储介质(可以是CD-ROM,U盘,移动硬盘等)中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施场景所述的方法。
本领域技术人员可以理解附图只是一个优选实施场景的示意图,附图中的模块或流程并不一定是实施本申请所必须的。
本领域技术人员可以理解实施场景中的装置中的模块可以按照实施场景描述进行分布于实施场景的装置中,也可以进行相应变化位于不同于本实施场景的一个或多个装置中。上述实施场景的模块可以合并为一个模块,也可以进一步拆分成多个子模块。
上述本申请序号仅仅为了描述,不代表实施场景的优劣。
以上公开的仅为本申请的几个具体实施场景,但是,本申请并非局限于此,任何本领域的技术人员能思之的变化都应落入本申请的保护范围。

Claims (22)

  1. 一种搜索结果输出方法,其特征在于,包括:
    获取初始结果队列,所述初始结果队列包括基于用户输入的搜索内容确定的与所述搜索内容相关的多个搜索结果;
    根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组;
    按照多个预设排序因子对所述多个信息组中的每个信息组包括的搜索结果进行排序,得到中间结果队列;
    基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列;
    输出所述目标结果队列。
  2. 根据权利要求1所述的方法,其特征在于,所述获取初始结果队列,包括:
    接收所述用户输入的所述搜索内容;
    解析所述搜索内容,查询与所述搜索内容相关的所述多个搜索结果;
    对所述多个搜索结果进行点击通过率CTR预估,输出所述多个搜索结果的多个预估点击率;
    按照所述多个预估点击率对所述多个搜索结果进行排序,得到所述初始结果队列。
  3. 根据权利要求1所述的方法,其特征在于,所述根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组,包括:
    确定多个语义匹配度取值区间和多个场景匹配度取值区间;
    对于所述多个搜索结果中每个搜索结果,
    确定该搜索结果的语义匹配度和场景匹配度,所述语义匹配度指示了该搜索结果与所述搜索内容的语义相关程度,所述场景匹配度指示了该搜索结果与所述搜索场景的相关程度;
    将该搜索结果的所述语义匹配度和所述场景匹配度与所述多个语义匹配度取值区间和所述多个场景匹配度取值区间进行比对;
    将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到所述多个信息组。
  4. 根据权利要求3所述的方法,其特征在于,所述确定该搜索结果的语义匹配度和场景匹配度,包括:
    查询该搜索结果和所述搜索内容之间的所述语义匹配度;
    确定所述用户输入所述搜索内容时所处的地理位置以及与所述搜索结果相关的门店的门店位置;
    计算所述地理位置与所述门店位置之间的位置距离,获取所述位置距离指示的所述场景匹配度。
  5. 根据权利要求3所述的方法,其特征在于,所述确定多个语义匹配度取值区间和多个场景匹配度取值区间,包括:
    查询预设划分标准,在所述预设划分标准中提取所述多个语义匹配度取值区间和所述多个场景匹配度取值区间;或,
    统计第一预设数目的样本参数,基于机器学习深度模型对所述第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照所述多个语义匹配度取值之间的大小关系构建所述多个语义匹配度取值区间,并基于所述机器学习深度模型对所述第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照所述多个场景匹配度取值之间的大小关系构建所述多个场景匹配度取值区间。
  6. 根据权利要求1所述的方法,其特征在于,所述按照多个预设排序因子对所述 多个信息组中的每个信息组包括的搜索结果进行排序,得到中间结果队列,包括:
    对于所述多个信息组中的每个信息组,
    基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分;
    按照所述相关性评分从大到小的顺序,对该信息组包括的全部搜索结果进行排序,得到排序后的信息组;
    按照所述多个信息组对应的分组等级从高到低的顺序将多个所述排序后的信息组进行组合,得到所述中间结果队列。
  7. 根据权利要求6所述的方法,其特征在于,所述基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分,包括:
    对于所述信息组包括的所述每个搜索结果,
    查询该搜索结果在所述多个预设排序因子上的多个因子评分;
    获取所述多个预设排序因子对应的多个因子权重,基于所述多个因子权重对所述多个因子评分进行权重计算,得到该搜索结果的所述相关性评分。
  8. 根据权利要求7所述的方法,其特征在于,所述获取所述多个预设排序因子对应的多个因子权重,包括:
    对于所述多个预设排序因子中每个预设排序因子,查询所述用户输入的搜索内容对应的搜索意图和搜索行业;
    基于线性函数,分别对所述搜索意图、所述搜索行业以及所述每个预设排序因子进行训练,得到所述多个因子权重。
  9. 根据权利要求1所述的方法,其特征在于,所述基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列,包括:
    对于所述初始结果队列与所述中间结果队列中处于同一位次的每两个搜索结果,将所述两个搜索结果进行比对,确定所述两个搜索结果之间的关联性;
    按照所述两个搜索结果之间的关联性,确定所述两个搜索结果的先后顺序并对所述中间结果队列进行调整,得到调整后的中间结果队列;
    将所述调整后的中间结果队列输入至评估器进行评估,得到所述评估器输出的队列评分,采用所述队列评分对所述调整后的中间结果队列进行标注,所述评估器是基于多个样本队列进行训练的且指示有所述多个样本队列中每个样本队列的评分;
    重复执行上述的比对过程,将所述调整后的中间结果队列与所述初始结果队列进行比对并对所述调整后的中间结果队列进行重新调整,直至调整次数达到预设轮转次数,得到数量满足所述预设轮转次数的队列评分;
    在数量满足所述预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及所述目标队列评分标注的目标中间结果队列;
    确定第二预设数目,在所述目标中间结果队列的队首截取数量满足所述第二预设数目的搜索结果,将数量满足所述第二预设数目的搜索结果作为所述目标结果队列。
  10. 根据权利要求1所述的方法,其特征在于,所述基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列之后,所述方法还包括:
    查询预设调控策略,确定所述预设调控策略的调控需求;
    按照所述调控需求对所述目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列;
    将所述调控后的目标结果队列输出。
  11. 一种搜索结果输出装置,其特征在于,包括:
    获取模块,用于获取初始结果队列,所述初始结果队列包括基于用户输入的搜索 内容确定的与所述搜索内容相关的多个搜索结果;
    划分模块,用于根据所述多个搜索结果中每个搜索结果与所述搜索内容、所述用户进行搜索时所处的搜索场景之间的匹配度,将所述多个搜索结果划分为多个信息组;
    排序模块,用于按照多个预设排序因子对所述多个信息组中的每个信息组包括的搜索结果进行排序,得到中间结果队列;
    调整模块,用于基于所述初始结果队列与所述中间结果队列中同一位次的搜索结果之间的关联性,调整所述中间结果队列包括的搜索结果的顺序,得到目标结果队列;
    输出模块,用于输出所述目标结果队列。
  12. 根据权利要求11所述的装置,其特征在于,所述获取模块,用于接收所述用户输入的所述搜索内容;解析所述搜索内容,查询与所述搜索内容相关的所述多个搜索结果;对所述多个搜索结果进行点击通过率CTR预估,输出所述多个搜索结果的多个预估点击率;按照所述多个预估点击率对所述多个搜索结果进行排序,得到所述初始结果队列。
  13. 根据权利要求11所述的装置,其特征在于,所述划分模块,用于确定多个语义匹配度取值区间和多个场景匹配度取值区间;对于所述多个搜索结果中每个搜索结果,确定该搜索结果的语义匹配度和场景匹配度,所述语义匹配度指示了该搜索结果与所述搜索内容的语义相关程度,所述场景匹配度指示了该搜索结果与所述搜索场景的相关程度;将该搜索结果的所述语义匹配度和所述场景匹配度与所述多个语义匹配度取值区间和所述多个场景匹配度取值区间进行比对;将语义匹配度和场景匹配度均处于同一取值区间的搜索结果划分为同一组,得到所述多个信息组。
  14. 根据权利要求13所述的装置,其特征在于,所述划分模块,用于对于所述多个搜索结果中每个搜索结果,查询所述搜索结果和所述搜索内容之间的所述语义匹配度;确定所述用户输入所述搜索内容时所处的地理位置以及与所述搜索结果相关的门店的门店位置;计算所述地理位置与所述门店位置之间的位置距离,获取所述位置距离指示的所述场景匹配度。
  15. 根据权利要求13所述的装置,其特征在于,所述划分模块,用于查询预设划分标准,在所述预设划分标准中提取所述多个语义匹配度取值区间和所述多个场景匹配度取值区间;或,统计第一预设数目的样本参数,基于机器学习深度模型对所述第一预设数目的样本参数进行语义训练,得到多个语义匹配度取值,按照所述多个语义匹配度取值之间的大小关系构建所述多个语义匹配度取值区间,并基于所述机器学习深度模型对所述第一预设数目的样本参数进行场景训练,得到多个场景匹配度取值,按照所述多个场景匹配度取值之间的大小关系构建所述多个场景匹配度取值区间。
  16. 根据权利要求11所述的装置,其特征在于,所述排序模块,用于对于所述多个信息组中的每个信息组,基于所述多个预设排序因子,计算该信息组包括的每个搜索结果的相关性评分;按照所述相关性评分从大到小的顺序,对该信息组包括的全部搜索结果进行排序,得到排序后的信息组;按照所述多个信息组对应的分组等级从高到低的顺序将多个所述排序后的信息组进行组合,得到所述中间结果队列。
  17. 根据权利要求16所述的装置,其特征在于,所述排序模块,用于对于所述信息组包括的所述每个搜索结果,查询该搜索结果在所述多个预设排序因子上的多个因子评分;获取所述多个预设排序因子对应的多个因子权重,基于所述多个因子权重对所述多个因子评分进行权重计算,得到该搜索结果的所述相关性评分。
  18. 根据权利要求17所述的装置,其特征在于,所述排序模块,用于对于所述多个预设排序因子中每个预设排序因子,查询所述用户输入的搜索内容对应的搜索意图和搜索行业;基于线性函数,分别对所述搜索意图、所述搜索行业以及所述每个预设排序因子进行训练,得到所述多个因子权重。
  19. 根据权利要求11所述的装置,其特征在于,所述调整模块,用于对于所述初 始结果队列与所述中间结果队列中处于同一位次的每两个搜索结果,将所述两个搜索结果进行比对,确定所述两个搜索结果之间的关联性;按照所述两个搜索结果之间的关联性,确定所述两个搜索结果的先后顺序并对所述中间结果队列进行调整,得到调整后的中间结果队列;将所述调整后的中间结果队列输入至评估器进行评估,得到所述评估器输出的队列评分,采用所述队列评分对所述调整后的中间结果队列进行标注,所述评估器是基于多个样本队列进行训练的且指示有所述多个样本队列中每个样本队列的评分;重复执行上述的比对过程,将所述调整后的中间结果队列与所述初始结果队列进行比对并对所述调整后的中间结果队列进行重新调整,直至调整次数达到预设轮转次数,得到数量满足所述预设轮转次数的队列评分;在数量满足所述预设轮转次数的队列评分中提取队列评分最高的目标队列评分以及所述目标队列评分标注的目标中间结果队列;确定第二预设数目,在所述目标中间结果队列的队首截取数量满足所述第二预设数目的搜索结果,将数量满足所述第二预设数目的搜索结果作为所述目标结果队列。
  20. 根据权利要求11所述的装置,其特征在于,所述装置还包括:
    查询模块,用于查询预设调控策略,确定所述预设调控策略的调控需求;
    所述调整模块,还用于按照所述调控需求对所述目标结果队列包括的搜索结果进行顺序调整,得到调控后的目标结果队列;
    所述输出模块,还用于将所述调控后的目标结果队列输出。
  21. 一种计算机设备,包括存储器和处理器,所述存储器存储有计算机程序,其特征在于,所述处理器执行所述计算机程序时实现权利要求1至10中任一项所述方法的步骤。
  22. 一种可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现权利要求1至10中任一项所述的方法的步骤。
PCT/CN2022/099454 2021-06-17 2022-06-17 搜索结果输出方法、装置、计算机设备及可读存储介质 WO2022262849A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202110673372.0 2021-06-17
CN202110673372.0A CN113254810B (zh) 2021-06-17 2021-06-17 搜索结果输出方法、装置、计算机设备及可读存储介质

Publications (1)

Publication Number Publication Date
WO2022262849A1 true WO2022262849A1 (zh) 2022-12-22

Family

ID=77188557

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2022/099454 WO2022262849A1 (zh) 2021-06-17 2022-06-17 搜索结果输出方法、装置、计算机设备及可读存储介质

Country Status (2)

Country Link
CN (1) CN113254810B (zh)
WO (1) WO2022262849A1 (zh)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113254810B (zh) * 2021-06-17 2021-10-29 浙江口碑网络技术有限公司 搜索结果输出方法、装置、计算机设备及可读存储介质
CN113672700B (zh) * 2021-08-18 2023-10-13 北京达佳互联信息技术有限公司 内容项的搜索方法、装置、电子设备以及存储介质
CN113407856B (zh) * 2021-08-19 2022-04-29 北京金堤征信服务有限公司 搜索结果排序方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150234827A1 (en) * 2012-08-22 2015-08-20 Baidu Online Network Technology (Beijing) Co., Ltd Method, apparatus, and device for ranking search results
CN105302898A (zh) * 2015-10-23 2016-02-03 天津车之家科技有限公司 一种基于点击模型的搜索排序方法及装置
CN110046308A (zh) * 2019-03-07 2019-07-23 北京搜狗科技发展有限公司 一种排序策略确定方法、装置和电子设备
CN111563207A (zh) * 2020-07-14 2020-08-21 口碑(上海)信息技术有限公司 一种搜索结果的排序方法、装置、存储介质及计算机设备
CN112000871A (zh) * 2020-08-21 2020-11-27 北京三快在线科技有限公司 确定搜索结果列表的方法、装置、设备及存储介质
CN113254810A (zh) * 2021-06-17 2021-08-13 浙江口碑网络技术有限公司 搜索结果输出方法、装置、计算机设备及可读存储介质

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103646106B (zh) * 2013-12-23 2016-05-25 山东大学 一种基于内容相似性的Web主题排序方法
CN104951468A (zh) * 2014-03-28 2015-09-30 阿里巴巴集团控股有限公司 数据搜索处理方法和系统
CN107368510B (zh) * 2017-04-10 2018-08-31 口碑(上海)信息技术有限公司 一种店铺搜索排序方法及装置

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150234827A1 (en) * 2012-08-22 2015-08-20 Baidu Online Network Technology (Beijing) Co., Ltd Method, apparatus, and device for ranking search results
CN105302898A (zh) * 2015-10-23 2016-02-03 天津车之家科技有限公司 一种基于点击模型的搜索排序方法及装置
CN110046308A (zh) * 2019-03-07 2019-07-23 北京搜狗科技发展有限公司 一种排序策略确定方法、装置和电子设备
CN111563207A (zh) * 2020-07-14 2020-08-21 口碑(上海)信息技术有限公司 一种搜索结果的排序方法、装置、存储介质及计算机设备
CN112000871A (zh) * 2020-08-21 2020-11-27 北京三快在线科技有限公司 确定搜索结果列表的方法、装置、设备及存储介质
CN113254810A (zh) * 2021-06-17 2021-08-13 浙江口碑网络技术有限公司 搜索结果输出方法、装置、计算机设备及可读存储介质

Also Published As

Publication number Publication date
CN113254810A (zh) 2021-08-13
CN113254810B (zh) 2021-10-29

Similar Documents

Publication Publication Date Title
WO2022262849A1 (zh) 搜索结果输出方法、装置、计算机设备及可读存储介质
CN107424043B (zh) 一种产品推荐方法及装置,电子设备
TWI554895B (zh) Search results sorting methods and systems, search results sorting optimization methods and systems
RU2629449C2 (ru) Устройство, а также способ выбора и размещения целевых сообщений на странице результатов поиска
JP5897019B2 (ja) 候補製品のリンクリストを判定する方法および装置
WO2018041168A1 (zh) 信息推送方法、存储介质和服务器
TWI648642B (zh) Data search processing method and system
TWI615723B (zh) 網路搜尋方法及設備
CN110532351B (zh) 推荐词展示方法、装置、设备及计算机可读存储介质
CN104217030B (zh) 一种根据服务器搜索日志数据进行用户分类的方法和装置
US20220253499A1 (en) Allocating communication resources via information technology infrastructure
US9128988B2 (en) Search result ranking by department
TW201727558A (zh) 地理區域的熱力展現方法和裝置
US20180308152A1 (en) Data Processing Method and Apparatus
TW201437933A (zh) 搜尋引擎的結果排序方法及系統
CN108415960B (zh) 一种地理位置服务的实现方法及装置,电子设备
US9767204B1 (en) Category predictions identifying a search frequency
US10474670B1 (en) Category predictions with browse node probabilities
TW201828200A (zh) 一種資料處理方法和裝置
CN112100511B (zh) 一种偏好程度数据获得方法、装置以及电子设备
Tan et al. Preference-oriented mining techniques for location-based store search
US10387934B1 (en) Method medium and system for category prediction for a changed shopping mission
CN110020209B (zh) 内容和搜索词的相关性确定方法及系统、展示方法及系统
US9015152B1 (en) Managing search results
CN109299368B (zh) 一种用于环境信息资源ai智能个性化推荐的方法及系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22824321

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE