WO2021196541A1 - 用于搜索内容的方法、装置、设备和计算机可读存储介质 - Google Patents
用于搜索内容的方法、装置、设备和计算机可读存储介质 Download PDFInfo
- Publication number
- WO2021196541A1 WO2021196541A1 PCT/CN2020/117129 CN2020117129W WO2021196541A1 WO 2021196541 A1 WO2021196541 A1 WO 2021196541A1 CN 2020117129 W CN2020117129 W CN 2020117129W WO 2021196541 A1 WO2021196541 A1 WO 2021196541A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- historical search
- search
- historical
- result
- records
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 230000004044 response Effects 0.000 claims abstract description 6
- 238000004590 computer program Methods 0.000 claims description 6
- 238000012545 processing Methods 0.000 abstract description 11
- 230000000694 effects Effects 0.000 abstract description 2
- 238000010586 diagram Methods 0.000 description 12
- 238000005065 mining Methods 0.000 description 12
- 230000008569 process Effects 0.000 description 10
- 238000004891 communication Methods 0.000 description 4
- 230000006870 function Effects 0.000 description 3
- 238000003062 neural network model Methods 0.000 description 3
- 230000003287 optical effect Effects 0.000 description 3
- 238000012216 screening Methods 0.000 description 3
- 238000004458 analytical method Methods 0.000 description 2
- 230000006399 behavior Effects 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000014509 gene expression Effects 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 238000013473 artificial intelligence Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 239000000446 fuel Substances 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 239000004065 semiconductor Substances 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2457—Query processing with adaptation to user needs
- G06F16/24578—Query processing with adaptation to user needs using ranking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2453—Query optimisation
- G06F16/24534—Query rewriting; Transformation
- G06F16/24539—Query rewriting; Transformation using cached or materialised query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/332—Query formulation
- G06F16/3325—Reformulation based on results of preceding query
- G06F16/3326—Reformulation based on results of preceding query using relevance feedback from the user, e.g. relevance feedback on documents, documents sets, document terms or passages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/3349—Reuse of stored results of previous queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9536—Search customisation based on social or collaborative filtering
Definitions
- the embodiments of the present disclosure mainly relate to the field of data processing, and more specifically, to methods, apparatuses, devices, and computer-readable storage media for searching content.
- search engines In order to solve the difficulty of information retrieval, many search engines have appeared to help users find information. Because the search engine will collect various information from a large number of websites to the local area, and then build various information databases through processing. When users want to find the content, they can easily and quickly get the content they want to find by entering the search content in the search engine. However, there are still many problems that need to be solved in the process of using search engines to find content.
- a method for searching content includes obtaining multiple historical search records related to multiple historical search requests in response to receiving a search request for a target search term, each historical search record including a historical search term targeted by a corresponding historical search request.
- the method further includes determining a first historical search record matching the target search term from a plurality of historical search records.
- the method further includes determining a second historical search record associated with the first historical search record from the plurality of historical search records based on the relationship between the plurality of historical search records.
- the method further includes determining an expanded result for the target search item based on the search result corresponding to the second historical search record.
- an apparatus for searching content includes a historical search record obtaining module, configured to obtain multiple historical search records related to multiple historical search requests in response to receiving a search request for a target search item, and each historical search record includes a corresponding historical search
- the historical search item targeted by the request is configured to determine the first historical search record matching the target search item from a plurality of historical search records
- the historical search record determination module is configured to be based on multiple historical search records The relationship between the search records, determining a second historical search record associated with the first historical search record from a plurality of historical search records; and an extended result determination module configured to be based on a search corresponding to the second historical search record As a result, the expanded result for the target search item is determined.
- an electronic device including one or more processors; and a storage device, for storing one or more programs, when one or more programs are used by one or more processors Execution enables one or more processors to implement the method according to the first aspect of the present disclosure.
- a computer-readable storage medium having a computer program stored thereon, and when the program is executed by a processor, the method according to the first aspect of the present disclosure is implemented.
- FIG. 1 shows a schematic diagram of an example 100 of providing a recommendation result according to a traditional solution
- FIG. 2 shows a schematic diagram of an environment 200 in which multiple embodiments of the present disclosure can be implemented
- FIG. 3 shows a flowchart of a method 300 for searching content according to some embodiments of the present disclosure
- FIG. 4 shows a flowchart of a method 400 for obtaining multiple historical search records according to some embodiments of the present disclosure
- FIG. 5 shows a flowchart of a method 500 for determining historical search record categories and relationships according to some embodiments of the present disclosure
- FIG. 6 shows a flowchart of a method 600 for determining the relationship between historical search records according to some embodiments of the present disclosure
- FIG. 7 shows a block diagram of an apparatus 700 for searching content according to some embodiments of the present disclosure
- FIG. 8 shows a block diagram of an apparatus 800 for searching content according to some embodiments of the present disclosure.
- FIG. 9 shows a block diagram of a device 900 capable of implementing various embodiments of the present disclosure.
- FIG. 1 shows a schematic diagram of an example 100 in which a traditional solution provides recommended search terms. After the user enters "Liu**" in the search engine, two recommendation boxes 102 and 104 are provided. Some recommended search terms are provided in box 102, and some recommended search terms are also provided in box 104.
- the recommended search items given by the traditional solution cannot directly meet the relevant needs of the user, and the user is required to click the search item to manually filter the document resources that can meet the needs in the new search page.
- the search term text in the traditional solution is generally short, and its attractiveness as recommended content is weak, and the search term is generated through user-generated content, and it is difficult to control its quality and safety.
- an improved solution for searching content is proposed.
- multiple historical search records related to multiple historical search requests are first obtained, wherein each historical search record includes the corresponding history for which the historical search request is targeted. Search item. Then, the first historical search record matching the target search item is determined from the multiple historical search records. Based on the relationship between the multiple historical search records, a second historical search record associated with the first historical search record is determined from the multiple historical search records. Then, based on the search result corresponding to the second historical search record, the expanded result for the target search item is determined.
- Figure 2 shows a schematic diagram of an environment 200 in which multiple embodiments of the present disclosure can be implemented.
- a terminal device 204 and a computing device 208 are included in this example environment 200.
- the computing device 208 provides the user 202 with an expanded result 212 for the search request 206 based on the search request 206 from the terminal device 204.
- the terminal device 204 may run an application or program for searching, such as a search engine application.
- the terminal device 204 receives the target search item input by the user 202, for example, the user 202 inputs "How much is the Mercedes-Benz C200".
- the terminal device 204 then generates a search request 206 for the target search term and sends the search request 206 to the computing device 208.
- Terminal devices 204 include, but are not limited to, personal computers, server computers, handheld or laptop devices, mobile devices (such as mobile phones, personal digital assistants (PDAs), media players, etc.), multi-processor systems, consumer electronics, small Computers, mainframe computers, distributed computing environments including any of the above systems or devices, etc.
- mobile devices such as mobile phones, personal digital assistants (PDAs), media players, etc.
- PDAs personal digital assistants
- multi-processor systems consumer electronics
- small Computers mainframe computers
- distributed computing environments including any of the above systems or devices, etc.
- Computing devices 208 include, but are not limited to, personal computers, server computers, handheld or laptop devices, multi-processor systems, consumer electronics, minicomputers, large computers, distributed computing environments including any of the above systems or devices, Virtual machines or other computing devices in the cloud platform.
- the computing device 208 After the computing device 208 receives the search request 206 from the terminal device 204, the computing device 208 not only generates search results for the target search item in the search request 206, but also obtains extensions from the computing device 208 according to the target search item in the search request 206 Result 212.
- the multiple historical search records 210 obtained by the computing device 208 are searched for matching historical search records by matching the target search item with the historical search items in the multiple historical search records 210.
- FIG. 2 shows that the computing device 208 receives a plurality of historical search records 210 from other devices, which is only an example and not a specific limitation of the present disclosure.
- the plurality of historical search records 210 may also be generated within the computing device 208 or by the computing device 208 when the search request 206 is received.
- the multiple historical search records 210 are determined by log data in the search log.
- Each historical search record in the plurality of historical search records 210 includes a historical search item targeted by a corresponding historical search request.
- each historical search item further includes a key entity, which is performed by entity recognition of the historical search item in the log data, and the number of occurrences of the entity in the historical search item is determined from a plurality of recognized entities. To determine it.
- each historical search item further includes a category of demand corresponding to the historical search item.
- each historical search record in addition to the historical search term, each historical search record also includes a historical search term associated with the historical search term and a degree of association with the associated historical search term.
- the computing device 208 searches the multiple historical search records 210 for historical search terms that are the same as the target search term, for example, searches for historical search records where the historical search term is "How much is a Mercedes-Benz C200?" In some embodiments, the computing device 208 searches the plurality of historical search records 210 for historical search terms whose matching degree with the target search term is higher than a threshold degree.
- the above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.
- the computing device 208 finds the first historical search record that matches the target search term, the computing device 208 also obtains the relationship between the multiple historical search records 210. The computing device 208 then determines the second historical search record associated with the first historical search record based on the relationship between the multiple histories, for example, the historical search item in the second historical search record is "Photo of Mercedes-Benz C200". Alternatively or additionally, the computing device 208 may also determine one or more other historical search records. In some embodiments, the relationship between the multiple historical search records is the degree of correlation with the multiple categories of the multiple historical search records. In some embodiments, the relationship between the multiple historical search records is the degree of association between the multiple historical search records.
- the computing device 208 then obtains the expanded result 212 based on the historical search terms of the second historical search record. The computing device 208 then provides the expanded result 212 and/or the target search result obtained from the target search term to the user 202.
- Figure 2 above shows a schematic diagram of an environment 200 in which multiple embodiments of the present disclosure can be implemented.
- the following describes a flowchart of a method 300 for searching content according to some embodiments of the present disclosure in conjunction with FIG. 3.
- the method 300 may be implemented by the computing device 208 in FIG. 2 or any other suitable device.
- the computing device 208 determines whether a search request 206 for the target search term is received. Upon receiving the search request 206, at block 304, the computing device 208 obtains a plurality of historical search records 210 related to the plurality of historical search requests. Wherein, each historical search record includes a historical search item targeted by a corresponding historical search request.
- each historical search record of the plurality of historical search records 210 includes a historical search term. In some embodiments, each historical search record in the plurality of historical search records 210 includes a historical search term and a key entity corresponding to the historical search term. In some embodiments, each historical search record of the plurality of historical search records 210 includes historical search terms, key entities corresponding to the historical search terms, and corresponding demand categories. In some embodiments, each historical search record of the plurality of historical search records 210 includes a historical search term, a corresponding historical search term, and a degree of association between the historical search term and the corresponding historical search term. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.
- the plurality of historical search records 210 are obtained by the computing device 208 and other servers or computers. In some embodiments, multiple historical search records 210 have been generated in the computing device 208. In some embodiments, the plurality of historical search records 210 are generated online by the computing device 208 when the user 202 performs a search. The process of obtaining multiple historical search records 210 by the computing device 208 will be described with reference to FIG. 4.
- the computing device 208 determines the first historical search record matching the target search term from the plurality of historical search records 210. After the computing device 208 obtains the multiple historical search records 210 and the target search term, it will search for the first historical search record matching the multiple historical search records 210 from the multiple historical search records 210.
- the target search term is exactly the same as the historical search term in the first historical search record.
- the degree of matching between the target search term and the historical search term in the first historical search record is higher than a predetermined matching threshold.
- the computing device 208 determines a second historical search record associated with the first historical search record from the plurality of historical search records 210 based on the relationship between the plurality of historical search records 210. In some embodiments, in addition to obtaining the second historical search record, the computing device 208 also obtains other historical search records associated with the first historical search record.
- each historical search record in the plurality of historical search records 210 includes historical search terms and key entities, or each historical search record includes historical search terms and key entities and the category of each historical search record.
- the relationship between multiple historical search records 210 is the degree of association between multiple categories.
- the computing device 208 determines a second category associated with the first category of the first historical search record based on the relationship between the plurality of historical search records 210.
- the computing device 208 determines from the plurality of historical search records 210 a second historical search record of the second category, the second historical search record including the key entity of the first historical search record.
- the computing device 208 For each category of the historical search record, the computing device 208 uses the plurality of historical search items included in the plurality of historical search records 210 to determine the category of the plurality of historical search records 210. The computing device 208 then determines the relationship between the plurality of historical search records 210 based on the category. Through the above method, the relationship between the category and multiple historical search records can be determined more quickly and accurately. The process of determining the category and determining the relationship between multiple historical search records related to the category will be described later in conjunction with FIG. 5.
- the relationship between the plurality of historical search records 210 is obtained. This relationship describes the degree of association between each historical search record in the plurality of historical search records 210 and its corresponding historical search record.
- the computing device 208 may determine a set of historical search records associated with the first historical search record based on the relationship between the plurality of historical search records 210, and each historical search record in the first historical search record and the set of historical search records Have a degree of relevance.
- the computing device 208 determines a second historical search record from a set of historical search records based on the degree of association. With this method, the second historical search record with a high matching degree can be found quickly and accurately.
- the process of determining the degree of association between each historical search record and its corresponding historical search record will be described below in conjunction with FIG. 6.
- the computing device 208 determines an expanded result 212 for the target search term based on the search result corresponding to the second historical search record.
- the computing device 208 after obtaining the second historical search record, obtains search results for historical search terms in the second historical search record. In some embodiments, the computing device 208 uses the historical search terms in the second historical search record to re-search, so as to obtain the search result in real time. As an alternative, in some embodiments, the computing device 208 may also look up historical search results regarding historical search terms in the second historical search record. For example, the computing device 208 can look up the aforementioned historical search results from the search log. It should be understood that the above examples are only used to describe the present disclosure, not to specifically limit the present disclosure. The computing device 208 may obtain the search results for the historical search items in the second historical search record in a variety of ways.
- the computing device 208 determines the search result obtained from the search term in the second historical search record as the expanded result 212. In this way, information suitable for users can be quickly and automatically expanded.
- the computing device 208 uses the second historical search record to perform a search to obtain historical search results for historical search terms in the second historical search record. For example, the computing device 208 can look up the historical search result from the daily increase in search logs. Then, the computing device 208 determines from the historical search results a part of the historical search results that have been accessed by the user 202. At this time, the computing device 208 determines part of the historical search result as the expanded result 212. In this way, the extension result 212 related to the user can be determined more accurately.
- the computing device 208 after obtaining the second historical search record, also obtains the information flow generated when the user 202 searches for historical search terms in the second historical search record.
- the information stream is a historical information stream recorded in the log record that is provided to the user when the user uses the historical search item in the second historical search record to search. Historical information flow can be news, various network information, push advertisements, etc.
- the computing device 208 determines the information flow browsed by the user 202 during the search as the expanded result 212 based on the information flow. For example, if the user 202 who performed the second historical search record also viewed the information stream pushed from the web server when searching for information, the viewed information stream is used as the extended result 212. Alternatively or additionally, the attention tag established by the user 202 needs to be present in the information stream being viewed. In this way, the sources of extended results can be increased and more extended results can be provided.
- the computing device 208 may provide the expanded result 212 to the terminal device 204, or the computing device 208 may provide the expanded result 212 and the target search result for the target search term to the terminal device 204. In this way, users can quickly obtain expanded results and target search results.
- the computing device 208 determines the first score of the expanded result 212, and the first score indicates the expanded result 212 and the historical search in the second historical search record.
- the score is generated by a neural network model.
- the score of each result is determined by inputting information such as the user click distribution, user click rate estimation, title, content, length, and historical search items of the second historical search record of each result in the expanded result 212 to the neural network module.
- the neural network model is determined by sample user click distribution, sample user click rate estimation, sample search result items, sample search items, title, content, length and other information of the expanded results, and sample scores.
- the computing device 208 also determines a second score of the target search result, the second score indicating the degree of relevance between the target search result and the target search term. It is also inputting the title, content length, target search item and other information of each result of the target search result into the above neural network model to determine the score of the target search result.
- the computing device 208 determines the priority of the expanded result 212 and the target search result based on the first score and the second score. Then, the computing device 208 provides the expanded result 212 and the target search result according to the priority. Alternatively or as an attachment, the computing device 208 may also set some restriction conditions on the display of the expanded result 212. For example, there can be only the first number of extended results 212 among the predetermined number of results provided, or the continuous number of extended results 212 can be set. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure. Those skilled in the art can set it as needed. Through the above method, it is possible to provide users with higher and more accurate target search results and recommendation results.
- the computing device 208 also establishes a target data source for obtaining search results corresponding to the second historical search record.
- the target data source may be generated by another device, and then the computing device 208 obtains the target data source from the other device. By establishing the target data source, the quality of the target data source can be improved, so that high-quality content can be provided to users.
- the computing device 208 when the computing device 208 establishes the target data source, it first determines the scores of multiple documents in the multiple original data sources, and the score of each document indicates the quality of the document.
- the scoring of documents is determined by the following methods: media site scoring: includes site scoring based on automatic link analysis methods, and site scoring marked by experts; media author scoring: includes author registration marked by experts, author popularity through big data analysis, The popularity of the author synthesized through reader feedback information such as likes and comments; and the richness of media texts, pictures, and videos.
- the computing device 208 determines the document whose score exceeds the threshold score among the multiple documents as the document in the target data source. In this way, high-quality candidate results can be obtained through truncation operations.
- FIG. 4 shows a flowchart of a method 400 for obtaining multiple historical search records according to some embodiments of the present disclosure.
- the method 400 in FIG. 4 may be executed by the computing device 208 in FIG. 2 or any other suitable device.
- the computing device 208 determines from the search log a set of historical search terms for which a set of historical search requests are targeted.
- search log entries of all users are stored in the search log. Therefore, a set of historical search terms can be determined from the search log.
- the computing device 208 determines multiple entities from a set of historical search terms, each entity identifying an object associated with a corresponding historical search term.
- the computing device 208 performs entity recognition on each historical search item in a set of historical search items, for example, to identify the entity through a named entity recognition method.
- the computing device 208 determines key entities from the multiple entities based on the number of occurrences of the multiple entities in a set of historical search terms.
- the computing device 208 determines a set of historical search terms that includes a single entity from a set of historical search terms. Then, the computing device 208 determines at least one historical search item from the set of historical search items, and the number of occurrences of a single entity included in the at least one historical search item in the set of historical search items exceeds the first threshold number of times. The computing device 208 determines a single entity included in at least one historical search term as a key entity. Through this method, key entities can be quickly and accurately determined.
- the computing device 208 when determining a key entity, determines a high-frequency entity whose number of occurrences exceeds the second threshold number from the multiple entities based on the number of occurrences of the multiple entities in a set of historical search terms. The computing device 208 determines the high-frequency entity as a key entity based on determining that the weight of the high-frequency entity in the corresponding historical search item exceeds the threshold weight, where the weight indicates the importance of the high-frequency entity in the corresponding historical search item.
- the computing device 208 determines the weight based on the location of the high-frequency entity in the corresponding historical search term. In some embodiments, the computing device 208 determines the weight based on the relationship between the length of the high-frequency entity and the length of the corresponding historical search term. In some embodiments, the computing device 208 may also determine the weight based on the combination of the foregoing methods and using any other suitable information.
- the above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure. The weights can also be obtained by combining the above methods or in other ways. Through the above method, the weight can be determined accurately and quickly.
- the computing device 208 selects multiple historical search terms that include key entities from a set of historical search terms. After determining the key entity, the computing device 208 uses the key entity to determine a historical search item that only includes the key entity.
- the computing device 208 generates a plurality of historical search records 210 based on the plurality of historical search terms and key entities.
- each historical search record in the plurality of historical search records 210 includes at least a historical search item and its corresponding key entity.
- the aforementioned multiple historical search records 210 may be generated by other devices based on the search logs, and the computing device 208 receives multiple historical search records 210 from other devices.
- FIG. 5 shows a flowchart for determining historical search record categories and relationships 500 according to some embodiments of the present disclosure.
- the method 500 in FIG. 5 may be executed by the computing device 208 in FIG. 2 or any other suitable device.
- each historical search record in the plurality of historical search records 210 includes key entities in addition to historical search terms.
- the computing device 208 obtains the remaining parts of each of the multiple historical search items by removing the corresponding key entities from the multiple historical search items. For example, when multiple historical search items are "How much is Mercedes-Benz C200”, “Price of Mercedes-Benz C200”, and “Picture of Mercedes-Benz C200", and the key entity is "Benz C200”, the remaining part is "How much” and " Price”, “Picture of”.
- the computing device 208 determines demand information associated with a plurality of historical search terms based on at least the remaining portion.
- the computing device 208 determines the user's demand information for the remaining part, for example, determines the remaining part "how much", “price”, and "picture of" as the demand information.
- the computing device 208 determines the category of the plurality of historical search records 210 based on the demand information.
- the computing device 208 uses a clustering operation to process the demand information to determine the categories of the multiple historical search records 210, for example, a k-means method is used to process the demand information.
- the computing device 208 may also determine the category of the demand information in other suitable ways, such as manually classifying. The above examples are only used to describe the present disclosure, but not to specifically limit the present disclosure.
- the demand categories of multiple historical search items can be accurately determined, and the classification of multiple historical search records can also be realized.
- the computing device 208 determines search times or search results for multiple historical search terms from the search log. After determining each category, the computing device 208 needs to determine the association relationship between each category. Therefore, the computing device 208 will determine the log records for multiple historical search items from the search log, and then determine the search time and search results of these log records.
- the computing device 208 determines the degree of relevance between the multiple categories based on the search time or the search results.
- the computing device 208 determines in the log that the two historical search records of different categories of the same user within a predetermined period of time are the correlation degree between the two categories increased by 1, alternatively or additionally, the history of the two historical search records
- the key entities of the search terms are the same. For example, if the user 202 searches for “price of Mercedes-Benz C200” and “picture of Mercedes-Benz C200” within a predetermined time period, it can be determined that the correlation between the category corresponding to “price” and the category corresponding to “picture” is 1.
- the degree of correlation between multiple categories can be determined.
- the computing device 208 determines the relationship between the plurality of historical search records 210 based on the degree of correlation between the plurality of categories. Through the degree of correlation between each category, the relationship between multiple historical search records 210 can be determined. For example, when a historical search record has a first category, one or more other categories with a higher degree of relevance can be determined through the first category, and then the key entity of the first historical search record is combined with one or more other categories. The category can then identify other historical search records associated with the historical search record.
- the degree of correlation between multiple categories can be quickly and accurately determined, so that the accuracy of the recommended results can be ensured when searching.
- FIG. 6 shows a flowchart for determining a relationship 600 between historical search records according to some embodiments of the present disclosure.
- the method 600 in FIG. 6 can be executed by the computing device 208 in FIG. 2 or any other suitable device.
- the computing device 208 determines search times or search results for multiple historical search terms from the search log.
- the search log stores many search log items of users, and the search time and search results of multiple historical search items can be determined through the search log items.
- the computing device 208 determines the degree of association between the plurality of historical search records 210 based on the search time or the search result.
- the computing device 208 determines that there is a correlation between the two search records based on the same user executing two search items within a predetermined period of time or the search results of the two search items having the same result item. For example, if the user 202 executes two historical search items within a predetermined period of time, the degree of association between two historical search records including the two search items may be increased by one. If there are a predetermined number of identical result items in the search results corresponding to the two historical search items, the degree of association between the two historical search records can be increased by one.
- the computing device 208 determines the relationship between the plurality of historical search records 210 based on the degree of association between the plurality of historical search records 210.
- the computing device 208 determines the association relationship between the plurality of historical search records 210 based on the determined degree of association.
- the multiple historical search records 210 and the association relationship between the multiple historical search records 210 may be generated by other devices, and the computing device 208 may be obtained from other devices.
- association relationship between multiple historical search records can be quickly and accurately determined, so that the expansion result can be quickly and accurately determined.
- the device 700 includes a high-quality result screening module 702, a related demand mining module 704, a recommendation result matching module 706, and a search result and recommendation result mixing module 708.
- the relevant demand mining module 704 digs out relevant demands based on the original search terms.
- the specific expression form of the relevant demands may be in the form of search terms, keyword combinations, semantic vectors, and the like.
- the recommendation result matching module 706 retrieves the results filtered by the high-quality result screening module 702, and finds resources that can meet the related needs as the recommendation result; finally, the search result and recommendation result mixing module 708 combines the recommendation result with the search
- the normal results retrieved by the engine are shuffled to form the final result list, which is returned to the user.
- the related demand mining module 704 mines related needs based on the user's original search terms.
- the technical methods used include the following: Content-based mining method: First, the search item content is split, and two concepts are defined: search key Entity and demand dimensions.
- the search core subject is the subject string that the user can extract from the search sequence during the search process. This subject string can represent the user’s core aspirations. For example, the search term is "How much is the Mercedes-Benz c200", the core subject is "Benz c200”, and "How much” is a description of the user's demand for the core subject, where the demand is asking the price.
- NER named entity recognition
- the key entity meets three conditions: 1) The key entity itself is used as a search Items have a high number; 2) Key entities frequently appear as substrings in multiple historical search items. 3) Among all the search terms that contain the subject string, the average weight of the subject string is relatively high.
- the demand dimension is the attribute of the key product entity.
- the remaining string of the key entity will be removed as the demand.
- the different expressions of the initially obtained demand substrings may be the same demand, for example, the demand for "How much is the Mercedes-Benz c200" and the "Mercedes-Benz c200 price” is the same.
- the correlation matrix between different dimensions is calculated to represent the close relationship between the dimensions.
- Historical search items are separated by key entities and demand information, and by mining a good demand category correlation matrix, the search items with strong correlation demands are regarded as the relevant extended demand collection of the current search items.
- the graph-based mining method mines the set of search terms that are strongly related to the current search term as the related expansion requirements of the current search term.
- the core keywords are extracted, and the user's focus tag is established.
- the recommendation result matching module 706 is based on the relevant requirements mined, and matches the results from the resource library that can meet the relevant requirements.
- the technical methods used include the following: Matching based on the search and retrieval system: Use the extended search term to search the retrieval system to obtain Satisfied results that match the extended search term. And merge all the results according to the strength of the association as the recommended result of the search term.
- Matching based on user search big data According to user behaviors such as co-occurrence and a bit, mining articles related to the extended search term as the recommendation result of the search term.
- Matching of big data based on user search and information flow browsing Mining and counting user focus tags from user search and information flow browsing data, and recalling articles through focus matching, as the user's personalized recommendation results.
- the target search result and extended result mixing module 708 mainly includes search result scoring, recommendation result scoring, and mixing. Scoring of search results: scoring based on the fusion model of historical click distribution and user click rate estimation. The recommendation result score is based on historical click distribution, user click-through rate estimation scoring, etc.
- shuffle sort from high to low based on search result score and recommendation result score.
- diversity control will also be carried out, including diversity control based on the density of recommended results, and diversity control of the density of recommended results on the same theme.
- the high-quality result screening module 702 scores the document resources based on some basic quality factors, and truncates based on the scores to obtain high-quality candidate results.
- the basic quality factors include: media site scoring: including site scoring based on automatic link analysis methods and site scoring marked by experts; media author scoring: including author registration marked by experts, author popularity through big data analysis, through likes, and comments The author’s popularity based on reader feedback information; the richness of media texts, pictures, and videos.
- FIG. 8 shows a schematic block diagram of an apparatus 800 for searching content according to an embodiment of the present disclosure.
- the apparatus 800 may include a historical search record obtaining module 802, configured to obtain multiple historical search records related to multiple historical search requests in response to receiving a search request for a target search item, each historical search record The search record includes the historical search item targeted by the corresponding historical search request.
- the device 800 further includes a target search term matching module 804 configured to determine a first historical search record matching the target search term from a plurality of historical search records.
- the device 800 further includes a historical search record determination module 806 configured to determine a second historical search record associated with the first historical search record from the multiple historical search records based on the relationship between the multiple historical search records.
- the device 800 further includes an extended result determination module configured to determine an extended result for the target search item based on the search result corresponding to the second historical search record.
- the historical search record acquisition module 802 includes: a first historical search item determination module configured to determine a set of historical search items targeted by a set of historical search requests from the search log; an entity determination module configured to To determine multiple entities from a set of historical search terms, each entity identifies an object associated with a corresponding historical search term; the first key entity determination module is configured to be based on the appearance of multiple entities in a set of historical search terms The number of times, the key entity is determined from multiple entities; the selection module is configured to select multiple historical search items including key entities from a set of historical search items; and the generation module is configured to be based on multiple historical search items and key The entity generates multiple historical search records.
- the first key entity determining module includes a historical search item set determining module configured to determine a historical search item set including a single entity from a set of historical search items; the second historical search item determining module is configured In order to determine at least one historical search item from the set of historical search items, the number of occurrences of a single entity included in the at least one historical search item in the set of historical search items exceeds a first threshold number of times; and a key entity determining module for a single entity is configured to A single entity included in at least one historical search item is determined as a key entity.
- the key entity determination module includes a high-frequency entity determination module configured to determine from a plurality of entities that the number of occurrences exceeds a second threshold, based on the number of occurrences of multiple entities in a set of historical search terms. Frequency entity; and a second key entity determination module, configured to determine the high frequency entity as a key entity based on determining that the weight of the high frequency entity in the corresponding historical search item exceeds the threshold weight, wherein the weight indicates that the high frequency entity is in the corresponding Importance in historical search terms.
- the second key entity determination module includes a position determination module configured to determine the position of the high-frequency entity in the corresponding historical search item, and a length relationship determination module configured to determine the length of the high-frequency entity and the corresponding The relationship between the length of historical search terms.
- the device 800 further includes a category determining module configured to determine the categories of multiple historical search records based on multiple historical search items included in the multiple historical search records; and a historical search record relationship determining module configured to To determine the relationship between multiple historical search records based on categories.
- each historical search record in the plurality of historical search records further includes a key entity
- the category determination module includes: a remaining part determination module configured to remove the corresponding key entity from the plurality of historical search items , Obtain the remaining parts of each of the multiple historical search items; a requirement information determination module configured to determine the requirement information associated with the multiple historical search items based at least on the remaining parts; and the historical search record category determining module, configured to be based on requirements Information to determine the category of multiple historical search records.
- the multiple historical search records have multiple categories
- the historical search record relationship determination module includes: a first search time or search result determination module configured to determine from the search log for multiple historical search terms The search time or search results of the search time; the correlation degree determination module is configured to determine the correlation degree between multiple categories based on the search time or search results; and the relationship determination module based on the correlation degree is configured to determine the relationship between multiple categories based on the search time or search results. The degree of relevance to determine the relationship between multiple historical search records.
- the apparatus 800 further includes a second search time or search result configuration module, configured to determine the search time or search results for multiple historical search items from the search log; the association degree determination module is configured to be based on The search time or search results determine the degree of association between multiple historical search records; and the relationship determination module based on the degree of association is configured to determine the degree of association between the multiple historical search records based on the degree of association between the multiple historical search records. Relationship between.
- each historical search record in the plurality of historical search records further includes a key entity and a category of each historical search record
- the historical search record determination module 806 includes: a second category determination module configured to be based on The relationship between a plurality of historical search records is determined to determine a second category associated with the first category of the first historical search record; and a second historical search record determination module having a category is configured to determine from the multiple historical search records There is a second historical search record of a second category, and the second historical search record includes a key entity of the first historical search record.
- the historical search record determination module 806 includes a set of historical search record determination modules configured to determine a set of historical search records associated with the first historical search record based on the relationship between the multiple historical search records ,
- the first historical search record has a degree of association with each historical search record in a set of historical search records; and the historical search record determining module based on the degree of association is configured to determine the first historical search record from the set of historical search records based on the degree of association 2.
- the extended result determination module 808 includes a first search result acquisition module configured to acquire search results for historical search items in the second historical search record; and an extended result determination module for search results is configured To determine the search result as an extended result.
- the extended result determination module 808 includes a second search result acquisition module configured to acquire historical search results for historical search items in the second historical search record; a partial historical search result determination module is configured to obtain Part of the historical search results that are determined to have been accessed by the user in the historical search results; and a part of the historical search result expansion module configured to determine the part of the historical search results as expanded results.
- the extended result determination module includes an information flow module configured to obtain the information flow generated when the user searches for the historical search items in the second historical search record; and the extended result module of the targeted information flow , Is configured to determine the extension result based on the information flow.
- the device 800 further includes at least one of the following: a first providing device configured to provide extended results; and a second providing device configured to provide extended results and target search results for target search terms.
- the second providing device includes a first score determination module configured to determine a first score of the expanded result, the first score indicating the degree of relevance between the expanded result and the historical search item in the second historical search record;
- the second score determination module is configured to determine the second score of the target search result, the second score indicates the degree of relevance between the target search result and the target search item;
- the priority determination module is configured to determine based on the first score and the second score The priority of the expanded result and the target search result; and the expanded result and search result providing module configured to provide the expanded result and the target search result based on the priority.
- the device 800 further includes a target data source establishment module configured to establish a target data source for obtaining search results corresponding to the second historical search record.
- the target data source establishment module includes a document score determination module configured to determine the scores of multiple documents in the multiple original data sources, the score of each document indicates the quality of the document; and the target data source document determination The module is configured to determine the document whose score exceeds the threshold score among multiple documents as the document in the target data source.
- FIG. 9 shows a schematic block diagram of an electronic device 900 that can be used to implement embodiments of the present disclosure.
- the device 900 may be used to implement the terminal device 204 and the computing device 208 in FIG. 1.
- the device 900 includes a computing unit 901, which can be configured according to computer program instructions stored in a read-only memory (ROM) 902 or computer program instructions loaded from a storage unit 808 to a random access memory (RAM) 903. Perform various appropriate actions and processing.
- ROM read-only memory
- RAM random access memory
- the computing unit 901, the ROM 902, and the RAM 903 are connected to each other through a bus 904.
- An input/output (I/O) interface 905 is also connected to the bus 904.
- the I/O interface 905 includes: an input unit 906, such as a keyboard, a mouse, etc.; an output unit 907, such as various types of displays, speakers, etc.; and a storage unit 908, such as a magnetic disk, an optical disk, etc. ; And the communication unit 909, such as a network card, a modem, a wireless communication transceiver, etc.
- the communication unit 909 allows the device 900 to exchange information/data with other devices through a computer network such as the Internet and/or various telecommunication networks.
- the computing unit 901 may be various general-purpose and/or special-purpose processing components with processing and computing capabilities. Some examples of computing unit 901 include, but are not limited to, central processing unit (CPU), graphics processing unit (GPU), various dedicated artificial intelligence (AI) computing chips, various computing units that run machine learning model algorithms, digital signal processing DSP, and any appropriate processor, controller, microcontroller, etc.
- the calculation unit 901 executes the various methods and processes described above, such as methods 300, 400, 500, and 600.
- 300, 400, 500, and 600 may be implemented as computer software programs, which are tangibly contained in a machine-readable medium, such as the storage unit 908.
- part or all of the computer program may be loaded and/or installed on the device 900 via the ROM 902 and/or the communication unit 909.
- the computer program When the computer program is loaded into the RAM 903 and executed by the computing unit 901, one or more steps of the methods 300, 400, 500, and 600 described above can be executed.
- the computing unit 901 may be configured to execute the method 900 in any other suitable manner (for example, by means of firmware).
- exemplary types of hardware logic components include: Field Programmable Gate Array (FPGA), Application Specific Integrated Circuit (ASIC), Application Specific Standard Product (ASSP), System on Chip System (SOC), Load programmable logic device (CPLD) and so on.
- FPGA Field Programmable Gate Array
- ASIC Application Specific Integrated Circuit
- ASSP Application Specific Standard Product
- SOC System on Chip System
- CPLD Load programmable logic device
- the program code for implementing the method of the present disclosure can be written in any combination of one or more programming languages. These program codes can be provided to the processors or controllers of general-purpose computers, special-purpose computers, or other programmable data processing devices, so that when the program codes are executed by the processor or controller, the functions specified in the flowcharts and/or block diagrams/ The operation is implemented.
- the program code can be executed entirely on the machine, partly executed on the machine, partly executed on the machine and partly executed on the remote machine as an independent software package, or entirely executed on the remote machine or server.
- a machine-readable medium may be a tangible medium, which may contain or store a program for use by the instruction execution system, apparatus, or device or in combination with the instruction execution system, apparatus, or device.
- the machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium.
- the machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, device, or device, or any suitable combination of the foregoing.
- machine-readable storage media would include electrical connections based on one or more wires, portable computer disks, hard disks, random access memory (RAM), read-only memory (ROM), erasable programmable read-only memory (EPROM or flash memory), optical fiber, portable compact disk read only memory (CD-ROM), optical storage device, magnetic storage device, or any suitable combination of the foregoing.
- RAM random access memory
- ROM read-only memory
- EPROM or flash memory erasable programmable read-only memory
- CD-ROM compact disk read only memory
- magnetic storage device or any suitable combination of the foregoing.
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Mathematical Physics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
Description
Claims (38)
- 一种搜索内容的方法,包括:响应于接收到针对目标搜索项的搜索请求,获取与多个历史搜索请求有关的多个历史搜索记录,每个历史搜索记录包括相对应的历史搜索请求所针对的历史搜索项;从所述多个历史搜索记录中确定与所述目标搜索项匹配的第一历史搜索记录;基于所述多个历史搜索记录之间的关系,从所述多个历史搜索记录中确定与所述第一历史搜索记录相关联的第二历史搜索记录;以及基于与所述第二历史搜索记录相对应的搜索结果,确定针对所述目标搜索项的扩展结果。
- 根据权利要求1所述的方法,还包括:基于所述多个历史搜索记录包括的多个历史搜索项,确定所述多个历史搜索记录的类别;以及基于所述类别,确定所述多个历史搜索记录之间的关系。
- 根据权利要求2所述的方法,其中所述多个历史搜索记录中的每个历史搜索记录还包括关键实体,其中确定所述多个历史搜索记录的类别包括:通过从所述多个历史搜索项中去除相应的关键实体,获得所述多个历史搜索项各自的剩余部分;至少基于所述剩余部分确定与所述多个历史搜索项相关联的需求信息;以及基于所述需求信息来确定所述多个历史搜索记录的类别。
- 根据权利要求2所述的方法,其中所述多个历史搜索记录具有多个类别,并且其中确定所述多个历史搜索记录之间的关系包括:从搜索日志中确定针对所述多个历史搜索项的搜索时间或搜索结果;基于所述搜索时间或所述搜索结果,确定所述多个类别之间的相关程度;以及基于所述多个类别之间的相关程度,确定所述多个历史搜索记录之间的关系。
- 根据权利要求1所述的方法,还包括:从搜索日志中确定针对所述多个历史搜索项的搜索时间或搜索结果;基于所述搜索时间或所述搜索结果,确定所述多个历史搜索记录的之间的关联程度;以及基于所述多个历史搜索记录之间的关联程度,确定所述多个历史搜索记录之间的关系。
- 根据权利要求1所述的方法,其中所述多个历史搜索记录中的每个历史搜索记录还包括关键实体和每个历史搜索记录的类别,其中确定所述第二历史搜索记录包括:基于所述多个历史搜索记录之间的所述关系,确定与所述第一历史搜索记录的第一类别相关联的第二类别;以及从所述多个历史搜索记录确定具有第二类别的第二历史搜索记录,所述第二历史搜索记录包括所述第一历史搜索记录的关键实体。
- 根据权利要求1所述的方法,其中确定所述第二历史搜索记录包括:基于所述多个历史搜索记录之间的关系,确定与所述第一历史搜索记录相关联的一组历史搜索记录,所述第一历史搜索记录与所述一组历史搜索记录中的每个历史搜索记录具有关联程度;以及基于所述关联程度,从所述一组历史搜索记录中确定所述第二历史搜索记录。
- 根据权利要求1所述的方法,其中获取所述多个历史搜索记录包括:从搜索日志中确定一组历史搜索请求所针对的一组历史搜索项;从所述一组历史搜索项中确定多个实体,每个实体标识与对应历史搜索项相关联的对象;基于所述多个实体在所述一组历史搜索项中的出现次数,从所述多个实体中确定关键实体;从所述一组历史搜索项中选择包括所述关键实体的多个历史搜索项;以及基于所述多个历史搜索项和所述关键实体生成所述多个历史搜索记录。
- 根据权利要求8所述的方法,其中确定所述关键实体包括:从所述一组历史搜索项中确定包括单个实体的历史搜索项集合;从所述历史搜索项集合确定至少一个历史搜索项,所述至少一个历史搜索项包括的单个实体在所述历史搜索项集合中的出现次数超过第一阈值次数;以及将所述至少一个历史搜索项包括的单个实体确定为所述关键实体。
- 根据权利要求8所述的方法,其中确定所述关键实体包括:基于所述多个实体在所述一组历史搜索项中的出现次数,从所述多个实体中确定出现次数超过第二阈值次数的高频实体;以及根据确定所述高频实体在对应的历史搜索项中的权重超过阈值权重,将所述高频实体确定为所述关键实体,其中所述权重指示所述高频实体在所述对应的历史搜索项中的重要性。
- 根据权利要求10所述的方法,其中所述权重是基于以下至少一项确定的:所述高频实体在所述对应的历史搜索项中的位置,以及所述高频实体的长度与所述对应的历史搜索项的长度之间的关系。
- 根据权利要求1所述的方法,其中确定针对所述目标搜索项的扩展结果包括:获取针对所述第二历史搜索记录中的历史搜索项的搜索结果;以及将所述搜索结果确定为所述扩展结果。
- 根据权利要求1所述的方法,其中确定针对所述目标搜索项的扩展结果包括:获取针对所述第二历史搜索记录中的历史搜索项的历史搜索结果;从所述历史搜索结果中确定已被用户访问的部分历史搜索结果;以及将所述部分历史搜索结果确定为所述扩展结果。
- 根据权利要求1所述的方法,其中确定针对所述目标搜索项的扩展结果包括:获取用户在针对所述第二历史搜索记录中的历史搜索项进行搜索时所产生的信息流;以及基于所述信息流,确定所述扩展结果。
- 根据权利要求1所述的方法,还包括以下至少一项:提供所述扩展结果;以及提供所述扩展结果和针对所述目标搜索项的目标搜索结果。
- 根据权利要求15所述的方法,其中提供所述扩展结果和所述目标搜索结果包括:确定所述扩展结果的第一分数,所述第一分数指示所述扩展结果与所述第二历史搜索记录中的历史搜索项的相关度;确定所述目标搜索结果的第二分数,所述第二分数指示所述目标搜索结果与所述目标搜索项的相关度;基于所述第一分数和所述第二分数,确定所述扩展结果和所述目标搜索结果的优先级;以及基于所述优先级提供所述扩展结果和所述目标搜索结果。
- 根据权利要求1所述的方法,还包括:建立用于获得与所述第二历史搜索记录相对应的搜索结果的目 标数据源。
- 根据权利要求17所述的方法,其中建立所述目标数据源包括:确定多个原始数据源中的多个文档的分数,每个文档的所述分数指示所述文档的质量;以及将所述多个文档中分数超过阈值分数的文档确定为所述目标数据源中的文档。
- 一种搜索内容的装置,包括:历史搜索记录获取模块,被配置为响应于接收到针对目标搜索项的搜索请求,获取与多个历史搜索请求有关的多个历史搜索记录,每个历史搜索记录包括相对应的历史搜索请求所针对的历史搜索项;目标搜索项匹配模块,被配置为从所述多个历史搜索记录中确定与所述目标搜索项匹配的第一历史搜索记录;历史搜索记录确定模块,被配置为基于所述多个历史搜索记录之间的关系,从所述多个历史搜索记录中确定与所述第一历史搜索记录相关联的第二历史搜索记录;以及扩展结果确定模块,被配置为基于与所述第二历史搜索记录相对应的搜索结果,确定针对所述目标搜索项的扩展结果。
- 根据权利要求19所述的装置,还包括:类别确定模块,被配置为基于所述多个历史搜索记录包括的多个历史搜索项,确定所述多个历史搜索记录的类别;以及历史搜索记录关系确定模块,被配置为基于所述类别,确定所述多个历史搜索记录之间的关系。
- 根据权利要求20所述的装置,其中所述多个历史搜索记录中的每个历史搜索记录还包括关键实体,其中所述类别确定模块包括:剩余部分确定模块,被配置为通过从所述多个历史搜索项中去除相应的关键实体,获得所述多个历史搜索项各自的剩余部分;需求信息确定模块,被配置为至少基于所述剩余部分确定与所述多个历史搜索项相关联的需求信息;以及历史搜索记录类别确定模块,被配置为基于所述需求信息来确定所述多个历史搜索记录的类别。
- 根据权利要求20所述的装置,其中所述多个历史搜索记录具有多个类别,并且其中所述历史搜索记录关系确定模块包括:第一搜索时间或搜索结果确定模块,被配置为从搜索日志中确定针对所述多个历史搜索项的搜索时间或搜索结果;相关程度确定模块,被配置为基于所述搜索时间或所述搜索结果,确定所述多个类别之间的相关程度;以及基于相关程度的关系确定模块,被配置为基于所述多个类别之间的相关程度,确定所述多个历史搜索记录之间的关系。
- 根据权利要求19所述的装置,还包括:第二搜索时间或搜索结果配置模块,被配置为从搜索日志中确定针对所述多个历史搜索项的搜索时间或搜索结果;关联程度确定模块,被配置为基于所述搜索时间或所述搜索结果,确定所述多个历史搜索记录的之间的关联程度;以及基于关联程度的关系确定模块,被配置为基于所述多个历史搜索记录之间的关联程度,确定所述多个历史搜索记录之间的关系。
- 根据权利要求19所述的装置,其中所述多个历史搜索记录中的每个历史搜索记录还包括关键实体和每个历史搜索记录的类别,其中所述历史搜索记录确定模块包括:第二类别确定模块,被配置为基于所述多个历史搜索记录之间的所述关系,确定与所述第一历史搜索记录的第一类别相关联的第二类别;以及具有类别的第二历史搜索记录确定模块,被配置为从所述多个历史搜索记录确定具有第二类别的第二历史搜索记录,所述第二历史搜索记录包括所述第一历史搜索记录的关键实体。
- 根据权利要求19所述的装置,其中所述历史搜索记录确定 模块包括:一组历史搜索记录确定模块,被配置为基于所述多个历史搜索记录之间的关系,确定与所述第一历史搜索记录相关联的一组历史搜索记录,所述第一历史搜索记录与所述一组历史搜索记录中的每个历史搜索记录具有关联程度;以及基于关联程度的历史搜索记录确定模块,被配置为基于所述关联程度,从所述一组历史搜索记录中确定所述第二历史搜索记录。
- 根据权利要求19所述的装置,其中所述历史搜索记录获取模块包括:第一历史搜索项确定模块,被配置为从搜索日志中确定一组历史搜索请求所针对的一组历史搜索项;实体确定模块,被配置为从所述一组历史搜索项中确定多个实体,每个实体标识与对应历史搜索项相关联的对象;第一关键实体确定模块,被配置为基于所述多个实体在所述一组历史搜索项中的出现次数,从所述多个实体中确定关键实体;选择模块,被配置为从所述一组历史搜索项中选择包括所述关键实体的多个历史搜索项;以及生成模块,被配置为基于所述多个历史搜索项和所述关键实体生成所述多个历史搜索记录。
- 根据权利要求26所述的装置,其中所述第一关键实体确定模块包括:历史搜索项集合确定模块,被配置为从所述一组历史搜索项中确定包括单个实体的历史搜索项集合;第二历史搜索项确定模块,被配置为从所述历史搜索项集合确定至少一个历史搜索项,所述至少一个历史搜索项包括的单个实体在所述历史搜索项集合中的出现次数超过第一阈值次数;以及针对单个实体的关键实体确定模块,被配置为将所述至少一个历史搜索项包括的单个实体确定为所述关键实体。
- 根据权利要求26所述的装置,其中关键实体确定模块包括:高频实体确定模块,被配置为基于所述多个实体在所述一组历史搜索项中的出现次数,从所述多个实体中确定出现次数超过第二阈值次数的高频实体;以及第二关键实体确定模块,被配置为根据确定所述高频实体在对应的历史搜索项中的权重超过阈值权重,将所述高频实体确定为所述关键实体,其中所述权重指示所述高频实体在所述对应的历史搜索项中的重要性。
- 根据权利要求28所述的装置,其中所述第二关键实体确定模块包括:位置确定模块,被配置为所述高频实体在所述对应的历史搜索项中的位置,以及长度关系确定模块,被配置为所述高频实体的长度与所述对应的历史搜索项的长度之间的关系。
- 根据权利要求19所述的装置,其中所述扩展结果确定模块包括:第一搜索结果获取模块,被配置为获取针对所述第二历史搜索记录中的历史搜索项的搜索结果;以及针对搜索结果的扩展结果确定模块,被配置为将所述搜索结果确定为所述扩展结果。
- 根据权利要求19所述的装置,其中所述扩展结果确定模块包括:第二搜索结果获取模块,被配置为获取针对所述第二历史搜索记录中的历史搜索项的历史搜索结果;部分历史搜索结果确定模块,被配置为从所述历史搜索结果中确定已被用户访问的部分历史搜索结果;以及部分历史搜索结果扩展模块,被配置为将所述部分历史搜索结果确定为所述扩展结果。
- 根据权利要求19所述的装置,其中所述扩展结果确定模块包括:信息流模块,被配置为获取用户在针对所述第二历史搜索记录中的历史搜索项进行搜索时所产生的信息流;以及针地信息流的扩展结果模块,被配置为基于所述信息流,确定所述扩展结果。
- 根据权利要求19所述的装置,还包括以下至少一项:第一提供装置,被配置为提供所述扩展结果;以及第二提供装置,被配置为提供所述扩展结果和针对所述目标搜索项的目标搜索结果。
- 根据权利要求33所述的装置,其中第二提供装置包括:第一分数确定模块,被配置为确定所述扩展结果的第一分数,所述第一分数指示所述扩展结果与所述第二历史搜索记录中的历史搜索项的相关度;第二分数确定模块,被配置为确定所述目标搜索结果的第二分数,所述第二分数指示所述目标搜索结果与所述目标搜索项的相关度;优先级确定模块,被配置为基于所述第一分数和所述第二分数,确定所述扩展结果和所述目标搜索结果的优先级;以及扩展结果和搜索结果提供模块,被配置为基于所述优先级提供所述扩展结果和所述目标搜索结果。
- 根据权利要求19所述的装置,还包括:目标数据源建立模块,被配置为建立用于获得与所述第二历史搜索记录相对应的搜索结果的目标数据源。
- 根据权利要求35所述的装置,其中所述目标数据源建立模块包括:文档分数确定模块,被配置为确定多个原始数据源中的多个文档的分数,每个文档的所述分数指示所述文档的质量;以及目标数据源文档确定模块,被配置为将所述多个文档中分数超过阈值分数的文档确定为所述目标数据源中的文档。
- 一种电子设备,包括:一个或多个处理器;以及存储装置,用于存储一个或多个程序,当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现根据权利要求1-18中任一项所述的方法。
- 一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序在被处理器执行时实现根据权利要求1-18中任一项所述的方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP20929634.2A EP4113329A4 (en) | 2020-04-01 | 2020-09-23 | METHOD, APPARATUS AND DEVICE FOR SEARCHING CONTENT, AND COMPUTER-READABLE STORAGE MEDIUM |
KR1020227027825A KR20220119745A (ko) | 2020-04-01 | 2020-09-23 | 콘텐츠를 검색하는 방법, 장치, 기기 및 컴퓨터 판독 가능 저장 매체 |
JP2022553192A JP7451747B2 (ja) | 2020-04-01 | 2020-09-23 | コンテンツを検索する方法、装置、機器及びコンピュータ読み取り可能な記憶媒体 |
US17/914,557 US20230147941A1 (en) | 2020-04-01 | 2020-09-23 | Method, apparatus and device used to search for content |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010252907.2 | 2020-04-01 | ||
CN202010252907.2A CN111475725B (zh) | 2020-04-01 | 2020-04-01 | 用于搜索内容的方法、装置、设备和计算机可读存储介质 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2021196541A1 true WO2021196541A1 (zh) | 2021-10-07 |
Family
ID=71749483
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/CN2020/117129 WO2021196541A1 (zh) | 2020-04-01 | 2020-09-23 | 用于搜索内容的方法、装置、设备和计算机可读存储介质 |
Country Status (6)
Country | Link |
---|---|
US (1) | US20230147941A1 (zh) |
EP (1) | EP4113329A4 (zh) |
JP (1) | JP7451747B2 (zh) |
KR (1) | KR20220119745A (zh) |
CN (1) | CN111475725B (zh) |
WO (1) | WO2021196541A1 (zh) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116628129A (zh) * | 2023-07-21 | 2023-08-22 | 南京爱福路汽车科技有限公司 | 一种汽车配件搜索方法及系统 |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111475725B (zh) * | 2020-04-01 | 2023-11-07 | 百度在线网络技术(北京)有限公司 | 用于搜索内容的方法、装置、设备和计算机可读存储介质 |
CN112053688B (zh) * | 2020-08-27 | 2024-03-08 | 海信视像科技股份有限公司 | 一种语音交互方法及交互设备、服务器 |
CN112528144A (zh) * | 2020-12-08 | 2021-03-19 | 北京百度网讯科技有限公司 | 搜索推荐方法、装置、智能设备、电子设备及存储介质 |
CN113051485B (zh) * | 2021-03-26 | 2023-08-22 | 北京达佳互联信息技术有限公司 | 群组搜索方法、装置、终端及存储介质 |
CN117972222B (zh) * | 2024-04-02 | 2024-06-21 | 紫金诚征信有限公司 | 基于人工智能的企业信息检索方法及装置 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577489A (zh) * | 2012-08-08 | 2014-02-12 | 百度在线网络技术(北京)有限公司 | 一种网页浏览历史查询方法及装置 |
CN103617266A (zh) * | 2013-12-03 | 2014-03-05 | 北京奇虎科技有限公司 | 个性化扩展搜索方法及装置、系统 |
CN106096003A (zh) * | 2014-12-26 | 2016-11-09 | 奇飞翔艺(北京)软件有限公司 | 数据搜索方法及客户端 |
CN106372231A (zh) * | 2016-09-08 | 2017-02-01 | 乐视控股(北京)有限公司 | 一种搜索方法及装置 |
CN111475725A (zh) * | 2020-04-01 | 2020-07-31 | 百度在线网络技术(北京)有限公司 | 用于搜索内容的方法、装置、设备和计算机可读存储介质 |
Family Cites Families (31)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH01233517A (ja) * | 1988-03-14 | 1989-09-19 | Mitsubishi Electric Corp | データベース検索装置 |
US7562069B1 (en) * | 2004-07-01 | 2009-07-14 | Aol Llc | Query disambiguation |
US20060224583A1 (en) * | 2005-03-31 | 2006-10-05 | Google, Inc. | Systems and methods for analyzing a user's web history |
CN101192223A (zh) * | 2006-11-27 | 2008-06-04 | 北京三星通信技术研究有限公司 | 黄页搜索方法和黄页搜索系统 |
KR101348598B1 (ko) * | 2007-12-21 | 2014-01-07 | 삼성전자주식회사 | 디지털 티비 방송 제공 시스템과 디지털 티비 및 그 제어방법 |
JP2010122932A (ja) * | 2008-11-20 | 2010-06-03 | Nippon Telegr & Teleph Corp <Ntt> | 文書検索装置、文書検索方法、および文書検索プログラム |
JP5220659B2 (ja) * | 2009-02-27 | 2013-06-26 | ヤフー株式会社 | 検索装置及び方法 |
US20100332493A1 (en) * | 2009-06-25 | 2010-12-30 | Yahoo! Inc. | Semantic search extensions for web search engines |
CN102012900B (zh) * | 2009-09-04 | 2013-01-30 | 阿里巴巴集团控股有限公司 | 信息检索方法和系统 |
JP5493845B2 (ja) * | 2009-12-28 | 2014-05-14 | 富士通株式会社 | 検索支援プログラム、検索支援装置、及び検索支援方法 |
CN101840420B (zh) * | 2010-04-02 | 2011-12-28 | 清华大学 | 搜索辅助系统与搜索辅助方法 |
US20140358971A1 (en) * | 2010-10-19 | 2014-12-04 | Google Inc. | Techniques for identifying chain businesses and queries |
US20120203751A1 (en) * | 2011-02-07 | 2012-08-09 | International Business Machines Corporation | Capture, Aggregate, and Use Search Activities as a Source of Social Data Within an Enterprise |
KR101818717B1 (ko) * | 2011-09-27 | 2018-01-15 | 네이버 주식회사 | 컨셉 키워드 확장 데이터 셋을 이용한 검색방법, 장치 및 컴퓨터로 판독 가능한 기록매체 |
AU2012247097B2 (en) * | 2011-11-14 | 2015-04-16 | Google Inc. | Visual search history |
CN102419776A (zh) * | 2011-12-31 | 2012-04-18 | 北京百度网讯科技有限公司 | 一种满足用户多维度搜索需求的方法和设备 |
CN102929966B (zh) * | 2012-10-12 | 2016-03-09 | 合一网络技术(北京)有限公司 | 一种用于提供个性化搜索列表的方法及系统 |
CN103049495A (zh) * | 2012-12-07 | 2013-04-17 | 百度在线网络技术(北京)有限公司 | 用于提供与查询序列相对应的搜索建议的方法、装置与设备 |
CN103593410B (zh) * | 2013-10-22 | 2017-04-12 | 上海交通大学 | 通过替换概念性词语进行搜索推荐的系统 |
US20160306887A1 (en) * | 2013-12-03 | 2016-10-20 | Beijing Qihoo Technology Company Limited | Methods, apparatuses and systems for linked and personalized extended search |
CN105335391B (zh) * | 2014-07-09 | 2019-02-15 | 阿里巴巴集团控股有限公司 | 基于搜索引擎的搜索请求的处理方法和装置 |
CN104462325B (zh) * | 2014-12-02 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | 搜索推荐方法及装置 |
CN105893397B (zh) * | 2015-06-30 | 2019-03-15 | 北京爱奇艺科技有限公司 | 一种视频推荐方法及装置 |
JP6664599B2 (ja) * | 2015-08-25 | 2020-03-13 | ヤフー株式会社 | 曖昧性評価装置、曖昧性評価方法、及び曖昧性評価プログラム |
CN105898423A (zh) * | 2015-12-08 | 2016-08-24 | 乐视网信息技术(北京)股份有限公司 | 视频推送方法、系统及服务器 |
CN106095819A (zh) * | 2016-05-31 | 2016-11-09 | 北京奇艺世纪科技有限公司 | 一种视频推荐方法及装置 |
CN109446402B (zh) * | 2017-08-29 | 2022-04-01 | 阿里巴巴集团控股有限公司 | 一种搜索方法及装置 |
CN108399232A (zh) * | 2018-02-13 | 2018-08-14 | 北京奇虎科技有限公司 | 一种信息推送方法、装置及电子设备 |
CN109101658B (zh) * | 2018-08-31 | 2022-05-10 | 优视科技新加坡有限公司 | 信息搜索方法、装置及设备/终端/服务器 |
EP3918486A4 (en) * | 2019-02-01 | 2022-10-19 | Ancestry.com Operations Inc. | SEARCH AND RANK RECORDS ACROSS DIFFERENT DATABASES |
CN110245357B (zh) * | 2019-06-26 | 2023-05-02 | 北京百度网讯科技有限公司 | 主实体识别方法和装置 |
-
2020
- 2020-04-01 CN CN202010252907.2A patent/CN111475725B/zh active Active
- 2020-09-23 US US17/914,557 patent/US20230147941A1/en active Pending
- 2020-09-23 JP JP2022553192A patent/JP7451747B2/ja active Active
- 2020-09-23 EP EP20929634.2A patent/EP4113329A4/en active Pending
- 2020-09-23 KR KR1020227027825A patent/KR20220119745A/ko unknown
- 2020-09-23 WO PCT/CN2020/117129 patent/WO2021196541A1/zh unknown
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103577489A (zh) * | 2012-08-08 | 2014-02-12 | 百度在线网络技术(北京)有限公司 | 一种网页浏览历史查询方法及装置 |
CN103617266A (zh) * | 2013-12-03 | 2014-03-05 | 北京奇虎科技有限公司 | 个性化扩展搜索方法及装置、系统 |
CN106096003A (zh) * | 2014-12-26 | 2016-11-09 | 奇飞翔艺(北京)软件有限公司 | 数据搜索方法及客户端 |
CN106372231A (zh) * | 2016-09-08 | 2017-02-01 | 乐视控股(北京)有限公司 | 一种搜索方法及装置 |
CN111475725A (zh) * | 2020-04-01 | 2020-07-31 | 百度在线网络技术(北京)有限公司 | 用于搜索内容的方法、装置、设备和计算机可读存储介质 |
Non-Patent Citations (1)
Title |
---|
See also references of EP4113329A4 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116628129A (zh) * | 2023-07-21 | 2023-08-22 | 南京爱福路汽车科技有限公司 | 一种汽车配件搜索方法及系统 |
CN116628129B (zh) * | 2023-07-21 | 2024-02-27 | 南京爱福路汽车科技有限公司 | 一种汽车配件搜索方法及系统 |
Also Published As
Publication number | Publication date |
---|---|
EP4113329A1 (en) | 2023-01-04 |
EP4113329A4 (en) | 2024-04-24 |
CN111475725B (zh) | 2023-11-07 |
KR20220119745A (ko) | 2022-08-30 |
US20230147941A1 (en) | 2023-05-11 |
JP7451747B2 (ja) | 2024-03-18 |
CN111475725A (zh) | 2020-07-31 |
JP2023516209A (ja) | 2023-04-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2021196541A1 (zh) | 用于搜索内容的方法、装置、设备和计算机可读存储介质 | |
KR102075833B1 (ko) | 미술 작품 추천 큐레이션 방법 및 시스템 | |
US8095539B2 (en) | Taxonomy-based object classification | |
US20190205472A1 (en) | Ranking Entity Based Search Results Based on Implicit User Interactions | |
CN109271574A (zh) | 一种热词推荐方法及装置 | |
Xu et al. | Web content mining | |
Hindle et al. | Clustering web video search results based on integration of multiple features | |
US20150120720A1 (en) | Method and system of identifying relevant content snippets that include additional information | |
US20100042610A1 (en) | Rank documents based on popularity of key metadata | |
Makvana et al. | A novel approach to personalize web search through user profiling and query reformulation | |
US20170185672A1 (en) | Rank aggregation based on a markov model | |
Saoud et al. | Integrating social profile to improve the source selection and the result merging process in distributed information retrieval | |
CN115248839A (zh) | 一种基于知识体系的长文本检索方法以及装置 | |
CN115391479A (zh) | 用于文档搜索的排序方法、装置、电子介质及存储介质 | |
Rajkumar et al. | Users’ click and bookmark based personalization using modified agglomerative clustering for web search engine | |
Wang et al. | An efficient refinement algorithm for multi-label image annotation with correlation model | |
Ren et al. | Role-explicit query extraction and utilization for quantifying user intents | |
Rashmi et al. | Deep web crawler: exploring and re-ranking of web forms | |
Ullah et al. | Query subtopic mining for search result diversification | |
Kambau et al. | Unified concept-based multimedia information retrieval technique | |
Hung et al. | Reorganization of search results based on semantic clustering | |
Cheng et al. | Learning To Rank Relevant Documents for Information Retrieval in Bioengineering Text Corpora | |
Sheth et al. | Ontology Based Semantic Web Information Retrieval Enhancing Search Significance | |
Wang et al. | GSR: A Resource Model and Semantics-based API Recommendation Algorithm | |
Nechakhin et al. | Similar Papers Recommendation for Research Comparisons. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 20929634 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 20227027825 Country of ref document: KR Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2022553192 Country of ref document: JP Kind code of ref document: A |
|
ENP | Entry into the national phase |
Ref document number: 2020929634 Country of ref document: EP Effective date: 20220930 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |