CN115455274A - Method, device and equipment for recommending candidate search terms and storage medium - Google Patents

Method, device and equipment for recommending candidate search terms and storage medium Download PDF

Info

Publication number
CN115455274A
CN115455274A CN202211133958.9A CN202211133958A CN115455274A CN 115455274 A CN115455274 A CN 115455274A CN 202211133958 A CN202211133958 A CN 202211133958A CN 115455274 A CN115455274 A CN 115455274A
Authority
CN
China
Prior art keywords
data
occurrence
relation data
occurrence relation
determining
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211133958.9A
Other languages
Chinese (zh)
Inventor
楚振江
黄川�
杨文博
吴永巍
范彪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu com Times Technology Beijing Co Ltd
Original Assignee
Baidu com Times Technology Beijing Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Baidu com Times Technology Beijing Co Ltd filed Critical Baidu com Times Technology Beijing Co Ltd
Priority to CN202211133958.9A priority Critical patent/CN115455274A/en
Publication of CN115455274A publication Critical patent/CN115455274A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9532Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/955Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
    • G06F16/9566URL specific, e.g. using aliases, detecting broken or misspelled links

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The disclosure provides a method, a device, equipment and a storage medium for recommending candidate search terms, and relates to the technical fields of big data, search, recommendation and the like. The specific implementation scheme is as follows: determining at least one target co-occurrence relationship data corresponding to the target website in the plurality of co-occurrence relationship data in response to detecting that the target website is clicked, wherein each co-occurrence relationship data in the plurality of co-occurrence relationship data comprises the website and the candidate search word which have a co-occurrence relationship with each other; acquiring a click rate evaluation value of at least one target co-occurrence relation data; determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data; and recommending the candidate search terms in the co-occurrence relation data to be recommended.

Description

Method, device and equipment for recommending candidate search terms and storage medium
Technical Field
The present disclosure relates to the field of artificial intelligence technology, and more particularly, to the field of big data, search, recommendation, and the like.
Background
With the development of search engines, search engines today have not only stayed on providing users with search results matching with search terms, but have paid more and more attention to how to better meet the needs of users. For example, after providing the search result matched with the search term for the user, the search engine may further recommend the candidate search term to the user for the subsequent search of the user.
Disclosure of Invention
The disclosure provides a method, a device, equipment, a storage medium and a program product for recommending candidate search terms.
According to an aspect of the present disclosure, there is provided a method for recommending a candidate search term, including: determining at least one target co-occurrence relationship data corresponding to a target website in a plurality of co-occurrence relationship data in response to detecting that the target website is clicked, wherein each co-occurrence relationship data in the plurality of co-occurrence relationship data comprises the website and a candidate search word which have a co-occurrence relationship with each other; acquiring a click rate evaluation value of the at least one target co-occurrence relation data; determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data; and recommending the candidate search terms in the co-occurrence relation data to be recommended.
According to another aspect of the present disclosure, there is provided an apparatus for recommending a candidate search term, including: the device comprises a first determining module, a second determining module and a searching module, wherein the first determining module is used for determining at least one target co-occurrence relation data corresponding to a target website in a plurality of co-occurrence relation data in response to the fact that the target website is detected to be clicked, and each co-occurrence relation data in the plurality of co-occurrence relation data comprises the website and a candidate searching word which have a co-occurrence relation with each other; the acquisition module is used for acquiring the click rate evaluation value of the at least one target co-occurrence relation data; the second determining module is used for determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data; and the recommending module is used for recommending the candidate search terms in the co-occurrence relation data to be recommended.
Another aspect of the present disclosure provides an electronic device including: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of the embodiments of the present disclosure.
According to another aspect of the disclosed embodiments, there is provided a non-transitory computer readable storage medium storing computer instructions for causing a computer to perform the method shown in the disclosed embodiments.
According to another aspect of an embodiment of the present disclosure, a computer program product is provided, which includes computer programs/instructions, and is characterized in that when being executed by a processor, the computer programs/instructions implement the steps of the method shown in the embodiment of the present disclosure.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present disclosure, nor do they limit the scope of the present disclosure. Other features of the present disclosure will become apparent from the following description.
Drawings
The drawings are included to provide a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:
fig. 1 schematically illustrates an exemplary system architecture to which a method and apparatus for recommendation of candidate search terms may be applied, according to an embodiment of the present disclosure;
FIG. 2 schematically shows a flow diagram of a method of candidate search terms according to an embodiment of the present disclosure;
FIG. 3 schematically illustrates a flow chart of a method of determining a click rate estimate for each co-occurrence data, in accordance with an embodiment of the present disclosure;
FIG. 4 schematically illustrates a method of obtaining multiple co-occurrence relationship data, in accordance with an embodiment of the disclosure;
FIG. 5 schematically illustrates a method of obtaining multiple co-occurrence relationship data according to another embodiment of the disclosure;
FIG. 6 is a schematic diagram illustrating a method of recommending candidate search terms according to an embodiment of the present disclosure;
FIG. 7 schematically shows a block diagram of an apparatus for recommending candidate search terms according to an embodiment of the present disclosure; and
FIG. 8 schematically shows a block diagram of an example electronic device that may be used to implement embodiments of the present disclosure.
Detailed Description
Exemplary embodiments of the present disclosure are described below with reference to the accompanying drawings, in which various details of the embodiments of the disclosure are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
With the development of search engines, search engines of today can not only provide users with search results matching with search terms, but also recommend candidate search terms based on click behaviors of the users after retrieval.
If the user inputs a search word "blessing words" through the search engine, the search engine may give a search result matching the search word input by the user. But the search terms may be inaccurate and the user may need to search further. Based on this, according to the embodiment of the present disclosure, the search demand of the user in the next step can be predicted. And further recommend the candidate search words to be responded, for example, the candidate search words corresponding to the next search request, such as "wedding blessing", "birthday blessing", "mid-autumn blessing", "send elder blessing", and the like. The user can further search according to the candidate search words, so that the search efficiency of the user is improved, and the search experience of the user is improved.
The system architecture of the method and apparatus for recommending candidate search terms provided by the present disclosure will be described below with reference to fig. 1.
Fig. 1 schematically illustrates an exemplary system architecture 100 to which the method and apparatus for recommendation of candidate search terms may be applied, according to an embodiment of the present disclosure. It should be noted that fig. 1 is only an example of a system architecture to which the embodiments of the present disclosure may be applied to help those skilled in the art understand the technical content of the present disclosure, and does not mean that the embodiments of the present disclosure may not be applied to other devices, systems, environments or scenarios.
As shown in fig. 1, the system architecture 100 according to this embodiment may include terminal devices 101, 102, 103, a network 104 and a server 105. The network 104 serves as a medium for providing communication links between the terminal devices 101, 102, 103 and the server 105. Network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
The user may use the terminal devices 101, 102, 103 to interact with the server 105 via the network 104 to receive or send messages or the like. The terminal devices 101, 102, 103 may have installed thereon various communication client applications, such as a search-type application, a shopping-type application, a web browser application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 105 may be a server that provides various services, such as a search engine server, for providing a search service to a user using the terminal device 101, 102, 103. The search engine server may perform processing such as searching on the received data such as search terms, and feed back the search result (e.g., web page, information, or data obtained or generated according to a user request) to the terminal device.
It should be noted that the recommendation method for candidate search terms provided by the embodiments of the present disclosure may be generally executed by the server 105. Accordingly, the recommendation device for candidate search terms provided by the embodiments of the present disclosure may be generally disposed in the server 105. The recommendation method for candidate search terms provided by the embodiment of the present disclosure may also be executed by a server or a server cluster that is different from the server 105 and is capable of communicating with the terminal devices 101, 102, 103 and/or the server 105. Accordingly, the recommendation device for candidate search terms provided by the embodiment of the present disclosure may also be disposed in a server or a server cluster different from the server 105 and capable of communicating with the terminal devices 101, 102, 103 and/or the server 105.
It should be understood that the number of terminal devices, networks, and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
In the technical scheme of the disclosure, the collection, storage, use, processing, transmission, provision, disclosure, application and other processing of the personal information of the related user are all in accordance with the regulations of related laws and regulations, necessary confidentiality measures are taken, and the customs of the public order is not violated.
In the technical scheme of the disclosure, before the personal information of the user is acquired or collected, the authorization or the consent of the user is acquired.
The method of candidate search terms provided by the present disclosure will be described below with reference to fig. 2.
FIG. 2 schematically shows a flow diagram of a method of candidate search terms according to an embodiment of the disclosure.
As shown in fig. 2, the method 200 includes, in response to detecting that the target website is clicked, determining at least one target co-occurrence relationship data corresponding to the target website from among the plurality of co-occurrence relationship data in operation S210.
Wherein each of the plurality of co-occurrence relationship data includes a web address and a candidate search term having a co-occurrence relationship with each other.
According to the embodiment of the disclosure, the target web address may be, for example, a web address of any page in the search result.
According to embodiments of the present disclosure, a search engine may record, for example, search behavior data of a user and click behavior data for web pages in search results. The search behavior data may include, for example, a search word and an occurrence time of the search, and the click behavior data may include, for example, a web address of a clicked web page and an occurrence time of the click. If the search behavior is similar to the click behavior in occurrence time, the web address of the click behavior and the search word of the search behavior have a co-occurrence relationship. Based on this, a plurality of co-occurrence relationship data can be determined from the search behavior data and click behavior data recorded by the search engine.
Then, in operation S220, a click rate evaluation value of at least one target co-occurrence relation data is acquired.
According to the embodiment of the disclosure, for example, click rate evaluation may be performed on a plurality of co-occurrence relationship data, and click rate evaluation values of the plurality of co-occurrence relationship data are obtained and recorded. Based on the click rate evaluation value, the record of the click rate evaluation value can be read to obtain the click rate evaluation value of at least one target co-occurrence relation data
In operation S230, co-occurrence relationship data to be recommended in the at least one target co-occurrence relationship data is determined according to the click rate evaluation value of the at least one target co-occurrence relationship data.
According to the embodiment of the present disclosure, the larger the click rate evaluation value may indicate the higher the possibility of being clicked. Based on this, for example, the target co-occurrence relationship data of which the click rate evaluation value is greater than the click rate threshold value in the at least one target co-occurrence relationship data may be determined as the co-occurrence relationship data to be recommended. Wherein, the click rate threshold value can be set according to actual needs.
In operation S240, candidate search terms in the to-be-recommended co-occurrence relationship data are recommended.
According to the embodiment of the disclosure, the accuracy and efficiency of recommending the search terms can be improved by recommending the candidate search terms to the user, and the user can be helped to find the required information content quickly.
According to the embodiment of the disclosure, for example, the candidate search term corresponding to the target website may be displayed around the display area of the target website in the webpage. A user may initiate a search for a candidate search term by clicking on the candidate search term.
According to an embodiment of the present disclosure, the click rate evaluation value of each co-occurrence relationship data may be predetermined. A method of determining a click rate evaluation value for each co-occurrence relationship data provided by the present disclosure will be described below with reference to fig. 3.
Fig. 3 schematically shows a flowchart of a method of determining a click rate evaluation value for each co-occurrence relationship data according to an embodiment of the present disclosure.
As shown in fig. 3, the method 300 includes acquiring a plurality of co-occurrence relationship data in operation S310.
According to the embodiment of the disclosure, for example, object behavior data of a search engine may be acquired, wherein the object behavior data records at least one search behavior and at least one click behavior. And then, the object behavior data can be segmented according to the behavior occurrence time to obtain a plurality of data segments, wherein each data segment comprises a behavior record with preset time length. The predetermined time period can be set according to actual needs, and can be 15 minutes, for example. Next, co-occurrence relationship data may be determined based on the search behavior and click behavior recorded in each data slice.
According to the embodiment of the disclosure, the data fragments are segmented, so that different search processes can be distinguished, and the effectiveness of the data is improved.
In operation S320, a search term feature, a web site feature, and a cross feature corresponding to each co-occurrence relationship data are determined.
According to the embodiment of the disclosure, for example, the search term feature may be determined according to the candidate search term in the co-occurrence relationship data for each co-occurrence relationship data. And determining the website characteristics according to the website in the co-occurrence relation data. And determining the cross features according to the correlation between the candidate search terms and the web addresses in the co-occurrence relation data. For example, feature extraction may be performed on the candidate search terms to obtain search term features. And (4) carrying out feature extraction on the website to obtain website features. And then, calculating the correlation of the search term characteristics and the website characteristics to obtain cross characteristics.
In operation S330, a click rate evaluation value of each co-occurrence relationship data is determined according to the search term feature, the website feature, and the cross feature corresponding to each co-occurrence relationship data.
According to the embodiment of the disclosure, for example, for each co-occurrence relationship data, the search term feature, the website feature and the cross feature corresponding to the co-occurrence relationship data may be merged to obtain candidate data. And then evaluating the candidate data by using a click through rate model to obtain a click rate evaluation value of the co-occurrence relation data. For example, the candidate word features and the cross features may be merged to obtain intermediate features. And then combining the intermediate features and the website features to obtain candidate data.
According to another embodiment of the present disclosure, for example, the relevance between the candidate search term in the plurality of co-occurrence relationship data and the web address may be evaluated to obtain a relevance evaluation value. Wherein, the relevance evaluation value can be used to represent the relevance between the candidate search term and the web address. And then deleting the co-occurrence relationship data of which the correlation evaluation value is smaller than the correlation threshold value from the plurality of co-occurrence relationship data. Wherein, the correlation threshold can be set by actual requirements. By deleting the co-occurrence relation data of which the correlation evaluation value is smaller than the correlation threshold, unreasonable data can be deleted, the recommendation effect is improved, and the data volume of subsequent processing is reduced.
According to another embodiment of the present disclosure, the object behavior data may include, for example, a user behavior log of a search engine. The analysis may be performed, for example, from a log of user behavior of the search engine. The user behavior log may include session, for example.
In this embodiment, when a user uses a search engine, the user initiates a search for a search term query. The search engine may record the user's search behavior. For example, in this embodiment, the type of search behavior in the user behavior log may be marked as se. When a user clicks on a web page in a search result, the clicking behavior of the user is recorded. For example, in this embodiment, the type of click behavior in the user behavior log may be marked as click. Based on the above, in the searching process, the user behavior log records the user behavior according to the occurrence of the searching behavior, the clicking behavior and the corresponding behavior occurrence time of the user. Illustratively, the user behavior log may take the form of sequence data.
For example, according to the chronological order, the user first performs a search action for the search term query being "liu somewhere". The search engine provides related web pages as search results. In the search results provided by the search engine, the user clicks the webpage with the URL of Liu certain encyclopedia, returns to the search engine after viewing the webpage, and initiates the search engine query with the search word query of 'Sida Tianwang' again. And after the search result aiming at the 'four kings' is browsed, the search of the search word query as 'flood a certain' is initiated again. And clicking a webpage with a website url of "ent.xxx.com" in a search result aiming at "flood a certain" by a user for browsing. After browsing the web page, a search engine query with search term query of "Wu Mou" is performed in the search engine. Based on this, a record such as that shown in table 1 below may be formed in the user behavior log.
Figure BDA0003849052970000071
Figure BDA0003849052970000081
TABLE 1
And carrying out personal information desensitization processing aiming at the user behavior log of the search engine. And then sorting the behavior data of each user according to time by the desensitized data to obtain the search behavior and click behavior data of the single user.
The individual user behavior data aggregated together over time may then be divided into multiple data slices over a predetermined time period. And determining that the search behavior and the click behavior recorded in each data fragment have co-occurrence relation data, namely co-occurrence association relation. In this embodiment, the data fragmentation may be session, for example, and the predetermined time period may be 15 minutes, for example. Therefore, co-occurrence relation data of the user dimension can be obtained. Each piece of co-occurrence relationship data may include a web address url and a search term query, where the web address url and the search term query have a co-occurrence relationship, which indicates that a user clicking the web address url is more likely to initiate a search for the search term query. Exemplarily, in the present embodiment, the co-occurrence relationship data may be as shown in table 2, for example. The website url may be used as a key field, and the search term query may be used as a candidate search term field.
key candidate
url = Liu certain _ encyclopedia query = four king heaven
url = Liu certain _ encyclopedia query = flood a certain
url=ent.xxx.com query = Wu Mou
TABLE 2
And performing association counting on the website url and the search term query in the data fragment of each user. The co-occurrence relation data obtained by all the user behavior logs are statistically collected, and the occurrence times of the same co-occurrence relation data are accumulated to become the co-occurrence times of the co-occurrence relation data. And (4) sequencing the search word query corresponding to the same website url from high to low according to the co-occurrence times, and intercepting the n search word queries with the highest order as candidate search words. Where n is a positive integer, and may be set according to actual needs, for example, it may be 1000.
The method for acquiring data of a plurality of co-occurrence relationships shown above is further described with reference to fig. 4 and 5 in conjunction with a specific embodiment. Those skilled in the art will appreciate that the following example embodiments are only for the understanding of the present disclosure, and the present disclosure is not limited thereto.
Fig. 4 schematically shows a schematic diagram of a method of acquiring multiple co-occurrence relationship data according to an embodiment of the present disclosure.
In processing search engine user behavior logs, it is shown in fig. 4 that PC (personal computer) side behavior logs and mobile side behavior logs in time units of days may be introduced. The PC side behavior log is a user behavior log recorded by a PC side web page search engine, and the mobile side behavior log is a user behavior log recorded by a mobile side web page search engine. And respectively extracting data from the PC side behavior log and the mobile side behavior log to obtain user behavior characterization data of different search engine scenes, and then merging the two characterization data. And grouping according to the combined behavior data and users, wherein the data corresponding to each user is divided into one group. And then segmenting each group of data, and mining co-occurrence relation data according to segmented data segments to obtain day-level co-occurrence relation data.
Fig. 5 schematically illustrates a method of acquiring multiple co-occurrence relationship data according to another embodiment of the present disclosure.
In fig. 5, it is shown that co-occurrence relationship data of the month level can be mined on the basis of co-occurrence relationship data of the day level. The method comprises the steps of obtaining daily co-occurrence relation data of a PC (personal computer) end search engine and daily co-occurrence relation data of a mobile end search engine in a month, combining the co-occurrence relation data, counting co-occurrence times, and marking the source of the co-occurrence times. For the merged data, the PC website url and the mobile website url may be adapted to merge the PC website and the mobile website corresponding to the same web page. For example, whether the PC end website url _ a and the mobile end website url _ X correspond to the same web site is identified for the PC end website url _ a and the mobile end website url _ X, and if the PC end website url _ a and the mobile end website url _ X correspond to the same web site, the search word queries corresponding to the two websites can be merged and used in a complementary manner. The subsequent processing steps of the month-level co-occurrence are similar to those of the day-level co-occurrence, and the occurrence times of each co-occurrence relation data are added to obtain the total co-occurrence times. And then sequencing all search terms query corresponding to each URL from high to low according to the total co-occurrence times. Intercepting the search terms corresponding to each website url, and acquiring the top m search terms query with the highest ranking as candidate search terms. Where m is a positive integer, and may be set according to actual needs, for example, it may be 2000.
The method for recommending candidate search terms shown above is further described with reference to fig. 6. Those skilled in the art will appreciate that the following example embodiments are only for the understanding of the present disclosure, and the present disclosure is not limited thereto.
Fig. 6 schematically shows a recommendation method of a candidate search term according to an embodiment of the present disclosure.
As shown in FIG. 6, in one aspect, samples may be collected, where each sample may include a web address url and a search term query. Then, the sample can be subjected to characteristic engineering to obtain sample characteristics. The machine learning model may then be trained using the sample features, resulting in a Click Through Rate (CTR) model.
On the other hand, a basic recall may be made. According to an embodiment of the present disclosure, the basic recall may include, for example, mining co-occurrence relationship data of the web address url and the search term query. The co-occurrence relationship data may then be feature engineered. According to embodiments of the present disclosure, feature engineering may include, for example, mining search term features, web site features, and cross features corresponding to each co-occurrence relationship data. Then, the features may be merged. For example, the search term feature, the website feature, and the cross feature process may be combined to obtain candidate data.
Next, co-occurrence data may be screened through algorithmic strategies. For example, the relevance between the website url and the search term query can be evaluated to obtain a relevance evaluation value. And then deleting the co-occurrence relation data of which the correlation evaluation value is smaller than the correlation threshold value in the plurality of co-occurrence relation data. For another example, for each piece of co-occurrence relationship data, intent analysis may be performed on the search term query in the co-occurrence relationship data to obtain an intent analysis result, and then it is determined whether the intent analysis result matches the web page content corresponding to the web address url in the co-occurrence relationship data, and if not, the co-occurrence relationship data is deleted. It can be understood that other algorithm strategies may also be adopted to screen the co-occurrence relationship data, for example, deleting the co-occurrence relationship data with too few search term query words, deleting the co-occurrence relationship data with too few association times, deleting the co-occurrence relationship data with sensitive terms included in the search term query, deleting the co-occurrence relationship data with low quality score, and the like, which is not specifically limited in this disclosure.
The candidate data can be evaluated by utilizing a click through rate model to obtain a click rate evaluation value of the co-occurrence relation data. According to the embodiment of the disclosure, for example, feature engineering can be performed on a sample to obtain sample features, and then a click through rate model is trained according to the sample features. Wherein the sample collection and training can be through machine learning techniques.
And finally, sequencing the search term query according to the click rate evaluation value, and recommending according to the sequencing. In addition, if more search terms can be truncated, the top k search terms are selected and output as recommended candidate search terms. Wherein k is a positive integer and can be set according to actual needs. Subsequently, the recommended candidate search terms may be formatted for use by an online query service.
The recommendation device of candidate search terms provided by the present disclosure will be described below with reference to fig. 7.
Fig. 7 schematically shows a block diagram of a recommendation apparatus of a candidate search term according to an embodiment of the present disclosure.
As shown in fig. 7, the apparatus 700 for recommending candidate search terms includes a first determining module 710, an obtaining module 720, a second determining module 730, and a recommending module 740.
The first determining module 710 is configured to determine, in response to detecting that the target website is clicked, at least one target co-occurrence relationship data corresponding to the target website from among the plurality of co-occurrence relationship data, where each co-occurrence relationship data from among the plurality of co-occurrence relationship data includes a website and a candidate search term that have a co-occurrence relationship with each other.
An obtaining module 720, configured to obtain a click rate evaluation value of at least one target co-occurrence relation data.
The second determining module 730 is configured to determine co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data.
And the recommending module 740 is configured to recommend the candidate search terms in the co-occurrence relation data to be recommended.
The present disclosure also provides an electronic device, a readable storage medium, and a computer program product according to embodiments of the present disclosure.
According to an embodiment of the present disclosure, the apparatus for recommending a candidate search term may further include a co-occurrence relationship data acquisition module, a feature determination module, and a click rate evaluation value determination module. The co-occurrence relation data acquisition module is used for acquiring a plurality of co-occurrence relation data. And the characteristic determining module is used for determining the search term characteristics, the website characteristics and the cross characteristics corresponding to each co-occurrence relation data. And the click rate evaluation value determining module is used for determining the click rate evaluation value of each co-occurrence relation data according to the search word characteristic, the website characteristic and the cross characteristic corresponding to each co-occurrence relation data.
According to the embodiment of the disclosure, the co-occurrence relationship data acquisition module may include an object behavior data acquisition sub-module, a segmentation sub-module, and a co-occurrence relationship data determination sub-module. The object behavior data acquisition submodule is used for acquiring object behavior data of a search engine, wherein the object behavior data records at least one search behavior and at least one click behavior. And the segmentation submodule is used for segmenting the object behavior data according to the behavior occurrence time to obtain a plurality of data segments, wherein each data segment comprises a behavior record with preset duration. And the co-occurrence relation data determining submodule is used for determining the co-occurrence relation data according to the searching behavior and the clicking behavior recorded in each data fragment.
According to an embodiment of the present disclosure, the feature determination module may include a first feature determination sub-module, a second feature determination sub-module, and a third feature determination sub-module. And the first characteristic determining submodule is used for determining the characteristics of the search words according to the candidate search words in the co-occurrence relation data aiming at each co-occurrence relation data. And the second characteristic determining submodule is used for determining the characteristics of the website according to the website in the co-occurrence relation data. And the third characteristic determining submodule is used for determining the cross characteristic according to the correlation between the candidate search word and the website in the co-occurrence relation data.
According to an embodiment of the present disclosure, the click rate evaluation value determination module may include a merge sub-module and an evaluation sub-module. And the merging submodule is used for merging the search term characteristics, the website characteristics and the cross characteristics corresponding to the co-occurrence relation data aiming at each co-occurrence relation data to obtain candidate data. And the evaluation submodule is used for evaluating the candidate data by utilizing a click through rate model to obtain a click rate evaluation value of the co-occurrence relation data.
According to an embodiment of the present disclosure, the apparatus for recommending a candidate search term may further include an evaluation module and a deletion module. The evaluation module is used for evaluating the correlation between the candidate search terms in the plurality of co-occurrence relation data and the website to obtain a correlation evaluation value. And the deleting module is used for deleting the co-occurrence relation data of which the correlation evaluation value is smaller than a correlation threshold value in the plurality of co-occurrence relation data.
According to an embodiment of the present disclosure, the second determining module may include a co-occurrence relation data to be recommended determining sub-module, configured to determine, as co-occurrence relation data to be recommended, target co-occurrence relation data in the at least one target co-occurrence relation data, where the click rate evaluation value is greater than the click rate threshold.
Fig. 8 schematically illustrates a block diagram of an example electronic device 800 that may be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular phones, smart phones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be examples only, and are not intended to limit implementations of the disclosure described and/or claimed herein.
As shown in fig. 8, the apparatus 800 includes a computing unit 801 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 802 or a computer program loaded from a storage unit 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the device 800 can also be stored. The calculation unit 801, the ROM 802, and the RAM 803 are connected to each other by a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
A number of components in the device 800 are connected to the I/O interface 805, including: an input unit 806, such as a keyboard, a mouse, or the like; an output unit 807 such as various types of displays, speakers, and the like; a storage unit 808, such as a magnetic disk, optical disk, or the like; and a communication unit 809 such as a network card, modem, wireless communication transceiver, etc. The communication unit 809 allows the device 800 to exchange information/data with other devices via a computer network such as the internet and/or various telecommunication networks.
Computing unit 801 may be a variety of general and/or special purpose processing components with processing and computing capabilities. Some examples of the computing unit 801 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various dedicated Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, and so forth. The calculation unit 801 executes the respective methods and processes described above, such as the recommendation method of candidate search words. For example, in some embodiments, the method of recommending candidate search terms may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 808. In some embodiments, part or all of a computer program may be loaded onto and/or installed onto device 800 via ROM 802 and/or communications unit 809. When the computer program is loaded into the RAM 803 and executed by the computing unit 801, one or more steps of the method for recommending a candidate search term described above may be performed. Alternatively, in other embodiments, the computing unit 801 may be configured to perform the method of recommending candidate search terms by any other suitable means (e.g., by means of firmware).
Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuitry, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), system on a chip (SOCs), complex Programmable Logic Devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs that are executable and/or interpretable on a programmable system including at least one programmable processor, which may be special or general purpose, receiving data and instructions from, and transmitting data and instructions to, a storage system, at least one input device, and at least one output device.
Program code for implementing the methods of the present disclosure may be written in any combination of one or more programming languages. These program codes may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus, such that the program codes, when executed by the processor or controller, cause the functions/operations specified in the flowchart and/or block diagram to be performed. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.
In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. A machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and a pointing device (e.g., a mouse or a trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user can be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user can be received in any form, including acoustic, speech, or tactile input.
The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), and the Internet.
The computer system may include clients and servers. A client and server are generally remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other.
The Server may be a cloud Server, which is also called a cloud computing Server or a cloud host, and is a host product in a cloud computing service system, so as to solve the defects of high management difficulty and weak service extensibility in a traditional physical host and a VPS service ("Virtual Private Server", or "VPS" for short). The server may also be a server of a distributed system, or a server incorporating a blockchain.
It should be understood that various forms of the flows shown above may be used, with steps reordered, added, or deleted. For example, the steps described in the present disclosure may be executed in parallel, sequentially, or in different orders, as long as the desired results of the technical solutions disclosed in the present disclosure can be achieved, and the present disclosure is not limited herein.
The above detailed description should not be construed as limiting the scope of the disclosure. It should be understood by those skilled in the art that various modifications, combinations, sub-combinations and substitutions may be made in accordance with design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present disclosure should be included in the scope of protection of the present disclosure.

Claims (17)

1. A recommendation method of candidate search terms comprises the following steps:
determining at least one target co-occurrence relationship data corresponding to a target website in a plurality of co-occurrence relationship data in response to detecting that the target website is clicked, wherein each co-occurrence relationship data in the plurality of co-occurrence relationship data comprises the website and a candidate search word which have a co-occurrence relationship with each other;
acquiring a click rate evaluation value of the at least one target co-occurrence relation data;
determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data; and
and recommending the candidate search words in the co-occurrence relation data to be recommended.
2. The method of claim 1, further comprising:
acquiring a plurality of co-occurrence relation data;
determining search term characteristics, website characteristics and cross characteristics corresponding to each co-occurrence relation data; and
and determining the click rate evaluation value of each co-occurrence relation data according to the search term characteristics, the website characteristics and the cross characteristics corresponding to each co-occurrence relation data.
3. The method of claim 2, wherein the obtaining a plurality of co-occurrence relationship data comprises:
acquiring object behavior data of a search engine, wherein the object behavior data records at least one search behavior and at least one click behavior;
the object behavior data are segmented according to behavior occurrence time to obtain a plurality of data segments, wherein each data segment comprises a behavior record with preset duration; and
and determining co-occurrence relation data according to the searching behavior and the clicking behavior recorded in each data fragment.
4. The method of claim 2, wherein said determining search term features, web site features, and cross-features corresponding to said each co-occurrence relationship data comprises:
for each of the co-occurrence relationship data,
determining the characteristics of the search terms according to the candidate search terms in the co-occurrence relation data;
determining website characteristics according to the websites in the co-occurrence relation data; and
and determining cross features according to the correlation between the candidate search terms and the websites in the co-occurrence relation data.
5. The method according to claim 2, wherein the determining the click rate evaluation value of each co-occurrence relation data according to the search term feature, the website feature and the cross feature corresponding to each co-occurrence relation data comprises:
for each of the co-occurrence relationship data,
merging the search term characteristics, the website characteristics and the cross characteristics corresponding to the co-occurrence relation data to obtain candidate data; and
and evaluating the candidate data by using a click through rate model to obtain a click rate evaluation value of the co-occurrence relation data.
6. The method of claim 2, further comprising:
evaluating the correlation between the candidate search terms and the websites in the plurality of co-occurrence relation data to obtain a correlation evaluation value; and
deleting the co-occurrence relation data of which the correlation evaluation value is smaller than a correlation threshold value from the plurality of co-occurrence relation data.
7. The method according to any one of claims 1 to 6, wherein the determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to click rate evaluation values of the at least one target co-occurrence relation data comprises:
and determining the target co-occurrence relation data of which the click rate evaluation value is greater than the click rate threshold value in the at least one target co-occurrence relation data as the co-occurrence relation data to be recommended.
8. An apparatus for recommending candidate search terms, comprising:
the device comprises a first determining module, a second determining module and a searching module, wherein the first determining module is used for determining at least one target co-occurrence relation data corresponding to a target website in a plurality of co-occurrence relation data in response to the fact that the target website is detected to be clicked, and each co-occurrence relation data in the plurality of co-occurrence relation data comprises the website and a candidate searching word which have a co-occurrence relation with each other;
the acquisition module is used for acquiring the click rate evaluation value of the at least one target co-occurrence relation data;
the second determining module is used for determining co-occurrence relation data to be recommended in the at least one target co-occurrence relation data according to the click rate evaluation value of the at least one target co-occurrence relation data; and
and the recommending module is used for recommending the candidate search terms in the co-occurrence relation data to be recommended.
9. The apparatus of claim 8, further comprising:
the co-occurrence relation data acquisition module is used for acquiring a plurality of co-occurrence relation data;
the characteristic determining module is used for determining search term characteristics, website characteristics and cross characteristics corresponding to each co-occurrence relation data; and
and the click rate evaluation value determining module is used for determining the click rate evaluation value of each co-occurrence relation data according to the search word characteristics, the website characteristics and the cross characteristics corresponding to each co-occurrence relation data.
10. The apparatus of claim 9, wherein the co-occurrence data acquisition module comprises:
the object behavior data acquisition sub-module is used for acquiring object behavior data of a search engine, wherein the object behavior data records at least one search behavior and at least one click behavior;
the segmentation submodule is used for segmenting the object behavior data according to behavior occurrence time to obtain a plurality of data segments, wherein each data segment comprises a behavior record with preset duration; and
and the co-occurrence relation data determining submodule is used for determining the co-occurrence relation data according to the searching behavior and the clicking behavior recorded in each data fragment.
11. The apparatus of claim 9, wherein the feature determination module comprises:
the first characteristic determining submodule is used for determining the characteristics of search words according to candidate search words in the co-occurrence relation data aiming at each co-occurrence relation data;
the second characteristic determining sub-module is used for determining website characteristics according to the websites in the co-occurrence relation data; and
and the third characteristic determining submodule is used for determining the cross characteristic according to the correlation between the candidate search word and the website in the co-occurrence relation data.
12. The apparatus of claim 9, wherein the click rate assessment value determining module comprises:
the merging submodule is used for merging the search term characteristics, the website characteristics and the cross characteristics corresponding to the co-occurrence relation data aiming at each co-occurrence relation data to obtain candidate data; and
and the evaluation submodule is used for evaluating the candidate data by utilizing a click through rate model to obtain a click rate evaluation value of the co-occurrence relation data.
13. The apparatus of claim 9, further comprising:
the evaluation module is used for evaluating the correlation between the candidate search terms in the plurality of co-occurrence relation data and the websites to obtain a correlation evaluation value; and
and the deleting module is used for deleting the co-occurrence relation data of which the correlation evaluation value is smaller than a correlation threshold value in the plurality of co-occurrence relation data.
14. The apparatus of any of claims 8 to 13, wherein the second determining means comprises:
and the co-occurrence relation data to be recommended determining submodule is used for determining the target co-occurrence relation data of which the click rate evaluation value is greater than the click rate threshold value in the at least one target co-occurrence relation data as the co-occurrence relation data to be recommended.
15. An electronic device, comprising:
at least one processor; and
a memory communicatively coupled to the at least one processor; wherein the content of the first and second substances,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-7.
16. A non-transitory computer readable storage medium having stored thereon computer instructions for causing the computer to perform the method of any one of claims 1-7.
17. A computer program product comprising computer program/instructions, characterized in that the computer program/instructions, when executed by a processor, implement the steps of the method according to any of claims 1-7.
CN202211133958.9A 2022-09-16 2022-09-16 Method, device and equipment for recommending candidate search terms and storage medium Pending CN115455274A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211133958.9A CN115455274A (en) 2022-09-16 2022-09-16 Method, device and equipment for recommending candidate search terms and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211133958.9A CN115455274A (en) 2022-09-16 2022-09-16 Method, device and equipment for recommending candidate search terms and storage medium

Publications (1)

Publication Number Publication Date
CN115455274A true CN115455274A (en) 2022-12-09

Family

ID=84303926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211133958.9A Pending CN115455274A (en) 2022-09-16 2022-09-16 Method, device and equipment for recommending candidate search terms and storage medium

Country Status (1)

Country Link
CN (1) CN115455274A (en)

Similar Documents

Publication Publication Date Title
CN107908662B (en) Method and device for realizing search system
CN110766486A (en) Method and device for determining item category
CN113806660B (en) Data evaluation method, training device, electronic equipment and storage medium
CN114428677B (en) Task processing method, processing device, electronic equipment and storage medium
CN114330329A (en) Service content searching method and device, electronic equipment and storage medium
CN110245357B (en) Main entity identification method and device
CN110750707A (en) Keyword recommendation method and device and electronic equipment
CN113656737A (en) Webpage content display method and device, electronic equipment and storage medium
CN112148841A (en) Object classification and classification model construction method and device
CN113722593B (en) Event data processing method, device, electronic equipment and medium
CN113792232B (en) Page feature calculation method, page feature calculation device, electronic equipment, page feature calculation medium and page feature calculation program product
CN112887426B (en) Information stream pushing method and device, electronic equipment and storage medium
CN112506800B (en) Method, apparatus, device, medium and program product for testing code
CN113495841B (en) Compatibility detection method, device, equipment, storage medium and program product
CN115759100A (en) Data processing method, device, equipment and medium
CN115455274A (en) Method, device and equipment for recommending candidate search terms and storage medium
CN114491232A (en) Information query method and device, electronic equipment and storage medium
CN113656731A (en) Advertisement page processing method and device, electronic equipment and storage medium
CN113221035A (en) Method, apparatus, device, medium, and program product for determining an abnormal web page
US10572560B2 (en) Detecting relevant facets by leveraging diagram identification, social media and statistical analysis software
CN112016017A (en) Method and device for determining characteristic data
CN113360765B (en) Event information processing method and device, electronic equipment and medium
CN113656393B (en) Data processing method, device, electronic equipment and storage medium
CN110532540B (en) Method, system, computer system and readable storage medium for determining user preferences
CN112818221B (en) Entity heat determining method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination