CN115129922A - Search term generation method, model training method, medium, device and equipment - Google Patents

Search term generation method, model training method, medium, device and equipment Download PDF

Info

Publication number
CN115129922A
CN115129922A CN202210799138.7A CN202210799138A CN115129922A CN 115129922 A CN115129922 A CN 115129922A CN 202210799138 A CN202210799138 A CN 202210799138A CN 115129922 A CN115129922 A CN 115129922A
Authority
CN
China
Prior art keywords
historical
resources
user
resource
interest
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210799138.7A
Other languages
Chinese (zh)
Inventor
刘卉芸
解忠乾
罗川江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Netease Cloud Music Technology Co Ltd
Original Assignee
Hangzhou Netease Cloud Music Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Netease Cloud Music Technology Co Ltd filed Critical Hangzhou Netease Cloud Music Technology Co Ltd
Priority to CN202210799138.7A priority Critical patent/CN115129922A/en
Publication of CN115129922A publication Critical patent/CN115129922A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/68Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/686Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using information manually generated, e.g. tags, keywords, comments, title or artist information, time, location or usage information, user ratings
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/60Information retrieval; Database structures therefor; File system structures therefor of audio data
    • G06F16/63Querying
    • G06F16/635Filtering based on additional data, e.g. user or group profiles

Abstract

The embodiment of the disclosure provides a search term generation method, a model training method, a medium, a device and equipment. The method comprises the following steps: recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; confirming candidate search terms according to the interest resources; inputting resource characteristics of historical resources corresponding to historical behaviors of the user and the characteristics of the candidate search terms into a prediction model, and outputting a prediction value corresponding to each candidate search term; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user; and screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words. According to the method, the probability value of the candidate search words clicked by the user is predicted through the prediction model, the candidate search words with high probability values are screened and recommended to the user, so that the search words according with the user interests are recommended, and the use experience of the user is improved.

Description

Search term generation method, model training method, medium, device and equipment
Technical Field
Embodiments of the present disclosure relate to the field of computer vision, and more particularly, to a search term generation method, a model training method, a medium, an apparatus, and a device.
Background
This section is intended to provide a background or context to the embodiments of the disclosure recited in the claims. The description herein is not admitted to be prior art by inclusion in this section.
Currently, in order to facilitate a user to search for a required music resource, a search box is set in a music website or music software, and the user can input a keyword in the search box to search for the music resource related to the keyword.
In order to further improve the use experience of the user, the light-color shading style search words in the search box recommend music resources which may be interested by the user to the user, and the user can directly select a search option to search the music resources related to the search words. Therefore, how to improve the matching degree of the search terms and the user interests is a key for improving the click rate of the user on the search terms, and has great significance for improving the use experience of the user.
Disclosure of Invention
The present disclosure provides a search term generation method, a model training method, a medium, an apparatus, and a device for recommending a search term that meets user interest.
In a first aspect of the disclosed embodiments, a method for generating search terms is provided, including: recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; confirming candidate search terms according to the interest resources; inputting resource characteristics of historical resources corresponding to historical behaviors of the user and the characteristics of the candidate search terms into a prediction model, and outputting a prediction value corresponding to each candidate search term; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user; and screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words.
In an embodiment of the present disclosure, the inputting resource features of historical resources corresponding to historical behaviors of a user and the features of the candidate search terms into a prediction model, and outputting a prediction value corresponding to each candidate search term includes: classifying historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic characteristics under each type, and respectively splicing the basic characteristics under each type of historical resources to obtain splicing characteristics of each type of historical resources; inputting the splicing characteristics and outputting the sequence characteristics of each type of historical resources based on an attention mechanism; splicing the sequence features to obtain the historical behavior features of the user; respectively splicing the user historical behavior features and the candidate search word features to obtain a plurality of first combined features; and inputting each first merging feature into a prediction model to obtain a prediction value of each candidate search term.
In another embodiment of the present disclosure, the method further comprises: splicing the first combination characteristics, the user attribute characteristics and the current environment characteristics to obtain a plurality of second combination characteristics; and inputting each second combined feature into the prediction model to obtain the predicted value of each candidate search term.
In yet another embodiment of the present disclosure, the confirming the candidate search term according to the interest resource includes: extracting title information of the interest resources; taking the title information as candidate search words; or, taking the historical search terms of the user corresponding to the title information as candidate search terms; or inputting the title information into a strategy template to match the title information with the resources under various types, and obtaining candidate search terms corresponding to the interest resources output by the strategy template.
In yet another embodiment of the present disclosure, the historical behavior includes historical operational records and real-time operational records; the recalling interest resources according to the historical behaviors of the user comprises the following steps: calculating a first recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first interest resource according to the first recall score; calculating a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second interest resource according to the second recall score; classifying the first interest resource and the second interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third interest resource in a preset number of resources at the top; and taking the first interest resource, the second interest resource and the third interest resource as the interest resources.
In another embodiment of the present disclosure, the calculating a first recall score of a historical resource corresponding to a historical operation record according to the historical operation record of the user includes: and for the historical operation records, according to the preset weight of the resource type, the preset weight of the operation type and the time attenuation coefficient, carrying out weighted calculation to obtain a first recall score of each historical resource under the historical operation records.
In another embodiment of the present disclosure, the calculating, according to the real-time operation record of the user, a second recall score of the historical resource corresponding to the real-time operation record includes: and for the real-time operation record, according to the weight of the preset resource type and the weight of the operation type, performing weighted calculation to obtain a second recall score of each historical resource under the real-time operation record.
In an embodiment of the present disclosure, the recommending and displaying after the candidate search term is filtered includes: sorting and screening the candidate search words according to the predicted values; and displaying the screened candidate search words in a carousel mode according to the preset stay time.
In a second aspect of the disclosed embodiments, there is provided a predictive model training method, comprising: recalling sample interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; confirming a sample search word according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that a sample search word is clicked or not clicked; and taking the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the sample search words as input, taking the operation labels of the sample search words as labels, and training to obtain a prediction model.
In an embodiment of the present disclosure, the training, with the resource features of the historical resources corresponding to the historical behaviors of the user and the features of the sample search terms as inputs and the operation labels of the sample search terms as labels, to obtain the prediction model includes: classifying historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic characteristics under each type, and respectively splicing the basic characteristics under each type of historical resources to obtain splicing characteristics of each type of historical resources; inputting the splicing characteristics based on an attention mechanism, and outputting sequence characteristics of each type of historical resources; splicing the sequence features to obtain the historical behavior features of the user; splicing the user historical behavior characteristics and the sample search word characteristics to obtain a plurality of third combined characteristics; and taking each third combination characteristic as input, taking the operation label of the sample search word corresponding to the third combination characteristic as a label, and training to obtain the prediction model.
In another embodiment of the present disclosure, the method further comprises: splicing the third combination characteristics, the user attribute characteristics and the historical environment characteristics to obtain fourth combination characteristics; and taking each fourth merging feature as input, taking the operation label of the sample search word corresponding to the fourth merging feature as a label, and training to obtain a prediction model.
In yet another embodiment of the present disclosure, identifying sample search terms according to the sample interest resource includes: extracting title information of the sample interest resources; taking the title information of the sample interest resources as sample search terms; or taking a user history search word corresponding to the title information of the sample interest resource as a sample search word; or inputting the title information of the sample interest resources into a strategy template so as to match the title information of the sample interest resources with the resources under various types and obtain sample search terms corresponding to the sample interest resources output by the strategy template.
In yet another embodiment of the present disclosure, the historical behavior includes historical operational records and real-time operational records; the recalling sample interest resources according to the historical behaviors of the user comprises the following steps: calculating a third recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first sample interest resource according to the third recall score; calculating a fourth recall score of the historical resources corresponding to the real-time operation record according to the real-time operation record of the user; determining a second sample interest resource according to the fourth recall score; classifying the first sample interest resource and the second sample interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third sample interest resource with a preset number of resources at the top; and taking the first sample interest resource, the second sample interest resource and the third sample interest resource as sample interest resources.
In another embodiment of the present disclosure, calculating a third recall score of a historical resource corresponding to a historical operation record according to the historical operation record of the user includes: and for the historical operation records, according to the preset weight of the resource type, the preset weight of the operation type and the time attenuation coefficient, carrying out weighted calculation to obtain a third recall score of each historical resource under the historical operation records.
In another embodiment of the present disclosure, calculating a fourth recall score of a historical resource corresponding to a historical operation record according to a real-time operation record of a user includes: and for the real-time operation record, according to the weight of the preset resource type and the weight of the operation type, performing weighted calculation to obtain a fourth recall score of each historical resource under the real-time operation record.
In a third aspect of embodiments of the present disclosure, there is provided a computer-readable storage medium comprising: the computer-readable storage medium has stored therein computer-executable instructions for implementing the search term generation method according to any one of the first aspect when executed by a processor.
In a fourth aspect of the disclosed embodiments, there is provided a search term generation apparatus comprising: the recall module is used for recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; the generating module is used for confirming the candidate search terms according to the interest resources; the prediction module is used for inputting the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the candidate search words into a prediction model and outputting a prediction value corresponding to each candidate search word; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user; and the display module is used for screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words.
In an embodiment of the present disclosure, the prediction module is specifically configured to classify historical resources corresponding to historical behaviors of a user according to resource types to obtain basic features under each type, and splice the basic features under each type of historical resources to obtain splicing features of each type of historical resources; the prediction module is specifically used for inputting the splicing characteristics and outputting the sequence characteristics of each type of historical resources based on an attention mechanism; the prediction module is specifically used for splicing the sequence features to obtain the historical behavior features of the user; the prediction module is specifically further configured to splice the user historical behavior features and the candidate search term features respectively to obtain a plurality of first combined features; and inputting each first merging characteristic into a prediction model to obtain a prediction value of each candidate search term.
In another embodiment of the present disclosure, the prediction module is further configured to splice the first merged features, the user attribute features, and the current environment features to obtain a plurality of second merged features; and inputting each second combined feature into the prediction model to obtain the prediction value of each candidate search term.
In another embodiment of the present disclosure, the generating module is specifically configured to extract title information of the interest resource; the generating module is specifically further configured to use the header information as a candidate search term; or, the generating module is specifically further configured to use the user history search word corresponding to the title information as a candidate search word; or, the generating module is specifically further configured to input the header information into a policy template, so as to match the header information with resources of various types, and obtain a candidate search term corresponding to the interest resource output by the policy template.
In yet another embodiment of the present disclosure, the historical behavior includes historical operational records and real-time operational records; the recall module is specifically used for calculating a first recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first interest resource according to the first recall score; the recall module is specifically further used for calculating a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second interest resource according to the second recall score; the recall module is specifically further configured to classify the first interest resource and the second interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third interest resource in a preset number of resources at the top; the recall module is specifically further configured to use the first interest resource, the second interest resource, and the third interest resource as interest resources.
In another embodiment of the present disclosure, the recall module is specifically configured to, for the historical operation records, perform weighted calculation according to preset weights of resource types, weights of operation types, and a time decay coefficient to obtain a first recall score of each historical resource under the historical operation records.
In another embodiment of the present disclosure, the recall module is specifically configured to, for the real-time operation record, perform weighted calculation according to a preset weight of the resource type and a preset weight of the operation type to obtain a second recall score of each historical resource under the real-time operation record.
In yet another embodiment of the present disclosure, the presentation module is specifically configured to perform rank screening on the candidate search terms according to the predicted values; the display module is specifically further configured to display the screened candidate search terms in a carousel manner according to a preset retention time.
In a fifth aspect of the disclosed embodiments, there is provided a predictive model training apparatus comprising: the computing module is used for recalling the sample interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; the acquisition module is used for confirming a sample search word according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that a sample search word is clicked or not clicked; and the training module is used for taking the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the sample search terms as input, taking the operation labels of the sample search terms as labels, and training to obtain the prediction model.
In an embodiment of the disclosure, the training module is specifically configured to classify historical resources corresponding to historical behaviors of a user according to resource types to obtain basic features under each type, and splice the basic features under each type of historical resources to obtain spliced features of each type of historical resources; the training module is specifically used for inputting the splicing characteristics and outputting the sequence characteristics of each type of historical resources based on an attention mechanism; the training module is specifically used for splicing the sequence features to obtain the historical behavior features of the user; the training module is specifically further configured to splice the user historical behavior features and the sample search term features to obtain a plurality of third combined features; and taking each third combination characteristic as input, taking the operation label of the sample search word corresponding to the third combination characteristic as a label, and training to obtain the prediction model.
In another embodiment of the present disclosure, the training module is further configured to splice the third merged features, the user attribute features, and the historical environment features to obtain fourth merged features; and taking the fourth combined features as input, taking the operation labels of the sample search terms corresponding to the fourth combined features as labels, and training to obtain the prediction model.
In another embodiment of the present disclosure, the obtaining module is specifically configured to extract header information of the sample interest resource; the obtaining module is specifically further configured to use the title information of the sample interest resource as a sample search term; or, the obtaining module is specifically further configured to use a user history search word corresponding to the title information of the sample interest resource as a sample search word; or, the obtaining module is specifically further configured to input the title information of the sample interest resource into a policy template, so as to match the title information of the sample interest resource with resources of various types, and obtain a sample search term corresponding to the sample interest resource output by the policy template.
In yet another embodiment of the present disclosure, the historical behavior includes historical operational records and real-time operational records; the calculation module is specifically used for calculating a third recall score of the historical resources corresponding to the historical operation record according to the historical operation record of the user; determining a first sample interest resource according to the third recall score; the computing module is specifically further configured to compute a fourth recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second sample interest resource according to the fourth recall score; the computing module is specifically configured to classify the first sample interest resource and the second sample interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third sample interest resource with a preset number of resources at the top; the calculation module is specifically further configured to use the first sample interest resource, the second sample interest resource, and the third sample interest resource as sample interest resources.
In another embodiment of the present disclosure, the calculating module is specifically configured to, for the historical operation records, perform weighted calculation according to preset weights of resource types and operation types and a time decay coefficient to obtain a third recall score of each historical resource under the historical operation records.
In another embodiment of the present disclosure, the calculating module is specifically configured to, for the real-time operation record, perform weighted calculation according to a preset weight of the resource type and a preset weight of the operation type to obtain a fourth recall score of each historical resource under the real-time operation record.
In a sixth aspect of embodiments of the present disclosure, there is provided a computing device comprising: at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor to cause the computing device to perform the search term generation method of any one of the first aspects of the embodiments of the present disclosure.
According to the method and the device for recommending the search words, the probability value that the candidate search words can be clicked by the user is predicted, the candidate search words with high probability values are screened and recommended to the user, so that the search words according with the user interests are recommended, and the use experience of the user is improved.
Drawings
The above and other objects, features and advantages of exemplary embodiments of the present disclosure will become readily apparent from the following detailed description read in conjunction with the accompanying drawings. Several embodiments of the present disclosure are illustrated by way of example, and not by way of limitation, in the figures of the accompanying drawings and in which:
fig. 1 schematically shows a schematic diagram of an application scenario according to an embodiment of the present disclosure;
fig. 2 schematically illustrates a flowchart of a search term generation method provided by an embodiment of the present disclosure;
FIG. 3 schematically illustrates an example diagram for determining a third resource of interest provided by an embodiment of the disclosure;
FIG. 4 schematically illustrates an example diagram of confirming candidate search terms provided by an embodiment of the disclosure;
FIG. 5 schematically illustrates an example diagram of predicting candidate search terms provided by an embodiment of the disclosure;
FIG. 6 is a schematic diagram illustrating a flow chart of presenting candidate search terms according to an embodiment of the disclosure;
FIG. 7 schematically illustrates an example diagram showing candidate search terms provided by an embodiment of the disclosure;
FIG. 8 schematically illustrates a flowchart of a predictive model training method provided by an embodiment of the present disclosure;
fig. 9 schematically illustrates a structural diagram of a storage medium provided by an embodiment of the present disclosure;
fig. 10 schematically shows a structural diagram of a search term generation apparatus provided by an embodiment of the present disclosure;
FIG. 11 is a schematic structural diagram of a predictive model training apparatus provided by an embodiment of the present disclosure;
fig. 12 schematically shows a structural diagram of a computing device provided by an embodiment of the present disclosure.
In the drawings, the same or corresponding reference numerals indicate the same or corresponding parts.
Detailed Description
The principles and spirit of the present disclosure will be described with reference to a number of exemplary embodiments. It is understood that these embodiments are presented merely to enable those skilled in the art to better understand and to practice the disclosure, and are not intended to limit the scope of the disclosure in any way. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.
As will be appreciated by one skilled in the art, embodiments of the present disclosure may be embodied as a system, apparatus, device, method, or computer program product. Accordingly, the present disclosure may be embodied in the form of: entirely hardware, entirely software (including firmware, resident software, micro-code, etc.), or a combination of hardware and software. The data related to the present disclosure may be data authorized by a user or fully authorized by each party, and the collection, transmission, use, and the like of the data all meet the requirements of relevant national laws and regulations, and the embodiments/examples of the present disclosure may be combined with each other.
According to the embodiment of the disclosure, a method for generating search terms, a model training method, a medium, a device and equipment are provided.
In this context, it is to be understood that the terms referred to have the following meanings:
an attention mechanism is as follows: and extracting characteristic parts in the text.
Moreover, any number of elements in the drawings are by way of example and not by way of limitation, and any nomenclature is used solely for differentiation and not by way of limitation.
The principles and spirit of the present disclosure are explained in detail below with reference to several representative embodiments thereof.
Summary of The Invention
The inventor finds that in the related art, music resources are matched through an algorithm and search words are generated and directly recommended to a user or hot search words are directly recommended to the user according to the history of the user. The method has the following defects that the generated search terms are not adaptive to each user and cannot accord with the interests of each user, and similarly, the hot search terms are obtained according to the search terms with high search frequency for a period of time but cannot accord with the interests of each user. Therefore, the use rate of the search terms by the user is reduced, and the use experience of the user is influenced.
To solve the above problems. The method and the device predict the probability value of the candidate search word clicked by the user through the pre-established prediction model, screen the candidate search word with high probability value and recommend the candidate search word to the user, so that the search word according with the user interest is recommended, and the use experience of the user is improved.
Having described the general principles of the present disclosure, various non-limiting embodiments of the present disclosure are described in detail below.
Application scene overview
Referring first to fig. 1, fig. 1 is a schematic view of an application scenario provided in an embodiment of the present disclosure.
When a cover needs to be generated, as shown in fig. 1, reference may be made to the illustrated flow. The method comprises the steps of obtaining historical behaviors of a user, and obtaining interest resources of the user according to the historical behaviors, wherein the interest resources comprise music resources operated by the user and music resources not operated by the user. And generating a candidate search word according to the interest resources, inputting the prediction model, and outputting a predicted value of the candidate search word, wherein the higher the predicted value is, the higher the probability that the candidate search word is clicked by the user is. And screening the candidate search words according to the predicted values, and recommending the candidate search words to the user.
Exemplary method
A search term generation method provided according to an exemplary embodiment of the present disclosure is described below with reference to fig. 2 to 8 in conjunction with the application scenario of fig. 1. It should be noted that the above application scenarios are merely illustrated for the convenience of understanding the spirit and principles of the present disclosure, and the embodiments of the present disclosure are not limited in this respect. Rather, embodiments of the present disclosure may be applied to any scenario where applicable.
The execution subject of the embodiments of the present disclosure may be a search term generation apparatus, and the search term generation apparatus is implemented in various ways. For example, the program may be software, or a medium storing a related computer program, such as a usb disk; alternatively, the apparatus may also be a physical device, such as a chip, an intelligent terminal, a computer, a server, etc., integrated with or installed with the relevant computer program.
Referring to fig. 2, fig. 2 is a schematic flowchart of a method for generating a search term according to an embodiment of the present disclosure. As shown in fig. 2, the search term generation method includes:
s201, recalling interest resources according to historical behaviors of the user; the historical behavior is for a plurality of historical resources.
Wherein the interest resource comprises any one or combination of the following: songs, singers, albums, song lists, videos, podcasts, and the like. It will be appreciated that the search terms generated by recalling multiple types of resources of interest may increase the user's selection range.
S202, confirming candidate search terms according to the interest resources.
The obtained candidate search terms may not accord with the interests of the user, so that the candidate search terms are temporarily reserved for waiting for the next screening, and are not directly recommended to the user.
S203, inputting the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the candidate search words into a prediction model, and outputting a prediction value corresponding to each candidate search word; and the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user.
The prediction model learns the habit of the user for selecting and clicking the search word according to the resource characteristics of the historical resources corresponding to the historical behaviors of the user, so that the probability value of the candidate search word meeting the user interest is predicted according to the characteristics of the candidate search word, the understanding can be realized, the meeting degree of the candidate search word and the user interest can be predicted for each user through the prediction model, and the basis is provided for recommending the search word.
And S204, screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words.
The candidate search words with high predicted values are screened out and recommended to the user, and the click rate of the user on the search words can be improved.
In one example, the historical behavior includes historical operational records and real-time operational records; s201 comprises: calculating a first recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first interest resource according to the first recall score; calculating a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second interest resource according to the second recall score; classifying the first interest resource and the second interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third interest resource in a preset number of resources at the top; and taking the first interest resource, the second interest resource and the third interest resource as the interest resources.
As an alternative embodiment, the historical behavior includes any one or a combination of the following: play, active play, heartburn, collect, forward, or comment. And the first interest resource and the second interest resource recalled according to the historical behaviors of the user are resources operated by the user. Referring to fig. 3, fig. 3 is an exemplary diagram for determining a third resource of interest according to an embodiment of the present disclosure. And constructing a similarity matrix among the resources of each type, and determining a third interest resource according to the recall score of the resource operated by the user and the similarity matrix. Taking the song resource type as an example, the intersection point part in the similarity matrix is the similarity between two resources, for example, the similarity between song a and song C is 0.4. The product of the recall score of song a and the similarity of song a to song C, i.e. 80 x 0.4, was calculated. And respectively summing products corresponding to resources which are not operated by each user, wherein the song C comprises two products 32 and 49, and the summation is 81 to obtain a product result of the song C. Similarly, the result of multiplying song D is 53, the result of multiplying song E is 38, and the result of multiplying song F is 104, which indicates that song F is most suitable for the user's interest in the songs that the user has not operated. And the calculation methods of other resource types are the same, and the preset number of resources with the multiplication result of the resources under each type ranked in the front are used as third interest resources.
For example, taking song resources as an example, songs with similar song styles, different songs of the same singer, or songs with similar song tempos have high similarity, and a user is interested in a certain song and will be interested in the similar song.
Based on the above embodiment, by calculating the recall score and calculating the result of multiplying the recall score by the similarity between the resources, the interest resources operated by the user and the interest resources not operated by the user can be recalled accurately, thereby enriching the range of the search terms.
Specifically, in an example, calculating a first recall score of a historical resource corresponding to a historical operation record according to the historical operation record of a user includes: and for the historical operation records, according to the preset weight of the resource type, the preset weight of the operation type and the time attenuation coefficient, carrying out weighted calculation to obtain a first recall score of each historical resource under the historical operation records.
As an optional implementation manner, the first recall score reference formula (1) of the historical resource corresponding to the historical operation record is calculated:
Figure BDA0003736811770000121
wherein, resource _ i represents any resource, type is the type of the resource, w type Is the weight of the resource type, action is the action on the resource, w action The weight of the operation type, t is the timestamp of the operation, time is the current timestamp, and decay is the time decay coefficient.
For example, the different operation behaviors represent the interest level of the user in the resource, and the collection behavior indicates that the user will use the resource later, which indicates that the interest level of the user in the resource is higher, so that a higher weight is set. The effective playing shows that the user completely plays the resource, and the interest degree of the user on the resource is higher, so that a higher weight is set. The interest of the user may change along with the time, and the operation behavior closer to the current time can reflect the interest of the user, so that the time attenuation coefficient is set to reflect the change of the interest of the user.
As one example, a preset number of historical resources ranked top by first recall score are considered first interest resources.
Specifically, in another example, calculating a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user includes: and for the real-time operation record, according to the weight of the preset resource type and the weight of the operation type, performing weighted calculation to obtain a second recall score of each historical resource under the real-time operation record.
As an optional implementation manner, the reference formula (2) of the first recall score of the historical resource corresponding to the real-time operation record is calculated:
score(resource_i)=w type *∑ actiont action*w action (2)
wherein, resource _ i represents any resource, type is the type of the resource, w type Is the weight of the resource type, action is the action on the resource, w actuon Is the weight of the operation type and t is the timestamp of the operation.
For example, if a user operation record of a period of time close to the current time is taken as the real-time operation record, it can be considered that the interest of the user does not change during the period of time close to the current time, and therefore the formula (2) does not set the time attenuation coefficient. If the user does not have an operational record during this time, a second recall score is not calculated.
As an example, a preset number of historical resources ranked top by the second recall score are used as the second interest resource.
It should be noted that the present disclosure does not limit the specific forms of the formula (1) and the formula (2), and the present disclosure does not limit the time range of the real-time operation recording.
Based on the above embodiment, by setting the weights of different resource types and the weight of an operation type, the recall score can be accurately calculated, so that the interest resource is determined according to the recall score. The factor of user interest change can be introduced in the process of calculating the recall score through the time attenuation coefficient, and the accuracy of calculating the recall score is further improved.
In another example, S202 includes: extracting title information of the interest resources; taking the title information as candidate search words; or, taking the historical search terms of the user corresponding to the title information as candidate search terms; or inputting the title information into a strategy template to match the title information with the resources under various types, and obtaining candidate search terms corresponding to the interest resources output by the strategy template.
For example, taking songs as an example, the song names of the resources of interest can be directly used as the candidate search terms. Taking a song list as an example, referring to fig. 4, fig. 4 is an example diagram of confirming candidate search terms according to an example of the present disclosure, the song list of the interest resource is named as "some songs touching the heart", the user selects the song list when searching for "warm songs" historically, and "warm songs" can be used as the candidate search terms of the song list. Taking a singer as an example, the name of the singer who is interested in the resources is Zhang III, Zhang III is input into a strategy template and matched with the resources under various types, a candidate search word 'Zhang III gives off new songs' can be generated by matching with song resources, and a candidate search word 'Zhang III gives off new albums' can be generated by matching with album resources.
It should be noted that the present disclosure is not limited to the specific form of generating candidate words.
Based on the above embodiments, a rich candidate search term can be generated in various ways.
In yet another example, S203 includes: classifying historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic characteristics under each type, and respectively splicing the basic characteristics under each type of historical resources to obtain splicing characteristics of each type of historical resources; inputting the splicing characteristics and outputting the sequence characteristics of each type of historical resources based on an attention mechanism; splicing the sequence features to obtain the historical behavior features of the user; respectively splicing the user historical behavior features and the candidate search word features to obtain a plurality of first combined features; and inputting each first merging feature into a prediction model to obtain a prediction value of each candidate search term.
As an alternative implementation, referring to fig. 5, fig. 5 is an exemplary diagram of a predicted candidate search term according to an example of the present disclosure. The resources under different resource types comprise different basic characteristics, all the basic characteristics of the single resources under the same type are spliced to obtain the splicing characteristics of the single resources, and the splicing characteristics represent the characteristics of the single resources. Based on the attention mechanism, all splicing characteristics under the same resource type are input, and sequence characteristics of the resource under the resource type are output, wherein the sequence characteristics represent the characteristics of a single resource type. And splicing the sequence features of each resource type to obtain the historical behavior features of the user, wherein the historical behavior features of the user represent the overall features of all resources operated by the user.
For example, resources under different resource types include different underlying characteristics, and the underlying characteristics of the song resource include at least one of: the average value of word vectors after word segmentation of song names, the number of recent clicks of the songs, the number of effective recent clicks of the songs, the number of recent collection of the songs, the number of recent forwarding of the songs and the like. The basic features of the historical search terms include at least one of: the average value of vectors after the historical search word text is segmented, the recent search times and the number of people/click resources of the historical search word, the number of people/click rate/effective play resources of the historical search word, and the number of people/effective click rate.
It is to be noted that the present disclosure does not limit the details of the underlying features.
Based on the above embodiment, the first merged feature obtained by the abundant basic features is used for prediction, so that the prediction accuracy can be improved.
Preferably, in another example, the search term generation method further includes: splicing the first combination characteristics, the user attribute characteristics and the current environment characteristics to obtain a plurality of second combination characteristics; and inputting each second combined feature into the prediction model to obtain the predicted value of each candidate search term.
As an optional implementation manner, on the basis of the first combined feature, the user attribute feature and the current environment feature are added. The user attribute characteristics include any one or a combination of: the age of the user, the gender of the user, the province of the user, the number of the user, etc. The current environmental characteristics include any one or combination of the following: current timestamp, current season, and device type, etc.
It should be noted that the present disclosure does not limit the content included in the second combination feature.
Based on the above embodiment, on the basis of the first combined feature, the user attribute feature and the current environment feature are added, so that the candidate search terms can be predicted from more dimensions, and the accuracy of the predicted value is improved.
In yet another example, S204 includes: sorting and screening the candidate search words according to the predicted values; and (5) displaying the screened candidate search words in a carousel mode according to a preset stay time.
For example, referring to fig. 6, fig. 6 is an exemplary diagram showing candidate search terms according to an example of the present disclosure. And screening candidate search words with predicted values sequenced at the first 5, and sequentially and respectively obtaining candidate search words 1, 2, 3, 4 and 5. The preset stay time is 6 seconds, and the candidate search words are sequentially played in turn or randomly displayed at intervals of 6 seconds.
Based on the above embodiment, the candidate search terms obtained through screening according to the predicted values and displayed in carousel manner better accord with the interests of the user, so that the user experience is improved.
As an example, referring to fig. 7, fig. 7 is an exemplary diagram illustrating candidate search terms according to an example of the present disclosure. The user may click on the currently displayed candidate search term, at which point the search box expands, displaying all candidate search terms in which the user may select.
Based on the above implementation mode, through the method for completely displaying all the candidate search terms, the user can select from the candidate search terms more quickly, and the user experience is improved.
Referring to fig. 8, fig. 8 is a schematic flowchart of a predictive model training method according to an embodiment of the present disclosure. As shown in fig. 8, the predictive model training method includes:
s801, recalling sample interest resources according to historical behaviors of the user; the historical behavior is for a plurality of historical resources.
S802, confirming a sample search word according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that the sample search word is clicked or not clicked.
And S803, taking the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the sample search terms as input, taking the operation labels of the sample search terms as labels, and training to obtain a prediction model.
In one example, the historical behavior includes historical operational records and real-time operational records; s801 comprises: calculating a third recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first sample interest resource according to the third recall score; calculating a fourth recall score of the historical resources corresponding to the real-time operation record according to the real-time operation record of the user; determining a second sample interest resource according to the fourth recall score; classifying the first sample interest resource and the second sample interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third sample interest resource with a preset number of resources at the top; and taking the first sample interest resource, the second sample interest resource and the third sample interest resource as sample interest resources.
Specifically, in an example, calculating a third recall score of the historical resource corresponding to the historical operation record according to the historical operation record of the user includes: and for the historical operation records, according to the preset weight of the resource type, the preset weight of the operation type and the time attenuation coefficient, carrying out weighted calculation to obtain a third recall score of each historical resource under the historical operation records.
Specifically, in another example, calculating a fourth recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user includes: and for the real-time operation record, according to the weight of the preset resource type and the weight of the operation type, performing weighted calculation to obtain a fourth recall score of each historical resource under the real-time operation record.
In another example, S802 includes: extracting title information of the sample interest resources; taking the title information of the sample interest resources as sample search terms; or taking a user history search word corresponding to the title information of the sample interest resource as a sample search word; or inputting the title information of the sample interest resources into a strategy template so as to match the title information of the sample interest resources with the resources under various types and obtain sample search terms corresponding to the sample interest resources output by the strategy template.
In yet another example, S803 includes: classifying historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic characteristics under each type, and splicing the basic characteristics under each type of historical resources to obtain splicing characteristics of each type of historical resources; inputting the splicing characteristics based on an attention mechanism, and outputting sequence characteristics of each type of historical resources; splicing the sequence features to obtain the historical behavior features of the user; splicing the user historical behavior characteristics and the sample search word characteristics to obtain a plurality of third combined characteristics; and taking each third combination characteristic as input, taking the operation label of the sample search word corresponding to the third combination characteristic as a label, and training to obtain the prediction model.
Preferably, in another example, the method for training the prediction model further includes: splicing the third combination characteristics, the user attribute characteristics and the historical environment characteristics to obtain fourth combination characteristics; and taking each fourth merging feature as input, taking the operation label of the sample search word corresponding to the fourth merging feature as a label, and training to obtain a prediction model.
According to the search term generation method provided by the embodiment, interest resources are recalled according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources; confirming candidate search terms according to the interest resources; inputting resource characteristics of historical resources corresponding to user historical behaviors and the characteristics of the candidate search words into a prediction model, and outputting a prediction value corresponding to each candidate search word; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user; and screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words. According to the method, the probability value of the candidate search words clicked by the user is predicted through the prediction model, the candidate search words with high probability values are screened and recommended to the user, so that the search words according with the user interests are recommended, and the use experience of the user is improved.
Exemplary Medium
Having described the method of the exemplary embodiment of the present disclosure, next, a storage medium of the exemplary embodiment of the present disclosure will be described with reference to fig. 9.
Referring to fig. 9, a storage medium 90 stores therein a program product for implementing the above method according to an embodiment of the present disclosure, which may employ a portable compact disc read only memory (CD-ROM) and include program code, and may be run on a terminal device, such as a personal computer. However, the program product of the present disclosure is not limited thereto.
The program product may employ any combination of one or more readable media. The readable medium may be a readable signal medium or a readable storage medium. A readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the readable storage medium include: an electrical connection having one or more wires, a portable diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.
A readable signal medium may include a propagated data signal with readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. The readable signal medium may also be any readable medium other than a readable storage medium.
Program code for carrying out operations of the present disclosure may be written in any combination of one or more programming languages, including an object oriented programming language such as Java, C + + or the like and conventional procedural programming languages, such as the "C" programming language or similar programming languages. The program code may execute entirely on the user computing device, partly on the user device, partly on a remote computing device, or entirely on the remote computing device or server. In the case of a remote computing device, the remote computing device may be connected to the user computing device through any kind of network, including a Local Area Network (LAN) or a Wide Area Network (WAN).
Exemplary devices
After introducing the media of the exemplary embodiment of the present disclosure, next, a search term generation apparatus of the exemplary embodiment of the present disclosure is described with reference to fig. 10 for implementing the method in any of the above method embodiments, which has similar implementation principles and technical effects, and is not described herein again.
Referring to fig. 10, fig. 10 is a schematic structural diagram of a search term generation apparatus according to an embodiment of the present disclosure. As shown in fig. 10, the search term generation apparatus includes:
the recall module 101 is used for recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources.
A generating module 102, configured to determine candidate search terms according to the interest resources.
The prediction module 103 is configured to input resource features of historical resources corresponding to user historical behaviors and the candidate search term features into a prediction model, and output a prediction value corresponding to each candidate search term; and the predicted value represents the probability value of clicking the candidate search word predicted by the prediction model by the user.
And the display module 104 is configured to filter the candidate search terms and recommend display according to the predicted values corresponding to the candidate search terms.
In one example, the historical behavior includes historical operational records and real-time operational records; the recall module 101 is specifically configured to calculate a first recall score of a historical resource corresponding to a historical operation record according to the historical operation record of a user; determining a first interest resource according to the first recall score; the recall module 101 is specifically further configured to calculate a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second interest resource according to the second recall score; the recall module 101 is further specifically configured to classify the first interest resource and the second interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third interest resource in a preset number of resources at the top; the recall module 101 is further specifically configured to use the first interest resource, the second interest resource, and the third interest resource as the interest resources.
Specifically, in an example, the recall module 101 is specifically configured to, for the historical operation record, perform weighted calculation according to a preset weight of a resource type, a preset weight of an operation type, and a time attenuation coefficient to obtain a first recall score of each historical resource in the historical operation record.
Specifically, in another example, the recall module 101 is specifically configured to, for the real-time operation record, perform weighted calculation according to a preset weight of the resource type and a preset weight of the operation type to obtain a second recall score of each historical resource under the real-time operation record.
In another example, the generating module 102 is specifically configured to extract title information of the interest resource; the generating module 102 is specifically configured to use the header information as a candidate search term; or, the generating module 102 is specifically further configured to use the user history search word corresponding to the title information as a candidate search word; or, the generating module 102 is further specifically configured to input the header information into a policy template, so as to match the header information with resources of various types, and obtain a candidate search term corresponding to the interest resource output by the policy template.
In another example, the prediction module 103 is specifically configured to classify historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic features under each type, and respectively splice the basic features under each type of historical resources to obtain splicing features of each type of historical resources; the prediction module 103 is further specifically configured to input the splicing feature based on an attention mechanism, and output a sequence feature of each type of historical resource; the prediction module 103 is further configured to specifically splice the sequence features to obtain historical behavior features of the user; the prediction module 103 is further specifically configured to splice the user historical behavior features and the candidate search term features respectively to obtain a plurality of first merged features; and inputting each first merging characteristic into a prediction model to obtain a prediction value of each candidate search term.
Preferably, in another example, the prediction module 103 is further configured to splice the first merged features, the user attribute features, and the current environment features to obtain a plurality of second merged features; and inputting each second combined feature into the prediction model to obtain the predicted value of each candidate search term.
In yet another example, the presentation module 104 is specifically configured to perform rank screening on the candidate search terms according to the predicted values; the displaying module 104 is further configured to display the screened candidate search terms in a carousel manner according to a preset retention time.
Referring to fig. 11, fig. 11 is a schematic structural diagram of a prediction model training apparatus according to an embodiment of the present disclosure. As shown in fig. 11, the prediction model training apparatus includes:
the computing module 111 is used for recalling the sample interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources.
An obtaining module 112, configured to determine a sample search term according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that the sample search word is clicked or not clicked.
And the training module 113 is configured to train to obtain a prediction model by using the resource features of the historical resources corresponding to the historical behaviors of the user and the features of the sample search terms as inputs and using the operation labels of the sample search terms as labels.
In one example, the historical behavior includes historical operational records and real-time operational records; the calculating module 111 is specifically configured to calculate, according to the historical operation record of the user, a third recall score of the historical resource corresponding to the historical operation record; determining a first sample interest resource according to the third recall score; the calculating module 111 is further specifically configured to calculate, according to the user real-time operation record, a fourth recall score of the historical resource corresponding to the real-time operation record; determining a second sample interest resource according to the fourth recall score; the calculating module 111 is further configured to classify the first sample interest resource and the second sample interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third sample interest resource with a preset number of resources at the top; the calculating module 111 is further configured to use the first sample interest resource, the second sample interest resource, and the third sample interest resource as sample interest resources.
Specifically, in an example, the calculating module 111 is specifically configured to, for the historical operation record, perform weighted calculation according to a preset weight of the resource type, a preset weight of the operation type, and a time decay coefficient to obtain a third recall score of each historical resource in the historical operation record.
As one example, a preset number of historical resources ranked top in the third recall score are considered first sample interest resources.
Specifically, in another example, the calculating module 111 is specifically configured to, for the real-time operation record, perform weighted calculation according to a preset weight of the resource type and a preset weight of the operation type to obtain a fourth recall score of each historical resource under the real-time operation record.
As one example, a preset number of historical resources ranked top fourth recall score are considered second interest resources.
In another example, the obtaining module 112 is specifically configured to extract title information of the sample interest resource; the obtaining module 112 is further configured to specifically use the title information of the sample interest resource as a sample search term; or, the obtaining module 112 is further specifically configured to use a user history search word corresponding to the title information of the sample interest resource as a sample search word; or, the obtaining module 112 is specifically configured to input the title information of the sample interest resource into a policy template, so as to match the title information of the sample interest resource with resources of various types, and obtain a sample search term corresponding to the sample interest resource output by the policy template.
In another example, the training module 113 is specifically configured to classify the historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic features under each type, and splice the basic features under each type of historical resources to obtain splicing features of each type of historical resources; the training module 113 is specifically configured to input the splicing feature based on an attention mechanism, and output a sequence feature of each type of historical resource; the training module 113 is specifically configured to splice the sequence features to obtain historical behavior features of the user; the training module 113 is further configured to splice the user historical behavior features and the sample search term features to obtain a plurality of third merged features; and taking each third combination characteristic as input, taking the operation label of the sample search word corresponding to the third combination characteristic as a label, and training to obtain the prediction model.
Preferably, in another example, the training module 113 is further configured to splice the third merged features, the user attribute features, and the historical environment features to obtain fourth merged features; and taking each fourth merging feature as input, taking the operation label of the sample search word corresponding to the fourth merging feature as a label, and training to obtain a prediction model.
The search term generation device provided by the embodiment comprises a recall module, a search term generation module and a recall module, wherein the recall module is used for recalling interest resources according to historical behaviors of a user; the historical behavior is for a plurality of historical resources; the generating module is used for confirming the candidate search words according to the interest resources; the prediction module is used for inputting the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the candidate search words into a prediction model and outputting a prediction value corresponding to each candidate search word; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user; and the display module is used for screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words. According to the method, the probability value of the candidate search words clicked by the user is predicted through the prediction model, the candidate search words with high probability values are screened and recommended to the user, so that the search words according with the user interests are recommended, and the use experience of the user is improved.
Exemplary computing device
Having described the methods, media, and apparatus of the exemplary embodiments of the present disclosure, a computing device of the exemplary embodiments of the present disclosure is described next with reference to fig. 12.
The computing device 120 shown in fig. 12 is only one example and should not impose any limitations on the functionality or scope of use of embodiments of the disclosure.
As shown in fig. 12, computing device 120 is embodied in the form of a general purpose computing device. Components of computing device 120 may include, but are not limited to: the at least one processing unit 1201 and the at least one storage unit 1202 may be coupled together via a bus 1203 to the various system components including the processing unit 1201 and the storage unit 1202.
The bus 1203 includes a data bus, a control bus, and an address bus.
The storage unit 1202 may include readable media in the form of volatile memory, such as Random Access Memory (RAM)12021 and/or cache memory 12022, and may further include readable media in the form of non-volatile memory, such as Read Only Memory (ROM) 12023.
The storage unit 1202 may also include a program/utility 12025 having a set (at least one) of program modules 12024, such program modules 12024 including, but not limited to: an operating system, one or more application programs, other program modules, and program data, each of which, or some combination thereof, may comprise an implementation of a network environment.
Computing device 120 may also communicate with one or more external devices 1204 (e.g., keyboard, pointing device, etc.). Such communication may occur via input/output (I/O) interfaces 1205. Also, computing device 120 may communicate with one or more networks (e.g., a Local Area Network (LAN), a Wide Area Network (WAN), and/or a public network, such as the internet) through network adapter 1206. As shown in FIG. 12, network adapter 1206 communicates with the other modules of computing device 120 over bus 1203. It should be understood that although not shown in the figures, other hardware and/or software modules may be used in conjunction with computing device 120, including but not limited to: microcode, device drivers, redundant processing units, external disk drive arrays, RAID systems, tape drives, and data backup storage systems, among others.
It should be noted that although in the above detailed description several units/modules or sub-units/modules of the search term generation apparatus are mentioned, such division is merely exemplary and not mandatory. Indeed, the features and functionality of two or more of the units/modules described above may be embodied in one unit/module, in accordance with embodiments of the present disclosure. Conversely, the features and functions of one unit/module described above may be further divided into embodiments by a plurality of units/modules.
Further, while the operations of the disclosed methods are depicted in the drawings in a particular order, this does not require or imply that these operations must be performed in this particular order, or that all of the illustrated operations must be performed, to achieve desirable results. Additionally or alternatively, certain steps may be omitted, multiple steps combined into one step execution, and/or one step broken down into multiple step executions.
While the spirit and principles of the present disclosure have been described with reference to several particular embodiments, it is to be understood that the present disclosure is not limited to the particular embodiments disclosed, nor is the division of aspects which is intended to be construed to be merely illustrative of the fact that features of the aspects may be combined to advantage. The disclosure is intended to cover various modifications and equivalent arrangements included within the spirit and scope of the appended claims.

Claims (10)

1. A search term generation method, comprising:
recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources;
confirming candidate search terms according to the interest resources;
inputting resource characteristics of historical resources corresponding to user historical behaviors and the characteristics of the candidate search words into a prediction model, and outputting a prediction value corresponding to each candidate search word; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user;
and screening the candidate search words and recommending and displaying the candidate search words according to the predicted values corresponding to the candidate search words.
2. The method of claim 1, wherein the inputting resource features of historical resources corresponding to historical behaviors of the user and the candidate search term features into a prediction model and outputting a prediction value corresponding to each candidate search term comprises:
classifying historical resources corresponding to the historical behaviors of the user according to resource types to obtain basic characteristics under each type, and respectively splicing the basic characteristics under each type of historical resources to obtain splicing characteristics of each type of historical resources;
inputting the splicing characteristics based on an attention mechanism, and outputting sequence characteristics of each type of historical resources;
splicing the sequence characteristics to obtain the historical behavior characteristics of the user;
respectively splicing the user historical behavior features and the candidate search word features to obtain a plurality of first combined features; and inputting each first merging characteristic into a prediction model to obtain a prediction value of each candidate search term.
3. The method of claim 2, further comprising:
splicing the first combination characteristics, the user attribute characteristics and the current environment characteristics to obtain a plurality of second combination characteristics; and inputting each second combined feature into the prediction model to obtain the prediction value of each candidate search term.
4. The method of claim 1, the identifying candidate search terms according to the interest resource, comprising:
extracting title information of the interest resources;
taking the title information as candidate search words;
or, taking the historical search terms of the user corresponding to the title information as candidate search terms;
or inputting the title information into a strategy template to match the title information with the resources under various types, and obtaining candidate search terms corresponding to the interest resources output by the strategy template.
5. The method of claim 1, the historical behavior comprising historical operational records and real-time operational records; the recalling interest resources according to the historical behaviors of the user comprises the following steps:
calculating a first recall score of the historical resources corresponding to the historical operation records according to the historical operation records of the user; determining a first interest resource according to the first recall score;
calculating a second recall score of the historical resource corresponding to the real-time operation record according to the real-time operation record of the user; determining a second interest resource according to the second recall score;
classifying the first interest resource and the second interest resource according to resource types; constructing a similarity matrix according to the pre-configured similarity between the resources of each type; calculating the product result of the recall score of the resources under each type and the similarity matrix, and sequencing the product result of the resources under each type to serve as a third interest resource in a preset number of resources at the top;
and taking the first interest resource, the second interest resource and the third interest resource as the interest resources.
6. A predictive model training method, comprising:
recalling sample interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources;
confirming a sample search word according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that the sample search word is clicked or not clicked;
and taking the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the sample search words as input, taking the operation labels of the sample search words as labels, and training to obtain a prediction model.
7. A computer-readable storage medium, comprising: the computer-readable storage medium has stored therein computer-executable instructions for implementing the search term generation method of any one of claims 1 to 6 when executed by a processor.
8. A search term generation apparatus comprising:
the recall module is used for recalling interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources;
the generating module is used for confirming the candidate search terms according to the interest resources;
the prediction module is used for inputting the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the candidate search words into a prediction model and outputting a prediction value corresponding to each candidate search word; the predicted value represents the probability value of the candidate search word predicted by the prediction model being clicked by the user;
and the display module is used for screening the candidate search terms and then recommending and displaying the candidate search terms according to the predicted values corresponding to the candidate search terms.
9. A predictive model training apparatus comprising:
the computing module is used for recalling the sample interest resources according to the historical behaviors of the user; the historical behavior is for a plurality of historical resources;
the acquisition module is used for confirming a sample search word according to the sample interest resource; acquiring sample search word characteristics and an operation label of a sample search word; the operation label comprises that a sample search word is clicked or not clicked;
and the training module is used for taking the resource characteristics of the historical resources corresponding to the historical behaviors of the user and the characteristics of the sample search words as input, taking the operation labels of the sample search words as labels, and training to obtain a prediction model.
10. A computing device, comprising:
at least one processor;
and a memory communicatively coupled to the at least one processor;
wherein the memory stores instructions executable by the at least one processor to cause the computing device to perform the search term generation method of any of claims 1 to 6.
CN202210799138.7A 2022-07-08 2022-07-08 Search term generation method, model training method, medium, device and equipment Pending CN115129922A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210799138.7A CN115129922A (en) 2022-07-08 2022-07-08 Search term generation method, model training method, medium, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210799138.7A CN115129922A (en) 2022-07-08 2022-07-08 Search term generation method, model training method, medium, device and equipment

Publications (1)

Publication Number Publication Date
CN115129922A true CN115129922A (en) 2022-09-30

Family

ID=83381959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210799138.7A Pending CN115129922A (en) 2022-07-08 2022-07-08 Search term generation method, model training method, medium, device and equipment

Country Status (1)

Country Link
CN (1) CN115129922A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115587261A (en) * 2022-12-09 2023-01-10 思创数码科技股份有限公司 Government affair resource catalog recommendation method and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115587261A (en) * 2022-12-09 2023-01-10 思创数码科技股份有限公司 Government affair resource catalog recommendation method and system

Similar Documents

Publication Publication Date Title
US9201959B2 (en) Determining importance of scenes based upon closed captioning data
CN108319723B (en) Picture sharing method and device, terminal and storage medium
US9626159B2 (en) Automatic generation of task scripts from web browsing interaction history
US8321414B2 (en) Hybrid audio-visual categorization system and method
WO2017096877A1 (en) Recommendation method and device
US8239412B2 (en) Recommending a media item by using audio content from a seed media item
CN102262647B (en) Signal conditioning package, information processing method and program
CN109275047B (en) Video information processing method and device, electronic equipment and storage medium
CN110704674A (en) Video playing integrity prediction method and device
CN113239173B (en) Question-answer data processing method and device, storage medium and electronic equipment
CN112819099B (en) Training method, data processing method, device, medium and equipment for network model
CN109857901B (en) Information display method and device, and method and device for information search
CN111754278A (en) Article recommendation method and device, computer storage medium and electronic equipment
CN111723235B (en) Music content identification method, device and equipment
CN115129922A (en) Search term generation method, model training method, medium, device and equipment
CN111680218B (en) User interest identification method and device, electronic equipment and storage medium
CN110569447B (en) Network resource recommendation method and device and storage medium
CN115618024A (en) Multimedia recommendation method and device and electronic equipment
CN114580790A (en) Life cycle stage prediction and model training method, device, medium and equipment
CN115080856A (en) Recommendation method and device and training method and device of recommendation model
KR102081553B1 (en) Big Data-Based Monitoring System of Promotional Content for Cultural Media
CN112989102A (en) Audio playing control method and device, storage medium and terminal equipment
CN113672758B (en) Song list generation method, device, medium and computing equipment
CN110598040B (en) Album recall method, device, equipment and storage medium
CN114676341B (en) Determination method, medium, device and computing equipment of recommended object

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination