CN108763332A - A kind of generation method and device of Search Hints word - Google Patents

A kind of generation method and device of Search Hints word Download PDF

Info

Publication number
CN108763332A
CN108763332A CN201810442164.8A CN201810442164A CN108763332A CN 108763332 A CN108763332 A CN 108763332A CN 201810442164 A CN201810442164 A CN 201810442164A CN 108763332 A CN108763332 A CN 108763332A
Authority
CN
China
Prior art keywords
word
candidate
candidate prompt
prompt word
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201810442164.8A
Other languages
Chinese (zh)
Inventor
刘维伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN201810442164.8A priority Critical patent/CN108763332A/en
Publication of CN108763332A publication Critical patent/CN108763332A/en
Pending legal-status Critical Current

Links

Abstract

An embodiment of the present invention provides a kind of generation method and device of Search Hints word, the generation method of Search Hints word includes:Obtain the search key of user;The candidate prompt set of words of search key is generated according to search key;Using search key and each candidate prompt word in candidate prompt set of words, the data characteristics of each candidate prompt word is generated;By in data characteristics input score value prediction model trained in advance, the score value of each candidate prompt word is obtained;According to score value, from the candidate target prompting word for prompting to determine search key in set of words.The score value of each candidate prompt word is extracted by score value prediction model, it is not necessary that weight is manually arranged, it is unreasonable to avoid weight setting, the problem of the score value inaccuracy of candidate prompt word is calculated, the score value of candidate prompt word can be objectively calculated according to the historical behavior of user, so that prompt word more meets the wish of user, user can select required prompt word in prompt word, improve the efficiency of information search.

Description

A kind of generation method and device of Search Hints word
Technical field
The present invention relates to technical field of data processing, and in particular to a kind of generation method and device of Search Hints word.
Background technology
With the rapid development of Internet, network becomes people's daily life, the essential part of study and work. Spreading network information is rapid, and network information is big, and how user rapidly retrieves useful information in bulk information It is most important.User inputs prompt and is also referred to as Search Hints, and phase is provided by user's information that importation is inquired in search box Complete prompt word is answered, is a kind of method improving recall precision.
At present Search Hints word mainly according to phonetic prefix, simplicity prefix, prompt word prefix of prompt word etc. it is different come Source generates.Specifically, the matching degree for first calculating search key and prompt word, judges prompt word at the volumes of searches for counting prompt word Whether be station in album, count prompt word clicking rate and prompt word novel degree (occur recently, still very early before just Have), different weights is given to above several parts further according to experience, then by way of being summed it up to different piece, is calculated The score of each prompt word is to generate prompt word.
However, the generation method of current prompt word, needs artificially for the matching degree of search key and prompt word, prompt Whether the volumes of searches of word, prompt word are that the novel degree of album, the clicking rate of prompt word and prompt word in station distributes different power Weight, the prompt word distribution weight for often causing volumes of searches big is excessive, or temperature is higher to be carried in order to which anti-cheating is suppressed Show that can not be ranked up to prompt word when the weight of word or the consistent matching degree that empirical equation calculates two prompt words shows As, to cause the search key of the prompt word generated according to sequence and user to mismatch, not meet the problem of user intention, So that user can not select in prompt word, prompt word also just loses the meaning of prompt.
Invention content
In view of the above problems, it is proposed that the embodiment of the present invention overcoming the above problem or at least partly in order to provide one kind A kind of generation method of the Search Hints word to solve the above problems and a kind of generating means of Search Hints word.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of generation methods of Search Hints word, including:
Obtain the search key of user;
The candidate prompt set of words of described search keyword is generated according to described search keyword;
Using described search keyword and each candidate prompt word in the candidate prompt set of words, generates each candidate and carry Show the data characteristics of word;
By in the data characteristics input of each candidate prompt word score value prediction model trained in advance, obtains each candidate and carry Show the score value of word;
According to the score value of each candidate prompt word, the mesh of described search keyword is determined from the candidate prompt set of words Mark prompt word.
Optionally, the step of candidate prompt set of words that described search keyword is generated according to described search keyword Including:
According to described search keyword, matched multiple candidate prompts are searched in pre-set candidate prompt word dictionary Word;
The candidate prompt set of words is generated using the multiple candidate prompt word.
Optionally, described using described search keyword and each candidate prompt word in the candidate prompt set of words, it is raw At each candidate prompt word data characteristics the step of include:
According to described search keyword and each candidate prompt word, search key pinyin string and candidate are generated respectively Prompt word pinyin string;
The spelling of each candidate prompt word is generated using described search keyword pinyin string and the candidate prompt word pinyin string Sound string similarity feature;
The Chinese character of each candidate prompt word is generated using the Chinese character string in described search keyword and the candidate prompt word String similarity feature;
Obtain the historical behavior that each candidate prompt word is directed to each candidate prompt word within a preset period of time Operation;
Using in preset time period each candidate prompt is generated for the historical behavior operation of each candidate prompt word Word temperature feature;
Whether it is album in station according to each candidate prompt word, generates the album feature of each candidate prompt word.
Optionally, it in the data characteristics input by each candidate prompt word score value prediction model trained in advance, obtains To each candidate prompt word score value the step of include:
By pinyin string similarity feature, the Chinese character string similarity feature, the temperature feature of each candidate prompt word And the album feature inputs in score value prediction model trained in advance, obtains the score value of each candidate prompt word.
Optionally, the score value prediction model is trained in the following manner:
Training sample is obtained, each training sample includes trained search key, the instruction in the training sample Practice the corresponding multiple candidate prompt words of training of search key, and, for the hits of the candidate prompt word of training;
According to the hits of each candidate prompt word, the label of each candidate prompt word is determined;
It is raw using the corresponding multiple candidate prompt words of training of the trained search key and the trained search key At the training data feature of the candidate prompt word of each training;
Data characteristics using the candidate prompt word of each training and the label predict mould based on preset algorithm training score value Type.
Optionally, the score value prediction model is LambdaMart models, and the data characteristics includes each candidate prompt Pinyin string similarity feature, Chinese character string similarity feature, temperature feature and the album feature of word, it is described candidate using each training The data characteristics of prompt word and the label include based on the step of preset algorithm training score value prediction model:
Using the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training Chinese character string similarity feature, the temperature feature of each candidate prompt word of training, the album of the candidate prompt word of each training are special The label of sign and each trained prompt word trains LambdaMart models based on GBDT algorithms.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of generating means of Search Hints word, including:
Search key acquisition module, the search key for obtaining user;
Candidate's prompt set of words generation module, the candidate for generating described search keyword according to described search keyword Prompt set of words;
Data characteristics generation module, it is each candidate in set of words for being prompted with the candidate using described search keyword Prompt word generates the data characteristics of each candidate prompt word;
Score value extraction module, the score value prediction model for training the data characteristics input of each candidate prompt word in advance In, obtain the score value of each candidate prompt word;
Target prompting word determining module determines institute for the score value according to each candidate prompt word from candidate prompt word State the target prompting word of search key.
Optionally, the candidate prompt set of words generation module includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word word Matched multiple candidate prompt words are searched in library;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word using the multiple candidate prompt word Set.
Optionally, the data characteristics generation module includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating respectively Search key pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for using described search keyword pinyin string and the candidate prompt Word pinyin string generates the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using in described search keyword and the candidate prompt word Chinese character string generates the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, for obtaining each candidate prompt word within a preset period of time for described every The behavior operation of a candidate's prompt word;
Temperature feature generates submodule, for using the behavior behaviour for being directed to each candidate prompt word in preset time period Make to generate each candidate prompt word temperature feature;
Whether album feature generates submodule, for being album in station according to each candidate prompt word, generate each candidate The album feature of prompt word.
Optionally, the score value extraction module includes:
Feature input submodule, for the pinyin string similarity feature of each candidate prompt word, the Chinese character string is similar It spends in the score value prediction model that feature, the temperature feature and the input of album feature are trained in advance, obtains each candidate prompt The score value of word.
Optionally, described device further includes model training module, and the model training module includes:
Training sample acquisition submodule, for obtaining training sample, each training sample includes one in the training sample The corresponding multiple candidate prompt words of training of a trained search key, the trained search key, and, it is candidate for training The hits of prompt word;
Label determination sub-module determines the mark of each candidate prompt word for the hits according to each candidate prompt word Label;
Training data feature generates submodule, for using the trained search key and the trained search key The corresponding candidate prompt word of multiple training, generates the training data feature of the candidate prompt word of each training;
Training submodule, for using the data characteristics and the label for each training candidate prompt word, being imputed based on pre- Method trains score value prediction model.
Optionally, the score value prediction model is LambdaMart models, and the data characteristics includes each candidate prompt Pinyin string similarity feature, Chinese character string similarity feature, temperature feature and the album feature of word, the trained submodule include:
Training unit, for pinyin string similarity feature, each instruction using each candidate prompt word of training Practice the Chinese character string similarity feature of candidate prompt word, the temperature feature of each candidate prompt word of training, each training candidate The label of the album feature of prompt word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
The embodiment of the present invention includes following advantages:
In the embodiment of the present invention, after the search key for obtaining user;It is searched according to described in the generation of described search keyword The candidate prompt set of words of rope keyword;It is prompted using described search keyword and each candidate in the candidate prompt set of words Word generates the data characteristics of each candidate prompt word;Then by the data characteristics input training in advance of each candidate prompt word In score value prediction model, the score value of each candidate prompt word is obtained;According to the score value of each candidate prompt word, from candidate prompt word The target prompting word of middle determining described search keyword, the embodiment of the present invention are led to after the data characteristics of the candidate prompt word of extraction The score value of the excessive each candidate prompt word of value prediction model extraction is avoided artificial or regular set without artificial setting weight Setting weight causes weight setting unreasonable, and the problem of the score value inaccuracy of candidate prompt word is calculated, can be according to user's Historical behavior objectively calculates the score value of candidate prompt word so that prompt word more meets the wish of user, and user can prompt Required prompt word is selected in word, improves the efficiency of information search.
Description of the drawings
Fig. 1 is a kind of step flow chart of the generation method embodiment 1 of Search Hints word of the present invention;
Fig. 2 is a kind of exemplary schematic diagram of Search Hints word of the present invention;
Fig. 3 is a kind of another exemplary schematic diagram of Search Hints word of the present invention;
Fig. 4 is a kind of step flow chart of the generation method embodiment 2 of Search Hints word of the present invention;
Fig. 5 is the exemplary plot of the partial decision tree of the score value prediction model of the present invention;
A kind of structure diagram of the generating means embodiment of Search Hints word of Fig. 6 present invention.
Specific implementation mode
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below in conjunction with the accompanying drawings and specific real Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of the generation method embodiment 1 of Search Hints word of the present invention, tool are shown Body may include steps of:
Step 101, the search key of user is obtained.
In the embodiment of the present invention, search key can be the character that user inputs in the input frame of application program, for example, User searches for the media informations such as film, music in video web-pages, then the media information for needing to search for can be inputted in input frame One or more character, and the display reminding word below input frame.
It is illustrated in figure 2 an example of prompt word, in this example, inputs " grandson realizes " in input frame, then character " grandson realizes " For search key, then the meeting display reminding word below input frame, including " Sun Wukong ", " the big film of Sun Wukong " etc..
Step 102, the candidate prompt set of words of described search keyword is generated according to described search keyword.
In practical applications, there can be multiple candidate prompt words for a search key, such as by search key Either candidate prompt word is concatenated into suffix or centre to prefix as candidate prompt word, can also be to be existed by search key The words comprising search key is searched in preset dictionary as candidate prompt word, candidate prompt word can also be closes with search Keyword pronunciation is identical or the similar words of font.
It is illustrated in figure 3 an example of prompt set of words, in figure 3, user needs to input search key " match originally You ", but have input " Sai Er ", then candidate prompt word may include " breath in The Legend of Zelda wilderness ", " The Legend of Zelda ", The candidate prompt word identical with " Sai Er " font of input such as " Sai Erda ", can also include similar to " Sai Er " font inputted " No. Sai Er " candidate prompt word, you can to realize automatic error-correcting prompt facility.
Step 103, it using described search keyword and each candidate prompt word in the candidate prompt set of words, generates every The data characteristics of a candidate's prompt word.
In the embodiment of the present invention, a search key corresponds to multiple candidate prompt words, can be by search key and one A candidate's prompt word is determined as a pairing, and each pairing can generate corresponding data characteristics, which may include Search key and candidate similar features of the prompt word on pronunciation, the similar features on font, candidate's prompt word it is big Temperature feature that the historical behavior of amount user is formed, the candidate prompt word whether be the interior album in station album feature.
Step 104, it by the data characteristics input of each candidate prompt word score value prediction model trained in advance, obtains every The score value of a candidate's prompt word.
After the data characteristics for obtaining search key and each candidate prompt word, inputs in score value prediction model and obtain often The score value of a candidate's prompt word, specifically, score value prediction model is trained by the historical data of a large number of users, example The click data of the historical search keyword and the corresponding candidate prompt word of historical search keyword of such as collecting a large number of users is made For training sample, the training data feature of training sample is extracted, score value prediction model, such as base are then trained by preset algorithm LambdaMart models are trained in GBDT algorithms, can be with the score value of predicting candidate prompt word by LambdaMart models, it can be with All candidate prompt words are ranked up according to score value so that the sequence of candidate prompt word more meets the wish of user.
Step 105, the score value according to each candidate prompt word determines that described search is closed from the candidate prompt set of words The target prompting word of keyword.
After the score value for obtaining the corresponding multiple candidate prompt words of search key, by multiple candidate prompt words according to score value It is ranked up, target prompting word is determined from multiple candidate prompt words, specifically, can will sort in a certain range Candidate prompt word is determined as target prompting word and is shown, is input in input frame so that user carries out selection.
The embodiment of the present invention is extracted each candidate after the data characteristics of the candidate prompt word of extraction by score value prediction model The score value of prompt word avoids artificial or rule setting weight and causes weight setting unreasonable without artificial setting weight, meter The problem for calculating the score value inaccuracy for obtaining candidate prompt word can objectively calculate candidate prompt word according to the historical behavior of user Score value so that prompt word more meets the wish of user, and user can select required prompt word in prompt word, improve letter Cease the efficiency of search.
With reference to Fig. 4, a kind of step flow chart of the generation method embodiment 2 of Search Hints word of the present invention, tool are shown Body may include steps of:
Step 201, the search key of user is obtained.
In the embodiment of the present invention, search key can be the character that user inputs in the input frame of application program, for example, User searches for the media informations such as film, music in video web-pages, then the media information for needing to search for can be inputted in input frame One or more character, and the display reminding word below input frame.
Step 202, the candidate prompt set of words of described search keyword is generated according to described search keyword.
In embodiments of the present invention, step 202 may include following sub-step:
Sub-step S11 is searched matched more according to described search keyword in pre-set candidate prompt word dictionary A candidate's prompt word.
Sub-step S12 generates the candidate prompt set of words using the multiple candidate prompt word.
In practical applications, can collect the input history of user, physical presence entity (as video name, name, Name etc.) and the candidate word of manual construction form candidate prompt word dictionary, such as when user is defeated in the input frame of video web-pages When entering to search for information, information input by user can be collected as candidate prompt word in server side, and be stored in candidate prompt In word dictionary, the higher word of current temperature can also be obtained certainly and is stored in candidate prompt word dictionary as candidate prompt word, It can also be that generating candidate prompt word according to the group word rule of word and word is stored in candidate prompt word dictionary.
After getting the search key of user, it can be searched in candidate prompt word dictionary by search key and include The candidate prompt word of search key, and using multiple candidate prompt words comprising search key as the candidate of search key Prompt set of words.
For example, user inputs search key " grandson realizes ", then candidate prompt set of words can be that { Sun Wukong, Sun Wukong are big Film, the love of Sun Wukong you 10,000 years, Sun Wukong creates a tremendous uproar, and Sun Wukong's cartoon, Sun Wukong three beats the White Bone Demon }.
Certainly, above-mentioned is that can also be input Pinyin or other Languages in practical applications to input Chinese character as example Search key.
Step 203, according to described search keyword and each candidate prompt word, search key phonetic is generated respectively String and candidate prompt word pinyin string.
In practical applications, a search key and a candidate prompt word can be a pairing, can generate this The pinyin string of pairing, for example, with search key for " grandson realizes ", candidate prompt word is " Sun Wukong ", " the big film of Sun Wukong " is Example, search key pinyin string are " sun wu ", and candidate prompt word pinyin string is " sun wu kong ", " sun wu kong da dian ying”。
Certainly, those skilled in the art can also generate corresponding pinyin string according to other Pinyin rules, such as simplicity, complete Spell etc., the embodiment of the present invention does not limit this.
Step 204, it generates each candidate using described search keyword pinyin string and the candidate prompt word pinyin string and carries Show the pinyin string similarity feature of word.
Pinyin string similarity feature may include multiple features, specific as shown in table 1:(token is search key, Query is candidate prompt word)
Table 1:
Preset computational methods are corresponded to by each feature serial number in table 1, the multiple of each candidate prompt word can be obtained Pinyin string similarity feature.
Step 205, each candidate prompt is generated using the Chinese character string in described search keyword and the candidate prompt word The Chinese character string similarity feature of word.
In practical applications, a search key and a candidate prompt word can be a pairing, can generate this The Chinese character string of pairing, for example, with search key for " grandson realizes ", candidate prompt word is " Sun Wukong ", " the big film of Sun Wukong " is Example, search key Chinese character string are " grandson realizes ", and candidate prompt word Chinese character string is " Sun Wukong ", " the big film of Sun Wukong " etc., certainly, As shown in figure 3, search key Chinese character string be " Sai Er ", candidate prompt word Chinese character string can be " The Legend of Zelda wilderness it Breath ", " No. Sai Er ", " The Legend of Zelda ", " Sai Erda " etc., wherein the search key of " match " and input in " No. Sai Er " " plug " in " Sai Er " is that font is similar.
Chinese character string similarity feature may include multiple features, specific as shown in table 2:(token is search key, Query is candidate prompt word)
By the preset computational methods of feature serial number 10-19 in table 2, multiple Chinese characters of each candidate prompt word can be obtained String similarity feature.
Certainly, above-mentioned is only to be illustrated using Chinese as example, and those skilled in the art can also associate other Language, the embodiment of the present invention do not limit this.
Step 206, it obtains each candidate prompt word and is directed to each candidate prompt word within a preset period of time Behavior operates.
In the embodiment of the present invention, a large number of users includes user to candidate prompt word for the behavior operation of each candidate word Searching times, IP numbers, hits and clicking rate etc..It can be operated by counting the behavior of user in the past period, such as One middle of the month of past, respectively searching times of the counting user before 1 day, before 2 days, before 4 days, before 7 days, before 30 days, IP numbers, click Number and clicking rate.
Step 207, it is generated using the behavior operation for each candidate prompt word in preset time period each candidate Prompt word temperature feature.
Obtaining searching times, IP number, hits and point of the user before 1 day, before 2 days, before 4 days, before 7 days, before 30 days After hitting rate, the temperature feature of candidate prompt word can be calculated, it is specific as shown in table 3:
Table 3:
Above-mentioned is only the temperature feature for being calculated with the data before 1 day candidate prompt word, before 2 days, before 4 days, before 7 days and 30 Temperature feature before it is not repeated herein with reference to table 3.
Temperature feature of the candidate prompt word before 2 days may include the temperature feature and preceding 2 days comprehensive temperatures on the same day before 2 days Feature.For example, current time is July 30, then the temperature feature before 2 days may include being calculated according to the data on July 28 The temperature feature of candidate prompt word, and, integrate July 28, the temperature spy for the candidate prompt word that the data on July 29 calculate Sign, the temperature feature before 4 days may include the temperature feature of the candidate prompt word calculated according to the data on July 26, and, it is comprehensive Close the temperature feature of the candidate prompt word of the data calculating in July 26, July 27, July 28, July 29.It can similarly count The temperature feature of candidate prompt word before calculating 7 days and before 30 days.Before being obtained 1 day by preset algorithm in table 3, before 2 days, before 4 days, 7 Totally 36 temperature feature (feature serial number 20-55) is tieed up before it and before 30 days.
Step 208, whether it is album in station according to each candidate prompt word, the album for generating each candidate prompt word is special Sign.
In the embodiment of the present invention, it can safeguard album list in a station, be had recorded in the list and belong to album in station Candidate prompt word, if candidate's prompt word in album list, can generate the album of candidate prompt word in station according to table 4 Feature:
Table 4:
It can obtain the album feature of candidate prompt word by table 4, i.e., 0 or 1.
Step 209, by the pinyin string similarity feature of each candidate prompt word, the Chinese character string similarity feature, described In the score value prediction model that temperature feature and the input of album feature are trained in advance, the score value of each candidate prompt word is obtained.
In pinyin string similarity feature (feature serial number 1-9), the Chinese character string similarity feature for generating each candidate prompt word After (feature serial number 10-19), temperature feature (feature serial number 20-55) and album feature (feature serial number 56), had altogether above-mentioned The score value of each candidate prompt word is extracted in 56 dimensional features input score value prediction model trained in advance.
In one preferred embodiment of the invention, score value prediction model is trained in the following manner:
Sub-step S21 obtains training sample, and each training sample includes that a training search is crucial in the training sample The corresponding multiple candidate prompt words of training of word, the trained search key, and, for the click of the candidate prompt word of training Number.
Training sample is the click data that a large number of users is directed to candidate prompt word, such as different user to same search key The candidate prompt word of difference of word is clicked, and the hits of each candidate prompt word is counted, specifically, can count when born The hits of different user different candidate's prompt words under same search key, when being below " grandson realizes " for search key, The data of each candidate's prompt word are as follows:
In practical applications, if without click on the day of candidate prompt word, hits are denoted as 0.Calculate each candidate simultaneously The click accounting of prompt word, i.e., the hits of candidate prompt word account for the ratio of all candidate prompt word hits summations.
Sub-step S22 determines the label of each candidate prompt word according to the hits of each candidate prompt word.
Hits show the intention of a large number of users, i.e., when a user inputs a search keyword, it is expected that candidate prompt word by It is appeared in prompt word option according to hits sequence, therefore, each candidate prompt word can be given tagged, specifically, can To be multiplied by rounding after a coefficient as the label of candidate's prompt word according to the clicks accounting of each candidate prompt word, such as it is multiplied by Rounding obtains following data after 20 multiplying factors:
As it appears from the above, the label of candidate prompt word " Sun Wukong " is 4, and candidate prompt word " Sun Wukong's cartoon " and " grandson Realize empty big film " label be 3, be built such that a kind of ordinal relation, show candidate prompt word " Sun Wukong " than candidate prompt word The position of " Sun Wukong's cartoon " and " the big film of Sun Wukong " sequence should be located further forward, and candidate prompt word " Sun Wukong's cartoon " The sequence of " the big film of Sun Wukong " should be similar, wishes the study of score value prediction model to this sequence as standard.
Sub-step S23 is waited using the corresponding multiple training of the trained search key and the trained search key Prompt word is selected, the training data feature of the candidate prompt word of each training is generated.
Specifically, may include according to the trained search key and the candidate prompt word of each training, respectively Generate training search key pinyin string and the candidate prompt word pinyin string of training;Using the trained search key pinyin string and The candidate prompt word pinyin string of training generates the pinyin string similarity feature of the candidate prompt word of each training;Using the training Chinese character string in search key and the candidate prompt word of the training generates the Chinese character string similarity of the candidate prompt word of each training Feature;Obtaining the training, each candidate prompt word is directed to the behavior of each candidate prompt word of the training within a preset period of time Operation;Using in preset time period the candidate prompt of each training is generated for the behavior operation of each candidate prompt word of training The temperature feature of word;Whether it is album in station according to the candidate prompt word of each training, generates the special of the candidate prompt word of each training Collect feature.
The step of training data feature of the above-mentioned candidate prompt word of generation training, can refer to step 203- steps 208, this In be not repeated.
Training data feature can be as follows:
The data characteristics " 1 of above-mentioned candidate's prompt word " the big film of Sun Wukong ":1 27:0.006 28:0.005 " meaning For:It is the 0.006, No. 28 characteristic value is 0.005 that No. 1 characteristic value, which is the 1, No. 27 characteristic value, and tag number is in table 1- tables 4 Feature serial number.
Above is only that can share No. 56 characteristic values with one in practical application using partial feature value as illustrating.
Sub-step S24, the data characteristics using the candidate prompt word of each training and the label, are trained based on preset algorithm Score value prediction model.
In embodiments of the present invention, the score value prediction model is LambdaMart models, and step S24 may include following Sub-step:
Using the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training The album of Chinese character string similarity feature, the temperature feature of each candidate prompt word of training and the candidate prompt word of each training Feature trains LambdaMart models based on GBDT algorithms.
LambdaMart is based on LambdaRank algorithms and MART (Multiple Additive Regression Tree) Algorithm converts sequencing problem to regression tree problem.Wherein, MART algorithms can be GBDT (gradient promoted decision tree, Gradient Boosting Decision Tree) algorithm.
GBDT is a kind of for returning, classifying and the machine learning algorithm of Sorting task, belongs to Boosting algorithms race A part.Boosting is the algorithm that weak learner can be promoted to strong learner by family, belongs to integrated study (ensemble Learning scope).Boosting methods are based on such a thought:For a complex task, by sentencing for multiple nodes The disconnected judgement for carrying out comprehensive income appropriate and going out, gets well than the individual judgement of one node of any of which.Gradient is promoted with it His boosting methods are the same, can be by opening in reality to build final prediction model by integrating multiple decision trees Source tool LightGBM carries out the training of LambdaMart models, i.e., by the pinyin string similarity feature of candidate prompt word, Chinese character String similarity feature, temperature feature and album feature train LambdaMart moulds as the data set of Open-Source Tools LightGBM Type.
An example being illustrated in figure 5 in LambdaMart models, 56 dimensions that will be generated in step 203- steps 208 It according in the value input model of feature, is compared with the threshold value of each node, into next branch, each branch's leaf node terminates A nodal value is all corresponded to afterwards, if there is 100 are set, then the value summation for the leaf node that 100 are set can be obtained into each time Select the score value of prompt word.
Step 210, the score value according to each candidate prompt word determines that described search is closed from the candidate prompt set of words The target prompting word of keyword.
Specifically, score value can be first depending on, all candidate prompt words are ranked up, i.e., are carried obtaining each candidate It after showing the score value of word, can be from big to small ranked up according to score value, obtain institute in the corresponding candidate prompt word of search key There is the sequence of candidate prompt word;Then the candidate prompt word of sequence within a preset range is determined as target prompting word, for example, can Sequence is determined as the target prompting word of search key in preceding 10 candidate prompt word, and shown according to sequence, with It is selected for user.
The embodiment of the present invention is extracted each candidate after the data characteristics of the candidate prompt word of extraction by score value prediction model The score value of prompt word avoids artificial or rule setting weight and causes weight setting unreasonable without artificial setting weight, meter The problem for calculating the score value inaccuracy for obtaining candidate prompt word can objectively calculate candidate prompt word according to the historical behavior of user Score value so that prompt word more meets the wish of user, and user can select required prompt word in prompt word, improve letter Cease the efficiency of search.
In the embodiment of the present invention, score value preset model combination pinyin string similar features, Chinese character string similar features and user The temperature feature and album feature that historical behavior data generate are trained, and fully combine temperature information, the search of prompt word Similarity on keyword and the phonetic of candidate prompt word, word string carries out comprehensive marking so that the score value of candidate prompt word is more To be objective, the ranking results of candidate prompt word match with the click behavior of user, more meet the needs of users.
In the embodiment of the present invention, the scoring tactics of candidate prompt word are simple, only need to generate multidimensional data according to prediction algorithm Multidimensional data feature input score value prediction model is extracted score value, it is not necessary that weight is manually specified by feature.
With reference to Fig. 6, a kind of structure diagram of the generating means embodiment of Search Hints word of the present invention, described device are shown Including:
Search key acquisition module 301, the search key for obtaining user;
Candidate's prompt set of words generation module 302, for generating described search keyword according to described search keyword Candidate's prompt set of words;
Data characteristics generation module 303, it is each in set of words for being prompted with the candidate using described search keyword Candidate prompt word generates the data characteristics of each candidate prompt word;
Score value extraction module 304, for predicting the data characteristics input of each candidate prompt word score value trained in advance In model, the score value of each candidate prompt word is obtained;
Target prompting word determining module 305, for the score value according to each candidate prompt word, from the candidate prompt word set The target prompting word of described search keyword is determined in conjunction.
Optionally, the candidate prompt set of words generation module 302 includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word word Matched multiple candidate prompt words are searched in library;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word using the multiple candidate prompt word Set.
Optionally, the data characteristics generation module 303 includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating respectively Search key pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for using described search keyword pinyin string and the candidate prompt Word pinyin string generates the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using in described search keyword and the candidate prompt word Chinese character string generates the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, for obtaining each candidate prompt word within a preset period of time for described every The behavior operation of a candidate's prompt word;
Temperature feature generates submodule, for using the behavior behaviour for being directed to each candidate prompt word in preset time period Make to generate each candidate prompt word temperature feature;
Whether album feature generates submodule, for being album in station according to each candidate prompt word, generate each candidate The album feature of prompt word.
Optionally, the score value extraction module 304 includes:
Feature input submodule, for the pinyin string similarity feature of each candidate prompt word, the Chinese character string is similar It spends in the score value prediction model that feature, the temperature feature and the input of album feature are trained in advance, obtains each candidate prompt The score value of word.
Optionally, described device further includes model training module, and the model training module includes:
Training sample acquisition submodule, for obtaining training sample, each training sample includes one in the training sample The corresponding multiple candidate prompt words of training of a trained search key, the trained search key, and, it is candidate for training The hits of prompt word;
Label determination sub-module determines the mark of each candidate prompt word for the hits according to each candidate prompt word Label;
Training data feature generates submodule, for using the trained search key and the trained search key The corresponding candidate prompt word of multiple training, generates the training data feature of the candidate prompt word of each training;
Training submodule, for using the data characteristics and the label for each training candidate prompt word, being imputed based on pre- Method trains score value prediction model.
Optionally, the training data feature generation submodule includes:
Pinyin string generation unit, for according to the trained search key and the candidate prompt word of each training, dividing Search key pinyin string and the candidate prompt word pinyin string of training Sheng Cheng not trained;
Pinyin string feature generation unit, for using the trained search key pinyin string and the candidate prompt of the training Word pinyin string generates the pinyin string similarity feature of the candidate prompt word of each training;
Chinese character string feature generation unit, for using in the trained search key and the candidate prompt word of the training Chinese character string generates the Chinese character string similarity feature of the candidate prompt word of each training;
Behavior operates acquiring unit, for obtain the training each candidate prompt word be directed within a preset period of time it is described The behavior operation of each candidate prompt word of training;
Temperature feature generation unit, for using the behavior for being directed to the candidate prompt word of each training in preset time period Operation generates the temperature feature of the candidate prompt word of each training;
Album feature generation unit generates each instruction for whether being album in station according to the candidate prompt word of each training Practice the album feature of candidate's prompt word.
Optionally, the score value prediction model is LambdaMart models, and the trained submodule includes:
Training unit, for pinyin string similarity feature, each instruction using each candidate prompt word of training Practice the Chinese character string similarity feature of candidate prompt word, the temperature feature of each candidate prompt word of training, each training candidate The label of the album feature of prompt word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
Optionally, the target prompting word determining module 305 includes:
Sorting sub-module, for according to score value, all candidate prompt words to be ranked up;
Target prompting word determination sub-module, for the candidate prompt word of sequence within a preset range to be determined as target prompting Word.
For device embodiments, since it is basically similar to the method embodiment, so fairly simple, the correlation of description Place illustrates referring to the part of embodiment of the method.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with The difference of other embodiment, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can be provided as method, apparatus or calculate Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code The form of the computer program product of implementation.
The embodiment of the present invention be with reference to according to the method for the embodiment of the present invention, terminal device (system) and computer program The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions In each flow and/or block and flowchart and/or the block diagram in flow and/or box combination.These can be provided Computer program instructions are set to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals Standby processor is to generate a machine so that is held by the processor of computer or other programmable data processing terminal equipments Capable instruction generates for realizing in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes The device of specified function.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing terminal equipments In computer-readable memory operate in a specific manner so that instruction stored in the computer readable memory generates packet The manufacture of command device is included, which realizes in one flow of flow chart or multiple flows and/or one side of block diagram The function of being specified in frame or multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing terminal equipments so that Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus The instruction executed on computer or other programmable terminal equipments is provided for realizing in one flow of flow chart or multiple flows And/or in one box of block diagram or multiple boxes specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, article Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device including the element.
Above to a kind of generation dress of the generation method and a kind of Search Hints word of Search Hints word provided by the present invention It sets and is described in detail, principle and implementation of the present invention are described for specific case used herein, above The explanation of embodiment is merely used to help understand the method and its core concept of the present invention;Meanwhile for the general skill of this field Art personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this Description should not be construed as limiting the invention.

Claims (12)

1. a kind of generation method of Search Hints word, which is characterized in that the method includes:
Obtain the search key of user;
The candidate prompt set of words of described search keyword is generated according to described search keyword;
Using described search keyword and each candidate prompt word in the candidate prompt set of words, each candidate's prompt word is generated Data characteristics;
By in the data characteristics input of each candidate prompt word score value prediction model trained in advance, each candidate prompt word is obtained Score value;
According to the score value of each candidate prompt word, determine that the target of described search keyword carries from the candidate prompt set of words Show word.
2. generation method as described in claim 1, which is characterized in that described to generate described search according to described search keyword The step of candidate prompt set of words of keyword includes:
According to described search keyword, matched multiple candidate prompt words are searched in pre-set candidate prompt word dictionary;
The candidate prompt set of words is generated using the multiple candidate prompt word.
3. generation method as described in claim 1, which is characterized in that described to be carried with the candidate using described search keyword Show in set of words each candidate's prompt word, the step of data characteristics for generating each candidate prompt word includes:
According to described search keyword and each candidate prompt word, search key pinyin string is generated respectively and candidate prompts Word pinyin string;
The pinyin string of each candidate prompt word is generated using described search keyword pinyin string and the candidate prompt word pinyin string Similarity feature;
The Chinese character string phase of each candidate prompt word is generated using the Chinese character string in described search keyword and the candidate prompt word Like degree feature;
Obtain the historical behavior operation that each candidate prompt word is directed to each candidate prompt word within a preset period of time;
Each candidate prompt word heat is generated using the historical behavior operation for each candidate prompt word in preset time period Spend feature;
Whether it is album in station according to each candidate prompt word, generates the album feature of each candidate prompt word.
4. generation method as claimed in claim 3, which is characterized in that the data characteristics by each candidate prompt word inputs In score value prediction model trained in advance, the step of obtaining the score value of each candidate prompt word, includes:
By the pinyin string similarity feature of each candidate prompt word, the Chinese character string similarity feature, the temperature feature and In album feature input score value prediction model trained in advance, the score value of each candidate prompt word is obtained.
5. generation method according to any one of claims 1-4, which is characterized in that the score value prediction model passes through with lower section Formula is trained:
Training sample is obtained, each training sample is searched including trained search key, the training in the training sample The corresponding multiple candidate prompt words of training of rope keyword, and, for the hits of the candidate prompt word of training;
According to the hits of each candidate prompt word, the label of each candidate prompt word is determined;
Using the corresponding multiple candidate prompt words of training of the trained search key and the trained search key, generate every The training data feature of a candidate prompt word of training;
Data characteristics using the candidate prompt word of each training and the label, score value prediction model is trained based on preset algorithm.
6. generation method as claimed in claim 5, which is characterized in that the score value prediction model is LambdaMart models, The data characteristics include the pinyin string similarity feature of each candidate prompt word, Chinese character string similarity feature, temperature feature and Album feature, the data characteristics using the candidate prompt word of each training and the label, score value is trained based on preset algorithm The step of prediction model includes:
Using the Chinese character of the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training Go here and there similarity feature, the temperature feature of each candidate prompt word of training, the album feature of the candidate prompt word of each training with And the label of each trained prompt word, LambdaMart models are trained based on GBDT algorithms.
7. a kind of generating means of Search Hints word, which is characterized in that described device includes:
Search key acquisition module, the search key for obtaining user;
Candidate's prompt set of words generation module, the candidate prompt for generating described search keyword according to described search keyword Set of words;
Data characteristics generation module, for using each candidate prompt in described search keyword and the candidate prompt set of words Word generates the data characteristics of each candidate prompt word;
Score value extraction module, for the data characteristics of each candidate prompt word to be inputted in score value prediction model trained in advance, Obtain the score value of each candidate prompt word;
Target prompting word determining module, for the score value according to each candidate prompt word, determined from candidate prompt word described in search The target prompting word of rope keyword.
8. generating means as claimed in claim 7, which is characterized in that the candidate prompt set of words generation module includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word dictionary Search matched multiple candidate prompt words;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word set using the multiple candidate prompt word It closes.
9. generating means as claimed in claim 7, which is characterized in that the data characteristics generation module includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating search respectively Keyword pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for being spelled using described search keyword pinyin string and the candidate prompt word Sound concatenates into the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using the Chinese character in described search keyword and the candidate prompt word Concatenate into the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, and each time is directed within a preset period of time for obtaining each candidate prompt word The behavior of prompt word is selected to operate;
Temperature feature generates submodule, for using the behavior operation life for being directed to each candidate prompt word in preset time period At each candidate prompt word temperature feature;
Album feature generates submodule, for whether being album in station according to each candidate prompt word, generates each candidate prompt The album feature of word.
10. generating means as claimed in claim 9, which is characterized in that the score value extraction module includes:
Feature input submodule, for pinyin string similarity feature, the Chinese character string similarity of each candidate prompt word is special In the score value prediction model that sign, the temperature feature and the input of album feature are trained in advance, each candidate prompt word is obtained Score value.
11. such as claim 7-10 any one of them generating means, which is characterized in that described device further includes model training mould Block, the model training module include:
Training sample acquisition submodule, for obtaining training sample, each training sample includes an instruction in the training sample Practice the corresponding multiple candidate prompt words of training of search key, the trained search key, and, for the candidate prompt of training The hits of word;
Label determination sub-module determines the label of each candidate prompt word for the hits according to each candidate prompt word;
Training data feature generates submodule, for being corresponded to using the trained search key and the trained search key Multiple candidate prompt words of training, generate the training data feature of the candidate prompt word of each training;
Training submodule, for using each data characteristics for training candidate prompt word and the label, being instructed based on preset algorithm Practice score value prediction model.
12. generating means as claimed in claim 11, which is characterized in that the score value prediction model is LambdaMart moulds Type, the data characteristics include pinyin string similarity feature, Chinese character string similarity feature, the temperature feature of each candidate prompt word With album feature, the trained submodule includes:
Training unit is waited for the pinyin string similarity feature using each candidate prompt word of training, each training Select the Chinese character string similarity feature of prompt word, the temperature feature of each candidate prompt word of training, the candidate prompt of each training The label of the album feature of word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
CN201810442164.8A 2018-05-10 2018-05-10 A kind of generation method and device of Search Hints word Pending CN108763332A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810442164.8A CN108763332A (en) 2018-05-10 2018-05-10 A kind of generation method and device of Search Hints word

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810442164.8A CN108763332A (en) 2018-05-10 2018-05-10 A kind of generation method and device of Search Hints word

Publications (1)

Publication Number Publication Date
CN108763332A true CN108763332A (en) 2018-11-06

Family

ID=64009543

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810442164.8A Pending CN108763332A (en) 2018-05-10 2018-05-10 A kind of generation method and device of Search Hints word

Country Status (1)

Country Link
CN (1) CN108763332A (en)

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111125310A (en) * 2019-12-24 2020-05-08 北京百度网讯科技有限公司 Voice query method and device
CN111639255A (en) * 2019-03-01 2020-09-08 北京字节跳动网络技术有限公司 Search keyword recommendation method and device, storage medium and electronic equipment
CN111931500A (en) * 2020-09-21 2020-11-13 北京百度网讯科技有限公司 Search information processing method and device
CN112541076A (en) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 Method and device for generating extended corpus of target field and electronic equipment
CN113312523A (en) * 2021-07-30 2021-08-27 北京达佳互联信息技术有限公司 Dictionary generation and search keyword recommendation method and device and server
CN113343082A (en) * 2021-05-25 2021-09-03 北京字节跳动网络技术有限公司 Hot field prediction model generation method and device, storage medium and equipment
CN113343147A (en) * 2021-06-18 2021-09-03 北京百度网讯科技有限公司 Information processing method, apparatus, device, medium, and program product
CN113553398A (en) * 2021-07-15 2021-10-26 杭州网易云音乐科技有限公司 Search word correcting method and device, electronic equipment and computer storage medium
WO2022134355A1 (en) * 2020-12-25 2022-06-30 平安科技(深圳)有限公司 Keyword prompt-based search method and apparatus, and electronic device and storage medium
CN116304277A (en) * 2023-03-01 2023-06-23 深圳一资源网络平台有限公司 Intelligent matching method, system and storage medium based on AI
TWI823242B (en) * 2022-01-27 2023-11-21 中國信託商業銀行股份有限公司 Text generation method and device
CN117349400A (en) * 2023-12-04 2024-01-05 环球数科集团有限公司 Prompt word construction method based on AIGC

Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090056273A (en) * 2007-11-30 2009-06-03 주식회사 다음커뮤니케이션 Keyword searching method
CN103136213A (en) * 2011-11-23 2013-06-05 阿里巴巴集团控股有限公司 Method and device for providing related words
CN103631794A (en) * 2012-08-22 2014-03-12 百度在线网络技术(北京)有限公司 Method, device and equipment for sorting search results
CN105224554A (en) * 2014-06-11 2016-01-06 阿里巴巴集团控股有限公司 Search word is recommended to carry out method, system, server and the intelligent terminal searched for
CN105653697A (en) * 2015-12-30 2016-06-08 北京奇艺世纪科技有限公司 Recommended word retrieval method and system
CN105843850A (en) * 2016-03-15 2016-08-10 北京百度网讯科技有限公司 Searching optimization method and device
CN107169010A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of determination method and device of recommendation search keyword
CN107463704A (en) * 2017-08-16 2017-12-12 北京百度网讯科技有限公司 Searching method and device based on artificial intelligence
CN107885889A (en) * 2017-12-13 2018-04-06 聚好看科技股份有限公司 Feedback method, methods of exhibiting and the device of search result

Patent Citations (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20090056273A (en) * 2007-11-30 2009-06-03 주식회사 다음커뮤니케이션 Keyword searching method
CN103136213A (en) * 2011-11-23 2013-06-05 阿里巴巴集团控股有限公司 Method and device for providing related words
CN103631794A (en) * 2012-08-22 2014-03-12 百度在线网络技术(北京)有限公司 Method, device and equipment for sorting search results
CN105224554A (en) * 2014-06-11 2016-01-06 阿里巴巴集团控股有限公司 Search word is recommended to carry out method, system, server and the intelligent terminal searched for
CN105653697A (en) * 2015-12-30 2016-06-08 北京奇艺世纪科技有限公司 Recommended word retrieval method and system
CN105843850A (en) * 2016-03-15 2016-08-10 北京百度网讯科技有限公司 Searching optimization method and device
CN107169010A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of determination method and device of recommendation search keyword
CN107463704A (en) * 2017-08-16 2017-12-12 北京百度网讯科技有限公司 Searching method and device based on artificial intelligence
CN107885889A (en) * 2017-12-13 2018-04-06 聚好看科技股份有限公司 Feedback method, methods of exhibiting and the device of search result

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
谢彬 等: "基于排序学习的混合推荐算法", 《黑龙江科技大学学报》 *

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639255A (en) * 2019-03-01 2020-09-08 北京字节跳动网络技术有限公司 Search keyword recommendation method and device, storage medium and electronic equipment
CN111639255B (en) * 2019-03-01 2023-12-29 北京字节跳动网络技术有限公司 Recommendation method and device for search keywords, storage medium and electronic equipment
CN111125310A (en) * 2019-12-24 2020-05-08 北京百度网讯科技有限公司 Voice query method and device
CN111931500B (en) * 2020-09-21 2023-06-23 北京百度网讯科技有限公司 Search information processing method and device
CN111931500A (en) * 2020-09-21 2020-11-13 北京百度网讯科技有限公司 Search information processing method and device
CN112541076A (en) * 2020-11-09 2021-03-23 北京百度网讯科技有限公司 Method and device for generating extended corpus of target field and electronic equipment
CN112541076B (en) * 2020-11-09 2024-03-29 北京百度网讯科技有限公司 Method and device for generating expanded corpus in target field and electronic equipment
WO2022134355A1 (en) * 2020-12-25 2022-06-30 平安科技(深圳)有限公司 Keyword prompt-based search method and apparatus, and electronic device and storage medium
CN113343082A (en) * 2021-05-25 2021-09-03 北京字节跳动网络技术有限公司 Hot field prediction model generation method and device, storage medium and equipment
CN113343147A (en) * 2021-06-18 2021-09-03 北京百度网讯科技有限公司 Information processing method, apparatus, device, medium, and program product
CN113343147B (en) * 2021-06-18 2024-01-19 北京百度网讯科技有限公司 Information processing method, apparatus, device, medium, and program product
CN113553398A (en) * 2021-07-15 2021-10-26 杭州网易云音乐科技有限公司 Search word correcting method and device, electronic equipment and computer storage medium
CN113553398B (en) * 2021-07-15 2024-01-26 杭州网易云音乐科技有限公司 Search word correction method, search word correction device, electronic equipment and computer storage medium
CN113312523A (en) * 2021-07-30 2021-08-27 北京达佳互联信息技术有限公司 Dictionary generation and search keyword recommendation method and device and server
TWI823242B (en) * 2022-01-27 2023-11-21 中國信託商業銀行股份有限公司 Text generation method and device
CN116304277B (en) * 2023-03-01 2023-12-15 张素愿 Intelligent matching method, system and storage medium based on AI
CN116304277A (en) * 2023-03-01 2023-06-23 深圳一资源网络平台有限公司 Intelligent matching method, system and storage medium based on AI
CN117349400A (en) * 2023-12-04 2024-01-05 环球数科集团有限公司 Prompt word construction method based on AIGC
CN117349400B (en) * 2023-12-04 2024-02-27 环球数科集团有限公司 Prompt word construction method based on AIGC

Similar Documents

Publication Publication Date Title
CN108763332A (en) A kind of generation method and device of Search Hints word
CN108021616B (en) Community question-answer expert recommendation method based on recurrent neural network
CN107491531B (en) Chinese network comment sensibility classification method based on integrated study frame
CN106815252B (en) Searching method and device
CN106709040B (en) Application search method and server
CN105302810B (en) A kind of information search method and device
CN111221962B (en) Text emotion analysis method based on new word expansion and complex sentence pattern expansion
CN107608960B (en) Method and device for linking named entities
CN106649818A (en) Recognition method and device for application search intentions and application search method and server
CN108304373B (en) Semantic dictionary construction method and device, storage medium and electronic device
CN103869998B (en) A kind of method and device being ranked up to candidate item caused by input method
CN110134792B (en) Text recognition method and device, electronic equipment and storage medium
CN110390006A (en) Question and answer corpus generation method, device and computer readable storage medium
CN107247751B (en) LDA topic model-based content recommendation method
Chen et al. Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features
CN111738002A (en) Ancient text field named entity identification method and system based on Lattice LSTM
CN113505204A (en) Recall model training method, search recall device and computer equipment
US20220366282A1 (en) Systems and Methods for Active Curriculum Learning
CN112417119A (en) Open domain question-answer prediction method based on deep learning
CN115526590A (en) Efficient human-sentry matching and re-pushing method combining expert knowledge and algorithm
Popa et al. Bart-tl: Weakly-supervised topic label generation
Reddy et al. N-gram approach for gender prediction
Monti et al. An ensemble approach of recurrent neural networks using pre-trained embeddings for playlist completion
CN111666374A (en) Method for integrating additional knowledge information into deep language model
CN112597768B (en) Text auditing method, device, electronic equipment, storage medium and program product

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20181106

RJ01 Rejection of invention patent application after publication