CN108763332A - A kind of generation method and device of Search Hints word - Google Patents
A kind of generation method and device of Search Hints word Download PDFInfo
- Publication number
- CN108763332A CN108763332A CN201810442164.8A CN201810442164A CN108763332A CN 108763332 A CN108763332 A CN 108763332A CN 201810442164 A CN201810442164 A CN 201810442164A CN 108763332 A CN108763332 A CN 108763332A
- Authority
- CN
- China
- Prior art keywords
- word
- candidate
- candidate prompt
- prompt word
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Landscapes
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
An embodiment of the present invention provides a kind of generation method and device of Search Hints word, the generation method of Search Hints word includes:Obtain the search key of user;The candidate prompt set of words of search key is generated according to search key;Using search key and each candidate prompt word in candidate prompt set of words, the data characteristics of each candidate prompt word is generated;By in data characteristics input score value prediction model trained in advance, the score value of each candidate prompt word is obtained;According to score value, from the candidate target prompting word for prompting to determine search key in set of words.The score value of each candidate prompt word is extracted by score value prediction model, it is not necessary that weight is manually arranged, it is unreasonable to avoid weight setting, the problem of the score value inaccuracy of candidate prompt word is calculated, the score value of candidate prompt word can be objectively calculated according to the historical behavior of user, so that prompt word more meets the wish of user, user can select required prompt word in prompt word, improve the efficiency of information search.
Description
Technical field
The present invention relates to technical field of data processing, and in particular to a kind of generation method and device of Search Hints word.
Background technology
With the rapid development of Internet, network becomes people's daily life, the essential part of study and work.
Spreading network information is rapid, and network information is big, and how user rapidly retrieves useful information in bulk information
It is most important.User inputs prompt and is also referred to as Search Hints, and phase is provided by user's information that importation is inquired in search box
Complete prompt word is answered, is a kind of method improving recall precision.
At present Search Hints word mainly according to phonetic prefix, simplicity prefix, prompt word prefix of prompt word etc. it is different come
Source generates.Specifically, the matching degree for first calculating search key and prompt word, judges prompt word at the volumes of searches for counting prompt word
Whether be station in album, count prompt word clicking rate and prompt word novel degree (occur recently, still very early before just
Have), different weights is given to above several parts further according to experience, then by way of being summed it up to different piece, is calculated
The score of each prompt word is to generate prompt word.
However, the generation method of current prompt word, needs artificially for the matching degree of search key and prompt word, prompt
Whether the volumes of searches of word, prompt word are that the novel degree of album, the clicking rate of prompt word and prompt word in station distributes different power
Weight, the prompt word distribution weight for often causing volumes of searches big is excessive, or temperature is higher to be carried in order to which anti-cheating is suppressed
Show that can not be ranked up to prompt word when the weight of word or the consistent matching degree that empirical equation calculates two prompt words shows
As, to cause the search key of the prompt word generated according to sequence and user to mismatch, not meet the problem of user intention,
So that user can not select in prompt word, prompt word also just loses the meaning of prompt.
Invention content
In view of the above problems, it is proposed that the embodiment of the present invention overcoming the above problem or at least partly in order to provide one kind
A kind of generation method of the Search Hints word to solve the above problems and a kind of generating means of Search Hints word.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of generation methods of Search Hints word, including:
Obtain the search key of user;
The candidate prompt set of words of described search keyword is generated according to described search keyword;
Using described search keyword and each candidate prompt word in the candidate prompt set of words, generates each candidate and carry
Show the data characteristics of word;
By in the data characteristics input of each candidate prompt word score value prediction model trained in advance, obtains each candidate and carry
Show the score value of word;
According to the score value of each candidate prompt word, the mesh of described search keyword is determined from the candidate prompt set of words
Mark prompt word.
Optionally, the step of candidate prompt set of words that described search keyword is generated according to described search keyword
Including:
According to described search keyword, matched multiple candidate prompts are searched in pre-set candidate prompt word dictionary
Word;
The candidate prompt set of words is generated using the multiple candidate prompt word.
Optionally, described using described search keyword and each candidate prompt word in the candidate prompt set of words, it is raw
At each candidate prompt word data characteristics the step of include:
According to described search keyword and each candidate prompt word, search key pinyin string and candidate are generated respectively
Prompt word pinyin string;
The spelling of each candidate prompt word is generated using described search keyword pinyin string and the candidate prompt word pinyin string
Sound string similarity feature;
The Chinese character of each candidate prompt word is generated using the Chinese character string in described search keyword and the candidate prompt word
String similarity feature;
Obtain the historical behavior that each candidate prompt word is directed to each candidate prompt word within a preset period of time
Operation;
Using in preset time period each candidate prompt is generated for the historical behavior operation of each candidate prompt word
Word temperature feature;
Whether it is album in station according to each candidate prompt word, generates the album feature of each candidate prompt word.
Optionally, it in the data characteristics input by each candidate prompt word score value prediction model trained in advance, obtains
To each candidate prompt word score value the step of include:
By pinyin string similarity feature, the Chinese character string similarity feature, the temperature feature of each candidate prompt word
And the album feature inputs in score value prediction model trained in advance, obtains the score value of each candidate prompt word.
Optionally, the score value prediction model is trained in the following manner:
Training sample is obtained, each training sample includes trained search key, the instruction in the training sample
Practice the corresponding multiple candidate prompt words of training of search key, and, for the hits of the candidate prompt word of training;
According to the hits of each candidate prompt word, the label of each candidate prompt word is determined;
It is raw using the corresponding multiple candidate prompt words of training of the trained search key and the trained search key
At the training data feature of the candidate prompt word of each training;
Data characteristics using the candidate prompt word of each training and the label predict mould based on preset algorithm training score value
Type.
Optionally, the score value prediction model is LambdaMart models, and the data characteristics includes each candidate prompt
Pinyin string similarity feature, Chinese character string similarity feature, temperature feature and the album feature of word, it is described candidate using each training
The data characteristics of prompt word and the label include based on the step of preset algorithm training score value prediction model:
Using the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training
Chinese character string similarity feature, the temperature feature of each candidate prompt word of training, the album of the candidate prompt word of each training are special
The label of sign and each trained prompt word trains LambdaMart models based on GBDT algorithms.
To solve the above-mentioned problems, the embodiment of the invention discloses a kind of generating means of Search Hints word, including:
Search key acquisition module, the search key for obtaining user;
Candidate's prompt set of words generation module, the candidate for generating described search keyword according to described search keyword
Prompt set of words;
Data characteristics generation module, it is each candidate in set of words for being prompted with the candidate using described search keyword
Prompt word generates the data characteristics of each candidate prompt word;
Score value extraction module, the score value prediction model for training the data characteristics input of each candidate prompt word in advance
In, obtain the score value of each candidate prompt word;
Target prompting word determining module determines institute for the score value according to each candidate prompt word from candidate prompt word
State the target prompting word of search key.
Optionally, the candidate prompt set of words generation module includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word word
Matched multiple candidate prompt words are searched in library;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word using the multiple candidate prompt word
Set.
Optionally, the data characteristics generation module includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating respectively
Search key pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for using described search keyword pinyin string and the candidate prompt
Word pinyin string generates the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using in described search keyword and the candidate prompt word
Chinese character string generates the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, for obtaining each candidate prompt word within a preset period of time for described every
The behavior operation of a candidate's prompt word;
Temperature feature generates submodule, for using the behavior behaviour for being directed to each candidate prompt word in preset time period
Make to generate each candidate prompt word temperature feature;
Whether album feature generates submodule, for being album in station according to each candidate prompt word, generate each candidate
The album feature of prompt word.
Optionally, the score value extraction module includes:
Feature input submodule, for the pinyin string similarity feature of each candidate prompt word, the Chinese character string is similar
It spends in the score value prediction model that feature, the temperature feature and the input of album feature are trained in advance, obtains each candidate prompt
The score value of word.
Optionally, described device further includes model training module, and the model training module includes:
Training sample acquisition submodule, for obtaining training sample, each training sample includes one in the training sample
The corresponding multiple candidate prompt words of training of a trained search key, the trained search key, and, it is candidate for training
The hits of prompt word;
Label determination sub-module determines the mark of each candidate prompt word for the hits according to each candidate prompt word
Label;
Training data feature generates submodule, for using the trained search key and the trained search key
The corresponding candidate prompt word of multiple training, generates the training data feature of the candidate prompt word of each training;
Training submodule, for using the data characteristics and the label for each training candidate prompt word, being imputed based on pre-
Method trains score value prediction model.
Optionally, the score value prediction model is LambdaMart models, and the data characteristics includes each candidate prompt
Pinyin string similarity feature, Chinese character string similarity feature, temperature feature and the album feature of word, the trained submodule include:
Training unit, for pinyin string similarity feature, each instruction using each candidate prompt word of training
Practice the Chinese character string similarity feature of candidate prompt word, the temperature feature of each candidate prompt word of training, each training candidate
The label of the album feature of prompt word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
The embodiment of the present invention includes following advantages:
In the embodiment of the present invention, after the search key for obtaining user;It is searched according to described in the generation of described search keyword
The candidate prompt set of words of rope keyword;It is prompted using described search keyword and each candidate in the candidate prompt set of words
Word generates the data characteristics of each candidate prompt word;Then by the data characteristics input training in advance of each candidate prompt word
In score value prediction model, the score value of each candidate prompt word is obtained;According to the score value of each candidate prompt word, from candidate prompt word
The target prompting word of middle determining described search keyword, the embodiment of the present invention are led to after the data characteristics of the candidate prompt word of extraction
The score value of the excessive each candidate prompt word of value prediction model extraction is avoided artificial or regular set without artificial setting weight
Setting weight causes weight setting unreasonable, and the problem of the score value inaccuracy of candidate prompt word is calculated, can be according to user's
Historical behavior objectively calculates the score value of candidate prompt word so that prompt word more meets the wish of user, and user can prompt
Required prompt word is selected in word, improves the efficiency of information search.
Description of the drawings
Fig. 1 is a kind of step flow chart of the generation method embodiment 1 of Search Hints word of the present invention;
Fig. 2 is a kind of exemplary schematic diagram of Search Hints word of the present invention;
Fig. 3 is a kind of another exemplary schematic diagram of Search Hints word of the present invention;
Fig. 4 is a kind of step flow chart of the generation method embodiment 2 of Search Hints word of the present invention;
Fig. 5 is the exemplary plot of the partial decision tree of the score value prediction model of the present invention;
A kind of structure diagram of the generating means embodiment of Search Hints word of Fig. 6 present invention.
Specific implementation mode
In order to make the foregoing objectives, features and advantages of the present invention clearer and more comprehensible, below in conjunction with the accompanying drawings and specific real
Applying mode, the present invention is described in further detail.
Referring to Fig.1, a kind of step flow chart of the generation method embodiment 1 of Search Hints word of the present invention, tool are shown
Body may include steps of:
Step 101, the search key of user is obtained.
In the embodiment of the present invention, search key can be the character that user inputs in the input frame of application program, for example,
User searches for the media informations such as film, music in video web-pages, then the media information for needing to search for can be inputted in input frame
One or more character, and the display reminding word below input frame.
It is illustrated in figure 2 an example of prompt word, in this example, inputs " grandson realizes " in input frame, then character " grandson realizes "
For search key, then the meeting display reminding word below input frame, including " Sun Wukong ", " the big film of Sun Wukong " etc..
Step 102, the candidate prompt set of words of described search keyword is generated according to described search keyword.
In practical applications, there can be multiple candidate prompt words for a search key, such as by search key
Either candidate prompt word is concatenated into suffix or centre to prefix as candidate prompt word, can also be to be existed by search key
The words comprising search key is searched in preset dictionary as candidate prompt word, candidate prompt word can also be closes with search
Keyword pronunciation is identical or the similar words of font.
It is illustrated in figure 3 an example of prompt set of words, in figure 3, user needs to input search key " match originally
You ", but have input " Sai Er ", then candidate prompt word may include " breath in The Legend of Zelda wilderness ", " The Legend of Zelda ",
The candidate prompt word identical with " Sai Er " font of input such as " Sai Erda ", can also include similar to " Sai Er " font inputted
" No. Sai Er " candidate prompt word, you can to realize automatic error-correcting prompt facility.
Step 103, it using described search keyword and each candidate prompt word in the candidate prompt set of words, generates every
The data characteristics of a candidate's prompt word.
In the embodiment of the present invention, a search key corresponds to multiple candidate prompt words, can be by search key and one
A candidate's prompt word is determined as a pairing, and each pairing can generate corresponding data characteristics, which may include
Search key and candidate similar features of the prompt word on pronunciation, the similar features on font, candidate's prompt word it is big
Temperature feature that the historical behavior of amount user is formed, the candidate prompt word whether be the interior album in station album feature.
Step 104, it by the data characteristics input of each candidate prompt word score value prediction model trained in advance, obtains every
The score value of a candidate's prompt word.
After the data characteristics for obtaining search key and each candidate prompt word, inputs in score value prediction model and obtain often
The score value of a candidate's prompt word, specifically, score value prediction model is trained by the historical data of a large number of users, example
The click data of the historical search keyword and the corresponding candidate prompt word of historical search keyword of such as collecting a large number of users is made
For training sample, the training data feature of training sample is extracted, score value prediction model, such as base are then trained by preset algorithm
LambdaMart models are trained in GBDT algorithms, can be with the score value of predicting candidate prompt word by LambdaMart models, it can be with
All candidate prompt words are ranked up according to score value so that the sequence of candidate prompt word more meets the wish of user.
Step 105, the score value according to each candidate prompt word determines that described search is closed from the candidate prompt set of words
The target prompting word of keyword.
After the score value for obtaining the corresponding multiple candidate prompt words of search key, by multiple candidate prompt words according to score value
It is ranked up, target prompting word is determined from multiple candidate prompt words, specifically, can will sort in a certain range
Candidate prompt word is determined as target prompting word and is shown, is input in input frame so that user carries out selection.
The embodiment of the present invention is extracted each candidate after the data characteristics of the candidate prompt word of extraction by score value prediction model
The score value of prompt word avoids artificial or rule setting weight and causes weight setting unreasonable without artificial setting weight, meter
The problem for calculating the score value inaccuracy for obtaining candidate prompt word can objectively calculate candidate prompt word according to the historical behavior of user
Score value so that prompt word more meets the wish of user, and user can select required prompt word in prompt word, improve letter
Cease the efficiency of search.
With reference to Fig. 4, a kind of step flow chart of the generation method embodiment 2 of Search Hints word of the present invention, tool are shown
Body may include steps of:
Step 201, the search key of user is obtained.
In the embodiment of the present invention, search key can be the character that user inputs in the input frame of application program, for example,
User searches for the media informations such as film, music in video web-pages, then the media information for needing to search for can be inputted in input frame
One or more character, and the display reminding word below input frame.
Step 202, the candidate prompt set of words of described search keyword is generated according to described search keyword.
In embodiments of the present invention, step 202 may include following sub-step:
Sub-step S11 is searched matched more according to described search keyword in pre-set candidate prompt word dictionary
A candidate's prompt word.
Sub-step S12 generates the candidate prompt set of words using the multiple candidate prompt word.
In practical applications, can collect the input history of user, physical presence entity (as video name, name,
Name etc.) and the candidate word of manual construction form candidate prompt word dictionary, such as when user is defeated in the input frame of video web-pages
When entering to search for information, information input by user can be collected as candidate prompt word in server side, and be stored in candidate prompt
In word dictionary, the higher word of current temperature can also be obtained certainly and is stored in candidate prompt word dictionary as candidate prompt word,
It can also be that generating candidate prompt word according to the group word rule of word and word is stored in candidate prompt word dictionary.
After getting the search key of user, it can be searched in candidate prompt word dictionary by search key and include
The candidate prompt word of search key, and using multiple candidate prompt words comprising search key as the candidate of search key
Prompt set of words.
For example, user inputs search key " grandson realizes ", then candidate prompt set of words can be that { Sun Wukong, Sun Wukong are big
Film, the love of Sun Wukong you 10,000 years, Sun Wukong creates a tremendous uproar, and Sun Wukong's cartoon, Sun Wukong three beats the White Bone Demon }.
Certainly, above-mentioned is that can also be input Pinyin or other Languages in practical applications to input Chinese character as example
Search key.
Step 203, according to described search keyword and each candidate prompt word, search key phonetic is generated respectively
String and candidate prompt word pinyin string.
In practical applications, a search key and a candidate prompt word can be a pairing, can generate this
The pinyin string of pairing, for example, with search key for " grandson realizes ", candidate prompt word is " Sun Wukong ", " the big film of Sun Wukong " is
Example, search key pinyin string are " sun wu ", and candidate prompt word pinyin string is " sun wu kong ", " sun wu kong da
dian ying”。
Certainly, those skilled in the art can also generate corresponding pinyin string according to other Pinyin rules, such as simplicity, complete
Spell etc., the embodiment of the present invention does not limit this.
Step 204, it generates each candidate using described search keyword pinyin string and the candidate prompt word pinyin string and carries
Show the pinyin string similarity feature of word.
Pinyin string similarity feature may include multiple features, specific as shown in table 1:(token is search key,
Query is candidate prompt word)
Table 1:
Preset computational methods are corresponded to by each feature serial number in table 1, the multiple of each candidate prompt word can be obtained
Pinyin string similarity feature.
Step 205, each candidate prompt is generated using the Chinese character string in described search keyword and the candidate prompt word
The Chinese character string similarity feature of word.
In practical applications, a search key and a candidate prompt word can be a pairing, can generate this
The Chinese character string of pairing, for example, with search key for " grandson realizes ", candidate prompt word is " Sun Wukong ", " the big film of Sun Wukong " is
Example, search key Chinese character string are " grandson realizes ", and candidate prompt word Chinese character string is " Sun Wukong ", " the big film of Sun Wukong " etc., certainly,
As shown in figure 3, search key Chinese character string be " Sai Er ", candidate prompt word Chinese character string can be " The Legend of Zelda wilderness it
Breath ", " No. Sai Er ", " The Legend of Zelda ", " Sai Erda " etc., wherein the search key of " match " and input in " No. Sai Er "
" plug " in " Sai Er " is that font is similar.
Chinese character string similarity feature may include multiple features, specific as shown in table 2:(token is search key,
Query is candidate prompt word)
By the preset computational methods of feature serial number 10-19 in table 2, multiple Chinese characters of each candidate prompt word can be obtained
String similarity feature.
Certainly, above-mentioned is only to be illustrated using Chinese as example, and those skilled in the art can also associate other
Language, the embodiment of the present invention do not limit this.
Step 206, it obtains each candidate prompt word and is directed to each candidate prompt word within a preset period of time
Behavior operates.
In the embodiment of the present invention, a large number of users includes user to candidate prompt word for the behavior operation of each candidate word
Searching times, IP numbers, hits and clicking rate etc..It can be operated by counting the behavior of user in the past period, such as
One middle of the month of past, respectively searching times of the counting user before 1 day, before 2 days, before 4 days, before 7 days, before 30 days, IP numbers, click
Number and clicking rate.
Step 207, it is generated using the behavior operation for each candidate prompt word in preset time period each candidate
Prompt word temperature feature.
Obtaining searching times, IP number, hits and point of the user before 1 day, before 2 days, before 4 days, before 7 days, before 30 days
After hitting rate, the temperature feature of candidate prompt word can be calculated, it is specific as shown in table 3:
Table 3:
Above-mentioned is only the temperature feature for being calculated with the data before 1 day candidate prompt word, before 2 days, before 4 days, before 7 days and 30
Temperature feature before it is not repeated herein with reference to table 3.
Temperature feature of the candidate prompt word before 2 days may include the temperature feature and preceding 2 days comprehensive temperatures on the same day before 2 days
Feature.For example, current time is July 30, then the temperature feature before 2 days may include being calculated according to the data on July 28
The temperature feature of candidate prompt word, and, integrate July 28, the temperature spy for the candidate prompt word that the data on July 29 calculate
Sign, the temperature feature before 4 days may include the temperature feature of the candidate prompt word calculated according to the data on July 26, and, it is comprehensive
Close the temperature feature of the candidate prompt word of the data calculating in July 26, July 27, July 28, July 29.It can similarly count
The temperature feature of candidate prompt word before calculating 7 days and before 30 days.Before being obtained 1 day by preset algorithm in table 3, before 2 days, before 4 days, 7
Totally 36 temperature feature (feature serial number 20-55) is tieed up before it and before 30 days.
Step 208, whether it is album in station according to each candidate prompt word, the album for generating each candidate prompt word is special
Sign.
In the embodiment of the present invention, it can safeguard album list in a station, be had recorded in the list and belong to album in station
Candidate prompt word, if candidate's prompt word in album list, can generate the album of candidate prompt word in station according to table 4
Feature:
Table 4:
It can obtain the album feature of candidate prompt word by table 4, i.e., 0 or 1.
Step 209, by the pinyin string similarity feature of each candidate prompt word, the Chinese character string similarity feature, described
In the score value prediction model that temperature feature and the input of album feature are trained in advance, the score value of each candidate prompt word is obtained.
In pinyin string similarity feature (feature serial number 1-9), the Chinese character string similarity feature for generating each candidate prompt word
After (feature serial number 10-19), temperature feature (feature serial number 20-55) and album feature (feature serial number 56), had altogether above-mentioned
The score value of each candidate prompt word is extracted in 56 dimensional features input score value prediction model trained in advance.
In one preferred embodiment of the invention, score value prediction model is trained in the following manner:
Sub-step S21 obtains training sample, and each training sample includes that a training search is crucial in the training sample
The corresponding multiple candidate prompt words of training of word, the trained search key, and, for the click of the candidate prompt word of training
Number.
Training sample is the click data that a large number of users is directed to candidate prompt word, such as different user to same search key
The candidate prompt word of difference of word is clicked, and the hits of each candidate prompt word is counted, specifically, can count when born
The hits of different user different candidate's prompt words under same search key, when being below " grandson realizes " for search key,
The data of each candidate's prompt word are as follows:
In practical applications, if without click on the day of candidate prompt word, hits are denoted as 0.Calculate each candidate simultaneously
The click accounting of prompt word, i.e., the hits of candidate prompt word account for the ratio of all candidate prompt word hits summations.
Sub-step S22 determines the label of each candidate prompt word according to the hits of each candidate prompt word.
Hits show the intention of a large number of users, i.e., when a user inputs a search keyword, it is expected that candidate prompt word by
It is appeared in prompt word option according to hits sequence, therefore, each candidate prompt word can be given tagged, specifically, can
To be multiplied by rounding after a coefficient as the label of candidate's prompt word according to the clicks accounting of each candidate prompt word, such as it is multiplied by
Rounding obtains following data after 20 multiplying factors:
As it appears from the above, the label of candidate prompt word " Sun Wukong " is 4, and candidate prompt word " Sun Wukong's cartoon " and " grandson
Realize empty big film " label be 3, be built such that a kind of ordinal relation, show candidate prompt word " Sun Wukong " than candidate prompt word
The position of " Sun Wukong's cartoon " and " the big film of Sun Wukong " sequence should be located further forward, and candidate prompt word " Sun Wukong's cartoon "
The sequence of " the big film of Sun Wukong " should be similar, wishes the study of score value prediction model to this sequence as standard.
Sub-step S23 is waited using the corresponding multiple training of the trained search key and the trained search key
Prompt word is selected, the training data feature of the candidate prompt word of each training is generated.
Specifically, may include according to the trained search key and the candidate prompt word of each training, respectively
Generate training search key pinyin string and the candidate prompt word pinyin string of training;Using the trained search key pinyin string and
The candidate prompt word pinyin string of training generates the pinyin string similarity feature of the candidate prompt word of each training;Using the training
Chinese character string in search key and the candidate prompt word of the training generates the Chinese character string similarity of the candidate prompt word of each training
Feature;Obtaining the training, each candidate prompt word is directed to the behavior of each candidate prompt word of the training within a preset period of time
Operation;Using in preset time period the candidate prompt of each training is generated for the behavior operation of each candidate prompt word of training
The temperature feature of word;Whether it is album in station according to the candidate prompt word of each training, generates the special of the candidate prompt word of each training
Collect feature.
The step of training data feature of the above-mentioned candidate prompt word of generation training, can refer to step 203- steps 208, this
In be not repeated.
Training data feature can be as follows:
The data characteristics " 1 of above-mentioned candidate's prompt word " the big film of Sun Wukong ":1 27:0.006 28:0.005 " meaning
For:It is the 0.006, No. 28 characteristic value is 0.005 that No. 1 characteristic value, which is the 1, No. 27 characteristic value, and tag number is in table 1- tables 4
Feature serial number.
Above is only that can share No. 56 characteristic values with one in practical application using partial feature value as illustrating.
Sub-step S24, the data characteristics using the candidate prompt word of each training and the label, are trained based on preset algorithm
Score value prediction model.
In embodiments of the present invention, the score value prediction model is LambdaMart models, and step S24 may include following
Sub-step:
Using the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training
The album of Chinese character string similarity feature, the temperature feature of each candidate prompt word of training and the candidate prompt word of each training
Feature trains LambdaMart models based on GBDT algorithms.
LambdaMart is based on LambdaRank algorithms and MART (Multiple Additive Regression Tree)
Algorithm converts sequencing problem to regression tree problem.Wherein, MART algorithms can be GBDT (gradient promoted decision tree,
Gradient Boosting Decision Tree) algorithm.
GBDT is a kind of for returning, classifying and the machine learning algorithm of Sorting task, belongs to Boosting algorithms race
A part.Boosting is the algorithm that weak learner can be promoted to strong learner by family, belongs to integrated study (ensemble
Learning scope).Boosting methods are based on such a thought:For a complex task, by sentencing for multiple nodes
The disconnected judgement for carrying out comprehensive income appropriate and going out, gets well than the individual judgement of one node of any of which.Gradient is promoted with it
His boosting methods are the same, can be by opening in reality to build final prediction model by integrating multiple decision trees
Source tool LightGBM carries out the training of LambdaMart models, i.e., by the pinyin string similarity feature of candidate prompt word, Chinese character
String similarity feature, temperature feature and album feature train LambdaMart moulds as the data set of Open-Source Tools LightGBM
Type.
An example being illustrated in figure 5 in LambdaMart models, 56 dimensions that will be generated in step 203- steps 208
It according in the value input model of feature, is compared with the threshold value of each node, into next branch, each branch's leaf node terminates
A nodal value is all corresponded to afterwards, if there is 100 are set, then the value summation for the leaf node that 100 are set can be obtained into each time
Select the score value of prompt word.
Step 210, the score value according to each candidate prompt word determines that described search is closed from the candidate prompt set of words
The target prompting word of keyword.
Specifically, score value can be first depending on, all candidate prompt words are ranked up, i.e., are carried obtaining each candidate
It after showing the score value of word, can be from big to small ranked up according to score value, obtain institute in the corresponding candidate prompt word of search key
There is the sequence of candidate prompt word;Then the candidate prompt word of sequence within a preset range is determined as target prompting word, for example, can
Sequence is determined as the target prompting word of search key in preceding 10 candidate prompt word, and shown according to sequence, with
It is selected for user.
The embodiment of the present invention is extracted each candidate after the data characteristics of the candidate prompt word of extraction by score value prediction model
The score value of prompt word avoids artificial or rule setting weight and causes weight setting unreasonable without artificial setting weight, meter
The problem for calculating the score value inaccuracy for obtaining candidate prompt word can objectively calculate candidate prompt word according to the historical behavior of user
Score value so that prompt word more meets the wish of user, and user can select required prompt word in prompt word, improve letter
Cease the efficiency of search.
In the embodiment of the present invention, score value preset model combination pinyin string similar features, Chinese character string similar features and user
The temperature feature and album feature that historical behavior data generate are trained, and fully combine temperature information, the search of prompt word
Similarity on keyword and the phonetic of candidate prompt word, word string carries out comprehensive marking so that the score value of candidate prompt word is more
To be objective, the ranking results of candidate prompt word match with the click behavior of user, more meet the needs of users.
In the embodiment of the present invention, the scoring tactics of candidate prompt word are simple, only need to generate multidimensional data according to prediction algorithm
Multidimensional data feature input score value prediction model is extracted score value, it is not necessary that weight is manually specified by feature.
With reference to Fig. 6, a kind of structure diagram of the generating means embodiment of Search Hints word of the present invention, described device are shown
Including:
Search key acquisition module 301, the search key for obtaining user;
Candidate's prompt set of words generation module 302, for generating described search keyword according to described search keyword
Candidate's prompt set of words;
Data characteristics generation module 303, it is each in set of words for being prompted with the candidate using described search keyword
Candidate prompt word generates the data characteristics of each candidate prompt word;
Score value extraction module 304, for predicting the data characteristics input of each candidate prompt word score value trained in advance
In model, the score value of each candidate prompt word is obtained;
Target prompting word determining module 305, for the score value according to each candidate prompt word, from the candidate prompt word set
The target prompting word of described search keyword is determined in conjunction.
Optionally, the candidate prompt set of words generation module 302 includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word word
Matched multiple candidate prompt words are searched in library;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word using the multiple candidate prompt word
Set.
Optionally, the data characteristics generation module 303 includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating respectively
Search key pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for using described search keyword pinyin string and the candidate prompt
Word pinyin string generates the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using in described search keyword and the candidate prompt word
Chinese character string generates the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, for obtaining each candidate prompt word within a preset period of time for described every
The behavior operation of a candidate's prompt word;
Temperature feature generates submodule, for using the behavior behaviour for being directed to each candidate prompt word in preset time period
Make to generate each candidate prompt word temperature feature;
Whether album feature generates submodule, for being album in station according to each candidate prompt word, generate each candidate
The album feature of prompt word.
Optionally, the score value extraction module 304 includes:
Feature input submodule, for the pinyin string similarity feature of each candidate prompt word, the Chinese character string is similar
It spends in the score value prediction model that feature, the temperature feature and the input of album feature are trained in advance, obtains each candidate prompt
The score value of word.
Optionally, described device further includes model training module, and the model training module includes:
Training sample acquisition submodule, for obtaining training sample, each training sample includes one in the training sample
The corresponding multiple candidate prompt words of training of a trained search key, the trained search key, and, it is candidate for training
The hits of prompt word;
Label determination sub-module determines the mark of each candidate prompt word for the hits according to each candidate prompt word
Label;
Training data feature generates submodule, for using the trained search key and the trained search key
The corresponding candidate prompt word of multiple training, generates the training data feature of the candidate prompt word of each training;
Training submodule, for using the data characteristics and the label for each training candidate prompt word, being imputed based on pre-
Method trains score value prediction model.
Optionally, the training data feature generation submodule includes:
Pinyin string generation unit, for according to the trained search key and the candidate prompt word of each training, dividing
Search key pinyin string and the candidate prompt word pinyin string of training Sheng Cheng not trained;
Pinyin string feature generation unit, for using the trained search key pinyin string and the candidate prompt of the training
Word pinyin string generates the pinyin string similarity feature of the candidate prompt word of each training;
Chinese character string feature generation unit, for using in the trained search key and the candidate prompt word of the training
Chinese character string generates the Chinese character string similarity feature of the candidate prompt word of each training;
Behavior operates acquiring unit, for obtain the training each candidate prompt word be directed within a preset period of time it is described
The behavior operation of each candidate prompt word of training;
Temperature feature generation unit, for using the behavior for being directed to the candidate prompt word of each training in preset time period
Operation generates the temperature feature of the candidate prompt word of each training;
Album feature generation unit generates each instruction for whether being album in station according to the candidate prompt word of each training
Practice the album feature of candidate's prompt word.
Optionally, the score value prediction model is LambdaMart models, and the trained submodule includes:
Training unit, for pinyin string similarity feature, each instruction using each candidate prompt word of training
Practice the Chinese character string similarity feature of candidate prompt word, the temperature feature of each candidate prompt word of training, each training candidate
The label of the album feature of prompt word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
Optionally, the target prompting word determining module 305 includes:
Sorting sub-module, for according to score value, all candidate prompt words to be ranked up;
Target prompting word determination sub-module, for the candidate prompt word of sequence within a preset range to be determined as target prompting
Word.
For device embodiments, since it is basically similar to the method embodiment, so fairly simple, the correlation of description
Place illustrates referring to the part of embodiment of the method.
Each embodiment in this specification is described in a progressive manner, the highlights of each of the examples are with
The difference of other embodiment, the same or similar parts between the embodiments can be referred to each other.
It should be understood by those skilled in the art that, the embodiment of the embodiment of the present invention can be provided as method, apparatus or calculate
Machine program product.Therefore, the embodiment of the present invention can be used complete hardware embodiment, complete software embodiment or combine software and
The form of the embodiment of hardware aspect.Moreover, the embodiment of the present invention can be used one or more wherein include computer can
With in the computer-usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) of program code
The form of the computer program product of implementation.
The embodiment of the present invention be with reference to according to the method for the embodiment of the present invention, terminal device (system) and computer program
The flowchart and/or the block diagram of product describes.It should be understood that flowchart and/or the block diagram can be realized by computer program instructions
In each flow and/or block and flowchart and/or the block diagram in flow and/or box combination.These can be provided
Computer program instructions are set to all-purpose computer, special purpose computer, Embedded Processor or other programmable data processing terminals
Standby processor is to generate a machine so that is held by the processor of computer or other programmable data processing terminal equipments
Capable instruction generates for realizing in one flow of flow chart or multiple flows and/or one box of block diagram or multiple boxes
The device of specified function.
These computer program instructions, which may also be stored in, can guide computer or other programmable data processing terminal equipments
In computer-readable memory operate in a specific manner so that instruction stored in the computer readable memory generates packet
The manufacture of command device is included, which realizes in one flow of flow chart or multiple flows and/or one side of block diagram
The function of being specified in frame or multiple boxes.
These computer program instructions can be also loaded into computer or other programmable data processing terminal equipments so that
Series of operation steps are executed on computer or other programmable terminal equipments to generate computer implemented processing, thus
The instruction executed on computer or other programmable terminal equipments is provided for realizing in one flow of flow chart or multiple flows
And/or in one box of block diagram or multiple boxes specify function the step of.
Although the preferred embodiment of the embodiment of the present invention has been described, once a person skilled in the art knows bases
This creative concept, then additional changes and modifications can be made to these embodiments.So the following claims are intended to be interpreted as
Including preferred embodiment and fall into all change and modification of range of embodiment of the invention.
Finally, it is to be noted that, herein, relational terms such as first and second and the like be used merely to by
One entity or operation are distinguished with another entity or operation, without necessarily requiring or implying these entities or operation
Between there are any actual relationship or orders.Moreover, the terms "include", "comprise" or its any other variant meaning
Covering non-exclusive inclusion, so that process, method, article or terminal device including a series of elements not only wrap
Those elements are included, but also include other elements that are not explicitly listed, or further include for this process, method, article
Or the element that terminal device is intrinsic.In the absence of more restrictions, being wanted by what sentence "including a ..." limited
Element, it is not excluded that there is also other identical elements in process, method, article or the terminal device including the element.
Above to a kind of generation dress of the generation method and a kind of Search Hints word of Search Hints word provided by the present invention
It sets and is described in detail, principle and implementation of the present invention are described for specific case used herein, above
The explanation of embodiment is merely used to help understand the method and its core concept of the present invention;Meanwhile for the general skill of this field
Art personnel, according to the thought of the present invention, there will be changes in the specific implementation manner and application range, in conclusion this
Description should not be construed as limiting the invention.
Claims (12)
1. a kind of generation method of Search Hints word, which is characterized in that the method includes:
Obtain the search key of user;
The candidate prompt set of words of described search keyword is generated according to described search keyword;
Using described search keyword and each candidate prompt word in the candidate prompt set of words, each candidate's prompt word is generated
Data characteristics;
By in the data characteristics input of each candidate prompt word score value prediction model trained in advance, each candidate prompt word is obtained
Score value;
According to the score value of each candidate prompt word, determine that the target of described search keyword carries from the candidate prompt set of words
Show word.
2. generation method as described in claim 1, which is characterized in that described to generate described search according to described search keyword
The step of candidate prompt set of words of keyword includes:
According to described search keyword, matched multiple candidate prompt words are searched in pre-set candidate prompt word dictionary;
The candidate prompt set of words is generated using the multiple candidate prompt word.
3. generation method as described in claim 1, which is characterized in that described to be carried with the candidate using described search keyword
Show in set of words each candidate's prompt word, the step of data characteristics for generating each candidate prompt word includes:
According to described search keyword and each candidate prompt word, search key pinyin string is generated respectively and candidate prompts
Word pinyin string;
The pinyin string of each candidate prompt word is generated using described search keyword pinyin string and the candidate prompt word pinyin string
Similarity feature;
The Chinese character string phase of each candidate prompt word is generated using the Chinese character string in described search keyword and the candidate prompt word
Like degree feature;
Obtain the historical behavior operation that each candidate prompt word is directed to each candidate prompt word within a preset period of time;
Each candidate prompt word heat is generated using the historical behavior operation for each candidate prompt word in preset time period
Spend feature;
Whether it is album in station according to each candidate prompt word, generates the album feature of each candidate prompt word.
4. generation method as claimed in claim 3, which is characterized in that the data characteristics by each candidate prompt word inputs
In score value prediction model trained in advance, the step of obtaining the score value of each candidate prompt word, includes:
By the pinyin string similarity feature of each candidate prompt word, the Chinese character string similarity feature, the temperature feature and
In album feature input score value prediction model trained in advance, the score value of each candidate prompt word is obtained.
5. generation method according to any one of claims 1-4, which is characterized in that the score value prediction model passes through with lower section
Formula is trained:
Training sample is obtained, each training sample is searched including trained search key, the training in the training sample
The corresponding multiple candidate prompt words of training of rope keyword, and, for the hits of the candidate prompt word of training;
According to the hits of each candidate prompt word, the label of each candidate prompt word is determined;
Using the corresponding multiple candidate prompt words of training of the trained search key and the trained search key, generate every
The training data feature of a candidate prompt word of training;
Data characteristics using the candidate prompt word of each training and the label, score value prediction model is trained based on preset algorithm.
6. generation method as claimed in claim 5, which is characterized in that the score value prediction model is LambdaMart models,
The data characteristics include the pinyin string similarity feature of each candidate prompt word, Chinese character string similarity feature, temperature feature and
Album feature, the data characteristics using the candidate prompt word of each training and the label, score value is trained based on preset algorithm
The step of prediction model includes:
Using the Chinese character of the pinyin string similarity feature of each candidate prompt word of training, each candidate prompt word of training
Go here and there similarity feature, the temperature feature of each candidate prompt word of training, the album feature of the candidate prompt word of each training with
And the label of each trained prompt word, LambdaMart models are trained based on GBDT algorithms.
7. a kind of generating means of Search Hints word, which is characterized in that described device includes:
Search key acquisition module, the search key for obtaining user;
Candidate's prompt set of words generation module, the candidate prompt for generating described search keyword according to described search keyword
Set of words;
Data characteristics generation module, for using each candidate prompt in described search keyword and the candidate prompt set of words
Word generates the data characteristics of each candidate prompt word;
Score value extraction module, for the data characteristics of each candidate prompt word to be inputted in score value prediction model trained in advance,
Obtain the score value of each candidate prompt word;
Target prompting word determining module, for the score value according to each candidate prompt word, determined from candidate prompt word described in search
The target prompting word of rope keyword.
8. generating means as claimed in claim 7, which is characterized in that the candidate prompt set of words generation module includes:
Candidate prompt word searches submodule, for foundation described search keyword, in pre-set candidate prompt word dictionary
Search matched multiple candidate prompt words;
Candidate's prompt set of words generates submodule, for generating the candidate prompt word set using the multiple candidate prompt word
It closes.
9. generating means as claimed in claim 7, which is characterized in that the data characteristics generation module includes:
Pinyin string generates submodule, for according to described search keyword and each candidate prompt word, generating search respectively
Keyword pinyin string and candidate prompt word pinyin string;
Pinyin string similarity feature generates submodule, for being spelled using described search keyword pinyin string and the candidate prompt word
Sound concatenates into the pinyin string similarity feature of each candidate prompt word;
Chinese character string similarity feature generates submodule, for using the Chinese character in described search keyword and the candidate prompt word
Concatenate into the Chinese character string similarity feature of each candidate prompt word;
Behavior operates acquisition submodule, and each time is directed within a preset period of time for obtaining each candidate prompt word
The behavior of prompt word is selected to operate;
Temperature feature generates submodule, for using the behavior operation life for being directed to each candidate prompt word in preset time period
At each candidate prompt word temperature feature;
Album feature generates submodule, for whether being album in station according to each candidate prompt word, generates each candidate prompt
The album feature of word.
10. generating means as claimed in claim 9, which is characterized in that the score value extraction module includes:
Feature input submodule, for pinyin string similarity feature, the Chinese character string similarity of each candidate prompt word is special
In the score value prediction model that sign, the temperature feature and the input of album feature are trained in advance, each candidate prompt word is obtained
Score value.
11. such as claim 7-10 any one of them generating means, which is characterized in that described device further includes model training mould
Block, the model training module include:
Training sample acquisition submodule, for obtaining training sample, each training sample includes an instruction in the training sample
Practice the corresponding multiple candidate prompt words of training of search key, the trained search key, and, for the candidate prompt of training
The hits of word;
Label determination sub-module determines the label of each candidate prompt word for the hits according to each candidate prompt word;
Training data feature generates submodule, for being corresponded to using the trained search key and the trained search key
Multiple candidate prompt words of training, generate the training data feature of the candidate prompt word of each training;
Training submodule, for using each data characteristics for training candidate prompt word and the label, being instructed based on preset algorithm
Practice score value prediction model.
12. generating means as claimed in claim 11, which is characterized in that the score value prediction model is LambdaMart moulds
Type, the data characteristics include pinyin string similarity feature, Chinese character string similarity feature, the temperature feature of each candidate prompt word
With album feature, the trained submodule includes:
Training unit is waited for the pinyin string similarity feature using each candidate prompt word of training, each training
Select the Chinese character string similarity feature of prompt word, the temperature feature of each candidate prompt word of training, the candidate prompt of each training
The label of the album feature of word and each trained prompt word trains LambdaMart models based on GBDT algorithms.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810442164.8A CN108763332A (en) | 2018-05-10 | 2018-05-10 | A kind of generation method and device of Search Hints word |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810442164.8A CN108763332A (en) | 2018-05-10 | 2018-05-10 | A kind of generation method and device of Search Hints word |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108763332A true CN108763332A (en) | 2018-11-06 |
Family
ID=64009543
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810442164.8A Pending CN108763332A (en) | 2018-05-10 | 2018-05-10 | A kind of generation method and device of Search Hints word |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108763332A (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111125310A (en) * | 2019-12-24 | 2020-05-08 | 北京百度网讯科技有限公司 | Voice query method and device |
CN111639255A (en) * | 2019-03-01 | 2020-09-08 | 北京字节跳动网络技术有限公司 | Search keyword recommendation method and device, storage medium and electronic equipment |
CN111931500A (en) * | 2020-09-21 | 2020-11-13 | 北京百度网讯科技有限公司 | Search information processing method and device |
CN112541076A (en) * | 2020-11-09 | 2021-03-23 | 北京百度网讯科技有限公司 | Method and device for generating extended corpus of target field and electronic equipment |
CN113312523A (en) * | 2021-07-30 | 2021-08-27 | 北京达佳互联信息技术有限公司 | Dictionary generation and search keyword recommendation method and device and server |
CN113343147A (en) * | 2021-06-18 | 2021-09-03 | 北京百度网讯科技有限公司 | Information processing method, apparatus, device, medium, and program product |
CN113343082A (en) * | 2021-05-25 | 2021-09-03 | 北京字节跳动网络技术有限公司 | Hot field prediction model generation method and device, storage medium and equipment |
CN113553398A (en) * | 2021-07-15 | 2021-10-26 | 杭州网易云音乐科技有限公司 | Search word correcting method and device, electronic equipment and computer storage medium |
WO2022134355A1 (en) * | 2020-12-25 | 2022-06-30 | 平安科技(深圳)有限公司 | Keyword prompt-based search method and apparatus, and electronic device and storage medium |
CN116304277A (en) * | 2023-03-01 | 2023-06-23 | 深圳一资源网络平台有限公司 | Intelligent matching method, system and storage medium based on AI |
TWI823242B (en) * | 2022-01-27 | 2023-11-21 | 中國信託商業銀行股份有限公司 | Text generation method and device |
CN117349400A (en) * | 2023-12-04 | 2024-01-05 | 环球数科集团有限公司 | Prompt word construction method based on AIGC |
CN117744753A (en) * | 2024-02-19 | 2024-03-22 | 浙江同花顺智能科技有限公司 | Method, device, equipment and medium for determining prompt word of large language model |
CN111563361B (en) * | 2020-04-01 | 2024-05-14 | 北京小米松果电子有限公司 | Text label extraction method and device and storage medium |
Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090056273A (en) * | 2007-11-30 | 2009-06-03 | 주식회사 다음커뮤니케이션 | Keyword searching method |
CN103136213A (en) * | 2011-11-23 | 2013-06-05 | 阿里巴巴集团控股有限公司 | Method and device for providing related words |
CN103631794A (en) * | 2012-08-22 | 2014-03-12 | 百度在线网络技术(北京)有限公司 | Method, device and equipment for sorting search results |
CN105224554A (en) * | 2014-06-11 | 2016-01-06 | 阿里巴巴集团控股有限公司 | Search word is recommended to carry out method, system, server and the intelligent terminal searched for |
CN105653697A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Recommended word retrieval method and system |
CN105843850A (en) * | 2016-03-15 | 2016-08-10 | 北京百度网讯科技有限公司 | Searching optimization method and device |
CN107169010A (en) * | 2017-03-31 | 2017-09-15 | 北京奇艺世纪科技有限公司 | A kind of determination method and device of recommendation search keyword |
CN107463704A (en) * | 2017-08-16 | 2017-12-12 | 北京百度网讯科技有限公司 | Searching method and device based on artificial intelligence |
CN107885889A (en) * | 2017-12-13 | 2018-04-06 | 聚好看科技股份有限公司 | Feedback method, methods of exhibiting and the device of search result |
-
2018
- 2018-05-10 CN CN201810442164.8A patent/CN108763332A/en active Pending
Patent Citations (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20090056273A (en) * | 2007-11-30 | 2009-06-03 | 주식회사 다음커뮤니케이션 | Keyword searching method |
CN103136213A (en) * | 2011-11-23 | 2013-06-05 | 阿里巴巴集团控股有限公司 | Method and device for providing related words |
CN103631794A (en) * | 2012-08-22 | 2014-03-12 | 百度在线网络技术(北京)有限公司 | Method, device and equipment for sorting search results |
CN105224554A (en) * | 2014-06-11 | 2016-01-06 | 阿里巴巴集团控股有限公司 | Search word is recommended to carry out method, system, server and the intelligent terminal searched for |
CN105653697A (en) * | 2015-12-30 | 2016-06-08 | 北京奇艺世纪科技有限公司 | Recommended word retrieval method and system |
CN105843850A (en) * | 2016-03-15 | 2016-08-10 | 北京百度网讯科技有限公司 | Searching optimization method and device |
CN107169010A (en) * | 2017-03-31 | 2017-09-15 | 北京奇艺世纪科技有限公司 | A kind of determination method and device of recommendation search keyword |
CN107463704A (en) * | 2017-08-16 | 2017-12-12 | 北京百度网讯科技有限公司 | Searching method and device based on artificial intelligence |
CN107885889A (en) * | 2017-12-13 | 2018-04-06 | 聚好看科技股份有限公司 | Feedback method, methods of exhibiting and the device of search result |
Non-Patent Citations (1)
Title |
---|
谢彬 等: "基于排序学习的混合推荐算法", 《黑龙江科技大学学报》 * |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111639255B (en) * | 2019-03-01 | 2023-12-29 | 北京字节跳动网络技术有限公司 | Recommendation method and device for search keywords, storage medium and electronic equipment |
CN111639255A (en) * | 2019-03-01 | 2020-09-08 | 北京字节跳动网络技术有限公司 | Search keyword recommendation method and device, storage medium and electronic equipment |
CN111125310A (en) * | 2019-12-24 | 2020-05-08 | 北京百度网讯科技有限公司 | Voice query method and device |
CN111563361B (en) * | 2020-04-01 | 2024-05-14 | 北京小米松果电子有限公司 | Text label extraction method and device and storage medium |
CN111931500A (en) * | 2020-09-21 | 2020-11-13 | 北京百度网讯科技有限公司 | Search information processing method and device |
CN111931500B (en) * | 2020-09-21 | 2023-06-23 | 北京百度网讯科技有限公司 | Search information processing method and device |
CN112541076A (en) * | 2020-11-09 | 2021-03-23 | 北京百度网讯科技有限公司 | Method and device for generating extended corpus of target field and electronic equipment |
CN112541076B (en) * | 2020-11-09 | 2024-03-29 | 北京百度网讯科技有限公司 | Method and device for generating expanded corpus in target field and electronic equipment |
WO2022134355A1 (en) * | 2020-12-25 | 2022-06-30 | 平安科技(深圳)有限公司 | Keyword prompt-based search method and apparatus, and electronic device and storage medium |
CN113343082A (en) * | 2021-05-25 | 2021-09-03 | 北京字节跳动网络技术有限公司 | Hot field prediction model generation method and device, storage medium and equipment |
CN113343147B (en) * | 2021-06-18 | 2024-01-19 | 北京百度网讯科技有限公司 | Information processing method, apparatus, device, medium, and program product |
CN113343147A (en) * | 2021-06-18 | 2021-09-03 | 北京百度网讯科技有限公司 | Information processing method, apparatus, device, medium, and program product |
CN113553398A (en) * | 2021-07-15 | 2021-10-26 | 杭州网易云音乐科技有限公司 | Search word correcting method and device, electronic equipment and computer storage medium |
CN113553398B (en) * | 2021-07-15 | 2024-01-26 | 杭州网易云音乐科技有限公司 | Search word correction method, search word correction device, electronic equipment and computer storage medium |
CN113312523A (en) * | 2021-07-30 | 2021-08-27 | 北京达佳互联信息技术有限公司 | Dictionary generation and search keyword recommendation method and device and server |
TWI823242B (en) * | 2022-01-27 | 2023-11-21 | 中國信託商業銀行股份有限公司 | Text generation method and device |
CN116304277B (en) * | 2023-03-01 | 2023-12-15 | 张素愿 | Intelligent matching method, system and storage medium based on AI |
CN116304277A (en) * | 2023-03-01 | 2023-06-23 | 深圳一资源网络平台有限公司 | Intelligent matching method, system and storage medium based on AI |
CN117349400A (en) * | 2023-12-04 | 2024-01-05 | 环球数科集团有限公司 | Prompt word construction method based on AIGC |
CN117349400B (en) * | 2023-12-04 | 2024-02-27 | 环球数科集团有限公司 | Prompt word construction method based on AIGC |
CN117744753A (en) * | 2024-02-19 | 2024-03-22 | 浙江同花顺智能科技有限公司 | Method, device, equipment and medium for determining prompt word of large language model |
CN117744753B (en) * | 2024-02-19 | 2024-05-03 | 浙江同花顺智能科技有限公司 | Method, device, equipment and medium for determining prompt word of large language model |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108763332A (en) | A kind of generation method and device of Search Hints word | |
CN108021616B (en) | Community question-answer expert recommendation method based on recurrent neural network | |
CN107491531B (en) | Chinese network comment sensibility classification method based on integrated study frame | |
CN106815252B (en) | Searching method and device | |
CN106709040B (en) | Application search method and server | |
CN105302810B (en) | A kind of information search method and device | |
CN103631961B (en) | Method for identifying relationship between sentiment words and evaluation objects | |
CN107608960B (en) | Method and device for linking named entities | |
CN106649818A (en) | Recognition method and device for application search intentions and application search method and server | |
Ma et al. | Course recommendation based on semantic similarity analysis | |
CN103869998B (en) | A kind of method and device being ranked up to candidate item caused by input method | |
CN110134792B (en) | Text recognition method and device, electronic equipment and storage medium | |
CN107247751B (en) | LDA topic model-based content recommendation method | |
CN113505204B (en) | Recall model training method, search recall device and computer equipment | |
CN111221962A (en) | Text emotion analysis method based on new word expansion and complex sentence pattern expansion | |
Chen et al. | Automatic key term extraction from spoken course lectures using branching entropy and prosodic/semantic features | |
CN108304373A (en) | Construction method, device, storage medium and the electronic device of semantic dictionary | |
CN111738002A (en) | Ancient text field named entity identification method and system based on Lattice LSTM | |
US20220366282A1 (en) | Systems and Methods for Active Curriculum Learning | |
CN112417119A (en) | Open domain question-answer prediction method based on deep learning | |
Popa et al. | Bart-tl: Weakly-supervised topic label generation | |
CN115526590A (en) | Efficient human-sentry matching and re-pushing method combining expert knowledge and algorithm | |
Monti et al. | An ensemble approach of recurrent neural networks using pre-trained embeddings for playlist completion | |
CN111666374A (en) | Method for integrating additional knowledge information into deep language model | |
CN112597768B (en) | Text auditing method, device, electronic equipment, storage medium and program product |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20181106 |
|
RJ01 | Rejection of invention patent application after publication |