CN110390052A - Search for recommended method, the training method of CTR prediction model, device and equipment - Google Patents

Search for recommended method, the training method of CTR prediction model, device and equipment Download PDF

Info

Publication number
CN110390052A
CN110390052A CN201910675882.4A CN201910675882A CN110390052A CN 110390052 A CN110390052 A CN 110390052A CN 201910675882 A CN201910675882 A CN 201910675882A CN 110390052 A CN110390052 A CN 110390052A
Authority
CN
China
Prior art keywords
word
search word
words
candidate word
candidate
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201910675882.4A
Other languages
Chinese (zh)
Other versions
CN110390052B (en
Inventor
周智毅
梁旭磊
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201910675882.4A priority Critical patent/CN110390052B/en
Publication of CN110390052A publication Critical patent/CN110390052A/en
Application granted granted Critical
Publication of CN110390052B publication Critical patent/CN110390052B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to machine learning techniques field, in particular to a kind of search recommended method, the training method of CTR prediction model, device and equipment, which comprises obtain the candidate word set of search term and search term;At least one set of mode input parameter is generated according to the ordinal characteristics of search term, every group model input parameter is corresponding with last set word and candidate word, and the ordinal characteristics of search term are used to characterize the input sequence for the words that search term is included;CTR prediction model is called, the CTR value of candidate word is calculated according to mode input parameter;According to the CTR value of candidate word, recommended candidate word corresponding with search term is chosen from candidate word set.The application is when CTR is estimated, and in conjunction with the ordinal characteristics of search term, so that the CTR value for each candidate word estimated is more acurrate, to obtain the recommended candidate word for being more in line with user demand, promotes the accuracy of the recommended candidate word provided a user.

Description

Search for recommended method, the training method of CTR prediction model, device and equipment
Technical field
The invention relates to machine learning techniques field, in particular to a kind of search recommended method, CTR prediction model Training method, device and equipment.
Background technique
User inputs after search term in the search box of application program, and application program can recommend some and is somebody's turn to do to user The relevant recommended candidate word of search term carries out content search so that user can directly select some recommended candidate word.
In the related art, it estimates and searches using CTR (Click Through Rate, click-through-rate) prediction model The CTR value of the relevant each candidate word of rope word, is then based on the CTR value of each candidate word, therefrom finds out several recommended candidates Word.For example, selection several maximum candidate words of CTR value, as recommended candidate word.CTR prediction model is being estimated and search term When the CTR value of some relevant candidate word, the characteristic information of the characteristic information and the candidate word in conjunction with the search term is needed, most The CTR value of the candidate word is exported eventually.
However, above-mentioned the relevant technologies are when progress CTR is estimated, the feature of consideration is not comprehensive enough, causes finally to push away to user The accuracy for the recommended candidate word relevant to search term recommended is lower.
Summary of the invention
The embodiment of the present application provides a kind of search recommended method, the training method of CTR prediction model, device and equipment, It can be used for solving in the related technology, since the feature of consideration is not comprehensive enough, cause final recommended to the user related to search term Recommended candidate word the lower problem of accuracy.The technical solution is as follows:
On the one hand, the embodiment of the present application provides a kind of search recommended method, which comprises
The candidate word set of search term and described search word is obtained, includes and described search word in the candidate word set At least one relevant candidate word;
At least one set of mode input parameter, every group model input parameter and one are generated according to the ordinal characteristics of described search word Group described search word and the candidate word are corresponding, and the ordinal characteristics of described search word are for characterizing the word that described search word is included The input sequence of word;
Click-through-rate CTR prediction model is called, the CTR of the candidate word is calculated according to the mode input parameter Value;
According to the CTR value of the candidate word, recommendation corresponding with described search word is chosen from the candidate word set and is waited Select word.
On the other hand, the embodiment of the present application provides a kind of training method of CTR prediction model, which comprises
It obtains historical search word and history corresponding with the historical search word exposes candidate word;
At least one set of training sample, every group of training sample and one group of institute are constructed according to the ordinal characteristics of the historical search word It states historical search word and history exposure candidate word is corresponding, the ordinal characteristics of the historical search word are for characterizing the history The input sequence for the words that search term is included;
CTR prediction model is trained using the training sample, obtains the CTR prediction model for completing training.
Another aspect, the embodiment of the present application provide a kind of search recommendation apparatus, and described device includes:
Search term obtains module, for obtaining search term;
Candidate word obtain module, include for obtaining the candidate word set of described search word, in the candidate word set with At least one relevant candidate word of described search word;
Parameter generation module, for generating at least one set of mode input parameter according to the ordinal characteristics of described search word, often Group model input parameter is corresponding with one group of described search word and the candidate word, and the ordinal characteristics of described search word are for characterizing institute State the input sequence for the words that search term is included;
Model calling module is calculated for calling click-through-rate CTR prediction model according to the mode input parameter To the CTR value of the candidate word;
Selection module is recommended to choose from the candidate word set and search with described for the CTR value according to the candidate word The corresponding recommended candidate word of rope word.
In some possible designs, described device further include:
Phonetic notation processing module, the words for being included to described search word carry out phonetic notation processing, obtain described search word The spelling for the words for being included.
Spelling feature generation module, the spelling of the words for being included according to described search word generate described search word Spelling feature;Wherein, the mode input parameter further includes the spelling feature of described search word.
In some possible designs, described device further include:
Initial obtains module, and the words for being included to described search word carries out phonetic notation processing, obtains described search The initial for the words that word is included.
Initial feature generation module, the initial of the words for being included according to described search word are searched described in generation The initial feature of rope word;Wherein, the mode input parameter further includes the initial feature of described search word.
In some possible designs, the n Q-character that the mode input parameter includes is for adding described search word Ordinal characteristics, the n is integer greater than 1;Described in being more than or equal to as the quantity m for the words that described search word is included It include the preceding n words for the words that described search word is included when n, in the n Q-character, the m is the integer greater than 1; It include that described search word is wrapped when the quantity m for the words that described search word is included is less than the n, in the n Q-character The m words contained, and n-m Q-character after m-th of Q-character in the n Q-character empties.
In some possible designs, the candidate word obtains module and is also used to generate the corresponding key of described search word Word, the keyword include at least one of the following: the words that words, described search word that described search word is included are included The initial for the words that spelling, described search word are included;It is inquired, is obtained in inverted index table using the keyword The candidate word set of described search word is closed comprising corresponding between the keyword and the candidate word in the inverted index table System.
In some possible designs, described device further include:
Collection of document constructs module, includes at least one in the candidate documents set for constructing candidate documents set Document, the document comprise at least one of the following: click volume is greater than the historical search word of threshold value, searches for document in database The title of title and popular search event.
Corresponding relationship generation module, it is described for generating corresponding relationship relevant to the document for each document The words that keyword in corresponding relationship includes at least one of the following: the words for including in the document, includes in the document Spelling, the initial for the words for including in the document, the candidate word in the corresponding relationship includes the document.
Concordance list generation module generates the inverted index for integrating corresponding relationship relevant to each document Table.
In some possible designs, described device further include:
Similarity calculation module, for calculating the similarity between described search word and each candidate word.
Set of words update module, for the similarity to be greater than to the candidate word of default similarity, after being determined as update Candidate word set;Wherein, the CTR prediction model calculates each candidate word in the updated candidate word set CTR value.
Also on the one hand, the embodiment of the present application provides a kind of training device of CTR prediction model, and described device includes:
Word obtains module, candidate for obtaining historical search word and history corresponding with historical search word exposure Word;
Sample constructs module, for constructing at least one set of training sample according to the ordinal characteristics of the historical search word, often Group training sample is corresponding with historical search word described in one group and history exposure candidate word, and the sequence of the historical search word is special It takes over for use in the input sequence for characterizing the words that the historical search word is included;
Model training module obtains completing training for being trained CTR prediction model using the training sample CTR prediction model.
In some possible designs, the word processing module is used to carry out the historical search word with the rank of word It divides, obtains each word that the historical search word is included;According to the sequence for each word that the historical search word is included, Obtain the ordinal characteristics of the historical search word.
In some possible designs, described device further include:
Phonetic notation processing module, the words for being included to the historical search word carry out phonetic notation processing, obtain described go through The spelling for the words that history search term is included.
Spelling feature generation module, the spelling of the words for being included according to the historical search word are gone through described in generation The spelling feature of history search term;Wherein, the training sample further includes the spelling feature of the historical search word.
In some possible designs, described device further include:
Initial obtains module, and the words for being included to the historical search word carries out phonetic notation processing, obtains described The initial for the words that historical search word is included.
Initial feature generation module, the initial of the words for being included according to the historical search word generate institute State the initial feature of historical search word;Wherein, the training sample further includes the initial feature of the historical search word.
In another aspect, the embodiment of the present application provides a kind of computer equipment, the computer equipment include processor and Memory, is stored at least one instruction, at least a Duan Chengxu, code set or instruction set in the memory, and described at least one Item instruction, an at least Duan Chengxu, the code set or instruction set are loaded by the processor and are executed to realize as above-mentioned Search recommended method described in aspect, or realize the training method of the CTR prediction model as described in terms of above-mentioned.
In another aspect, the embodiment of the present application provides a kind of computer readable storage medium, the computer-readable storage Be stored at least one instruction, at least a Duan Chengxu, code set or instruction set in medium, at least one instruction, it is described extremely A few Duan Chengxu, the code set or instruction set are as processor load and executes to realize such as the search recommendation as described in terms of above-mentioned Method, or realize the training method of the CTR prediction model as described in terms of above-mentioned.
Also on the one hand, the embodiment of the present application provides a kind of computer program product, and the computer program product is located When managing device execution, for realizing above-mentioned search recommended method, or the training method of above-mentioned CTR prediction model is realized.
Technical solution provided by the embodiments of the present application may include it is following the utility model has the advantages that
Technical solution provided by the embodiments of the present application is estimating each candidate word of search term by CTR prediction model When CTR value, in the characteristic information for the search term that mode input parameter is included, the ordinal characteristics including search term, the search term Ordinal characteristics characterize the input sequence of the words that search term is included, since the ordinal characteristics of search term can accurately reflect The actual search of user is intended to, therefore when being estimated by CTR prediction model progress CTR, in conjunction with the ordinal characteristics of search term, energy Enough so that the CTR value for each candidate word that CTR prediction model is estimated is more acurrate, to obtain being more in line with user demand Recommended candidate word promotes the accuracy of the recommended candidate word provided a user.
Detailed description of the invention
Fig. 1 illustrates the flow chart of search recommended method provided by the present application;
Fig. 2 is the flow chart for the search recommended method that the application one embodiment provides;
Fig. 3 is the flow chart for the search recommended method that another embodiment of the application provides;
Fig. 4 illustrates a kind of schematic diagram of mode input parameter tag setting;
Fig. 5 illustrates the schematic diagram of another mode input parameter tag setting;
Fig. 6 illustrates a kind of schematic diagram for showing recommended candidate word;
Fig. 7 is the flow chart for the search recommended method that the application one embodiment provides;
Fig. 8 illustrates a kind of schematic diagram of Wide&Deep model;
Fig. 9 illustrates a kind of schematic diagram of Wide&Deep model training process;
Figure 10 illustrates a kind of flow chart of CTR prediction model training method;
Figure 11 is the block diagram for the search recommendation apparatus that the application one embodiment provides;
Figure 12 is the block diagram for the search recommendation apparatus that another embodiment of the application provides;
Figure 13 is the block diagram for the CTR prediction model device that the application one embodiment provides;
Figure 14 is the block diagram for the CTR prediction model device that another embodiment of the application provides;
Figure 15 is the structural block diagram for the computer equipment that the application one embodiment provides.
Specific embodiment
To keep the purposes, technical schemes and advantages of the application clearer, below in conjunction with attached drawing to the application embodiment party Formula is described in further detail.
Firstly, briefly being introduced noun involved in the embodiment of the present application.
CTR (Click Through Rate, click-through-rate): generally, CTR refers to the web advertisement, and (such as picture is wide Announcement, copy, video ads etc.) click arrival rate, be the important indicator for measuring Internet advertising effect.For example, It scans for after inputting keyword in a search engine, then relevant webpage is arranged out in order by the factors such as bid Come, then user can select oneself interested webpage click to enter;Using all numbers come that search out in a website as total Number, the number that user is clicked and entered the Web page account for the ratio of total degree click CTR.In the embodiment of the present application, CTR is The CTR value of each candidate word in the candidate word set of CTR prediction model prediction, CTR value is higher, indicates candidate's copula by user A possibility that click, is higher.
Click volume: also commonly referred to as PV (Page View, page browsing amount), refer to the browsing time of webpage, to measure use The quantity of family access webpage.User's one page of every opening just records 1 PV, and repeatedly opening the same page, then pageview is accumulative.
The executing subject of method provided by the embodiments of the present application, each step can be computer equipment, the computer equipment Refer to have the electronic equipment of data calculating, processing and storage capacity, such as PC (Personal Computer, personal computer) Or server.
Technical solution provided by the embodiments of the present application, by means of AI (Artificial Intelligence, artificial intelligence) Machine learning techniques in field realize search and recommend, so as to obtain the recommended candidate word for being more in line with user demand, Promote the accuracy of the recommended candidate word provided a user.
Above-mentioned artificial intelligence is machine simulation, extension and the extension people controlled using digital computer or digital computer Intelligence, perception environment, obtain knowledge and using Knowledge Acquirement optimum theory, method, technology and application system.Change sentence It talks about, artificial intelligence is a complex art of computer science, it attempts to understand essence of intelligence, and is produced a kind of new The intelligence machine that can be made a response in such a way that human intelligence is similar.Artificial intelligence namely studies the design of various intelligence machines Principle and implementation method make machine have the function of perception, reasoning and decision.
Artificial intelligence technology is an interdisciplinary study, is related to that field is extensive, and the technology of existing hardware view also has software layer The technology in face.Artificial intelligence basic technology generally comprise as sensor, Special artificial intelligent chip, cloud computing, distributed storage, The technologies such as big data processing technique, operation/interactive system, electromechanical integration.Artificial intelligence software's technology mainly includes computer Several general orientation such as vision technique, voice processing technology, natural language processing technique and machine learning/deep learning.
Wherein, ML (Machine Learning, machine learning) is a multi-field cross discipline, is related to probability theory, system Count the multiple subjects such as, Approximation Theory, convextiry analysis, algorithm complexity theory.Specialize in the mankind are simulated or realized to computer how Learning behavior reorganize the existing structure of knowledge to obtain new knowledge or skills and be allowed to constantly improve the performance of itself. Machine learning is the core of artificial intelligence, is the fundamental way for making computer have intelligence, and application is each throughout artificial intelligence A field.Machine learning and deep learning generally include artificial neural network, confidence network, intensified learning, transfer learning, conclusion The technologies such as study, formula teaching habit.
With artificial intelligence technology research and progress, research and application is unfolded in multiple fields in artificial intelligence technology, such as Common smart home, intelligent wearable device, virtual assistant, intelligent sound box, intelligent marketing, unmanned, automatic Pilot, nobody Machine, robot, intelligent medical, intelligent customer service etc., it is believed that with the development of technology, artificial intelligence technology will obtain in more fields To application, and play more and more important value.
Scheme provided by the embodiments of the present application is related to the technologies such as the machine learning of artificial intelligence, especially by following examples It is illustrated.
Referring to FIG. 1, its flow chart for illustrating search recommended method provided by the present application.
Computer equipment can be inquired in inverted index table 12, be searched for after getting search term 11 The candidate word set 13 of word.Later, calculating equipment can be generated the feature of characteristic information and a candidate word including search term The mode input parameter 14 of information, wherein the characteristic information of search term includes the ordinal characteristics of search term.Computer equipment can be with CTR prediction model 15 is called, the CTR value 16 of each candidate word is calculated;By the CTR value 16 of each candidate word, according to from height It is ranked up to low sequence, then selects at least one candidate word for coming front as recommended candidate corresponding with search term Word.Finally, recommended candidate word list 17 is shown in combobox.
Generation for inverted index table, the historical search log 18 of the available user of computer equipment, and this is gone through Data in history search log 18 are filtered cleaning, wherein largely expose the candidate word not clicked on and some different to remove Regular data.The history for being greater than threshold value it is possible to further choose click volume from the historical search log after filtering cleaning is searched Candidate documents set 20 is added in rope word 19.Further, it is also possible to title 21 and popular search event that document in database will be searched for Title 22 be added into candidate documents set 20.It, can also be to candidate text after getting above-mentioned candidate documents set 20 Shelves set 20 is standardized, and obtains standard candidate documents set 23.For each in standard candidate documents set 23 A document generates corresponding relationship relevant to the document, integrates corresponding relationship relevant to each document, generates inverted index table 12。
Training for CTR prediction model, the exposure log 24 of the available candidate word of computer equipment, and from the exposure It is candidate that history exposure corresponding with historical search word is obtained in log 24;Later, it characteristic information based on historical search word and goes through The characteristic information that history exposes candidate word constructs training sample 25, and the characteristic information of the historical search word includes the suitable of historical search word Sequence characteristics, and 26 are trained to CTR prediction model under using the training sample 25 online, the CTR for obtaining completing training is estimated Model 15.It completes CTR prediction model after training, which can be pushed on line, for other meters Calculate machine equipment calls the CTR prediction model 15 to be estimated when needed.
Search recommended method provided by the embodiments of the present application, the executing subject of each step can be such as mobile phone, plate electricity The terminal devices such as brain, PC, are also possible to server.For example, the search pushes away when CTR prediction model is disposed on the terminal device The executing subject for recommending each step of method can be the terminal device;When CTR prediction model is deployed on server apparatus, The executing subject of each step of the search recommended method can be the server apparatus.For ease of description, it is pushed away in following search It recommends in the embodiment of method, explanation is only introduced as computer equipment using the executing subject of each step.
In the following, explanation is introduced to technical scheme by several embodiments.
Referring to FIG. 2, the flow chart of the search recommended method provided it illustrates the application one embodiment.In this implementation In example, mainly it is applied to illustrate in computer equipment described above in this way.This method may include following several A step:
Step 201, the candidate word set of search term and search term is obtained.
It may include at least one candidate word relevant to described search word in above-mentioned search set of words, so that computer is set It is standby that recommended candidate word can be chosen from the candidate word set of the search term.
Step 202, at least one set of mode input parameter is generated according to the ordinal characteristics of search term, every group model inputs parameter Corresponding with last set word and candidate word, the input that the ordinal characteristics of search term are used to characterize the words that search term is included is suitable Sequence.
After getting above-mentioned search term, at least one set of mode input can be generated according to the ordinal characteristics of the search term Parameter.Wherein, every group model input parameter is corresponding with last set word and candidate word.For example, it is assumed that getting and search term phase P candidate word of pass, at this point it is possible to generate p group model input parameter, every group model parameter includes the search term and p time A candidate word in word is selected, p is positive integer.
Computer equipment can extract the ordinal characteristics of the search term after getting above-mentioned search term.Wherein, this is searched The ordinal characteristics of rope word are used to characterize the input sequence for the words that search term is included, namely for characterizing in search term comprising which A little words, and what kind of the input sequence of these words is successively.For example, search term is " I only likes you ", search term is wrapped The input sequence of the words contained is followed successively by " I ", " only ", " liking ", " you ".
By the input sequence difference for the words that search term is included, the search intention of representative user can not yet Together.Therefore, in order to accurately know that the actual search of user is intended to, the sequence that computer equipment can extract the search term is special Sign, is based further on the ordinal characteristics and scans for recommending.
Optionally, the ordinal characteristics of said extracted search term need first to carry out word segmentation processing to search term, obtain the search The words that word is included.Wherein, above-mentioned word segmentation processing can be is divided with words rank, is also possible to the rank of single word To divide.
After extracting the ordinal characteristics of search term, computer equipment can be generated according to the ordinal characteristics of the search term At least one set of mode input parameter.Above-mentioned mode input parameter is for being input to CTR prediction model, to estimate each of search term The CTR value of candidate word.
In the exemplary embodiment, mode input parameter may include the characteristic information of search term and the feature letter of candidate word Breath.
The characteristic information of above-mentioned search term is for characterizing the attributive character that search term has.The characteristic information of the search term can To include the ordinal characteristics of search term.
In one example, the characteristic information of the search term can also include the spelling feature of search term, the search term Spelling feature is used to characterize the sequence of the corresponding spelling of search term.
In this case, before above-mentioned generation at least one set mode input parameter, following steps can also be performed:
(1) words for being included to search term carries out phonetic notation processing, obtains the spelling for the words that search term is included;
(2) spelling for the words for being included according to search term generates the spelling feature of search term.
The above-mentioned words for being included to search term carries out phonetic notation processing, i.e., carries out phonetic notation to the words that search term is included, To obtain the spelling for the words that search term is included.For example, search term included words be " I ", " only ", " liking ", " you ", the spelling for the words that search term is included are " wo ", " zhi ", " xihuan ", " ni ".Included based on the search term The spelling of words may further generate the spelling feature of search term by a series of processing.
Optionally, the spelling feature of the search term can be expressed as one-hot feature, at this point, when getting search term institute When the spelling for the words for including, one-hot coding can be carried out, to generate with the complete of the search term of one-hot character representation Spell feature.
In yet another example, the characteristic information of search term can also include the initial feature of search term, the search term Initial feature be used to characterize the sequence of the corresponding spelling initial of search term.
In this case, before above-mentioned generation at least one set mode input parameter, following steps can also be performed:
(1) words for being included to search term carries out phonetic notation processing, obtains the initial for the words that search term is included;
(2) initial for the words for being included according to search term generates the initial feature of search term.
After the above-mentioned words for being included to search term carries out phonetic notation processing, words that available search term is included it is complete It spells, and then the initial for the words that search term is included can be extracted from the spelling.
Optionally, when being divided with the rank of single word, the initial for the words that search term is included i.e. every The initial of the spelling of a word;It is divided when with the rank of words, and when the search term words that is included includes word, search The initial for the words that word is included includes the initial of the spelling of first character in word.
For example, the spelling for the words that search term is included is " wo ", " zhi ", " xi ", " huan ", " ni ", then search term institute The initial for the words for including is " w ", " z ", " x ", " h ", " n ".In another example the spelling for the words that search term is included is " wo ", " zhi ", " xihuan ", " ni ", the then initial for the words that search term is included are " w ", " z ", " x ", " n ".Based on this The initial for the words that search term is included may further generate the initial feature of search term by a series of processing.
Optionally, the initial feature of the search term can also be expressed as one-hot feature, at this point, when getting search When the initial for the words that word is included, one-hot coding can be carried out, to generate with the search of one-hot character representation The initial feature of word.
In addition, the characteristic information of above-mentioned search term can also include the part of speech feature of the search term, which is used for Characterize the part of speech of search term, such as verb, noun, adjective.
The characteristic information of above-mentioned candidate word is for characterizing the attributive character that candidate word has.The characteristic information of the candidate word can With include the ordinal characteristics of candidate word, the spelling feature of candidate word, the initial feature of candidate word, the part of speech feature of candidate word, Category feature of candidate word etc..
Optionally, mode input parameter is gone back in addition to including the characteristic information of search term and the characteristic information of a candidate word It may include the linked character information between search term and candidate word, the linked character information is for characterizing search term and candidate word Between incidence relation.The linked character information may include correlative character, cross feature, statistical nature etc..Wherein, should Correlative character is for feature relevant characterizing between search term and candidate word, if search term and candidate word include between words Correlation;The cross feature is used to characterize the assemblage characteristic of two or more features between search term and candidate word;It should Statistical nature is used to characterize the feature of the numerical value between search term and candidate word, such as number of the corresponding candidate word of a certain search term The quantity etc. that the quantity for the candidate word being clicked in amount, multiple candidate words, the search term are searched daily.
In addition, mode input parameter can also include user characteristics, which is used to characterize the attribute of search user Feature.For example, the user characteristics may include age characteristics, sex character, regional feature.Certainly, user characteristics are not limited to Age characteristics, sex character, regional feature can also cover any attributive character related to user.
Step 203, CTR prediction model is called, the CTR value of candidate word is calculated according to mode input parameter.
Above-mentioned CTR prediction model is used to calculate the CTR value of candidate word, and CTR value is higher, indicates candidate's copula by user's point A possibility that hitting is higher.
The CTR prediction model can be Wide&Deep model, be also possible to DeepFM (DeepFactorization Machine, depth factor disassembler) model, it can also be AFM (Attentional Factorization Machine, note The Factorization machine for power mechanism of anticipating), it can also be DCN (Data Communication Network, data communication network) mould Type.In addition, the CTR prediction model can also be that other machine learning models, the embodiment of the present application do not limit this.
The CTR prediction model can it is online under be trained, and be pushed on line after the completion of training, so that computer is set It is standby to use.Fig. 6 embodiment is seen below for being discussed in detail for CTR prediction model, details are not described herein again.
Step 204, according to the CTR value of candidate word, recommended candidate word corresponding with search term is chosen from candidate word set.
After getting the CTR value of above-mentioned candidate word, computer equipment can according to the CTR value of the candidate word, select with The corresponding recommended candidate word of search term, and the recommended candidate word is recommended into user.
Optionally, the CTR value of each candidate word can be ranked up by computer equipment according to sequence from high to low, Then at least one candidate word for coming front is selected to come as recommended candidate word corresponding with search term, such as by CTR value Preceding 10 10 candidate words are as recommended candidate word corresponding with search term.
Optionally, CTR value can be greater than the candidate word of threshold value as recommendation corresponding with search term by computer equipment Candidate word.
In conclusion technical solution provided by the embodiments of the present application, is estimating each of search term by CTR prediction model When the CTR value of candidate word, in the characteristic information for the search term that mode input parameter is included, the ordinal characteristics including search term, The ordinal characteristics of the search term characterize the input sequence for the words that search term is included, since the ordinal characteristics of search term can The actual search for accurately reflecting user is intended to, therefore when being estimated by CTR prediction model progress CTR, in conjunction with the suitable of search term The CTR value of sequence characteristics, each candidate word for enabling to CTR prediction model to estimate is more acurrate, to be more in line with The recommended candidate word of user demand promotes the accuracy of the recommended candidate word provided a user.
Referring to FIG. 3, the flow chart of the search recommended method provided it illustrates another embodiment of the application.In this reality It applies in example, is mainly applied to illustrate in computer equipment described above in this way.This method may include as follows Several steps:
Step 301, search term is obtained.
This step is same or similar with step 201 in above-mentioned Fig. 2 embodiment, and details are not described herein again.
Step 302, the corresponding keyword of search term is generated.
Above-mentioned keyword includes at least one of the following: the complete of the words that words, search term that search term is included are included It spells, the initial for the words that search term is included.Computer equipment generates the corresponding keyword of search term, in order in subsequent step In rapid, the candidate word set of search term is obtained based on the keyword.
Optionally, computer equipment can carry out word segmentation processing to search term, to obtain the words for being included of search term; Phonetic notation processing can be carried out to search term, to obtain the spelling of the words for being included of search term;Further, computer equipment can To extract the initial for the words that search term is included from the spelling of the words for being included of search term.
About the introduction of word segmentation processing and phonetic notation processing, the content in embodiment referring to fig. 2, details are not described herein again.
Step 303, it is inquired in inverted index table using keyword, obtains the candidate word set of search term.
Include the corresponding relationship between keyword and candidate word in above-mentioned inverted index table.Each keyword can correspond to A few candidate word.
Illustratively, search term be " I only likes you ", the corresponding keyword of search term may include " I ", " only ", " happiness ", " joyous ", " you ", " wo ", " zhi ", " xi ", " huan ", " ni ", " w ", " z ", " x ", " h ", " n ", " woz ", " wozh " " wozhixi ", " wozhixihuan ", " wozhixihuanni ", " w ", " liking you ", " xihuanni " etc..With keyword For " I ", " only ", " happiness ", " joyous ", " you ", each keyword corresponds at least one candidate word, as shown in following tables -1:
Keyword Candidate word
I Candidate word 1, candidate word 2, candidate word 3
Only Candidate word 1, candidate word 3
Happiness Candidate word 2
Vigorously Candidate word 3
You Candidate word 2, candidate word 3
Table -1
After getting keyword, according to above-mentioned inverted index table you can learn that the candidate word set of search term.
Optionally, it is inquired in inverted index table using keyword, before obtaining the candidate word set of search term, Following steps can also be performed:
(1) candidate documents set is constructed, includes at least one document in the candidate documents set.
Above-mentioned document comprises at least one of the following: click volume is greater than the historical search word of threshold value, search database Chinese The title of shelves and the title of popular search event.
Optionally, the historical search log of the available user of computer equipment, and the data in the log were carried out Filtering is washed, and wherein largely exposes the candidate word not clicked on and some abnormal datas with removal.Further, it is possible to from filtering is crossed The historical search word that click volume is greater than threshold value is chosen in log after washing, and candidate documents set is added.
Computer equipment can also obtain the title of document in search database, include at least one in the search database A document, and each document has respective title.By taking user searches in a certain video class application program as an example, above-mentioned search Video database in the database i.e. application program, includes multiple videos in the video database, and each video have it is each From title.
Computer equipment can also obtain the title of popular search event.Optionally, computer equipment can be from others The title of crawl popular search event in database (database of the database of such as search engine, microblogging).
Optionally, since the content for including in the candidate documents set is from different channels, lead to candidate documents Content in conjunction is more mixed and disorderly.Therefore, after getting above-mentioned candidate documents set, can also to the candidate documents set into Content in candidate documents set is carried out unification, in order to subsequent inquiry by row standardization.
(2) for each document, corresponding relationship relevant to document is generated.
Keyword in above-mentioned corresponding relationship includes at least one of the following: the words for including in document, includes in document The initial for the words for including in the spelling of words, document, the candidate word in corresponding relationship includes document.
Optionally, computer equipment can carry out word segmentation processing to each document and phonetic notation is handled, to obtain in document Words, spelling and the initial for including.
(3) corresponding relationship relevant to each document is integrated, inverted index table is generated.
It illustratively, include 3 documents: document 1, document 2 and document 3 in candidate documents set, in corresponding relationship Keyword be document in include words for.Assuming that the words for being included of 3 documents is as follows:
Document 1: word 1, word 2, word 3;
Document 2: word 1, word 4, word 5;
Document 3: word 2, word 3, word 5, word 6.
Corresponding relationship relevant to 3 documents is integrated, the inverted index table of generation can indicate are as follows:
Keyword Candidate word
Word 1 Document 1, document 2
Word 2 Document 1, document 3
Word 3 Document 1, document 3
Word 4 Document 2
Word 5 Document 2, document 3
Word 6 Document 3
Table -2
It should be noted is that the successive sequence that above-mentioned steps 301,302 and 303 execute is not construed as limiting, Ke Yixian Step 301 is executed, then executes step 302 and 303;Step 302 and 303 can also be first carried out, then executes step 301;It can be with It is performed simultaneously step 301 and 302, and executes step 303 after step 302.
Step 304, the similarity between search term and each candidate word is calculated.
Computer equipment can also calculate the similarity between search term and each candidate word.The similarity is searched for characterizing Similarity degree between rope word and each candidate word.
Optionally, above-mentioned similarity can be calculated using any one following mode: TF-IFD (Term Frequency- Inverse Document Frequency, the reverse file word frequency of word frequency -) algorithm, cosine similarity, Euclidean distance, Hamming distance From.Certainly, the calculating of similarity is not limited to above-mentioned several algorithms, can also cover other calculations that can calculate similarity Method, the embodiment of the present application are not construed as limiting this.
Step 305, similarity is greater than to the candidate word of default similarity, is determined as updated candidate word set.
Further, similarity can be greater than the candidate word of default similarity by computer equipment, as updated candidate Candidate word in set of words.
Optionally, similarity can be ranked up by computer equipment according to sequence from big to small, and by preceding multiple phases Like corresponding candidate word is spent, as the candidate word in updated candidate word set.
By the way that similarity to be greater than to the candidate word of default similarity, as the candidate word in updated candidate word set, It ensure that the correlation between candidate word and search term, the accuracy of candidate word recommendation may further be improved.
Step 306, the ordinal characteristics of search term are extracted.
The ordinal characteristics of the search term are used to characterize the input sequence for the words that search term is included.
This step is same or similar with step 202 in above-mentioned Fig. 2 embodiment, and details are not described herein again.
Step 307, at least one set of mode input parameter is generated according to the ordinal characteristics of search term, every group model inputs parameter Corresponding with last set word and candidate word, the input that the ordinal characteristics of search term are used to characterize the words that search term is included is suitable Sequence.
Mode input parameter can also include the spelling feature and/or initial feature of search term.It is other defeated about model The introduction for entering parameter please refers to the content of step 203 in Fig. 2 embodiment, and details are not described herein again.
Optionally, mode input parameter is gone back in addition to including the characteristic information of search term and the characteristic information of a candidate word It may include the linked character information between search term and candidate word, the linked character information is for characterizing search term and candidate word Between incidence relation.The linked character information may include correlative character, cross feature, statistical nature etc..Wherein, should Correlative character is for feature relevant characterizing between search term and candidate word, if search term and candidate word include between words Correlation;The cross feature is used to characterize the assemblage characteristic of two or more features between search term and candidate word;It should Statistical nature is used to characterize the feature of the numerical value between search term and candidate word, such as number of the corresponding candidate word of a certain search term The number etc. that the quantity for the candidate word being clicked in amount, multiple candidate words, the search term are searched daily.
In addition, mode input parameter can also include user characteristics, which is used to characterize the attribute of search user Feature.For example, the user characteristics may include age characteristics, sex character, regional feature.Certainly, user characteristics are not limited to Age characteristics, sex character, regional feature can also cover any attributive character related to user.
Optionally, the n Q-character that above-mentioned mode input parameter includes is used to add the ordinal characteristics of search term, and n is big It include search term institute in n Q-character if the quantity m for the words that search term is included is more than or equal to n in 1 integer The preceding n words for the words for including, m are the integer greater than 1;Wherein, when m is greater than n, n-th in the words that search term is included Words after a is all truncated, no longer implanting feature position.If the quantity m for the words that search term is included is less than n, n special M words for being included including search term in sign position, and n-m feature locations after m-th of Q-character in n Q-character It is empty.
Illustratively, in conjunction with reference Fig. 4, a kind of signal of mode input parameter tag setting is illustrated Figure.Assuming that n be 10, if search term be " I only likes you ", which is carried out to cut word with the rank of word, obtain " I ", " only ", " happiness ", " joyous ", " you " five words, five words are inputted preceding 5 Q-characters in sequence: " I " makes an addition to first spy Sign position, " only " make an addition to second Q-character, " happiness " makes an addition to third Q-character, " joyous " make an addition to the 4th Q-character, " you " makes an addition to the 5th Q-character.
It, can be by when mode input parameter also includes the spelling feature and initial feature of search term with continued reference to Fig. 4 According to the sequence of the corresponding spelling of the search term, the corresponding spelling of search term is inputted since the 11st Q-character;Then according to searching The sequence of the corresponding spelling initial of rope word inputs the corresponding spelling initial of search term since the 21st Q-character.
Optionally, in conjunction with reference Fig. 5, when mode input parameter includes the characteristic information of search term, the feature letter of candidate word When breath and linked character information between search term and candidate word, the characteristic information of the characteristic information of the search term, candidate word And the linked character information between search term and candidate word is successively added in mode input parameter.
Step 308, CTR prediction model is called, the CTR value of candidate word is calculated according to mode input parameter.
After getting updated candidate word set, computer equipment can call CTR prediction model, and according to each Group model input parameter calculates separately to obtain the CTR value of each candidate word included in updated candidate word set.
Step 309, according to the CTR value of candidate word, recommended candidate word corresponding with search term is chosen from candidate word set.
After the CTR value for getting above-mentioned each candidate word, computer equipment can be according to the CTR of each candidate word Value selects recommended candidate word corresponding with search term, and the recommended candidate word is recommended user.
Optionally, when CTR prediction model is disposed on the terminal device, terminal device is getting each candidate word CTR value can be arranged according to the descending sequence of CTR value, and at least one candidate word conduct that selected and sorted is forward The corresponding recommended candidate word of search term is shown to user.
Optionally, when CTR prediction model is deployed on server apparatus, server apparatus is filtering out recommended candidate word Later, other than each recommended candidate word is sent to terminal device, the sequence knot of these recommended candidate words can also be sent The CTR value of fruit or each recommended candidate word.Above-mentioned ranking results can be to be arranged according to the descending sequence of CTR value Arrange obtained result.Terminal device when showing recommended candidate word, be these recommended candidate words are ranked up according to CTR value it is aobvious Show, such as be ranked up according to the descending sequence of CTR value, CTR value is bigger, sorts more forward, and CTR value is smaller, sequence More rearward.Illustratively, as shown in fig. 6, scanned in the video class application program installed on the terminal device with user for Example.User inputs search term " I only likes you " in the search box 61 of the video class application program;Terminal device can should Search term is sent to server apparatus;Accordingly, server apparatus can receive the search term, and be sieved using CTR prediction model Select recommended candidate word;Later, server apparatus can be by each recommended candidate word, the ranking results of recommended candidate word or each The CTR value of a recommended candidate word is sent to terminal device.Video class application program on terminal device pushes away these according to CTR value It recommends candidate word to be ranked up, and selects highest preceding 5 candidate words of CTR value as recommended candidate word list 62 and be shown to use Family.
The embodiment of the present application is when calculating CTR value, it is contemplated that the ordinal characteristics of search term, so that each candidate word The calculated result of CTR value is more accurate, so that the ranking results of each recommended candidate word are more accurate.
In conclusion technical solution provided by the embodiments of the present application, is estimating each of search term by CTR prediction model When the CTR value of candidate word, in the characteristic information for the search term that mode input parameter is included, the ordinal characteristics including search term, The ordinal characteristics of the search term characterize the input sequence for the words that search term is included, since the ordinal characteristics of search term can The actual search for accurately reflecting user is intended to, therefore when being estimated by CTR prediction model progress CTR, in conjunction with the suitable of search term The CTR value of sequence characteristics, each candidate word for enabling to CTR prediction model to estimate is more acurrate, to be more in line with The recommended candidate word of user demand promotes the accuracy of the recommended candidate word provided a user.
In addition, the characteristic information of search term is in addition to the ordinal characteristics including search term, it can also include the spelling of search term Feature and/or initial feature, compensate for the missing of phonetic, in user's input Pinyin, Chinese or Chinese and pinyin mixing Scene can guarantee final recommendation effect.
In addition, by the candidate word that similarity is greater than to default similarity, as the time in updated candidate word set Word is selected, ensure that the correlation between candidate word and search term, the accuracy of candidate word recommendation may further be improved.
In foregoing embodiments, scans for recommending to based on CTR prediction model, CTR is estimated below by embodiment Explanation is introduced in the training process of model.
Referring to FIG. 7, it illustrates the processes of the training method of the CTR prediction model of the application one embodiment offer Figure.In the present embodiment, mainly it is applied to illustrate in computer equipment described above in this way.This method can be with It comprises the following steps:
Step 701, it obtains historical search word and history corresponding with historical search word exposes candidate word.
The search log and exposure log of the available user of computer equipment, and obtain user's from the search log Historical search word and history corresponding with historical search word expose candidate word.
Optionally, computer equipment can directly extract above-mentioned search log and exposure click logs from running log, It can also be obtained from the offer of other electronic equipments, the embodiment of the present application is not construed as limiting this.
Optionally, after the search log and exposure log for getting above-mentioned user, the data in log can be carried out Cleaning filtering, to remove abnormal data.
Step 702, at least one set of training sample, every group of training sample and one are constructed according to the ordinal characteristics of historical search word Group historical search word and history exposure candidate word are corresponding, and the ordinal characteristics of historical search word are included for characterizing historical search word Words input sequence.
After getting above-mentioned historical search word, at least one can be constructed according to the ordinal characteristics of the historical search word Training sample.Wherein, every group of training sample is corresponding with one group of historical search word and history exposure candidate word.For example, it is assumed that obtaining Candidate word is exposed to q history corresponding with historical search word, at this point it is possible to q group training sample is constructed, every group of training sample packet The history exposure candidate word in the historical search word and q history exposure candidate word is included, q is positive integer.
The training sample may include positive sample, also may include negative sample.In addition, having for each training sample It include a label value.In the embodiment of the present application, whether user is clicked into the label as training sample, for example, user Corresponding label value is 1 when click, and corresponding label value is 0 when user does not click on.
It optionally, can be to history before constructing at least one set of training sample according to the ordinal characteristics of historical search word Search term carries out word segmentation processing, extracts the ordinal characteristics of historical search word.The ordinal characteristics of above-mentioned historical search word are for characterizing The input sequence for the words that historical search word is included.
Optionally, above-mentioned that word segmentation processing is carried out to historical search word, the ordinal characteristics of historical search word are extracted, may include Following sub-step:
(1) historical search word is divided with the rank of word, obtains each word that historical search word is included;
(2) sequence for each word for being included according to historical search word, obtains the ordinal characteristics of historical search word.
It is above-mentioned to divide historical search word with the rank of word, that is, word included in historical search word is all drawn It is divided into self-existent word.When above-mentioned historical search word is English, then each word is regarded as a word.
Since the input sequence of each word is different, the search intention of the user of representative is different.Therefore, in order to accurately obtaining Know that the actual search of the corresponding user of historical search word is intended to, the ordinal characteristics of the historical search word can also be obtained.
Optionally, after the sequence for getting each word that historical search word is included, historical search word can be wrapped The each word contained carries out one-hot coding, to obtain with the ordinal characteristics of the historical search word of one-hot character representation.
In some other embodiments, the ordinal characteristics of above-mentioned historical search word can also indicate in other forms, this Shen Please embodiment this is not construed as limiting.
In the embodiment of the present application, by dividing historical search word with the rank of word, and according to historical search word The sequence for each word for being included obtains the ordinal characteristics of historical search word, on the one hand, avoids since the inaccuracy of participle is right The influence of search result;On the other hand, the sequence based on word scans for, and can accurately know the corresponding user of historical search word Actual search be intended to, so as to provide more reasonable result for user.
In the exemplary embodiment, training sample may include the characteristic information and history exposure candidate word of historical search word Characteristic information.
The characteristic information of above-mentioned historical search word is for characterizing the attributive character that historical search word has.The historical search word Characteristic information include historical search word ordinal characteristics.
In one example, the characteristic information of above-mentioned historical search word further includes the spelling feature of historical search word.
In this case, before above-mentioned building training sample, following steps can also be performed:
(1) words for being included to historical search word carries out phonetic notation processing, obtains the words that historical search word is included Spelling;
(2) spelling for the words for being included according to historical search word generates the spelling feature of historical search word.
In another example, the characteristic information of historical search word further includes the initial feature of historical search word
In this case, before above-mentioned building training sample, following steps can also be performed:
(1) words for being included to historical search word carries out phonetic notation processing, obtains the words that historical search word is included Initial;
(2) initial for the words for being included according to historical search word generates the initial feature of historical search word.
The characteristic information of above-mentioned history exposure candidate word is used to characterize the attributive character that history exposure candidate word has.This is gone through The characteristic information of history exposure candidate word may include the spelling spy of the ordinal characteristics of history exposure candidate word, history exposure candidate word Sign, the initial feature of history exposure candidate word, the part of speech feature of history exposure candidate word, the classification of history exposure candidate word are special Sign etc..
Optionally, in above-mentioned training sample, the ordinal characteristics of historical search word, are gone through the spelling feature of historical search word The initial feature of history search term is arranged successively.
In addition, training sample can also include user characteristics, which is used to characterize the attributive character of search user. For example, the user characteristics may include age characteristics, sex character, regional feature.Certainly, user characteristics are not limited to age spy Sign, sex character, regional feature can also cover any attributive character related to user.
Step 703, CTR prediction model is trained using training sample, obtains the CTR prediction model for completing training.
After getting above-mentioned training sample, CTR prediction model can be trained using training sample, be completed Trained CTR prediction model.
Optionally, it completes CTR prediction model after training, the CTR prediction model of completion training can carried out Verifying.After a successful authentication, which can be pushed on line, so that other computer equipments are in needs When call the CTR prediction model to be estimated.
In conclusion technical solution provided by the embodiments of the present application, by obtaining historical search word and and historical search The corresponding history of word exposes candidate word;Word segmentation processing is carried out to historical search word, extracts the ordinal characteristics of historical search word;Building The training sample of ordinal characteristics comprising historical search word;CTR prediction model is trained using training sample, is completed Trained CTR prediction model.Compared to characteristic processing mode in the related technology, the embodiment of the present application considers in model training The ordinal characteristics of historical search word, so that the model that training is completed can preferably capture the search intention of user, into One step provides more accurate recommendation results.
In the following, introducing the process of the training of model so that CTR prediction model is Wide&Deep model as an example.
Firstly, the framework to Wide&Deep model is simply introduced.In conjunction with reference Fig. 8, one is illustrated The schematic diagram of kind Wide&Deep model.
As shown in figure 8, Wide&Deep model includes Wide model (left-hand component in Fig. 8) and Deep model (in Fig. 8 Right-hand component) two parts.
For Wide model, it is a generalized linear model, can indicates are as follows:
Y=wTx+b;
Wherein, y indicates the output of model, that is, the CTR value estimated;X indicates the feature of the input Wide mode input of model Including low-dimensional dense characteristic and higher-dimension sparse features, x=[x1, x2 ..., xd] indicates that x is the vector of d dimension;W indicates mould The parameter of type, w=[w1, w2 ..., wd];The weight of b expression model.
For Deep model, in input layer 81, its input is discrete features information, can pass through insertion processing 82 (Embedding) it by the discrete features information of input, is calculated by weight matrix, is compressed to the continuous dense vector of low dimensional, So as to save computing resource.The continuous dense vector of these low dimensionals is successively transferred to two layers of hidden layer 83, in hidden layer 83 obtain the output 84 of model into after crossing calculating.
As shown in figure 9, a kind of its schematic diagram for illustrating Wide&Deep model training process.Getting instruction After practicing sample 91, training sample 91 is input in Wide&Deep model.Wherein, Wide model side input continuous feature and Discrete features calculate the weight of Wide model;The discrete features of input are first carried out insertion processing by Deep model side 92, obtain the corresponding value embedded of discrete features.Then the weight of Wide model and value embedded are input to full articulamentum 93 (Concatenated Layer), the active coating 94 Jing Guo multilayer, the activation primitive of active coating 94 can be Relu later, finally As label whether click of the loss layer 95 using user, instructed by minimizing the value of loss function (such as Sigmoid function) Practice model.
In the following, the complete training process to CTR prediction model is simply introduced.As shown in Figure 10, exemplary Show a kind of flow chart of CTR prediction model training method.The search log and exposure of the available user of computer equipment Log 100, and the historical search word and history corresponding with historical search word exposure time of user is obtained from the search log Select word.In addition, computer equipment can also obtain the user data 101 of user, which may include age information, property Other information, regional information etc. can further generate user characteristics according to the user data, to train CTR prediction model.Into One step carries out historical search word to cut word processing 102, extracts the ordinal characteristics of historical search word;Historical search word is infused Sound processing 103, extracts the spelling feature of historical search word, further obtains the initial feature of historical search word.Based on above-mentioned The historical searches words such as the initial feature of the ordinal characteristics of historical search word, the spelling feature of historical search word and historical search word Characteristic information and history exposure candidate word characteristic information generate training sample 104.Later, the training sample can be used This progress model training 105.It is alternatively possible to obtain pre-training model 106, it is trained based on the pre-training model 106.In After the completion of training, model verifying 107 can be carried out to the CTR prediction model of completion training.After a successful authentication, Ke Yijin The CTR prediction model, i.e., be pushed on line, so that other computer equipments are called when needed by row model push 108 The CTR prediction model is estimated, i.e. model service 109.
Following is the application Installation practice, can be used for executing the application embodiment of the method.It is real for the application device Undisclosed details in example is applied, the application embodiment of the method is please referred to.
Figure 11 is please referred to, it illustrates the block diagrams for the search recommendation apparatus that the application one embodiment provides.Device tool Have and realize the above-mentioned exemplary function of search recommended method, the function can also be executed corresponding by hardware realization by hardware Software realization.The device can be computer equipment described above, can be set in computer equipment.The device 1100 can To include: that search term obtains module 1101, candidate word obtains module 1102, parameter generation module 1103, model calling module 1104 choose module 1105 with recommendation.
Search term obtains module 1101, for obtaining search term;
Candidate word obtains module 1102 and wraps in the candidate word set for obtaining the candidate word set of described search word Include at least one candidate word relevant to described search word.
Parameter generation module 1103, for generating at least one set of mode input ginseng according to the ordinal characteristics of described search word Number, every group model input parameter is corresponding with one group of described search word and the candidate word, and the ordinal characteristics of described search word are used for The input sequence for the words that characterization described search word is included.
Model calling module 1104 is calculated described for calling CTR prediction model according to the mode input parameter The CTR value of candidate word.
Recommend to choose module 1105, for the CTR value according to the candidate word, selection and institute from the candidate word set State the corresponding recommended candidate word of search term.
In conclusion technical solution provided by the embodiments of the present application, is estimating each of search term by CTR prediction model When the CTR value of candidate word, in the characteristic information for the search term that mode input parameter is included, the ordinal characteristics including search term, The ordinal characteristics of the search term characterize the input sequence for the words that search term is included, since the ordinal characteristics of search term can The actual search for accurately reflecting user is intended to, therefore when being estimated by CTR prediction model progress CTR, in conjunction with the suitable of search term The CTR value of sequence characteristics, each candidate word for enabling to CTR prediction model to estimate is more acurrate, to be more in line with The recommended candidate word of user demand promotes the accuracy of the recommended candidate word provided a user.
In some possible designs, as shown in figure 12, described device 1100 further include: phonetic notation processing module 1106 and complete Spell feature generation module 1107.
Phonetic notation processing module 1106, the words for being included to described search word carry out phonetic notation processing, obtain described search The spelling for the words that rope word is included.
Spelling feature generation module 1107, the spelling of the words for being included according to described search word are searched described in generation The spelling feature of rope word;Wherein, the mode input parameter further includes the spelling feature of described search word.
In some possible designs, as shown in figure 12, described device 1100 further include: initial obtains 1108 He of module Initial feature generation module 1109.
Initial obtains module 1108, and the words for being included to described search word carries out phonetic notation processing, obtains described The initial for the words that search term is included.
Initial feature generation module 1109, the initial of the words for being included according to described search word generate institute State the initial feature of search term;Wherein, the mode input parameter further includes the initial feature of described search word.
In some possible designs, the n Q-character that the mode input parameter includes is for adding described search word Ordinal characteristics, the n is integer greater than 1;Described in being more than or equal to as the quantity m for the words that described search word is included It include the preceding n words for the words that described search word is included when n, in the n Q-character, the m is the integer greater than 1; It include that described search word is wrapped when the quantity m for the words that described search word is included is less than the n, in the n Q-character The m words contained, and n-m Q-character after m-th of Q-character in the n Q-character empties.
In some possible designs, the candidate word obtains module 1102 and is also used to generate the corresponding pass of described search word Keyword, the keyword include at least one of the following: the words that words, described search word that described search word is included are included Spelling, the initial of the described search word words that is included;It is inquired, is obtained in inverted index table using the keyword To the candidate word set of described search word, comprising corresponding between the keyword and the candidate word in the inverted index table Relationship.
In some possible designs, as shown in figure 12, described device 1100 further include: collection of document constructs module 1110, corresponding relationship generation module 1111 and concordance list generation module 1112.
Collection of document constructs module 1110, includes at least in the candidate documents set for constructing candidate documents set One document, the document comprise at least one of the following: click volume is greater than the historical search word of threshold value, search database Chinese The title of shelves and the title of popular search event.
Corresponding relationship generation module 1111, for generating corresponding relationship relevant to the document for each document, Keyword in the corresponding relationship includes at least one of the following: the words for including in the document, includes in the document The initial for the words for including in the spelling of words, the document, the candidate word in the corresponding relationship includes the document.
Concordance list generation module 1112 generates the row of falling for integrating corresponding relationship relevant with each document Concordance list.
In some possible designs, as shown in figure 12, described device 1100 further include: 1113 He of similarity calculation module Set of words update module 1114.
Similarity calculation module 1113, for calculating the similarity between described search word and each candidate word.
Set of words update module 1114 is determined as more for the similarity to be greater than to the candidate word of default similarity Candidate word set after new;Wherein, the CTR prediction model calculates each time in the updated candidate word set Select the CTR value of word.
Figure 13 is please referred to, it illustrates the block diagrams for the CTR prediction model device that the application one embodiment provides.The device With the above-mentioned exemplary function of CTR prediction model method of realization, the function can also be executed by hardware realization by hardware Corresponding software realization.The device can be computer equipment described above, also can be set in computer equipment.The device 1300 may include: that word obtains module 1310, sample building module 1320 and model training module 1330.
Word obtains module 1310, for obtaining historical search word and history corresponding with historical search word exposure Candidate word.
Sample constructs module 1320, for constructing at least one set of training sample according to the ordinal characteristics of the historical search word This, every group of training sample is corresponding with historical search word described in one group and history exposure candidate word, the historical search word Ordinal characteristics are used to characterize the input sequence for the words that the historical search word is included.
Model training module 1330 obtains completing instruction for being trained CTR prediction model using the training sample Experienced CTR prediction model.
In conclusion technical solution provided by the embodiments of the present application, by obtaining historical search word and and historical search The corresponding history of word exposes candidate word;Word segmentation processing is carried out to historical search word, extracts the ordinal characteristics of historical search word;Building The training sample of ordinal characteristics comprising historical search word;CTR prediction model is trained using training sample, is completed Trained CTR prediction model.Compared to characteristic processing mode in the related technology, the embodiment of the present application considers in model training The ordinal characteristics of historical search word, so that the model that training is completed can preferably capture the search intention of user, into One step provides more accurate recommendation results.
In some possible designs, described device 1300 further include: word segmentation processing module 1340 and ordinal characteristics generate Module 1350.
Word segmentation processing module 1340 obtains the history for dividing the historical search word with the rank of word Each word that search term is included.
Ordinal characteristics generation module 1350, the sequence of each word for being included according to the historical search word, obtains The ordinal characteristics of the historical search word.
In some possible designs, as shown in figure 14, described device 1300 further include: phonetic notation processing module 1360 and complete Spell feature generation module 1370.
Phonetic notation processing module 1360, the words for being included to the historical search word carry out phonetic notation processing, obtain institute State the spelling for the words that historical search word is included.
Spelling feature generation module 1370, the spelling of the words for being included according to the historical search word generate institute State the spelling feature of historical search word;Wherein, the training sample further includes the spelling feature of the historical search word.
In some possible designs, as shown in figure 14, described device 1300 further include: initial obtains 1380 He of module Initial feature generation module 1390.
Initial obtains module 1380, and the words for being included to the historical search word carries out phonetic notation processing, obtains The initial for the words that the historical search word is included.
Initial feature generation module 1390, the initial of the words for being included according to the historical search word are raw At the initial feature of the historical search word;Wherein, the training sample further includes that the initial of the historical search word is special Sign.
It should be noted that device provided by the above embodiment, when realizing its function, only with above-mentioned each functional module It divides and carries out for example, can according to need in practical application and be completed by different functional modules above-mentioned function distribution, The internal structure of equipment is divided into different functional modules, to complete all or part of the functions described above.In addition, Apparatus and method embodiment provided by the above embodiment belongs to same design, and specific implementation process is detailed in embodiment of the method, this In repeat no more.
Figure 15 is please referred to, it illustrates the structural schematic diagrams for the computer equipment that the application one embodiment provides.Specifically For:
The computer equipment 1600 includes central processing unit (CPU) 1601 including random access memory (RAM) 1602 and read-only memory (ROM) 1603 system storage 1604, and connection system storage 1604 and central processing list The system bus 1605 of member 1601.The computer equipment 1600 further includes that letter is transmitted between each device helped in computer The basic input/output (I/O system) 1606 of breath, and for storage program area 1613, application program 1614 and other The mass-memory unit 1607 of program module 1612.
The basic input/output 1606 includes display 1608 for showing information and inputs for user The input equipment 1609 of such as mouse, keyboard etc of information.Wherein the display 1608 and input equipment 1609 all pass through The input and output controller 1610 for being connected to system bus 1605 is connected to central processing unit 1601.The basic input/defeated System 1606 can also include input and output controller 1610 to touch for receiving and handling from keyboard, mouse or electronics out Control the input of multiple other equipment such as pen.Similarly, input and output controller 1610 also provide output to display screen, printer or Other kinds of output equipment.
The mass-memory unit 1607 (is not shown by being connected to the bulk memory controller of system bus 1605 It is connected to central processing unit 1601 out).The mass-memory unit 1607 and its associated computer-readable medium are Computer equipment 1600 provides non-volatile memories.That is, the mass-memory unit 1607 may include such as hard The computer-readable medium (not shown) of disk or CD-ROM drive etc.
Without loss of generality, the computer-readable medium may include computer storage media and communication media.Computer Storage medium includes information such as computer readable instructions, data structure, program module or other data for storage The volatile and non-volatile of any method or technique realization, removable and irremovable medium.Computer storage medium includes RAM, ROM, EPROM, EEPROM, flash memory or other solid-state storages its technologies, CD-ROM, DVD or other optical storages, tape Box, tape, disk storage or other magnetic storage devices.Certainly, skilled person will appreciate that the computer storage medium It is not limited to above-mentioned several.Above-mentioned system storage 1604 and mass-memory unit 1607 may be collectively referred to as memory.
According to the various embodiments of the application, the computer equipment 1600 can also be connected by networks such as internets The remote computer operation being connected on network.Namely computer 1600 can be by the net that is connected on the system bus 1605 Network interface unit 1611 is connected to network 1612, in other words, Network Interface Unit 1611 can be used also to be connected to other classes The network or remote computer system (not shown) of type.
The memory further includes at least one instruction, at least a Duan Chengxu, code set or instruction set, and described at least one Instruction, an at least Duan Chengxu, code set or instruction set are stored in memory, and be configured to by one or more than one It manages device to execute, to realize above-mentioned search recommended method, or realizes above-mentioned CTR prediction model training method.
In the exemplary embodiment, a kind of computer readable storage medium, the computer-readable storage medium are additionally provided Be stored at least one instruction, at least a Duan Chengxu, code set or instruction set in matter, at least one instruction, it is described at least One Duan Chengxu, the code set or described instruction collection realize above-mentioned search recommended method when being executed by processor, or realize Above-mentioned CTR prediction model training method.
In the exemplary embodiment, a kind of computer program product is additionally provided, when the computer program product is processed When device executes, for realizing above-mentioned search recommended method, or above-mentioned CTR prediction model training method is realized.
It should be understood that referenced herein " multiple " refer to two or more."and/or", description association The incidence relation of object indicates may exist three kinds of relationships, for example, A and/or B, can indicate: individualism A exists simultaneously A And B, individualism B these three situations.Character "/" typicallys represent the relationship that forward-backward correlation object is a kind of "or".
The foregoing is merely the exemplary embodiments of the application, all in spirit herein not to limit the application Within principle, any modification, equivalent replacement, improvement and so on be should be included within the scope of protection of this application.

Claims (15)

1. a kind of search recommended method, which is characterized in that the described method includes:
The candidate word set of search term and described search word is obtained, includes related to described search word in the candidate word set At least one candidate word;
At least one set of mode input parameter, every group model input parameter and one group of institute are generated according to the ordinal characteristics of described search word It states search term and the candidate word is corresponding, the ordinal characteristics of described search word are for characterizing the words that described search word is included Input sequence;
Click-through-rate CTR prediction model is called, the CTR value of the candidate word is calculated according to the mode input parameter;
According to the CTR value of the candidate word, recommended candidate word corresponding with described search word is chosen from the candidate word set.
2. the method according to claim 1, wherein described generate at least according to the ordinal characteristics of described search word One group model inputs before parameter, further includes:
The words for being included to described search word carries out phonetic notation processing, obtains the spelling for the words that described search word is included;
According to the spelling for the words that described search word is included, the spelling feature of described search word is generated;
Wherein, the mode input parameter further includes the spelling feature of described search word.
3. the method according to claim 1, wherein described generate at least according to the ordinal characteristics of described search word One group model inputs before parameter, further includes:
The words for being included to described search word carries out phonetic notation processing, obtains the initial for the words that described search word is included;
According to the initial for the words that described search word is included, the initial feature of described search word is generated;
Wherein, the mode input parameter further includes the initial feature of described search word.
4. the method according to claim 1, wherein the n Q-character that the mode input parameter includes is used for The ordinal characteristics of described search word are added, the n is the integer greater than 1;
It include institute in the n Q-character if the quantity m for the words that described search word is included is more than or equal to the n The preceding n words for the words that search term is included is stated, the m is the integer greater than 1;
It include described search word in the n Q-character if the quantity m for the words that described search word is included is less than the n M words for being included, and n-m Q-character after m-th of Q-character in the n Q-character empties.
5. the method according to claim 1, wherein the candidate word set for obtaining described search word, comprising:
The corresponding keyword of described search word is generated, the keyword includes at least one of the following: that described search word is included The initial for the words that the spelling for the words that words, described search word are included, described search word are included;
It is inquired in inverted index table using the keyword, obtains the candidate word set of described search word, the row of falling Include the corresponding relationship between the keyword and the candidate word in concordance list.
6. according to the method described in claim 5, it is characterized in that, described carried out in inverted index table using the keyword Inquiry, before obtaining the candidate word set of described search word, further includes:
Construct candidate documents set, include at least one document in the candidate documents set, the document include it is following at least A kind of: click volume is greater than the historical search word of threshold value, searches for the title of document and the title of popular search event in database;
For each document, generate corresponding relationship relevant to the document, the keyword in the corresponding relationship include with It is at least one of lower: the word for including in the spelling of the words for including in the words that includes in the document, the document, the document The initial of word, the candidate word in the corresponding relationship includes the document;
Integration corresponding relationship relevant to each document, generates the inverted index table.
7. method according to any one of claims 1 to 6, which is characterized in that the candidate word for obtaining described search word After set, further includes:
Calculate the similarity between described search word and each candidate word;
The similarity is greater than to the candidate word of default similarity, is determined as updated candidate word set;
Wherein, the CTR prediction model calculates the CTR value of each candidate word in the updated candidate word set.
8. a kind of training method of click-through-rate CTR prediction model, which is characterized in that the described method includes:
It obtains historical search word and history corresponding with the historical search word exposes candidate word;
Construct at least one set of training sample according to the ordinal characteristics of the historical search word, every group of training sample with gone through described in one group History search term and history exposure candidate word are corresponding, and the ordinal characteristics of the historical search word are for characterizing the historical search The input sequence for the words that word is included;
CTR prediction model is trained using the training sample, obtains the CTR prediction model for completing training.
9. according to the method described in claim 8, it is characterized in that, described construct according to the ordinal characteristics of the historical search word Before at least one set of training sample, further includes:
The historical search word is divided with the rank of word, obtains each word that the historical search word is included;
According to the sequence for each word that the historical search word is included, the ordinal characteristics of the historical search word are obtained.
10. according to the method described in claim 8, it is characterized in that, the ordinal characteristics structure according to the historical search word It builds before at least one set of training sample, further includes:
The words for being included to the historical search word carries out phonetic notation processing, obtains the words that the historical search word is included Spelling;
According to the spelling for the words that the historical search word is included, the spelling feature of the historical search word is generated;
Wherein, the training sample further includes the spelling feature of the historical search word.
11. according to the method described in claim 8, it is characterized in that, the ordinal characteristics structure according to the historical search word It builds before at least one set of training sample, further includes:
The words for being included to the historical search word carries out phonetic notation processing, obtains the words that the historical search word is included Initial;
According to the initial for the words that the historical search word is included, the initial feature of the historical search word is generated;
Wherein, the training sample further includes the initial feature of the historical search word.
12. a kind of search recommendation apparatus, which is characterized in that described device includes:
Search term obtains module, for obtaining search term;
Candidate word obtain module, include for obtaining the candidate word set of described search word, in the candidate word set with it is described At least one relevant candidate word of search term;
Parameter generation module, for generating at least one set of mode input parameter, every group of mould according to the ordinal characteristics of described search word Type input parameter is corresponding with one group of described search word and the candidate word, and the ordinal characteristics of described search word are for characterizing described search The input sequence for the words that rope word is included;
Institute is calculated according to the mode input parameter for calling click-through-rate CTR prediction model in model calling module State the CTR value of candidate word;
Recommend to choose module, for the CTR value according to the candidate word, be chosen and described search word from the candidate word set Corresponding recommended candidate word.
13. a kind of training device of click-through-rate CTR prediction model, which is characterized in that described device includes:
Word obtains module, for obtaining historical search word and history corresponding with historical search word exposure candidate word;
Sample constructs module, for constructing at least one set of training sample, every group of instruction according to the ordinal characteristics of the historical search word White silk sample is corresponding with historical search word described in one group and history exposure candidate word, and the ordinal characteristics of the historical search word are used In the input sequence for characterizing the words that the historical search word is included;
Model training module obtains the CTR for completing training for being trained using the training sample to CTR prediction model Prediction model.
14. a kind of computer equipment, which is characterized in that the computer equipment includes processor and memory, the memory In be stored at least one instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, described at least one Duan Chengxu, the code set or instruction set are loaded as the processor and are executed to realize as described in any one of claim 1 to 7 Method, or realize such as the described in any item methods of claim 8 to 11.
15. a kind of computer readable storage medium, which is characterized in that be stored at least one in the computer readable storage medium Item instruction, at least a Duan Chengxu, code set or instruction set, at least one instruction, an at least Duan Chengxu, the code Collection or instruction set are loaded by processor and are executed to realize method as described in any one of claim 1 to 7, or are realized as weighed Benefit requires 8 to 11 described in any item methods.
CN201910675882.4A 2019-07-25 2019-07-25 Search recommendation method, training method, device and equipment of CTR (China train redundancy report) estimation model Active CN110390052B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910675882.4A CN110390052B (en) 2019-07-25 2019-07-25 Search recommendation method, training method, device and equipment of CTR (China train redundancy report) estimation model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910675882.4A CN110390052B (en) 2019-07-25 2019-07-25 Search recommendation method, training method, device and equipment of CTR (China train redundancy report) estimation model

Publications (2)

Publication Number Publication Date
CN110390052A true CN110390052A (en) 2019-10-29
CN110390052B CN110390052B (en) 2022-10-28

Family

ID=68287267

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910675882.4A Active CN110390052B (en) 2019-07-25 2019-07-25 Search recommendation method, training method, device and equipment of CTR (China train redundancy report) estimation model

Country Status (1)

Country Link
CN (1) CN110390052B (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143543A (en) * 2019-12-04 2020-05-12 北京达佳互联信息技术有限公司 Object recommendation method, device, equipment and medium
CN112084406A (en) * 2020-09-07 2020-12-15 维沃移动通信有限公司 Short message processing method and device, electronic equipment and storage medium
CN112434184A (en) * 2020-12-15 2021-03-02 四川长虹电器股份有限公司 Deep interest network sequencing method based on historical movie posters
CN112559895A (en) * 2021-02-19 2021-03-26 深圳平安智汇企业信息管理有限公司 Data processing method and device, electronic equipment and storage medium
CN113220983A (en) * 2020-02-06 2021-08-06 北京沃东天骏信息技术有限公司 Deep learning-based item selection method and device
CN113409284A (en) * 2021-06-28 2021-09-17 北京百度网讯科技有限公司 Circuit board fault detection method, device, equipment and storage medium
CN113495986A (en) * 2020-03-20 2021-10-12 华为技术有限公司 Data processing method and device
CN113626683A (en) * 2021-06-30 2021-11-09 北京三快在线科技有限公司 CTR (China train redundancy) estimation processing method and device, electronic equipment and storage medium
CN114238705A (en) * 2020-09-09 2022-03-25 北京搜狗科技发展有限公司 Related search recommendation method and device and electronic equipment
CN114357205A (en) * 2021-12-24 2022-04-15 北京达佳互联信息技术有限公司 Candidate word mining method and device, electronic equipment and storage medium
CN118295968A (en) * 2022-08-01 2024-07-05 荣耀终端有限公司 File searching method and related device

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360358A (en) * 2011-09-28 2012-02-22 百度在线网络技术(北京)有限公司 Keyword recommendation method and system
CN103365839A (en) * 2012-03-26 2013-10-23 腾讯科技(深圳)有限公司 Recommendation search method and device for search engines
WO2013166916A1 (en) * 2012-05-07 2013-11-14 深圳市世纪光速信息技术有限公司 Information search method and server
CN104636334A (en) * 2013-11-06 2015-05-20 阿里巴巴集团控股有限公司 Keyword recommending method and device
CN106339404A (en) * 2016-06-30 2017-01-18 北京奇艺世纪科技有限公司 Search word recognition method and device
CN106339510A (en) * 2016-10-28 2017-01-18 北京百度网讯科技有限公司 The click prediction method and device based on artificial intelligence
CN106547887A (en) * 2016-10-27 2017-03-29 北京百度网讯科技有限公司 Method and apparatus is recommended in search based on artificial intelligence
CN107273404A (en) * 2017-04-26 2017-10-20 努比亚技术有限公司 Appraisal procedure, device and the computer-readable recording medium of search engine
CN108984656A (en) * 2018-06-28 2018-12-11 北京春雨天下软件有限公司 Medicine label recommendation method and device
CN109145245A (en) * 2018-07-26 2019-01-04 腾讯科技(深圳)有限公司 Predict method, apparatus, computer equipment and the storage medium of clicking rate
CN109189990A (en) * 2018-07-25 2019-01-11 北京奇艺世纪科技有限公司 A kind of generation method of search term, device and electronic equipment
CN109582903A (en) * 2018-11-29 2019-04-05 广州市百果园信息技术有限公司 A kind of method, apparatus, equipment and storage medium that information is shown

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102360358A (en) * 2011-09-28 2012-02-22 百度在线网络技术(北京)有限公司 Keyword recommendation method and system
CN103365839A (en) * 2012-03-26 2013-10-23 腾讯科技(深圳)有限公司 Recommendation search method and device for search engines
WO2013166916A1 (en) * 2012-05-07 2013-11-14 深圳市世纪光速信息技术有限公司 Information search method and server
CN104636334A (en) * 2013-11-06 2015-05-20 阿里巴巴集团控股有限公司 Keyword recommending method and device
CN106339404A (en) * 2016-06-30 2017-01-18 北京奇艺世纪科技有限公司 Search word recognition method and device
CN106547887A (en) * 2016-10-27 2017-03-29 北京百度网讯科技有限公司 Method and apparatus is recommended in search based on artificial intelligence
CN106339510A (en) * 2016-10-28 2017-01-18 北京百度网讯科技有限公司 The click prediction method and device based on artificial intelligence
CN107273404A (en) * 2017-04-26 2017-10-20 努比亚技术有限公司 Appraisal procedure, device and the computer-readable recording medium of search engine
CN108984656A (en) * 2018-06-28 2018-12-11 北京春雨天下软件有限公司 Medicine label recommendation method and device
CN109189990A (en) * 2018-07-25 2019-01-11 北京奇艺世纪科技有限公司 A kind of generation method of search term, device and electronic equipment
CN109145245A (en) * 2018-07-26 2019-01-04 腾讯科技(深圳)有限公司 Predict method, apparatus, computer equipment and the storage medium of clicking rate
CN109582903A (en) * 2018-11-29 2019-04-05 广州市百果园信息技术有限公司 A kind of method, apparatus, equipment and storage medium that information is shown

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
YIQUN LIU 等: "How do users describe their information need: Query recommendation based on snippet click model", 《EXPERT SYSTEMS WITH APPLICATION》 *
王忠义 等: "基于点击率预测的微信公众号广告精准投放研究", 《知识管理论坛》 *
石雁 等: "基于朴素贝叶斯点击预测的查询推荐方法", 《计算机应用与软件》 *
纪文迪 等: "广告点击率估算技术综述", 《华东师范大学学报(自然科学版)》 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111143543A (en) * 2019-12-04 2020-05-12 北京达佳互联信息技术有限公司 Object recommendation method, device, equipment and medium
CN113220983A (en) * 2020-02-06 2021-08-06 北京沃东天骏信息技术有限公司 Deep learning-based item selection method and device
CN113495986A (en) * 2020-03-20 2021-10-12 华为技术有限公司 Data processing method and device
CN112084406A (en) * 2020-09-07 2020-12-15 维沃移动通信有限公司 Short message processing method and device, electronic equipment and storage medium
CN114238705A (en) * 2020-09-09 2022-03-25 北京搜狗科技发展有限公司 Related search recommendation method and device and electronic equipment
CN112434184A (en) * 2020-12-15 2021-03-02 四川长虹电器股份有限公司 Deep interest network sequencing method based on historical movie posters
CN112434184B (en) * 2020-12-15 2022-03-01 四川长虹电器股份有限公司 Deep interest network sequencing method based on historical movie posters
CN112559895A (en) * 2021-02-19 2021-03-26 深圳平安智汇企业信息管理有限公司 Data processing method and device, electronic equipment and storage medium
CN112559895B (en) * 2021-02-19 2021-05-18 深圳平安智汇企业信息管理有限公司 Data processing method and device, electronic equipment and storage medium
CN113409284A (en) * 2021-06-28 2021-09-17 北京百度网讯科技有限公司 Circuit board fault detection method, device, equipment and storage medium
CN113409284B (en) * 2021-06-28 2023-10-20 北京百度网讯科技有限公司 Circuit board fault detection method, device, equipment and storage medium
CN113626683A (en) * 2021-06-30 2021-11-09 北京三快在线科技有限公司 CTR (China train redundancy) estimation processing method and device, electronic equipment and storage medium
CN113626683B (en) * 2021-06-30 2023-05-30 北京三快在线科技有限公司 CTR (control parameter) estimation processing method and device, electronic equipment and storage medium
CN114357205A (en) * 2021-12-24 2022-04-15 北京达佳互联信息技术有限公司 Candidate word mining method and device, electronic equipment and storage medium
CN118295968A (en) * 2022-08-01 2024-07-05 荣耀终端有限公司 File searching method and related device

Also Published As

Publication number Publication date
CN110390052B (en) 2022-10-28

Similar Documents

Publication Publication Date Title
CN110390052A (en) Search for recommended method, the training method of CTR prediction model, device and equipment
US11334635B2 (en) Domain specific natural language understanding of customer intent in self-help
US20220245213A1 (en) Content recommendation method and apparatus, electronic device, and storage medium
CN112131350B (en) Text label determining method, device, terminal and readable storage medium
CN110032632A (en) Intelligent customer service answering method, device and storage medium based on text similarity
Hillard et al. Computer-assisted topic classification for mixed-methods social science research
CN111143576A (en) Event-oriented dynamic knowledge graph construction method and device
CN111382361A (en) Information pushing method and device, storage medium and computer equipment
CN106355446B (en) A kind of advertisement recommender system of network and mobile phone games
Shen et al. A voice of the customer real-time strategy: An integrated quality function deployment approach
CN112148831B (en) Image-text mixed retrieval method and device, storage medium and computer equipment
EP3732592A1 (en) Intelligent routing services and systems
CN111563158A (en) Text sorting method, sorting device, server and computer-readable storage medium
Bedi et al. CitEnergy: A BERT based model to analyse Citizens’ Energy-Tweets
CN112633690A (en) Service personnel information distribution method, service personnel information distribution device, computer equipment and storage medium
Xiong et al. DNCP: An attention-based deep learning approach enhanced with attractiveness and timeliness of News for online news click prediction
CN116010696A (en) News recommendation method, system and medium integrating knowledge graph and long-term interest of user
Smailović Sentiment analysis in streams of microblogging posts
CN114255067A (en) Data pricing method and device, electronic equipment and storage medium
Voronov et al. Forecasting popularity of news article by title analyzing with BN-LSTM network
Gupta et al. Real-time sentiment analysis of tweets: A case study of Punjab elections
Wang et al. Feeding what you need by understanding what you learned
Liu Python Machine Learning By Example: Implement machine learning algorithms and techniques to build intelligent systems
Liu et al. Answer Keyword Generation for Community Question Answering by Multiaspect Gamma–Poisson Matrix Completion
Hussain et al. A Novel Data Extraction Framework Using Natural Language Processing (DEFNLP) Techniques

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant