CN107220384A - A kind of search word treatment method, device and computing device based on correlation - Google Patents
A kind of search word treatment method, device and computing device based on correlation Download PDFInfo
- Publication number
- CN107220384A CN107220384A CN201710515009.XA CN201710515009A CN107220384A CN 107220384 A CN107220384 A CN 107220384A CN 201710515009 A CN201710515009 A CN 201710515009A CN 107220384 A CN107220384 A CN 107220384A
- Authority
- CN
- China
- Prior art keywords
- search word
- word
- keyword
- keyword sequence
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of search word treatment method, device and computing device based on correlation, this method includes:The search daily record of each user is obtained to extract available search word;Word segmentation processing is carried out to each available search word, to obtain its corresponding one or more Feature Words;Feature Words are changed to generate corresponding keyword, one or more corresponding keywords are combined, to form keyword sequence corresponding with available search word;From the corresponding available search word of each keyword sequence, frequency of occurrence highest available search word is selected as the predetermined search word of the keyword sequence;Each keyword sequence is separately input into correlation calculations model to be trained, according to the Sequential output of the correlation from big to small first quantity keyword sequence related to the keyword sequence inputted;The keyword sequence that first quantity is exported replaces with its corresponding predetermined search word, the corresponding relation formed between keyword sequence and the first quantity predetermined search word.
Description
Technical field
The present invention relates to Internet technical field, more particularly to a kind of search word treatment method, device based on correlation
And computing device.
Background technology
With the fast development of Internet technology, it is that work and life are brought that increasing people, which starts to enjoy internet,
Various facilities.During than if desired for obtaining information, it can carry out and search using search engine by keying in search term in a browser
The related information search of rope word.And user search for a keyword when, also often be intended to search for its associative key, for example with
Family inputs " java ", it is understood that there may be more meet the keyword of its intention, such as " java web ", " java rear ends ".Therefore, for
Family keyword, with reference to the contact between different keywords, accurately providing its relative words can help user to save input time, together
Shi Tisheng conversion ratios.
Current main stream approach is follow-up word combination collaborative filtering, and main thought is:" three states are inputted in view of user
Will ", obtains inputting " real Three Kindoms is unparalleled " again in a few minutes of search result, it is believed that the user with identical follow-up word looks into
Asking entry has certain similarity, if user input data enough, the phase of these entries can be provided based on collaborative filtering
Close search term.However, follow-up word combination collaborative filtering still suffers from no small defect, particularly in the website of recruitment industry
In portion's search, problem becomes apparent.
Compared to large-scale website, less, user's inquiry entry homogeneity is serious, therefore is permitted for the recruitment industry search data scale of construction
Many entries may be without follow-up word.Moreover, as the user of recruiter, its search law does not meet " same user's search term
All it is related " this precondition, the search content of this kind of user generally not theed least concerned, now failed using follow-up word.This
Outside, popular word such as " java ", " product manager " vocabulary usually turn into other words follow-up word, this to unexpected winner relative words not
Profit, but punishment is applied to popular word and manual setting weight is needed, the difficulty of project is increased, and find to be difficult in actual items
Control.
The content of the invention
Therefore, the present invention provide it is a kind of based on correlation search term processing technical scheme, with try hard to solve or extremely
It is few to alleviate the problem of existing above.
According to an aspect of the present invention there is provided a kind of search word treatment method based on correlation, suitable for being set in calculating
Standby middle execution, this method comprises the following steps:The search daily record of each user in multiple users is obtained, being extracted from search daily record can
Use search term;Word segmentation processing is carried out to each available search word, to obtain its corresponding one or more Feature Words;By one or
More Feature Words are changed to generate corresponding keyword respectively, and combine one or more corresponding keywords, with
Form keyword sequence corresponding with available search word;From the available search word corresponding to each keyword sequence, selection occurs
Frequency highest available search word as the keyword sequence predetermined search word;Each keyword sequence is separately input to correlation
Property computation model in be trained, according to the Sequential output of correlation from big to small to input keyword sequence it is related first
Quantity keyword sequence;The keyword sequence that first quantity is exported replaces with its corresponding predetermined search word, so that
The corresponding relation formed between keyword sequence and the first quantity predetermined search word.
Alternatively, in the search word treatment method based on correlation according to the present invention, being extracted from search daily record can
The step of with search term, includes:Initial search word is obtained from search daily record and counts its quantity;If quantity is more than the first numerical value,
Then the initial search word of the corresponding user of quantity is directly deleted;Count the search time of all not deleted each initial search words
Number;The initial search word that searching times are less than second value is filtered out, remaining initial search word is regard as available search word.
Alternatively, in the search word treatment method based on correlation according to the present invention, by one or more features
The step of word is changed to generate corresponding keyword respectively includes:Reject in one or more Feature Words and belong to meaningless
The Feature Words of word or sensitive word;Remaining Feature Words carry out synonym conversion after rejecting, to generate corresponding keyword.
Alternatively, in the search word treatment method based on correlation according to the present invention, it is one or more right to combine
The keyword answered, is included with being formed the step of keyword sequence corresponding with available search word:To one or more corresponding
Keyword carries out text ascending order arrangement;To the keyword after arrangement, it will be connected between two neighboring keyword with the first symbol
Connect, to form keyword sequence corresponding with available search word.
Alternatively, in the search word treatment method based on correlation according to the present invention, the first symbol is underscore
Alternatively, in the search word treatment method based on correlation according to the present invention, in formation and available search word
After the step of corresponding keyword sequence, in addition to:Count the number of times that each keyword sequence repeats;If number of times is less than the
One numerical value, then reject the corresponding keyword sequence of number of times;If number of times is not less than the first numerical value, retain the corresponding keyword of number of times
Sequence.
Alternatively, according to the present invention the search word treatment method based on correlation in, when receive user key entry
During query search word, this method also includes:Query search word is handled, to form keyword corresponding with query search word
Sequence;The first corresponding quantity predetermined search word is obtained according to keyword sequence, and is searched from first quantity is specific
Preceding second quantity predetermined search word is selected in rope word, the second quantity is not more than the first quantity;Search second quantity is specific
Rope word recommends the user as the related term of query search word.
According to a further aspect of the invention there is provided a kind of search term processing unit based on correlation, suitable for residing in
In computing device, the device includes extraction module, word-dividing mode, modular converter, selecting module, training module and replacement module.
Wherein, extraction module is suitable to the search daily record for obtaining each user in multiple users, and available search word is extracted from search daily record;Point
Word module is suitable to carry out word segmentation processing to each available search word, to obtain its corresponding one or more Feature Words;Modulus of conversion
Block is suitable to be changed one or more Feature Words respectively to generate corresponding keyword, and combines one or more right
The keyword answered, to form keyword sequence corresponding with available search word;Selecting module is suitable to right from each keyword sequence institute
In the available search word answered, frequency of occurrence highest available search word is selected as the predetermined search word of the keyword sequence;Instruction
Practice module to be suitable to be separately input to each keyword sequence to be trained in correlation calculations model, according to correlation from big to small
Sequential output and related the first quantity keyword sequence of keyword sequence that inputs;Replacement module is suitable to the first quantity
The keyword sequence of individual output replaces with its corresponding predetermined search word, so as to form keyword sequence and the first quantity spy
Determine the corresponding relation between search term.
Alternatively, in the search term processing unit based on correlation according to the present invention, extraction module is further adapted for:
Initial search word is obtained from search daily record and its quantity is counted when quantity is more than the first numerical value, by the corresponding user's of quantity
Initial search word is directly deleted;Count the searching times of all not deleted each initial search words;Searching times are filtered out to be less than
The initial search word of second value, regard remaining initial search word as available search word.
Alternatively, in the search term processing unit based on correlation according to the present invention, modular converter is further adapted for:
Reject the Feature Words for belonging to meaningless word or sensitive word in one or more Feature Words;Remaining Feature Words are carried out after rejecting
Synonym is converted, to generate corresponding keyword.
Alternatively, in the search term processing unit based on correlation according to the present invention, modular converter is further adapted for:
Text ascending order arrangement is carried out to one or more corresponding keywords;To the keyword after arrangement, by two neighboring keyword
Between be attached with the first symbol, to form corresponding with available search word keyword sequence.
Alternatively, in the search term processing unit based on correlation according to the present invention, the first symbol is underscore.
Alternatively, in the search term processing unit based on correlation according to the present invention, in addition to processing module, fit
In:Count the number of times that each keyword sequence repeats;When number of times is less than the first numerical value, the corresponding crucial word order of number of times is rejected
Row;When number of times is not less than the first numerical value, retain the corresponding keyword sequence of number of times.
Alternatively, in the search term processing unit based on correlation according to the present invention, in addition to recommending module, fit
In:When receiving the query search word of user's key entry, query search word is handled, it is corresponding with query search word to be formed
Keyword sequence;Corresponding the first quantity predetermined search word is obtained according to keyword sequence, and from first quantity
Preceding second quantity predetermined search word is selected in individual predetermined search word, the second quantity is not more than the first quantity;By second quantity
Individual predetermined search word recommends the user as the related term of query search word.
According to a further aspect of the invention there is provided a kind of computing device, including according to the present invention based on correlation
Search term processing unit.
According to a further aspect of the invention there is provided a kind of computing device, including one or more processors, memory with
And one or more programs, wherein one or more program storages in memory and are configured as by one or more processors
Perform, one or more programs include the instruction for being used to perform the search word treatment method based on correlation according to the present invention.
According to a further aspect of the invention, a kind of computer-readable storage medium for storing one or more programs is also provided
Matter, one or more programs include instruction, and instruction is when executed by a computing apparatus so that computing device is according to the present invention's
Search word treatment method based on correlation.
The technical scheme handled according to the search term based on correlation of the present invention, first each available search word to user
Word segmentation processing is carried out to obtain corresponding one or more Feature Words, each Feature Words are changed to generate corresponding key
Word, combines each keyword to form keyword sequence corresponding with available search word, from can use corresponding to each keyword sequence
In search term, frequency of occurrence highest available search word is selected as the predetermined search word of the keyword sequence, by each keyword
Sequence is separately input to be trained in correlation calculations model, according to the Sequential output of correlation from big to small and the pass of input
The first related quantity keyword sequence of keyword sequence, the keyword sequence that the first quantity is exported replaces with its correspondence
Predetermined search word, the corresponding relation formed between keyword sequence and the first quantity predetermined search word.In above-mentioned technical side
In case, the correlation calculations model only considers the distance between search term, and when window is set to infinity, user's is irregular
Search will not influence its correlation calculations, while also being had a clear superiority in the processing of unexpected winner vocabulary, without to popular vocabulary
It is artificial to adjust power.In addition, after formation keyword sequence corresponding with available search word, repeating for each keyword sequence
Existing number of times is counted, and is rejected for the keyword sequence of the numerical value of number of times first, without entering to all keyword sequences
Row subsequent treatment, reduces computation complexity and time cost.In addition, during the available search word of user is the advance daily record from search
Extract, in extraction process can filtering spam user and the low search data of searching times, ensure result effectively and
While accurately, processing speed is further increased.
Brief description of the drawings
In order to realize above-mentioned and related purpose, some illustrative sides are described herein in conjunction with following description and accompanying drawing
Face, these aspects indicate the various modes of principles disclosed herein that can put into practice, and all aspects and its equivalent aspect
It is intended to fall under in the range of theme claimed.The following detailed description by being read in conjunction with the figure, the disclosure it is above-mentioned
And other purposes, feature and advantage will be apparent.Throughout the disclosure, identical reference generally refers to identical
Part or element.
Fig. 1 shows the structured flowchart of computing device 100 according to an embodiment of the invention;
Fig. 2 shows the flow of the search word treatment method 200 according to an embodiment of the invention based on correlation
Figure;
Fig. 3 shows the signal of the search term processing unit 300 according to an embodiment of the invention based on correlation
Figure;
Fig. 4 shows showing for the search term processing unit 400 based on correlation according to still another embodiment of the invention
It is intended to;And
Fig. 5 shows showing for the search term processing unit 500 based on correlation according to still another embodiment of the invention
It is intended to.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although showing the disclosure in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Complete conveys to those skilled in the art.
Fig. 1 is the block diagram of Example Computing Device 100.In basic configuration 102, computing device 100, which is typically comprised, is
System memory 106 and one or more processor 104.Memory bus 108 can be used in processor 104 and system storage
Communication between device 106.
Depending on desired configuration, processor 104 can be any kind of processing, include but is not limited to:Microprocessor
(μ P), microcontroller (μ C), digital information processor (DSP) or any combination of them.Processor 104 can be included such as
The cache of one or more rank of on-chip cache 110 and second level cache 112 etc, processor core
114 and register 116.The processor core 114 of example can include arithmetic and logical unit (ALU), floating-point unit (FPU),
Digital signal processing core (DSP core) or any combination of them.The Memory Controller 118 of example can be with processor
104 are used together, or in some implementations, Memory Controller 118 can be an interior section of processor 104.
Depending on desired configuration, system storage 106 can be any type of memory, include but is not limited to:Easily
The property lost memory (RAM), nonvolatile memory (ROM, flash memory etc.) or any combination of them.System is stored
Device 106 can include operating system 120, one or more apply 122 and routine data 124.In some embodiments,
It may be arranged to be operated using routine data 124 on an operating system using 122.
Computing device 100 can also include contributing to from various interface equipments (for example, output equipment 142, Peripheral Interface
144 and communication equipment 146) to basic configuration 102 via the communication of bus/interface controller 130 interface bus 140.Example
Output equipment 142 include graphics processing unit 148 and audio treatment unit 150.They can be configured as contributing to via
One or more A/V port 152 is communicated with the various external equipments of such as display or loudspeaker etc.Outside example
If interface 144 can include serial interface controller 154 and parallel interface controller 156, they can be configured as contributing to
Via one or more I/O port 158 and such as input equipment (for example, keyboard, mouse, pen, voice-input device, touch
Input equipment) or the external equipment of other peripheral hardwares (such as printer, scanner) etc communicated.The communication of example is set
Standby 146 can include network controller 160, and it can be arranged to be easy to via one or more COM1 164 and one
The communication that other individual or multiple computing devices 162 pass through network communication link.
Network communication link can be an example of communication media.Communication media can be generally presented as in such as carrier wave
Or computer-readable instruction in the modulated data signal of other transmission mechanisms etc, data structure, program module, and can
With including any information delivery media." modulated data signal " can such signal, one in its data set or many
It is individual or it change can the mode of coding information in the signal carry out.As nonrestrictive example, communication media can be with
Include the wire medium of such as cable network or private line network etc, and it is such as sound, radio frequency (RF), microwave, infrared
(IR) the various wireless mediums or including other wireless mediums.Term computer-readable medium used herein can include depositing
Both storage media and communication media.
Computing device 100 can be implemented as server, such as file server, database server, application program service
Device and WEB server etc., can also be embodied as a part for portable (or mobile) electronic equipment of small size, these electronic equipments
Can be such as cell phone, personal digital assistant (PDA), personal media player device, wireless network browsing apparatus, individual
Helmet, application specific equipment or the mixing apparatus of any of the above function can be included.Computing device 100 can also be real
It is now to include desktop computer and the personal computer of notebook computer configuration.In certain embodiments, computing device 100 is real
It is now server, the server is configured as performing the search word treatment method 200 based on correlation according to the present invention.Using
122 include the search term processing unit 300 based on correlation according to the present invention.
Fig. 2 shows the flow chart of the search word treatment method 200 according to an embodiment of the invention based on correlation.
Search word treatment method 200 based on correlation is suitable in the computing device for being embodied as server that (such as the calculating shown in Fig. 1 is set
Standby 100) middle execution.
As shown in Fig. 2 method 200 starts from step S210.In step S210, the search of each user in multiple users is obtained
Daily record, available search word is extracted from search daily record.According to one embodiment of present invention, can be in the following manner from search day
Available search word is extracted in will.Initial search word is obtained from search daily record first and count its quantity, if the quantity is more than the
One numerical value, then directly delete the initial search word of the corresponding user of the quantity.Then, count all not deleted each original to search
The searching times of rope word, filter out searching times be less than second value initial search word, using remaining initial search word as
Available search word.Wherein, the first numerical value is preferably 200, and second value is preferably 3.In this embodiment, for user A
Speech, its initial search word searched in daily record amounts to 150, and the initial search word in user B search daily record amounts to 237
Individual, the initial search word in user C daily record amounts to 89.Because the quantity of original search term in user B search daily record is big
Directly deleted in 100, therefore by user B initial search word, this is filtered equivalent to user B is regarded as into junk user in advance
Processing.User A 150 initial search words and user C 89 initial search words are retained, and are now counted each again and original are searched
The searching times that the searching times of rope word, wherein user A have 115 initial search words are less than 3, and user C has 56 initial search
The searching times of word be less than 3, filter out searching times be less than 3 initial search word, using remaining 68 initial search words as
Available search word.It should be noted that acquired search daily record is the daily record of user's one-year age, first full dose is obtained, after
Daily incremental update.
Then, into step S220, word segmentation processing is carried out to each available search word, it is corresponding one or more to obtain its
Individual Feature Words.According to one embodiment of present invention, word segmentation processing is carried out to available search word by Jieba participles instrument, than
Such as when available search word is " java PHPs ", the result after word segmentation processing is " java ", " software development " and
" engineer " this 3 Feature Words.It should be noted that instrument used in word segmentation processing or algorithm, do not enter in the present invention
Row limitation, as long as the condition of accurate participle can be met, in other words, all these technology people for understanding the present invention program
It can be readily apparent that, and also within protection scope of the present invention, not repeated herein for member.
After Feature Words are obtained, step S230 is performed, one or more Feature Words are changed with generation pair respectively
The keyword answered, and one or more corresponding keywords are combined, to form keyword sequence corresponding with available search word.
According to one embodiment of present invention, Feature Words are changed to generate corresponding keyword in the following manner, first rejected
Belong to the Feature Words of meaningless word or sensitive word in one or more Feature Words, then remaining Feature Words after rejecting are carried out together
Adopted word conversion, to generate corresponding keyword.In this embodiment, the NUL in Feature Words, such as " ", " t ", " n ", entirely
The search term being made up of numeral can directly be removed as meaningless word, then when carrying out synonym conversion, typically be utilized
Some search terms are unconditionally converted into synonym by corresponding dictionary, for example " exploitation ", " Developmental Engineer ", " soft project
Teacher ", " software development ", " PHP " are converted into " engineer ", then " engineer " is corresponding keyword, " main
Pipe ", " head ", " director ", " chief inspector ", " tl " are transformed into " leader ", then " leader " is corresponding keyword.Cause
This, 3 Feature Words " java ", " software development " and " engineer " generated in step S220, corresponding keyword is successively
For " java ", " engineer " and " engineer ", both are identical due to rear, therefore the keyword finally given is " java " and " work
Cheng Shi ".Conversion process to Feature Words is not only limited to above-mentioned expression, can suitably be adjusted under conditions of application scenarios are met
It can be readily apparent that for transformation rule, all these technical staff for understanding the present invention program, and also at this
Within the protection domain of invention, do not repeated herein.
After Feature Words are changed to generate corresponding keyword, start to one or more keyword carry out groups
Close, to form keyword sequence corresponding with available search word.According to one embodiment of present invention, can shape in the following manner
Into keyword sequence corresponding with available search word.First, text ascending order row is carried out to one or more corresponding keywords
Row, then, to the keyword after arrangement, will be attached, to be formed and can use between two neighboring keyword with the first symbol
The corresponding keyword sequence of search term.Wherein, the first symbol is underscore.In this embodiment, to " java " and " engineering
This 2 keywords of teacher " carry out text ascending order arrangement, it is known that " java " makes number one, and " engineer " comes second, incite somebody to action both
It is connected with underscore, it is " java_ that can finally obtain with available search word " java PHPs " corresponding keyword sequence
Engineer ".
Certainly, computation complexity is simplified in order to further, can also be according to crucial word order after each keyword sequence is obtained
The number of repetition of row filters out Partial key word sequence.According to still another embodiment of the invention, in formation and available search word
After corresponding keyword sequence, the number of times that each keyword sequence repeats is counted, if the number of times is less than the first numerical value, is picked
Except the corresponding keyword sequence of the number of times, if the number of times is not less than the first numerical value, retain the corresponding keyword sequence of the number of times.
Wherein, the first numerical value is preferably 20.In this embodiment, the number of times that keyword sequence " java_ engineer " repeats is
39 times, it is not less than 20, then retains the keyword sequence.It should be noted that it is usually in some necessity to reject keyword sequence
Application scenarios under the step of just perform, whether should specifically have the demand rejected with reference to current scene, and the first numerical value is set
Surely need also exist for being weighed according to actual conditions.
In step S240, from the available search word corresponding to each keyword sequence, selection frequency of occurrence highest can
With predetermined search word of the search term as the keyword sequence.To illustrate step S240 and subsequent processing steps, according to the present invention
Another embodiment, according to step S210, the available search word for obtaining user A is respectively " java webpages ", " java programs
Member " and " java background scripts ", user B available search word is respectively " java programmers " and " java backstages ", and user C's can
It is respectively " java engineer " and " java backstages " with search term.By step S220 and S230, each available search word pair is obtained
The keyword sequence answered, it is specific as shown in table 1:
Table 1
Now, frequency of occurrence statistics is carried out to the available search word corresponding to each keyword sequence in table 1,
It is specific as shown in table 2:
Table 2
According to the frequency of occurrence of each available search word in table 2, selection frequency of occurrence highest available search word is used as key
The predetermined search word of word sequence, then for keyword sequence " java_ engineer ", available search word " java programmers "
Frequency of occurrence is 2, and more than the frequency of occurrence of available search word " java engineer ", then its predetermined search word is " java programs
Member ".Table 3 shows the example of keyword sequence according to an embodiment of the invention and its corresponding predetermined search word, specifically
It is as follows:
Table 3
Herein, by corresponding relation one new relation table of formation in table 2 between available search word and keyword sequence,
Mapping table is designated as, corresponding relation one new relation table of formation in table 3 between keyword sequence and predetermined search word is designated as
Mode reduces table, so as to subsequent step processing.Hereafter, into step S250, each keyword sequence is separately input to correlation
It is trained in computation model, according to the Sequential output of correlation from big to small first number related to the keyword sequence inputted
Amount keyword sequence.Wherein, the first quantity is preferably 20.Certainly, when related keyword sequence quantity is less than the first quantity,
When being such as less than 20, directly according to the correlation Sequential output from big to small institute related with the keyword sequence of input
There is keyword sequence.According to one embodiment of present invention, correlation calculations model selection item2vec models, by step
Keyword sequence " java_ webpages " that S240 is obtained, " java_ engineer ", " java_ backstages " and " java_ backstages _ script " point
It is not input in item2vec models and is trained.Item2vec models are different from word2vec models, and word2vec models are
One sentence is regarded to the ordered sequence of word composition as, item2vec models have been given up the spatial information of word in sentence, regarded as
The set that word is constituted, and only it regard the word in contextual window size as context, item2vec compared to word2vec models
Whole words in sample are accordingly to be regarded as its context by model to any word, and in other words, the contextual window of item2vec models is regarded
For infinity.If it follows that the contextual window of word2vec models is set into a very big positive integer, can incite somebody to action should
Word2vec models are trained as item2vec models.In this embodiment, using Gensim instruments, it is called
Word2vec models are trained to each keyword sequence, and parameter setting is:Model vector dimension vecSize=200, training time
Number itemNum=200, contextual window window=1000000.Being dimensioned to contextual window window herein
1000000, its numerical value exceedes the number of initial search word, therefore to each initial search word, its context is all whole
Document, embodies item2vec models herein.It is right after training is completed to each keyword sequence by correlation calculations model
In each keyword sequence, the coefficient correlation for obtaining other associated keyword sequences is regard as correlation, here phase relation
Several spans is 0~1.Table 4 shows the example of keyword sequence dependency relation according to an embodiment of the invention, this
Shi Shangwei is ranked up processing, specific as follows:
Table 4
As shown in table 4, coef1, coef2, coef3 and coef4 represent the numerical value of corresponding correlation respectively,
Its value is followed successively by 0.75,0.35,0.86 and 0.61.According to this result, the keyword sequence of each output is sorted,
The example of keyword sequence dependency relation after finally giving based on relevance ranking, it is specific as shown in table 5:
Table 5
Finally, into step S270, the keyword sequence that the first quantity is exported replaces with its and corresponding specific searched
Rope word, so as to form the corresponding relation between keyword sequence and the first quantity predetermined search word.According to one of the present invention
Embodiment, reduces table, by the keyword sequence " java_ engineer " of output, " java_ backstages ", " java_ webpages " with reference to mode
" java_ backstages _ script " replace with successively its for predetermined search word, i.e., replace with " java programmers ", " java respectively
Backstage ", " java webpages " and " java background scripts ".Table 6 show keyword sequence according to an embodiment of the invention with
The example of predetermined search word corresponding relation, it is specific as follows:
Table 6
After the corresponding relation of keyword sequence and predetermined search word is constructed, be usually existed in database so as to
Inquire about at any time, therefore can quickly and accurately recommend the phase of its query search word keyed in user by this corresponding relation
Close word.According to still another embodiment of the invention, when receiving the query search word of user's key entry, first query search word is entered
Row processing, to form keyword sequence corresponding with query search word, corresponding first is obtained further according to keyword sequence
Quantity predetermined search word, and select from the first quantity predetermined search word preceding second quantity predetermined search word, and
Two quantity are not more than the first quantity, finally recommend the second quantity predetermined search word as the related term of query search word
The user.Wherein, the second quantity is preferably 10.Certainly, if the quantity of the predetermined search word got is not less than the second quantity,
Predetermined search word is all recommended into the user as the related term of query search word.Further, by popular specific search
The corresponding relation of word, such as " product manager ", " java engineer " and keyword sequence is put into popular caching, to accelerate service speed
Degree.
In this embodiment, the query search word that user keys in is " java websites ", to improve treatment effeciency, is existed first
Search whether to exist in mapping table with query search word identical available search word, if in the presence of directly obtaining the available search
The corresponding keyword sequence of word, without handling query search word, to form keyword corresponding with query search word
Sequence, if being not present, according to step S220 and step S230 formation keyword sequence corresponding with query search word.It is very bright
Aobvious, simultaneously " java websites " is not present in the available search word in mapping table, then it handle and obtain corresponding keyword sequence
For " java_ webpages ".Next, search whether there is keyword sequence " java_ webpages " from hot topic caching, if in the presence of,
The first quantity predetermined search word corresponding with the keyword sequence is directly obtained, and will the specific search of wherein preceding second quantity
Word recommends user as related term, if being not present, and keyword sequence " java_ webpages " is whether there is in inquiry database,
If in the presence of, the first quantity predetermined search word corresponding with the keyword sequence is obtained, and will wherein preceding second quantity spy
Search term, which is determined, as related term recommends user.Now, keyword sequence " java_ webpages " has been found in hot topic caching,
Because the quantity of predetermined search word is less than the second quantity, therefore the consequently recommended related term to user is followed successively by " java programmers "
" java backstages ".
Fig. 3 shows the schematic diagram of the search term processing unit 300 based on correlation of one embodiment of the invention.As schemed
Shown in 3, the search term processing unit 300 based on correlation includes extraction module 310, word-dividing mode 320, modular converter 330, choosing
Select module 340, training module 350 and replacement module 360.
Extraction module 310 is suitable to the search daily record for obtaining each user in multiple users, and available search is extracted from search daily record
Rope word.Extraction module 310 is further adapted for obtaining initial search word from search daily record and counts its quantity;When quantity is more than the
During one numerical value, the initial search word of the corresponding user of quantity is directly deleted;Count all not deleted each initial search words
Searching times;The initial search word that searching times are less than second value is filtered out, remaining initial search word is searched as available
Rope word.The detail of the execution aforesaid operations of extraction module 310 can be found in the step S210 in method 200, not gone to live in the household of one's in-laws on getting married herein
State.
Word-dividing mode 320 is connected with extraction module 310, suitable for carrying out word segmentation processing to each available search word, to obtain it
Corresponding one or more Feature Words.The detail of the execution aforesaid operations of word-dividing mode 320 can be found in the step in method 200
Rapid S220, is not repeated herein.
Modular converter 330 is connected with word-dividing mode 320, suitable for one or more Feature Words are changed with life respectively
Into corresponding keyword, and one or more corresponding keywords are combined, to form keyword corresponding with available search word
Sequence.Modular converter 330 is further adapted for rejecting the feature for belonging to meaningless word or sensitive word in one or more Feature Words
Word;Remaining Feature Words carry out synonym conversion after rejecting, to generate corresponding keyword.Modular converter 330 is further fitted
In to one or more corresponding keywords progress text ascending order arrangements;To the keyword after arrangement, by two neighboring key
It is attached between word with the first symbol, to form keyword sequence corresponding with available search word.Wherein, under the first symbol is
Line.The detail of the execution aforesaid operations of modular converter 330 can be found in the step S230 in method 200, not gone to live in the household of one's in-laws on getting married herein
State.
Selecting module 340 is connected with modular converter 330, suitable for from the available search word corresponding to each keyword sequence,
Frequency of occurrence highest available search word is selected as the predetermined search word of the keyword sequence.Selecting module 340 performs above-mentioned
The detail of operation can be found in the step S240 in method 200, not repeated herein.
Training module 350 is connected with modular converter 330, suitable for each keyword sequence is separately input into correlation calculations mould
It is trained, is closed according to the Sequential output of correlation from big to small first quantity related to the keyword sequence inputted in type
Keyword sequence.The detail of the execution aforesaid operations of training module 350 can be found in the step S250 in method 200, refuse herein
To repeat.
Replacement module 360 is connected with selecting module 340 and training module 350 respectively, suitable for export the first quantity
Keyword sequence replaces with its corresponding predetermined search word, so as to form keyword sequence and the first quantity predetermined search word
Between corresponding relation.The detail of the execution aforesaid operations of replacement module 360 can be found in the step S260 in method 200, this
Place is not repeated.
Fig. 4 shows the schematic diagram of the search term processing unit 400 based on correlation of another embodiment of the invention.Such as
Shown in Fig. 4, the extraction module 410 of the search term processing unit 400 based on correlation, word-dividing mode 420, modular converter 430, choosing
Module 440, training module 450 and replacement module 460 are selected, respectively with the search term processing unit 300 based on correlation in Fig. 3
Extraction module 310, word-dividing mode 320, modular converter 330, selecting module 340, training module 350 and replacement module 360 are one by one
Correspondence, is consistent, and increased processing module 470 newly.
Processing module 470 is connected with modular converter 430, suitable for counting the number of times that each keyword sequence repeats;When secondary
When number is less than the first numerical value, the corresponding keyword sequence of number of times is rejected;When number of times is not less than the first numerical value, retain number of times correspondence
Keyword sequence.The detail that processing module 470 performs aforesaid operations can be found in method 200 after execution step S230,
The processing procedure that the number of times repeated according to keyword sequence is rejected or retained to the keyword sequence, is not gone to live in the household of one's in-laws on getting married herein
State.
Fig. 5 shows the schematic diagram of the search term processing unit 500 based on correlation of another embodiment of the invention.Such as
Shown in Fig. 5, the extraction module 510 of the search term processing unit 500 based on correlation, word-dividing mode 520, modular converter 530, choosing
Module 540, training module 550 and replacement module 560 are selected, respectively with the search term processing unit 300 based on correlation in Fig. 3
Extraction module 310, word-dividing mode 320, modular converter 330, selecting module 340, training module 350 and replacement module 360 are one by one
Correspondence, is consistent, and increased recommending module 580 newly.
Recommending module 580 is connected with replacement module 560, suitable for when receiving the query search word of user's key entry, to looking into
Ask search term to be handled, to form keyword sequence corresponding with query search word;Obtain right with it according to keyword sequence
The the first quantity predetermined search word answered, and select from the first quantity predetermined search word the specific search of preceding second quantity
Word, the second quantity is not more than the first quantity;Recommend the second quantity predetermined search word as the related term of query search word
Give the user.The detail that recommending module 580 performs aforesaid operations can be found in method 200 after execution step S260,
Recommend the processing procedure of the related term of the query search word during query search word for receiving user's key entry to the user, herein not
Repeated
The specific steps and embodiment handled on the search term based on correlation, in the description based on Fig. 2
Detailed disclosure, here is omitted.
In the existing search word treatment method based on correlation, it is believed that user's inquiry entry with identical follow-up word has
Certain similarity, if user input data enough, the relevant search word of these entries can be provided based on collaborative filtering,
But when the search data scale of construction is little, and user's inquiry entry homogeneity is serious, many entries may be without follow-up word, Er Qieruo
Search content is not theed least concerned, and is now failed using follow-up word, is unfavorable for the processing of unexpected winner relative words.According to present invention implementation
The technical scheme of the processing of the search term based on correlation of example, carries out word segmentation processing to obtain to each available search word of user first
Corresponding one or more Feature Words are taken, each Feature Words are changed to generate corresponding keyword, each keyword is combined
To form keyword sequence corresponding with available search word, from the available search word corresponding to each keyword sequence, select
Each keyword sequence is separately input to phase by existing frequency highest available search word as the predetermined search word of the keyword sequence
Be trained in closing property computation model, according to the Sequential output of correlation from big to small it is related with the keyword sequence of input the
One quantity keyword sequence, the keyword sequence that the first quantity is exported replaces with its corresponding predetermined search word, shape
Corresponding relation between keyword sequence and the first quantity predetermined search word.In the above-mentioned technical solutions, the correlation meter
Calculate model and only consider the distance between search term, when window is set to infinity, the irregular search of user will not influence it
Correlation calculations, while also being had a clear superiority in the processing of unexpected winner vocabulary, without manually adjusting power to popular vocabulary.In addition,
After formation keyword sequence corresponding with available search word, united for the number of times that each keyword sequence repeats
Meter, is rejected for the keyword sequence of the numerical value of number of times first, without all keyword sequences are carried out with subsequent treatment, reduction
Computation complexity and time cost.In addition, the available search word of user is to be extracted in advance from search daily record, carrying
Meeting filtering spam user and the low search data of searching times, while ensuring result efficiently and accurately, enter during taking
One step improves processing speed.
A7. the method as any one of A1-6, when receiving the query search word of user's key entry, methods described is also
Including:
The query search word is handled, to form keyword sequence corresponding with the query search word;
The first corresponding quantity predetermined search word is obtained according to the keyword sequence, and it is individual from first quantity
Preceding second quantity predetermined search word is selected in predetermined search word, second quantity is not more than first quantity;
The user is recommended using the second quantity predetermined search word as the related term of the query search word.
B9. the device as described in B8, the extraction module is further adapted for:
Initial search word is obtained from the search daily record and counts its quantity;
When the quantity is more than the first numerical value, the initial search word of the corresponding user of the quantity is directly deleted;
Count the searching times of all not deleted each initial search words;
The initial search word that searching times are less than second value is filtered out, remaining initial search word is regard as available search
Word.
B10. the device as described in B8 or 9, the modular converter is further adapted for:
Reject the Feature Words for belonging to meaningless word or sensitive word in one or more Feature Words;
Remaining Feature Words carry out synonym conversion after rejecting, to generate corresponding keyword.
B11. the device as any one of B8-10, the modular converter is further adapted for:
Text ascending order arrangement is carried out to one or more corresponding keyword;
To the keyword after arrangement, will be attached between two neighboring keyword with the first symbol, with formed with it is described
The corresponding keyword sequence of available search word.
B12. the device as described in B11, wherein, first symbol is underscore.
B13. the device as any one of B8-12, in addition to processing module, is suitable to:
Count the number of times that each keyword sequence repeats;
When the number of times is less than the first numerical value, the corresponding keyword sequence of the number of times is rejected;
When the number of times is not less than the first numerical value, retain the corresponding keyword sequence of the number of times.
B14. the device as any one of B8-13, in addition to recommending module, is suitable to:
When receive user key entry query search word when, the query search word is handled, with formed with it is described
The corresponding keyword sequence of query search word;
The first corresponding quantity predetermined search word is obtained according to the keyword sequence, and it is individual from first quantity
Preceding second quantity predetermined search word is selected in predetermined search word, second quantity is not more than first quantity;
The user is recommended using the second quantity predetermined search word as the related term of the query search word.
In the specification that this place is provided, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention
Example can be put into practice in the case of these no details.In some instances, known method, knot is not been shown in detail
Structure and technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect, exist
Above in the description of the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:It is i.e. required to protect
The application claims of shield are than the feature more features that is expressly recited in each claim.More precisely, as following
As claims reflect, inventive aspect is all features less than single embodiment disclosed above.Therefore, abide by
Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself
It is used as the separate embodiments of the present invention.
Those skilled in the art should be understood the module or unit or group of the equipment in example disclosed herein
Between can be arranged in equipment as depicted in this embodiment, or alternatively can be positioned at and the equipment in the example
In different one or more equipment.Module in aforementioned exemplary can be combined as a module or be segmented into addition multiple
Submodule.
Those skilled in the art, which are appreciated that, to be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Member or group between be combined into one between module or unit or group, and can be divided into addition multiple submodule or subelement or
Between subgroup.In addition at least some in such feature and/or process or unit exclude each other, it can use any
Combination is disclosed to all features disclosed in this specification (including adjoint claim, summary and accompanying drawing) and so to appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification (including adjoint power
Profit is required, summary and accompanying drawing) disclosed in each feature can or similar purpose identical, equivalent by offer alternative features come generation
Replace.
Although in addition, it will be appreciated by those of skill in the art that some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of be the same as Example does not mean in of the invention
Within the scope of and form different embodiments.For example, in the following claims, times of embodiment claimed
One of meaning mode can be used in any combination.
In addition, be described as herein can be by the processor of computer system or by performing for some in the embodiment
Method or the combination of method element that other devices of the function are implemented.Therefore, with for implementing methods described or method
The processor of the necessary instruction of element forms the device for implementing this method or method element.In addition, device embodiment
Element described in this is the example of following device:The device is used to implement as in order to performed by implementing the element of the purpose of the invention
Function.
Various technologies described herein can combine hardware or software, or combinations thereof is realized together.So as to the present invention
Method and apparatus, or the process and apparatus of the present invention some aspects or part can take embedded tangible media, such as it is soft
The form of program code (instructing) in disk, CD-ROM, hard disk drive or other any machine readable storage mediums,
Wherein when program is loaded into the machine of such as computer etc, and when being performed by the machine, the machine becomes to put into practice this hair
Bright equipment.
In the case where program code is performed on programmable computers, computing device generally comprises processor, processor
Readable storage medium (including volatibility and nonvolatile memory and/or memory element), at least one input unit, and extremely
A few output device.Wherein, memory is arranged to store program codes;Processor is arranged to according to the memory
Instruction in the described program code of middle storage, performs the search word treatment method based on correlation of the present invention.
By way of example and not limitation, computer-readable medium includes computer-readable storage medium and communication media.Calculate
Machine computer-readable recording medium includes computer-readable storage medium and communication media.Computer-readable storage medium storage such as computer-readable instruction,
The information such as data structure, program module or other data.Communication media is general modulated with carrier wave or other transmission mechanisms etc.
Data-signal processed passes to embody computer-readable instruction, data structure, program module or other data including any information
Pass medium.Any combination above is also included within the scope of computer-readable medium.
As used in this, unless specifically stated so, come using ordinal number " first ", " second ", " the 3rd " etc.
Description plain objects are merely representative of the different instances for being related to similar object, and are not intended to imply that the object being so described must
Must have the time it is upper, spatially, in terms of sequence or given order in any other manner.
Although describing the present invention according to the embodiment of limited quantity, above description, the art are benefited from
It is interior it is clear for the skilled person that in the scope of the present invention thus described, it can be envisaged that other embodiments.Additionally, it should be noted that
The language that is used in this specification primarily to readable and teaching purpose and select, rather than in order to explain or limit
Determine subject of the present invention and select.Therefore, in the case of without departing from the scope and spirit of the appended claims, for this
Many modifications and changes will be apparent from for the those of ordinary skill of technical field.For the scope of the present invention, to this
The done disclosure of invention is illustrative and not restrictive, and it is intended that the scope of the present invention be defined by the claims appended hereto.
Claims (10)
1. a kind of search word treatment method based on correlation, suitable for being performed in computing device, methods described includes:
The search daily record of each user in multiple users is obtained, available search word is extracted from the search daily record;
Word segmentation processing is carried out to each available search word, to obtain its corresponding one or more Feature Words;
One or more Feature Words are changed respectively to generate corresponding keyword, and combine one or more
Multiple corresponding keywords, to form keyword sequence corresponding with the available search word;
From the available search word corresponding to each keyword sequence, selection frequency of occurrence highest available search word is used as the key
The predetermined search word of word sequence;
Each keyword sequence is separately input to be trained in correlation calculations model, according to the order of correlation from big to small
Export the first quantity keyword sequence related to the keyword sequence inputted;
The keyword sequence that first quantity is exported replaces with its corresponding predetermined search word, so as to form keyword sequence
With the corresponding relation between the first quantity predetermined search word.
2. the method as described in claim 1, described to include from the search daily record the step of extraction available search word:
Initial search word is obtained from the search daily record and counts its quantity;
If the quantity is more than the first numerical value, the initial search word of the corresponding user of the quantity is directly deleted;
Count the searching times of all not deleted each initial search words;
The initial search word that searching times are less than second value is filtered out, remaining initial search word is regard as available search word.
3. method as claimed in claim 1 or 2, described to be changed one or more Feature Words respectively to generate
The step of corresponding keyword, includes:
Reject the Feature Words for belonging to meaningless word or sensitive word in one or more Feature Words;
Remaining Feature Words carry out synonym conversion after rejecting, to generate corresponding keyword.
4. the method as any one of claim 1-3, the combination is one or more a corresponding keyword, with
The step of forming keyword sequence corresponding with the available search word includes:
Text ascending order arrangement is carried out to one or more corresponding keyword;
It to the keyword after arrangement, will be attached, used with being formed with described with the first symbol between two neighboring keyword
The corresponding keyword sequence of search term.
5. method as claimed in claim 4, wherein, first symbol is underscore.
6. the method as any one of claim 1-5, is forming keyword sequence corresponding with the available search word
The step of after, in addition to:
Count the number of times that each keyword sequence repeats;
If the number of times is less than the first numerical value, the corresponding keyword sequence of the number of times is rejected;
If the number of times is not less than the first numerical value, retain the corresponding keyword sequence of the number of times.
7. a kind of search term processing unit based on correlation, suitable for residing in computing device, described device includes:
Extraction module, the search daily record suitable for obtaining each user in multiple users extracts available search from the search daily record
Word;
Word-dividing mode, suitable for carrying out word segmentation processing to each available search word, to obtain its corresponding one or more Feature Words;
Modular converter, suitable for one or more Feature Words are changed to generate corresponding keyword, and group respectively
One or more corresponding keyword is closed, to form keyword sequence corresponding with the available search word;
Selecting module, suitable for from the available search word corresponding to each keyword sequence, selecting frequency of occurrence highest is available to search
Rope word as the keyword sequence predetermined search word;
Training module, suitable for being separately input to each keyword sequence to be trained in correlation calculations model, according to correlation
The first quantity keyword sequence related to the keyword sequence inputted of Sequential output from big to small;
Replacement module, the keyword sequence suitable for the first quantity is exported replaces with its corresponding predetermined search word, so that
The corresponding relation formed between keyword sequence and the first quantity predetermined search word.
8. a kind of computing device, including the search term processing unit based on correlation as claimed in claim 7.
9. a kind of computing device, including:
One or more processors;
Memory;And
One or more programs, wherein one or more of program storages are in the memory and are configured as by described one
Individual or multiple computing devices, one or more of programs include being used to perform in the method according to claim 1 to 6
Either method instruction.
10. a kind of computer-readable recording medium for storing one or more programs, one or more of programs include instruction,
The instruction is when executed by a computing apparatus so that in method of the computing device according to claim 1 to 6
Either method.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710515009.XA CN107220384B (en) | 2017-06-29 | 2017-06-29 | A kind of search word treatment method based on correlation, device and calculate equipment |
CN201911033168.1A CN110795628B (en) | 2017-06-29 | 2017-06-29 | Search term processing method and device based on correlation and computing equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710515009.XA CN107220384B (en) | 2017-06-29 | 2017-06-29 | A kind of search word treatment method based on correlation, device and calculate equipment |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911033168.1A Division CN110795628B (en) | 2017-06-29 | 2017-06-29 | Search term processing method and device based on correlation and computing equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107220384A true CN107220384A (en) | 2017-09-29 |
CN107220384B CN107220384B (en) | 2019-11-15 |
Family
ID=59950626
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710515009.XA Active CN107220384B (en) | 2017-06-29 | 2017-06-29 | A kind of search word treatment method based on correlation, device and calculate equipment |
CN201911033168.1A Active CN110795628B (en) | 2017-06-29 | 2017-06-29 | Search term processing method and device based on correlation and computing equipment |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201911033168.1A Active CN110795628B (en) | 2017-06-29 | 2017-06-29 | Search term processing method and device based on correlation and computing equipment |
Country Status (1)
Country | Link |
---|---|
CN (2) | CN107220384B (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107609192A (en) * | 2017-10-12 | 2018-01-19 | 北京京东尚科信息技术有限公司 | The supplement searching method and device of a kind of search engine |
CN107798091A (en) * | 2017-10-23 | 2018-03-13 | 金蝶软件(中国)有限公司 | The method and its relevant device that a kind of data crawl |
CN110457339A (en) * | 2018-05-02 | 2019-11-15 | 北京京东尚科信息技术有限公司 | Data search method and device, electronic equipment, storage medium |
CN110750682A (en) * | 2018-07-06 | 2020-02-04 | 武汉斗鱼网络科技有限公司 | Title hot word automatic metering method, storage medium, electronic equipment and system |
CN110795612A (en) * | 2019-10-28 | 2020-02-14 | 北京字节跳动网络技术有限公司 | Search word recommendation method and device, electronic equipment and computer-readable storage medium |
CN112685361A (en) * | 2020-12-24 | 2021-04-20 | 北京浪潮数据技术有限公司 | Information query method and device and computer readable storage medium |
CN112883295A (en) * | 2019-11-29 | 2021-06-01 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN113239183A (en) * | 2021-05-28 | 2021-08-10 | 北京达佳互联信息技术有限公司 | Training method and device of ranking model, electronic equipment and storage medium |
CN116340469A (en) * | 2023-05-29 | 2023-06-27 | 之江实验室 | Synonym mining method and device, storage medium and electronic equipment |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112328752B (en) * | 2021-01-04 | 2021-06-15 | 平安科技(深圳)有限公司 | Course recommendation method and device based on search content, computer equipment and medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104143005A (en) * | 2014-08-04 | 2014-11-12 | 五八同城信息技术有限公司 | Related searching system and method |
CN104199822A (en) * | 2014-07-11 | 2014-12-10 | 五八同城信息技术有限公司 | Method and system for identifying demand classification corresponding to searching |
CN104239321A (en) * | 2013-06-14 | 2014-12-24 | 高德软件有限公司 | Data processing method and device for search engine |
CN105335391A (en) * | 2014-07-09 | 2016-02-17 | 阿里巴巴集团控股有限公司 | Processing method and device of search request on the basis of search engine |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103136213B (en) * | 2011-11-23 | 2017-04-12 | 阿里巴巴集团控股有限公司 | Method and device for providing related words |
US8700621B1 (en) * | 2012-03-20 | 2014-04-15 | Google Inc. | Generating query suggestions from user generated content |
CN104598583B (en) * | 2015-01-14 | 2018-01-09 | 百度在线网络技术(北京)有限公司 | The generation method and device of query statement recommendation list |
-
2017
- 2017-06-29 CN CN201710515009.XA patent/CN107220384B/en active Active
- 2017-06-29 CN CN201911033168.1A patent/CN110795628B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104239321A (en) * | 2013-06-14 | 2014-12-24 | 高德软件有限公司 | Data processing method and device for search engine |
CN105335391A (en) * | 2014-07-09 | 2016-02-17 | 阿里巴巴集团控股有限公司 | Processing method and device of search request on the basis of search engine |
CN104199822A (en) * | 2014-07-11 | 2014-12-10 | 五八同城信息技术有限公司 | Method and system for identifying demand classification corresponding to searching |
CN104143005A (en) * | 2014-08-04 | 2014-11-12 | 五八同城信息技术有限公司 | Related searching system and method |
Cited By (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107609192A (en) * | 2017-10-12 | 2018-01-19 | 北京京东尚科信息技术有限公司 | The supplement searching method and device of a kind of search engine |
CN107798091B (en) * | 2017-10-23 | 2021-05-18 | 金蝶软件(中国)有限公司 | Data crawling method and related equipment thereof |
CN107798091A (en) * | 2017-10-23 | 2018-03-13 | 金蝶软件(中国)有限公司 | The method and its relevant device that a kind of data crawl |
CN110457339A (en) * | 2018-05-02 | 2019-11-15 | 北京京东尚科信息技术有限公司 | Data search method and device, electronic equipment, storage medium |
CN110750682A (en) * | 2018-07-06 | 2020-02-04 | 武汉斗鱼网络科技有限公司 | Title hot word automatic metering method, storage medium, electronic equipment and system |
CN110795612A (en) * | 2019-10-28 | 2020-02-14 | 北京字节跳动网络技术有限公司 | Search word recommendation method and device, electronic equipment and computer-readable storage medium |
CN112883295A (en) * | 2019-11-29 | 2021-06-01 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN112883295B (en) * | 2019-11-29 | 2024-02-23 | 北京搜狗科技发展有限公司 | Data processing method, device and medium |
CN112685361A (en) * | 2020-12-24 | 2021-04-20 | 北京浪潮数据技术有限公司 | Information query method and device and computer readable storage medium |
CN112685361B (en) * | 2020-12-24 | 2024-09-10 | 北京浪潮数据技术有限公司 | Information query method, device and computer readable storage medium |
CN113239183A (en) * | 2021-05-28 | 2021-08-10 | 北京达佳互联信息技术有限公司 | Training method and device of ranking model, electronic equipment and storage medium |
CN116340469A (en) * | 2023-05-29 | 2023-06-27 | 之江实验室 | Synonym mining method and device, storage medium and electronic equipment |
CN116340469B (en) * | 2023-05-29 | 2023-08-11 | 之江实验室 | Synonym mining method and device, storage medium and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
CN110795628B (en) | 2023-04-11 |
CN107220384B (en) | 2019-11-15 |
CN110795628A (en) | 2020-02-14 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107220384B (en) | A kind of search word treatment method based on correlation, device and calculate equipment | |
CN109840287B (en) | Cross-modal information retrieval method and device based on neural network | |
CN111310438B (en) | Chinese sentence semantic intelligent matching method and device based on multi-granularity fusion model | |
CN108363790B (en) | Method, device, equipment and storage medium for evaluating comments | |
CN111444320B (en) | Text retrieval method and device, computer equipment and storage medium | |
CN106202010B (en) | Method and apparatus based on deep neural network building Law Text syntax tree | |
CN111190997B (en) | Question-answering system implementation method using neural network and machine learning ordering algorithm | |
CN113672708B (en) | Language model training method, question-answer pair generation method, device and equipment | |
CN112100326B (en) | Anti-interference question and answer method and system integrating retrieval and machine reading understanding | |
CN111310439B (en) | Intelligent semantic matching method and device based on depth feature dimension changing mechanism | |
CN111797214A (en) | FAQ database-based problem screening method and device, computer equipment and medium | |
CN107168954A (en) | Text key word generation method and device and electronic equipment and readable storage medium storing program for executing | |
CN116244418B (en) | Question answering method, device, electronic equipment and computer readable storage medium | |
CN105468719B (en) | A kind of inquiry error correction method, device and calculate equipment | |
CN107977347A (en) | A kind of topic De-weight method and computing device | |
CN107341233A (en) | A kind of position recommends method and computing device | |
CN111898369A (en) | Article title generation method, model training method and device and electronic equipment | |
WO2023109436A1 (en) | Part of speech perception-based nested named entity recognition method and system, device and storage medium | |
CN112287656B (en) | Text comparison method, device, equipment and storage medium | |
CN107688609A (en) | A kind of position label recommendation method and computing device | |
CN117421393B (en) | Generating type retrieval method and system for patent | |
CN117744652A (en) | Domain feature word mining method and device based on large language model | |
WO2023240839A1 (en) | Machine translation method and apparatus, and computer device and storage medium | |
US20130339003A1 (en) | Assisted Free Form Decision Definition Using Rules Vocabulary | |
CN108491423A (en) | A kind of sort method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |