CN103559313B - Searching method and device - Google Patents
Searching method and device Download PDFInfo
- Publication number
- CN103559313B CN103559313B CN201310586096.XA CN201310586096A CN103559313B CN 103559313 B CN103559313 B CN 103559313B CN 201310586096 A CN201310586096 A CN 201310586096A CN 103559313 B CN103559313 B CN 103559313B
- Authority
- CN
- China
- Prior art keywords
- dictionary
- search term
- acquiescence
- vocabulary
- search
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 33
- 230000011218 segmentation Effects 0.000 claims description 15
- 238000012545 processing Methods 0.000 claims description 11
- 238000001914 filtration Methods 0.000 claims description 4
- 230000000717 retained effect Effects 0.000 claims description 3
- 230000001186 cumulative effect Effects 0.000 claims 2
- 238000000605 extraction Methods 0.000 claims 1
- 238000004422 calculation algorithm Methods 0.000 description 4
- 230000008901 benefit Effects 0.000 description 3
- 230000032683 aging Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000013480 data collection Methods 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 238000001035 drying Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 238000011430 maximum method Methods 0.000 description 1
- 238000005192 partition Methods 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/953—Querying, e.g. by the use of web search engines
- G06F16/9535—Search customisation based on user profiles and personalisation
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention discloses a kind of searching method and device.Wherein, the searching method includes:Obtain acquiescence dictionary;The number for each search term that counting user is sent by client, the search term that number is more than to predetermined value are added in the acquiescence dictionary, obtain current dictionary;The search term that user is sent by client is received, the search term is searched in current dictionary, obtains search result, and the search result is returned to for being shown to user to the client.Above-mentioned searching method and device, the number of each search term sent by counting user, the search term that number is more than to predetermined value are added in acquiescence dictionary, allow the word of hot topic to be easier to hit related data, so as to lift search hit rate.
Description
Technical field
The present invention relates to computer technology, and in particular to a kind of searching method and device.
Background technology
The appearance of search engine, numerous site informations are incorporated, serve the effect of information navigation.Search engine is divided into vertical
Straight two kinds of search engine and universal search engine:
Universal search engine is the same just as the portal website that internet occurs for the first time, substantial amounts of information integration navigation,
The inquiry being exceedingly fast, by the finish message on all websites on a platform for users to use, then information value for the first time
Generally approved by numerous businessmans, rapidly become the field of most worthy in internet;
Vertical search engine is the professional search engine for some industry, is the subdivision and extension of search engine, is
The information special to certain class in web page library is once integrated, orientation point field extract needs data handled after again
User is returned to some form.
Vertical search is the containing much information of relative universal search engine, inquire about that inaccurate, depth is inadequate etc. put forward it is new
Search engine service pattern, have one by what is provided for a certain specific area, a certain specific crowd or a certain particular demands
The information of price value and related service.Its feature is exactly " specially, smart, deep ", and has industry color, and compare universal search engine
Magnanimity information disordering, vertical search engine then seem it is more absorbed, specific and deeply.
Dependence of the existing vertical search hit ratio to dictionary is larger, and accurate dictionary could obtain more preferable search body
Test, therefore, it is necessary to one fairly perfect and update efficiently dictionary.
The content of the invention
In view of the above problems, it is proposed that the present invention so as to provide one kind overcome above mentioned problem or at least in part solve on
State the searching method and device of problem.
According to an aspect of the invention, there is provided a kind of searching method, including:
Obtain acquiescence dictionary;
The number for each search term that counting user is sent by client, the search term that number is more than to predetermined value are added to
In the acquiescence dictionary, current dictionary is obtained;
The search term that user is sent by client is received, the search term is searched in current dictionary, obtains search result,
And the search result is returned for being shown to user to the client.
According to another aspect of the present invention, there is provided a kind of searcher, including:
Module is obtained, gives tacit consent to dictionary suitable for obtaining;
Add module, the number of each search term sent suitable for counting user by client, number is more than predetermined value
Search term be added to it is described acquiescence dictionary in, obtain current dictionary;
Search module, the search term sent suitable for receiving user by client, searches for the search term in current dictionary,
Search result is obtained, and the search result is returned to for being shown to user to the client.
Above-mentioned searching method and device, the number of each search term sent by counting user, are more than predetermined value by number
Search term be added in acquiescence dictionary, allow the word of hot topic to be easier to hit related data, so as to lift search hit
Rate.
Described above is only the general introduction of technical solution of the present invention, in order to better understand the technological means of the present invention,
And can be practiced according to the content of specification, and in order to allow above and other objects of the present invention, feature and advantage can
Become apparent, below especially exemplified by the embodiment of the present invention.
Brief description of the drawings
By reading the detailed description of hereafter preferred embodiment, it is various other the advantages of and benefit it is common for this area
Technical staff will be clear understanding.Accompanying drawing is only used for showing the purpose of preferred embodiment, and is not considered as to the present invention
Limitation.And in whole accompanying drawing, identical part is denoted by the same reference numerals.In the accompanying drawings:
Fig. 1 a show the flow chart of searching method according to an embodiment of the invention;
Fig. 1 b show the flow chart of searching method in accordance with another embodiment of the present invention;
Fig. 2 shows the flow chart of searching method in accordance with another embodiment of the present invention;
Fig. 3 shows the structural representation of searcher according to an embodiment of the invention;
Fig. 4 shows the structural representation of searcher in accordance with another embodiment of the present invention.
Embodiment
The exemplary embodiment of the disclosure is more fully described below with reference to accompanying drawings.Although the disclosure is shown in accompanying drawing
Exemplary embodiment, it being understood, however, that may be realized in various forms the disclosure without should be by embodiments set forth here
Limited.On the contrary, these embodiments are provided to facilitate a more thoroughly understanding of the present invention, and can be by the scope of the present disclosure
Completely it is communicated to those skilled in the art.
Fig. 1 a show the flow chart of searching method according to an embodiment of the invention.As shown in Figure 1a, the searcher
Method includes:
Step S101, acquiescence dictionary is obtained;
The acquiescence dictionary is that the webpage that is captured from internet is parsed, extracted and filtration treatment, then to processing
Web page contents afterwards carry out word segmentation processing acquisition;
Wherein, the acquiescence dictionary includes different classes of acquiescence dictionary;Such as game dictionary include swordsman's class game dictionary,
Simulating management class game dictionary etc.;
Step S103, the number for each search term that counting user is sent by client, number is more than searching for predetermined value
Rope word is added in acquiescence dictionary, obtains current dictionary;
Step S103 includes:The number for each search term that counting user is sent by client, judge that search term is corresponding
Classification, the search term that number in the category is more than to predetermined value is added in the acquiescence dictionary of corresponding classification, obtains corresponding class
Other current dictionary.
Because search term is stored in daily record, it is possible to will be stored in searching in daily record by script file hourly
Rope word writes vocabulary, if not having this word in vocabulary, vocabulary is added, if having there is this word in vocabulary, by the word
Number add one.
Wherein, search term is obtained from daily record realizes that code is as follows:
The search term of acquisition is write into vocabulary, if not having this word in vocabulary, adds vocabulary, if in vocabulary
There is this word, then the number of the word is added one, specific implementation code is as follows:
Above-mentioned vocabulary includes keyword, the number of corresponding keyword and dictionary segmentation row, wherein above-mentioned keyword can be
Chinese word.
Above-mentioned word tableau format is as shown in table 1:
The word tableau format of table 1
Keyword | Number | Segmentation row |
Judge that classification corresponding to each search term can specifically determine according to the degree of correlation of each search term and different classes of dictionary,
, can be true by the classification of the current search term when the dictionary degree of correlation of current search word and some or some classifications is more than predetermined value
Being set to dictionary corresponding with these has identical classification, can also be based on empirical algorithms and determine that it is corresponding by current search word
Classification.The category includes the classification of various applications, such as game classification etc..
The vocabulary for including the search term that number is more than predetermined value in a certain classification is added to the acquiescence word of corresponding classification
Code is as follows for realizing in storehouse:
Step S105, the search term that user is sent by client is received, the search term is searched in current dictionary, is obtained
Search result, and the search result is returned for being shown to user to client.
In addition, after current dictionary is obtained, this method can also include:Step S104, the index of current dictionary is updated, such as
Shown in Fig. 1 b.
If have updated the index of current dictionary before step S105, can be searched in the current dictionary after renewal indexes
The search term of rope user input, obtains search result.
By foregoing description it is known that the present embodiments relate to search be vertical search, i.e., to a certain field for example
The search that field of play is carried out, because dependence of the vertical search hit ratio to dictionary is larger, therefore, one it is fairly perfect and
Efficiently dictionary is particularly important for renewal, and the embodiment of the present invention can quickly and easily update dictionary, it is hereby achieved that
More preferable search experience.
Above-mentioned searching method is particularly suitable for use in ageing strong field, such as field of play.Above-mentioned searching method, pass through system
Count user send each search term number, by number be more than predetermined value search term be added to acquiescence dictionary in, allow hot topic
Word is easier to hit related data, so as to lift search hit rate.
Fig. 2 shows the flow chart of searching method in accordance with another embodiment of the present invention.As shown in Fig. 2 this method bag
Include:
Step S201, the vocabulary for needing to add into acquiescence dictionary is obtained;
Before this step, the number for each search term that handy family is sent by client, specific implementation side are counted first
Method can be:The search term being stored in daily record is write by vocabulary by script file hourly, if there is no this in vocabulary
Individual word, then add vocabulary, if having there is this word in vocabulary, the number of the word is added into one;Then number is more than predetermined
The search term of value is retained in vocabulary, and the search term that number is less than the predetermined value is deleted from vocabulary.
It is assumed that current vocabulary is as shown in table 2;
2 current vocabulary of table
Chinese word | Usage frequency | Segmentation row |
Weaponry | 10 | x:1 |
Frame card | 6 | x:1 |
Above-mentioned predetermined value can be configured as needed, such as could be arranged to 5 times;At this point it is possible to give tacit consent to corresponding
The vocabulary added in dictionary is as shown in table 3;It is of course also possible to be arranged to other values, such as 8 times, if but be arranged to 8 times, can be with
The vocabulary added into corresponding acquiescence dictionary is as shown in table 4.
The vocabulary that table 3 adds into acquiescence dictionary
Chinese word | Usage frequency | Segmentation row |
Weaponry | 10 | x:1 |
Frame card | 6 | x:1 |
Another vocabulary that table 4 adds into acquiescence dictionary
Chinese word | Usage frequency | Segmentation row |
Weaponry | 10 | x:1 |
Step S202, vocabulary form is handled, vocabulary generation is met to the vocabulary text of predetermined format requirement;
The predetermined format can flexibly be set as needed, such as could be arranged to mmseg forms or other forms,
Mmseg is a Words partition system common, based on dictionary in Chinese word segmentation, and it is a variety of to give up based on Forward Maximum Method
Supplemented by the rule of ambiguity, because its realization is simple, the speed of service is very fast, so result is relatively preferable, application is wider;The participle
System generally includes a dictionary, two kinds of matching algorithms and four kinds of ambiguity resolution rules.
For example, table 3 can be converted into following form:
E-TEN Corp's sword [tab] 10
x:1
Frame card [tab] 6
x:1
Other rows
Table 4 can be converted into following form:
E-TEN Corp's sword [tab] 10
x:1
Other rows
Step S203, the vocabulary text for meeting predetermined format requirement of generation is added to the former vocabulary text of corresponding classification
Unigram.txt end, new vocabulary text unigram_new.txt is saved as, and copy the catalogue where mmseg to
Under, generate new dictionary;
For example with the following manner, it is possible to generate new dictionary unigram_new.txt.uni:
/usr/local/mmseg3/bin/mmseg-u/usr/local/mmseg3/etc/unigram_new.txt
In this embodiment it is assumed that the predetermined value set is 5 time, therefore E-TEN Corp's sword can be added to the forces such as Heaven Sword And Dragon Sabre
In the dictionary of chivalrous class game, it can will frame card and be added in the dictionary that zillionaire etc. manages class game, i.e., different types of trip
Playing has the acquiescence dictionary of oneself, in the dictionary for the game that the high-frequency search term that user inputs can only be added to corresponding types.
Pass through above-mentioned steps so that, also can be more perfect to the dictionary that should be played after a new game issue.
Step S204, acquiescence dictionary is replaced using new dictionary;
Such as replacement can be realized in the following ways, it is specially:
mv/usr/local/mmseg3/etc/unigram_new.txt.uni/usr/local/mmseg3/etc/
uni.lib
By above-mentioned steps S201-204, it can preferably realize and be more than the i.e. searching times of user's word interested
The word of predetermined value is given tacit consent in dictionary corresponding to being added to;
Step S205, the index of current dictionary is regularly updated, restarts searching component searchd;
Specific implementation is as follows:
/usr/local/coreseek/bin/indexer -c /usr/local/coreseek/etc/c.conf -
all -pidfile -rotate
Close searchd
ps auxww|grep searchd
kill923230
Start searchd
/usr/local/coreseek/bin/searchd -c/usr/local/coreseek/etc/c.conf -
console -pidfile
Wherein, searchd is the component for actually handling search, and it behaves like a kind of service during operation, it and client
Hold the various application programming interfaces of application call(API)Communicated, be responsible for receiving inquiry, processing inquiry and returned data
Collection.
Different from index(indexer), searchd is not for being called in order line or in general script
, on the contrary, it or one demons of conduct(daemon)Called by init.d(In Unix/Linux class systems), or
Serviced as one kind(In Windows class systems)Used, therefore not all command-line option is all always effective, this
It is relevant with option during structure.
Step S206, search result is returned to according to the keyword received.
Update the index of dictionary and can scan for operating after restarting searching component.
By foregoing description it is known that the present embodiments relate to search be vertical search, i.e., to a certain field for example
The search that field of play is carried out, because dependence of the vertical search hit ratio to dictionary is larger, therefore, one it is fairly perfect and
Efficiently dictionary is particularly important for renewal, and the embodiment of the present invention can quickly and easily update dictionary, it is hereby achieved that
More preferable search experience.
Above-mentioned searching method, the number of each search term sent by counting user, number is more than to the search of predetermined value
Word is added in acquiescence dictionary, allows the word of hot topic to be easier to hit related data, so as to lift search hit rate.
Fig. 3 shows the structural representation of searcher according to an embodiment of the invention.As shown in figure 3, the search
Device includes:Module 31, add module 32 and search module 33 are obtained, wherein:
Module 31 is obtained to be suitable to obtain acquiescence dictionary.Add module 32 is respectively searched suitable for counting user by what client was sent
The number of rope word, the search term that number is more than to predetermined value are added in above-mentioned acquiescence dictionary, obtain current dictionary.Search module
33 are suitable to receive the search term that user is sent by client, and the search term is searched in current dictionary, obtains search result, and
The search result is returned to above-mentioned client for being shown to user.
Wherein, above-mentioned acquiescence dictionary is that the webpage that is captured from internet is parsed, extracted and filtration treatment, then
Word segmentation processing acquisition is carried out to the web page contents after processing;The acquisition module is particularly adapted to obtain different classes of acquiescence word
Storehouse.Such as game dictionary includes swordsman's class game dictionary, Simulating management class game dictionary etc..
Because search term is stored in daily record, it is possible to will be stored in searching in daily record by script file hourly
Rope word writes vocabulary, if not having this word in vocabulary, vocabulary is added, if having there is this word in vocabulary, by the word
Number add one, so as to complete the statistics to each search term number;Judge classification corresponding to search term simultaneously, number is more than
The search term of predetermined value is added in the acquiescence dictionary of corresponding classification, generates the current dictionary of corresponding classification.Can in above-mentioned vocabulary
Can be found in table 1 including keyword, the number of corresponding keyword and dictionary segmentation row, word tableau format.Wherein above-mentioned keyword
Can be Chinese word, English words or other words.
Specifically, judge that classification corresponding to each search term can be according to the degree of correlation of each search term and different classes of dictionary come really
It is fixed, when the dictionary degree of correlation of search term and some or some classifications is more than predetermined value, the classification of the search term can be defined as
Dictionary corresponding with these has identical classification, can also determine its corresponding class by current search word based on empirical algorithms
Not.The category includes the classification of various applications, such as game classification etc..
In addition, vocabulary is given birth to during the current dictionary of the corresponding classification of generation, it is necessary to handle vocabulary form
Into the vocabulary text for meeting predetermined format requirement;The predetermined format can be mmseg forms or extended formatting.Specifically, by life
Into the vocabulary text for meeting predetermined format requirement be added to corresponding classification former vocabulary text unigram.txt end, protect
New vocabulary text unigram_new.txt is saved as, and is copied under the catalogue where mmseg, so as to generate new dictionary.
Further, the searcher can also include:Update module 34, as shown in figure 4, the update module is suitable to adding
Add after module 32 obtains current dictionary, update the index of the current dictionary, so that search module 33 passes through visitor receiving user
After the search term that family end is sent, the search term is searched in the current dictionary after renewal indexes, obtains search result, and upwards
State client and return to the search result for being shown to user.
Above-mentioned searcher is particularly suitable for use in ageing strong field, such as field of play.
By foregoing description it is known that the present embodiments relate to search be vertical search, i.e., to a certain field for example
The search that field of play is carried out, because dependence of the vertical search hit ratio to dictionary is larger, therefore, one it is fairly perfect and
Efficiently dictionary is particularly important for renewal, and the embodiment of the present invention can quickly and easily update dictionary, it is hereby achieved that
More preferable search experience.
Above-mentioned searcher, the number of each search term sent by counting user, number is more than to the search of predetermined value
Word is added in acquiescence dictionary, allows the word of hot topic to be easier to hit related data, so as to lift search hit rate.
Algorithm and display be not inherently related to any certain computer, virtual system or miscellaneous equipment provided herein.
Various general-purpose systems can also be used together with teaching based on this.As described above, required by constructing this kind of system
Structure be obvious.In addition, the present invention is not also directed to any certain programmed language.It should be understood that it can utilize various
Programming language realizes the content of invention described herein, and the description done above to language-specific is to disclose this hair
Bright preferred forms.
In the specification that this place provides, numerous specific details are set forth.It is to be appreciated, however, that the implementation of the present invention
Example can be put into practice in the case of these no details.In some instances, known method, structure is not been shown in detail
And technology, so as not to obscure the understanding of this description.
Similarly, it will be appreciated that in order to simplify the disclosure and help to understand one or more of each inventive aspect,
Above in the description to the exemplary embodiment of the present invention, each feature of the invention is grouped together into single implementation sometimes
In example, figure or descriptions thereof.However, the method for the disclosure should be construed to reflect following intention:I.e. required guarantor
The application claims of shield features more more than the feature being expressly recited in each claim.It is more precisely, such as following
Claims reflect as, inventive aspect is all features less than single embodiment disclosed above.Therefore,
Thus the claims for following embodiment are expressly incorporated in the embodiment, wherein each claim is in itself
Separate embodiments all as the present invention.
Those skilled in the art, which are appreciated that, to be carried out adaptively to the module in the equipment in embodiment
Change and they are arranged in one or more equipment different from the embodiment.Can be the module or list in embodiment
Member or component be combined into a module or unit or component, and can be divided into addition multiple submodule or subelement or
Sub-component.In addition at least some in such feature and/or process or unit exclude each other, it can use any
Combination is to this specification(Including adjoint claim, summary and accompanying drawing)Disclosed in all features and so disclosed appoint
Where all processes or unit of method or equipment are combined.Unless expressly stated otherwise, this specification(Including adjoint power
Profit requirement, summary and accompanying drawing)Disclosed in each feature can be by providing the alternative features of identical, equivalent or similar purpose come generation
Replace.
In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments
In included some features rather than further feature, but the combination of the feature of different embodiments means in of the invention
Within the scope of and form different embodiments.For example, in the following claims, embodiment claimed is appointed
One of meaning mode can use in any combination.
The all parts embodiment of the present invention can be realized with hardware, or to be run on one or more processor
Software module realize, or realized with combinations thereof.It will be understood by those of skill in the art that it can use in practice
Microprocessor or digital signal processor(DSP)Come realize in searcher according to embodiments of the present invention some or it is complete
The some or all functions of portion's part.The present invention be also implemented as a part for performing method as described herein or
Person whole equipment or program of device(For example, computer program and computer program product).It is such to realize the present invention's
Program can store on a computer-readable medium, or can have the form of one or more signal.Such signal
It can download and obtain from internet website, either provide on carrier signal or provided in the form of any other.
It should be noted that the present invention will be described rather than limits the invention for above-described embodiment, and ability
Field technique personnel can design alternative embodiment without departing from the scope of the appended claims.In the claims,
Any reference symbol between bracket should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not
Element or step listed in the claims.Word "a" or "an" before element does not exclude the presence of multiple such
Element.The present invention can be by means of including the hardware of some different elements and being come by means of properly programmed computer real
It is existing.In if the unit claim of equipment for drying is listed, several in these devices can be by same hardware branch
To embody.The use of word first, second, and third does not indicate that any order.These words can be explained and run after fame
Claim.
Claims (10)
1. a kind of searching method, including:
Obtain acquiescence dictionary;The acquiescence dictionary is to carry out word segmentation processing acquisition to the web page contents of crawl;The acquiescence word
Storehouse includes different classes of acquiescence dictionary;
The search term being stored in daily record is write by vocabulary by script file;If there is no the search term in the vocabulary,
Add vocabulary;Otherwise, the number of the search term is added one;
The number for each search term that counting user is sent by client, judges classification corresponding to search term, by the category times
Number is added to more than the search term of predetermined value in the acquiescence dictionary of corresponding classification, obtains the current dictionary of corresponding classification;
Update the index of current dictionary;
The search term that is sent by client of user is received, searches for the search term in current dictionary, acquisition search result, and to
The client returns to the search result for being shown to user.
2. according to the method for claim 1, the acquiescence dictionary that obtains includes:Obtain different classes of acquiescence dictionary;Or
Person
The number for each search term that the counting user is sent by client, the search term that number is more than to predetermined value are added to
In the acquiescence dictionary, current dictionary is obtained, including:
The number for each search term that counting user is sent by client, judges classification corresponding to search term, by the category times
Number is added to more than the search term of predetermined value in the acquiescence dictionary of corresponding classification, obtains the current dictionary of corresponding classification.
3. according to the method for claim 2, the number for each search term that the counting user is sent by client, judge
Classification corresponding to search term, the search term that number is more than to predetermined value are added in the acquiescence dictionary of corresponding classification, corresponded to
The current dictionary of classification, including:
The search term being stored in daily record is write into vocabulary using script file, and the number of corresponding search term is subjected to cumulative place
Reason;
Judge classification corresponding to search term;
The search term that number is more than to predetermined value is retained in the vocabulary, and the vocabulary is added to the acquiescence word of corresponding classification
In storehouse, the current dictionary of corresponding classification is generated, the acquiescence dictionary of corresponding classification is replaced using the current dictionary of corresponding classification.
4. according to the method for claim 3, the vocabulary includes keyword, the number of corresponding keyword and dictionary segmentation
OK.
5. according to the method for claim 1, the acquiescence dictionary is that the webpage captured from internet is parsed, carried
Take and filtration treatment, word segmentation processing acquisition then is carried out to the web page contents after processing.
6. a kind of searcher, including:
Module is obtained, gives tacit consent to dictionary suitable for obtaining;The acquiescence dictionary is to carry out word segmentation processing acquisition to the web page contents of crawl
's;The acquiescence dictionary includes different classes of acquiescence dictionary;
Add module, the number of each search term sent suitable for counting user by client, judges classification corresponding to search term,
The search term that number in the category is more than to predetermined value is added in the acquiescence dictionary of corresponding classification, obtains corresponding classification
Current dictionary;
Update module, suitable for updating the index of current dictionary;
Search module, the search term sent suitable for receiving user by client, searches for the search term in current dictionary, obtains
Search result, and the search result is returned for being shown to user to the client;
The add module is further adapted for that the search term being stored in daily record is write into vocabulary by script file;If institute's predicate
There is no the search term in table, then add vocabulary;Otherwise, the number of the search term is added one.
7. device according to claim 6, the acquisition module, it is particularly adapted to obtain different classes of acquiescence dictionary;Or
Person
The add module, is particularly adapted to:The number for each search term that counting user is sent by client, by the category times
Number is added to more than the search term of predetermined value in the acquiescence dictionary of corresponding classification, obtains the current dictionary of corresponding classification.
8. device according to claim 7, the add module, are particularly adapted to:
The search term being stored in daily record is write into vocabulary using script file, and the number of corresponding search term is subjected to cumulative place
Reason;The search term that number is more than to predetermined value is retained in the vocabulary, and the vocabulary is added to the acquiescence word of corresponding classification
In storehouse, the current dictionary of corresponding classification is generated, the acquiescence dictionary of corresponding classification is replaced using the current dictionary of corresponding classification.
9. device according to claim 8, the vocabulary includes keyword, the number of corresponding keyword and dictionary segmentation
OK.
10. device according to claim 6, the acquiescence dictionary be the webpage captured from internet is parsed,
Extraction and filtration treatment, word segmentation processing acquisition then is carried out to the web page contents after processing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310586096.XA CN103559313B (en) | 2013-11-20 | 2013-11-20 | Searching method and device |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310586096.XA CN103559313B (en) | 2013-11-20 | 2013-11-20 | Searching method and device |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103559313A CN103559313A (en) | 2014-02-05 |
CN103559313B true CN103559313B (en) | 2018-02-23 |
Family
ID=50013559
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310586096.XA Active CN103559313B (en) | 2013-11-20 | 2013-11-20 | Searching method and device |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN103559313B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105404661A (en) * | 2015-11-05 | 2016-03-16 | 浪潮(北京)电子信息产业有限公司 | Index file updating method and system |
CN105893626A (en) * | 2016-05-10 | 2016-08-24 | 中广核工程有限公司 | Index library creation method used for nuclear power engineering and index system adopting index library creation method |
CN106502980B (en) * | 2016-10-09 | 2019-05-17 | 武汉斗鱼网络科技有限公司 | A kind of search method and system based on text morpheme cutting |
CN106971000B (en) * | 2017-04-12 | 2020-04-28 | 北京焦点新干线信息技术有限公司 | Searching method and device |
CN107247798B (en) * | 2017-06-27 | 2021-05-25 | 北京京东尚科信息技术有限公司 | Method and device for constructing search word bank |
CN109542612A (en) * | 2017-09-22 | 2019-03-29 | 阿里巴巴集团控股有限公司 | A kind of hot spot keyword acquisition methods, device and server |
CN112507181B (en) * | 2019-09-16 | 2023-09-29 | 百度在线网络技术(北京)有限公司 | Search request classification method, device, electronic equipment and storage medium |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1763739A (en) * | 2004-10-21 | 2006-04-26 | 北京大学 | Search method based on semantics in search engine |
CN1936893A (en) * | 2006-06-02 | 2007-03-28 | 北京搜狗科技发展有限公司 | Method and system for generating input-method word frequency base based on internet information |
CN101038596A (en) * | 2007-04-29 | 2007-09-19 | 北京搜狗科技发展有限公司 | Method and system for classifying website |
CN103106227A (en) * | 2012-08-03 | 2013-05-15 | 人民搜索网络股份公司 | System and method of looking up new word based on webpage text |
Family Cites Families (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100498790C (en) * | 2007-02-06 | 2009-06-10 | 腾讯科技(深圳)有限公司 | Retrieving method and system |
US20100114878A1 (en) * | 2008-10-22 | 2010-05-06 | Yumao Lu | Selective term weighting for web search based on automatic semantic parsing |
CN102289436B (en) * | 2010-06-18 | 2013-12-25 | 阿里巴巴集团控股有限公司 | Method and device for determining weighted value of search term and method and device for generating search results |
CN103064838B (en) * | 2011-10-19 | 2016-03-30 | 阿里巴巴集团控股有限公司 | Data search method and device |
-
2013
- 2013-11-20 CN CN201310586096.XA patent/CN103559313B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1763739A (en) * | 2004-10-21 | 2006-04-26 | 北京大学 | Search method based on semantics in search engine |
CN1936893A (en) * | 2006-06-02 | 2007-03-28 | 北京搜狗科技发展有限公司 | Method and system for generating input-method word frequency base based on internet information |
CN101038596A (en) * | 2007-04-29 | 2007-09-19 | 北京搜狗科技发展有限公司 | Method and system for classifying website |
CN103106227A (en) * | 2012-08-03 | 2013-05-15 | 人民搜索网络股份公司 | System and method of looking up new word based on webpage text |
Also Published As
Publication number | Publication date |
---|---|
CN103559313A (en) | 2014-02-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103559313B (en) | Searching method and device | |
CN106649818B (en) | Application search intention identification method and device, application search method and server | |
CN102612691B (en) | Method and system for scoring texts | |
CN107704503A (en) | User's keyword extracting device, method and computer-readable recording medium | |
US8949227B2 (en) | System and method for matching entities and synonym group organizer used therein | |
US8825620B1 (en) | Behavioral word segmentation for use in processing search queries | |
CN103020845A (en) | Mobile application pushing method and system | |
US8793120B1 (en) | Behavior-driven multilingual stemming | |
JP2013545189A (en) | Determining category information using multistage | |
CN103473317A (en) | Method and equipment for extracting keywords | |
US10810245B2 (en) | Hybrid method of building topic ontologies for publisher and marketer content and ad recommendations | |
CN110765761A (en) | Contract sensitive word checking method and device based on artificial intelligence and storage medium | |
US10055408B2 (en) | Method of extracting an important keyword and server performing the same | |
US10599760B2 (en) | Intelligent form creation | |
CN103577534A (en) | Searching method and search engine | |
CN112925883B (en) | Search request processing method and device, electronic equipment and readable storage medium | |
CN109522275B (en) | Label mining method based on user production content, electronic device and storage medium | |
CN105653547A (en) | Method and device for extracting keywords of text | |
CN105608113A (en) | Method and apparatus for judging POI data in text | |
KR101638535B1 (en) | Method of detecting issue patten associated with user search word, server performing the same and storage medium storing the same | |
CN107844493A (en) | A kind of file association method and system | |
CN110069769A (en) | Using label generating method, device and storage equipment | |
CN103577547B (en) | Webpage type identification method and device | |
CN110245357B (en) | Main entity identification method and device | |
CN105243053A (en) | Method and apparatus for extracting key sentence of document |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
TR01 | Transfer of patent right |
Effective date of registration: 20220727 Address after: Room 801, 8th floor, No. 104, floors 1-19, building 2, yard 6, Jiuxianqiao Road, Chaoyang District, Beijing 100015 Patentee after: BEIJING QIHOO TECHNOLOGY Co.,Ltd. Address before: 100088 room 112, block D, 28 new street, new street, Xicheng District, Beijing (Desheng Park) Patentee before: BEIJING QIHOO TECHNOLOGY Co.,Ltd. Patentee before: Qizhi software (Beijing) Co.,Ltd. |
|
TR01 | Transfer of patent right |