CN105512199A - Search method, search device and search server - Google Patents

Search method, search device and search server Download PDF

Info

Publication number
CN105512199A
CN105512199A CN201510843765.6A CN201510843765A CN105512199A CN 105512199 A CN105512199 A CN 105512199A CN 201510843765 A CN201510843765 A CN 201510843765A CN 105512199 A CN105512199 A CN 105512199A
Authority
CN
China
Prior art keywords
ageing
time
search results
search word
search
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510843765.6A
Other languages
Chinese (zh)
Other versions
CN105512199B (en
Inventor
尹康平
王祥志
王刚
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alibaba China Co Ltd
Original Assignee
Guangzhou Shenma Mobile Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangzhou Shenma Mobile Information Technology Co Ltd filed Critical Guangzhou Shenma Mobile Information Technology Co Ltd
Priority to CN201510843765.6A priority Critical patent/CN105512199B/en
Publication of CN105512199A publication Critical patent/CN105512199A/en
Application granted granted Critical
Publication of CN105512199B publication Critical patent/CN105512199B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/957Browsing optimisation, e.g. caching or content distillation
    • G06F16/9574Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching

Landscapes

  • Engineering & Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a search method, a search device and a search server.The search method comprises following steps: receiving a search request of a user in order to acquire search words; acquire search results, the quantity of search results and the search quantity related to search words based on search words; making a time series analysis of the quantity of search results and the search quantity within a certain period of time in order to determine timeliness of search words; and responding to determination of search words with timeliness and taking timeliness as a series-adjusting basis to make adjustments on an array sequence of search words. Therefore, the search method can accurately and quickly determine words without timeliness characteristics when a user search for words so that search experience of the user is improved.

Description

Searching method, searcher and search server
Technical field
The present invention relates to search field, particularly relate to the judgement ageing to search word.
Background technology
Along with the development of internet and the improvement of search technique, people more rely on search engine from the magnanimity information of internet, obtain information needed for oneself timely and accurately.User is by obtaining corresponding result to search engine submit Query.Under normal circumstances, the result the most relevant that returns of search engine.But for burst or hot ticket, user can be more prone to obtain ageing better result, such as, latest news or recent Special Topics in Journalism.
The ageing of information refers to that the effectiveness of information depends on the time and has certain time limit, its be worth size with provide the time of information closely related.Classic method can from the literal ageing demand getting on to analyze user, and the method just can work when only comprising ageing Feature Words in the searching request of user.But user tends to submit shorter centre word to usually, this situation is more obvious when using mobile device.How to judge the ageing demand of user when lacking ageing Feature Words more fast and accurately, and provide more timely or up-to-date information, for search, especially mobile search is very important.
Summary of the invention
Generally, be called that burst is ageing by due to burst and the ageing of hot ticket initiation.From the angle of search service, when occurring the ageing event of burst, usually all along with resource burst and relevant search burst, that is, the remarkable increase of related news and volumes of searches.
Based on above-mentioned cognition, the application carrys out sequence statistic analytical model Time Created by the online related resource of real-time statistics user search and the quantity of relevant search, judges whether user has ageing demand by series model analysis time.Above-mentioned analysis can be white noise verification or the trend analysis based on some default time point.Under the prerequisite not demonstrated mutability by white noise verification or trend analysis, find the catastrophe point in sequence again, ageing intensity and trend is judged, thus as the further guidance of adjusting sequence Search Results by the difference of the sequence data before and after contrast catastrophe point.
So the present invention can judge the ageing demand of user rapidly and accurately when lacking ageing Feature Words.
In one aspect of the invention, disclose a kind of searching method, comprising: receive the searching request of user to obtain search word; Obtain the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word; Time series analysis is carried out, to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches; Have ageing in response to the described search word of judgement, as tune sequence foundation, putting in order of described Search Results is adjusted using ageing.
Like this, by only carrying out time series analysis to the quantity of Search Results and the quantity of searching request, just can infer the ageing intention of the user search lacking time word, promoting the accuracy of returned content thus, thus improve the search experience of user.
Preferably, time series analysis is carried out to Search Results quantity and number of searches, can comprise with the step of the ageing feature judging described search word: the Search Results quantity in the described time period and number of searches are temporally divided at interval, generates very first time sequence data; White noise verification is carried out to described very first time sequence data and judges the ageing feature of described search word according to the result of described white noise verification.
Above-mentioned " time interval " can be such as one day, such as the data in two months is daily divided, and obtains very first time sequence data.Like this, by the time series analysis of common white noise verification to existing incremental data, relatively simply realize for searching method of the present invention provides one.
Preferably, white noise verification carried out to very first time sequence data and judge that according to the result of described white noise verification the step of the ageing feature of described search word can comprise: supposing described very first time sequence data x 1, x 2, x 3... x nqLB statistic meet card side distribution:
Q L B = n ( n + 2 ) Σ k = 1 m ( ρ ^ k 2 n - k ) ~ χ 2 ( m ) ,
Wherein n is the value obtained after the described time interval divides the described time period, m is degree of freedom, it is coefficient of autocorrelation; P value in response to described QLB statistic is less than the first threshold of the level of significance as agreement, judges that search word has ageing feature.
Like this, the Q value known by this area and P value carry out white noise verification, relatively simply realize for searching method of the present invention provides one.
Preferably, time series analysis is carried out to described Search Results number, to judge can step comprising of the ageing feature of described search word: select and the default time point of current time at a distance of Different periods, calculate current time to the Search Results quantity M in the day part of default time point 1, M 2..., M j-1, M j, wherein j is the number of default time point, and the longest described time period no longer than obtaining Search Results quantity in the described period; Ask for from the most long duration to described time period the described period by the mean value M of each that the described time interval divides m, calculate M 1, M 2..., M j-1, M jwith M mratio:
R 1 = M 1 M m , R 2 = M 2 M m , ... , R j - 1 = M j - 1 M m , R j = M j M m
In response to there being the value of any one R to be greater than Second Threshold, judge that described search word has ageing feature.
Like this, carry out trend analysis by arranging default time point to existing Search Results incremental data, just judge the ageing of search word by simply calculating, the object thus for realizing the application provides another simple implementation.
Preferably, if it is ageing to judge that described search word has, then time series analysis can also comprise and carries out Singularity detection, characterizes described ageing catastrophe point position to find out.
Ageing owing to not only judging, also specifically find out the position of catastrophe point, just more accurately for user returns Search Results, the search experience of user can be improved thus.
Preferably, Singularity detection can carry out for above-mentioned very first time sequence data, and comprises: find out and make five of following formula values all be greater than the k value of the 3rd threshold value to determine the position of catastrophe point:
diff 1=x k-x k-1
diff 2=x k-x k-2
diff 3=x k-x k-3
diff 4=x k+1-x k-1
diff 5=x k+2-x k-1
Like this, judge just can find out catastrophe point position by simple difference, and more accurately can carry out tune sequence to Search Results according to the position of catastrophe point, to return the content more meeting user's needs.
Preferably, can also according to the k value found out, described very first time sequence data is divided into 2 independent time series data S 1=x 1, x 2... x k-1, S 2=x k, x k+1... x n, and according to S 1and S 2data model judge ageing be enhancing, decay or tend to be steady; And according to ageing be in enhancing, decay or tend to be steady as tune sequence according to adjusting putting in order of described Search Results.
Like this, by the modeling respectively of the data before and after k point, just can judge ageing trend, thus the accurate tune sequence to Search Results further.
According to a further aspect in the invention, disclose a kind of searcher, comprising: receiving element, for receiving the searching request of user to obtain search word; Acquiring unit, for obtaining the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word; Time series analysis unit, for carrying out time series analysis, to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches; And adjust sequence unit, for having ageing in response to the described search word of judgement, putting in order of described Search Results is adjusted using ageing as tune sequence foundation.
Preferably, time series analysis unit can also be used for the Search Results quantity in the described time period and number of searches temporally to divide at interval, generates very first time sequence data; And white noise verification is carried out to described very first time sequence data and judges the ageing of described search word according to the result of described white noise verification.
Preferably, time series analysis unit also can be used for: suppose described very first time sequence data x 1, x 2, x 3... x nqLB statistic meet card side distribution:
Q L B = n ( n + 2 ) Σ k = 1 m ( ρ ^ k 2 n - k ) ~ χ 2 ( m ) ,
Wherein n is the value obtained after the described time interval divides the described time period, m is degree of freedom, it is coefficient of autocorrelation; And in response to the first threshold that the P-value of described QLB statistic is less than the level of significance as agreement, judge that described search word has ageing feature.
Preferably, time series analysis unit also can be used for: select and the default time point of current time at a distance of Different periods, calculate current time to the Search Results quantity M in the day part of default time point 1, M 2..., M j-1, M j, wherein j is the number of default time point, and the most long duration in the described period is no longer than the described time period obtaining Search Results quantity; Ask for from the most long duration to described time period the described period by the mean value M of each that the described time interval divides m, calculate M 1, M 2..., M j-1, M jwith M mratio:
R 1 = M 1 M m , R 2 = M 2 M m , ... , R j - 1 = M j - 1 M m , R j = M j M m
In response to there being the value of any one R to be greater than Second Threshold, judge that described search word has ageing feature.
Preferably, if search word has ageing described in described time series analysis unit judges, then described time series analysis unit can also carry out Singularity detection, characterizes described ageing catastrophe point position to find out.
Preferably, five values that described Singularity detection can comprise following formula when finding out all are greater than the k value of the 3rd threshold value to determine the position of catastrophe point:
diff 1=x k-x k-1
diff 2=x k-x k-2
diff 3=x k-x k-3
diff 4=x k+1-x k-1
diff 5=x k+2-x k-1
Preferably, described time series analysis unit can also according to the k value found out, described very first time sequence data is divided into 2 independent time series data S 1=x 1, x 2... x k-1, S 2=x k, x k+1... x n, and according to S 1and S 2data model judge ageing be enhancing, decay or tend to be steady; And adjust sequence unit can according to ageing be in enhancing, decay or tend to be steady as tune sequence according to adjusting putting in order of described Search Results.
According to another aspect of the invention, disclose a kind of search server, comprising: storer, store the searching record of user to search word for the network information with search word association store, receiving trap, for receiving the searching request of user, processor, be connected to described storer and described receiving trap, for obtaining search word from the described searching request received by described receiving trap, the Search Results obtained based on described search word is obtained from storer, Search Results quantity and the number of searches relevant to described search word, time series analysis is carried out to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches, and have ageing in response to the described search word of judgement, putting in order of described Search Results is adjusted using ageing as tune sequence foundation, dispensing device, have adjusted as tune sequence foundation the described Search Results put in order using ageing for sending to the client device of user.
Thus, the support on device is just provided for searching method according to the present invention.
Accompanying drawing explanation
In conjunction with the drawings disclosure illustrative embodiments is described in more detail, above-mentioned and other object of the present disclosure, Characteristics and advantages will become more obvious, wherein, in disclosure illustrative embodiments, identical reference number represents same parts usually.
Fig. 1 is a kind of according to an embodiment of the invention indicative flowchart of searching method.
Fig. 2 is the schematic block diagram of a kind of searcher according to an embodiment of the invention.
Fig. 3 is the hardware composition diagram of a kind of search server according to an embodiment of the invention.
Embodiment
Below with reference to accompanying drawings preferred implementation of the present disclosure is described in more detail.Although show preferred implementation of the present disclosure in accompanying drawing, but should be appreciated that, the disclosure can be realized in a variety of manners and not should limit by the embodiment of setting forth here.On the contrary, provide these embodiments to be to make the disclosure more thorough and complete, and the scope of the present disclosure intactly can be conveyed to those skilled in the art.
Fig. 1 is a kind of according to an embodiment of the invention indicative flowchart of searching method.
In step S110, receive the searching request of user to obtain search word.
In step S120, obtain the Search Results, Search Results quantity and the number of searches relevant to this search word that obtain based on this search word.
In step S130, time series analysis is carried out to the Search Results quantity in certain hour section and number of searches, to judge the ageing of described search word.
In step S140, have ageing in response to the described search word of judgement, as tune sequence foundation, putting in order of described Search Results is adjusted using ageing.
Thus, just can carry out time series analysis to realize the judgement ageing to search word by means of only to the quantity of Search Results in certain hour and the quantity of search, and thus putting in order of Search Results be adjusted accordingly.
Here, time series analysis refer to concept known in this area, i.e. " by one group of observation data arranged in chronological order (being called sequential) and certain parameter model matching and analyze ".
Step S110 is further illustrated.Search engine is all recall corresponding document according to the searching request of user.Generally, the searching request of user is various.If only literally inquired about, miss a large amount of correlated results possibly.Thus, need to carry out certain process to user's request, such as, need some the unessential words removed in searching request, and suitable conversion is carried out to the partial words in user search, increase the data of the result of recalling.
From searching request, obtain the process not main contents of the present invention of search word, do not repeat them here.
In a preferred embodiment, step S130 can comprise and the Search Results quantity in above-mentioned certain hour section and number of searches being divided by certain hour interval, generates very first time sequence data.Above-mentioned " time interval " can be such as one day, and above-mentioned " time period " can be such as two months (for convenience of calculating, being considered as 60 days), therefore such as can nearly bimestrial data daily divide to generate very first time sequence data.Data encasement to Search Results quantity and number of searches will be described in detail in detail in conjunction with example as follows.
The present invention be used for judging the ageing demand of user according to mainly two number certificates, portion is the quantity (such as, the quantity of related news) of the relevent information that user asks, and portion is the quantity that other users carry out similar search in addition.In a preferred embodiment, can in advance by this two number according to building up information index and inquiry log index respectively so that follow-up data statistics and recalling.
In a preferred embodiment, information index can comprise two parts: sky level upgrades index and real-time update index.The Data Source of information index can be such as artificial and the kind subpage frame of the select news of machine, and the link of what reptile was real-time the crawl page on kind of subpage frame, sets up index by the news data crawled.
In a preferred embodiment, inquiry log index can comprise two parts: sky level upgrades index and hour level upgrades index.The data of this inquiry log can from the real-time query daily record of user search on line.
In a preferred embodiment, information index can be the information page (such as, the news pages that reptile crawls) of index in two months from current time, and the value of acquisition is the quantity of the relevent information page.
In a preferred embodiment, inquiry log index can be correspondingly the inquiry log of user searchs all in two months from current time, acquisition be relevant or the quantity of similar inquiry.
In a preferred embodiment, above-mentioned sky level upgrades index can be by 1 day, 2 days, 3 days ... the index upgraded by sky for 60 days.It can be such as upgraded with 1 hour, 3 hours, 6 hours, 9 hours, 12 hours, 15 hours, 18 hours, 21 hours, 24 hours that hour level upgrades.
It should be understood that above about two months, be only the example provided for convenience of description by sky, the restriction of every three hours, those skilled in the art can choose different values according to specific implementation.Obtaining " the certain hour section " of raw data can be any proper time period outside two months, such as two weeks, one month, half a year etc.Can be any time interval outside one day for dividing " time interval " of this certain hour section, such as half a day, every other day, every three days etc." hour level upgrade " can be such as per hour, every two hours or even interval renewal not etc.It is evident that, these changes are all positioned within the scope that the principle of the invention contains.
In a preferred embodiment, the quantity of user being inquired about resource quantity and the relevant search of recalling divides according to certain hour interval, obtains a vector with time index, data is sorted according to the distance from current time, rise time sequence data x 1, x 2, x 3... x n.Such as, the data of nearly two months (here for convenience of calculating, getting two months is 60 days) daily divide, and obtain time series data x 1, x 2, x 3... x n, n gets the integer between 1 to 60 here.
When to the very first time, sequence data is analyzed, first following basic assumption is set up: the correlated results of recalling under normal random challenge and relevant inquiring meet normal distribution, the time series generated is white noise, and when inquiry has ageing feature, there is catastrophe point in the time series of recalling, and the distribution of data before catastrophe point and after catastrophe point is obviously different.
Subsequently, can to time series data x 1, x 2, x 3... x ncarry out the calculating of basic statistics amount.Basic statistics amount can comprise average, variance, autocovariance and coefficient of autocorrelation etc.
Average: μ ^ = X ‾ t = Σ t = 1 n x t n
Variance: D ^ = Σ t = 1 n ( x t - μ ^ ) n - 1 = γ ^ ( 0 )
Autocovariance: &ForAll; 0 < k < n , &gamma; ^ ( k ) = &Sigma; t = 1 n - k ( x t - &mu; ^ ) ( x t + k - &mu; ^ ) n - k
Coefficient of autocorrelation: &ForAll; 0 < k < n , &rho; ^ ( k ) = &gamma; ^ ( k ) &gamma; ^ ( 0 )
In a preferred embodiment, step S130 can also comprise and carries out white noise verification to described very first time sequence data and judge the ageing of search word according to the result of white noise verification.
In a preferred embodiment, described very first time sequence data x is supposed 1, x 2, x 3... x nqLB statistic meet card side distribution:
Q L B = n ( n + 2 ) &Sigma; k = 1 m ( &rho; ^ k 2 n - k ) ~ &chi; 2 ( m ) ,
Wherein n is the value obtained after the described time interval divides the described time period, m is degree of freedom, it is coefficient of autocorrelation; P value (P-value) in response to described QLB statistic is less than the first threshold of the level of significance as agreement, judges that described search word has ageing feature.
Particularly, the value that can directly use daily level to gather here.When calculate only 30 day data, if today is No. 30, the sequence data so detected be exactly this month No. 30, No. 29, No. 28 ... the data of No. 3, No. 2 and No. 1.
In a preferred embodiment, step S130 can comprise selection and the default time point of current time at a distance of Different periods, calculates current time to the Search Results quantity M in the day part of default time point 1, M 2..., M j-1, M j, wherein j is the number of default time point, and the longest described time period no longer than obtaining Search Results quantity in the described period; Ask for from the most long duration to described time period the described period by the mean value M of each that the described time interval divides m, calculate M 1, M 2..., M j-1, M jwith M mratio:
R 1 = M 1 M m , R 2 = M 2 M m , ... , R j - 1 = M j - 1 M m , R j = M j M m
In response to there being the value of any one R to be greater than Second Threshold, judge that search word has ageing feature.
Particularly, in a preferred embodiment, following five default time points can be selected when original time series comprises the data of nearly two months (such as, 60 days): from current point in time 1 hour, 3 hours, 1 day, 3 days and 7 days.Data in the time period marked off by these five time points are sued for peace.To the 6th period, namely the 7th day to the 60th day, calculate the average M of sequence data basic time in this period m.In a preferred embodiment, can removing a mxm. and remove a minimum when computation of mean values, there is large departing from the statistics caused to prevent abnormal data and real model.Thus, the average resource quantity of above-mentioned six periods is obtained:
M h1,M h3,M d1,M d3,M d7,M m
Ask for the ratio that the first five period is equivalent to the 6th background periods:
R h 1 = M h 1 M m , R h 3 = M h 3 M m , R d 1 = M d 1 M m , R d 3 = M d 3 M m , R d 7 = M d 7 M m
By analyzing above-mentioned ratio data, if having any one to be greater than a certain specific threshold in above-mentioned R value, then think that Search Results embodies ageing feature.
It can thus be appreciated that, time series analysis can be carried out to the Search Results quantity in certain hour section and number of searches, by judging the ageing of search word to the white noise verification of Search Results quantity and number of searches.Also can carry out the trend analysis of default time point to Search Results quantity, supplementing as what carry out judging above by white noise verification or replacing.
In a preferred embodiment, after judging that the current queries of user has ageing feature by above-mentioned approach, finer analysis can be carried out to determine ageing intensity and trend to time series data.
In a preferred embodiment, can judge search word have ageing after, Singularity detection is carried out to very first time series model, with find out characterize described ageing catastrophe point position.
When the distributional difference of the seasonal effect in time series data that data cannot be split by white noise verification and/or default time point is obvious.Can think active user inquire about probably have ageing.Thus can according to hypothesis before, think that the time series data of this ageing inquiry correspondence exists catastrophe point, by finding the change of the distribution of the sequence data before and after catastrophe point and analysis catastrophe point, us can be helped to analyze and judge ageing intensity and trend.
In a preferred embodiment, Differential Detection can be used to find catastrophe point.Particularly, can calculate whether there occurs violent change at the difference value of some time points.Here, the time series data x in precedent can still be used 1, x 2, x 3... x n.When n gets the integer between 1 to 60, in the practice of nearly six months, search catastrophe point (being limited with some day).In a preferred embodiment, can only use the data of nearly 30 days to calculate.
In a preferred embodiment, 5 difference values that can be calculated as follows roughly can determine the position of seasonal effect in time series catastrophe point.
diff 1=x k-x k-1
diff 2=x k-x k-2
diff 3=x k-x k-3
diff 4=x k+1-x k-1
diff 5=x k+2-x k-1
When the absolute value of these 5 values is all greater than a certain specific threshold value, can infer that this value is exactly this seasonal effect in time series catastrophe point.That is, the now recurring structure that is distributed in of time series data suddenlys change.
It should be understood that and also can calculate more and less individual difference values (such as, 3,7) according to specific implementation, within this scope all contained in the principle of the invention.
After finding out catastrophe point, then can according to the k value found out, very first time sequence data is divided into 2 independent time series data S 1=x 1, x 2... x k-1, S 2=x k, x k+1... x n, and according to S 1and S 2data model judge ageing be in enhancing, decay or tend to be steady, and according to ageing be in enhancing, decaying or tending to be steady is used as Search Results and adjusts sequence foundation.
Generally, if the data mean value after catastrophe point is much larger than the data mean value before catastrophe point, then can think that the resource of respective queries there occurs outburst after the catastrophe point moment.If the difference after catastrophe point is just continuously, then can think that the event corresponding with search word or object are in lasting fermentation, if second order difference is continuously just so, more evidence suggests that it is ageing in continuous enhancing.If analyze difference to be continuously negative after certain point, then can think that the quantity of corresponding resource or the number of searches of user are in decline, what show that active user inquires about ageingly there occurs decay.Be more or less the same if positive and negative, then can think ageing and slowly return steadily.
As above composition graphs 1 describes searching method and preferred embodiment thereof.In device described below, corresponding units is identical with the function above with reference to Fig. 1 and the corresponding steps subsequently described by preferred embodiment respectively with the function of parts.In order to avoid repeating, the emphasis tracing device structure that can have and parts, then repeat no more for some details here, can with reference to corresponding description above.
Fig. 2 is a kind of according to an embodiment of the invention schematic block diagram of searcher 20.Searcher 20 can comprise receiving element 100, acquiring unit 200, time series analysis unit 300 and adjust sequence unit 400.
Receiving element 100 can receive the searching request of user to obtain search word.
Acquiring unit 200 can obtain the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word.
Time series analysis unit 300 can carry out time series analysis to the Search Results quantity in certain hour section and number of searches, to judge the ageing of described search word.
Adjust sequence unit 400 can have ageing in response to this search word of judgement, as tune sequence foundation, putting in order of described Search Results is adjusted using ageing.
In a preferred embodiment, Search Results quantity in the above-mentioned time period and number of searches can temporally divide at interval by time series analysis unit 300, generate very first time sequence data, white noise verification is carried out to very first time sequence data and judges the ageing feature of search word according to the result of white noise verification.The white noise verification undertaken by this time series analysis unit 300 can be identical with aforesaid concrete grammar, do not repeat them here.
In a preferred embodiment, this time series analysis unit 300 can also carry out the trend analysis giving tacit consent to catastrophe point place as mentioned above.
In a preferred embodiment, this time series analysis unit 300 can also carry out Singularity detection.Time series analysis unit 300 can carry out Singularity detection when it judges that search word has ageing to very first time series model, characterizes ageing catastrophe point position to find out.
In a preferred embodiment, Singularity detection also can be by analogously finding catastrophe point with the difference value method described for searching method above, and generates former and later two different temporal models to judge ageing trend, does not repeat them here.
In a preferred embodiment, adjust sequence unit 400 can according to ageing be in enhancing, decaying or tending to be steady is used as Search Results and adjusts sequence foundation.
Above composition graphs 2 describes the Implement of Function Module according to searching method of the present invention.Following hardware supported composition graphs 3 being described related device.
Fig. 3 is the hardware composition diagram of a kind of search server 30 according to an embodiment of the invention.This search server 30 can comprise processor 31, storer 32, receiving trap 33 and dispensing device 34.
Storer 32 can store the searching record of user to search word with the network information of search word association store.
Receiving trap 33 can receive the searching request of user.
Processor 31 is connected to storer 32, receiving trap 33 and dispensing device 34.Processor 31 can be processed the searching request received by receiving trap 33 to obtain search word, the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word can be obtained from storer 32, time series analysis is carried out to judge the ageing of described search word to the Search Results quantity in certain hour section and described number of searches, and have ageing in response to the described search word of judgement, putting in order of described Search Results is adjusted using ageing as tune sequence foundation.
Dispensing device 34 can send to the client device of user and have adjusted as tune sequence foundation the described Search Results put in order using ageing.
Search server 30 can be the same device characterizing hardware and functional module respectively with the searcher 20 of Fig. 2, also can be different device.They can realize the method described in Fig. 1 example and preferred embodiment thereof.
Above be described in detail with reference to the attached drawings according to searching method of the present invention and device.
In addition, can also be embodied as a kind of computer program according to method of the present invention, this computer program comprises the computer program code instruction for performing the above steps limited in said method of the present invention.Or, a kind of computer program can also be embodied as according to method of the present invention, this computer program comprises computer-readable medium, stores the computer program for performing the above-mentioned functions limited in said method of the present invention on the computer-readable medium.Those skilled in the art will also understand is that, may be implemented as electronic hardware, computer software or both combinations in conjunction with various illustrative logical blocks, module, circuit and the algorithm steps described by disclosure herein.
Process flow diagram in accompanying drawing and block diagram show the architectural framework in the cards of the system and method according to multiple embodiment of the present invention, function and operation.In this, each square frame in process flow diagram or block diagram can represent a part for module, program segment or a code, and a part for described module, program segment or code comprises one or more executable instruction for realizing the logic function specified.Also it should be noted that at some as in the realization of replacing, the function marked in square frame also can be different from occurring in sequence of marking in accompanying drawing.Such as, in fact two continuous print square frames can perform substantially concurrently, and they also can perform by contrary order sometimes, and this determines according to involved function.Also it should be noted that, the combination of the square frame in each square frame in block diagram and/or process flow diagram and block diagram and/or process flow diagram, can realize by the special hardware based system of the function put rules into practice or operation, or can realize with the combination of specialized hardware and computer instruction.
Be described above various embodiments of the present invention, above-mentioned explanation is exemplary, and non-exclusive, and be also not limited to disclosed each embodiment.When not departing from the scope and spirit of illustrated each embodiment, many modifications and changes are all apparent for those skilled in the art.The selection of term used herein, is intended to explain best the principle of each embodiment, practical application or the improvement to the technology in market, or makes other those of ordinary skill of the art can understand each embodiment disclosed herein.

Claims (15)

1. a searching method, comprising:
Receive the searching request of user to obtain search word;
Obtain the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word;
Time series analysis is carried out, to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches;
Have ageing in response to the described search word of judgement, as tune sequence foundation, putting in order of described Search Results is adjusted using ageing.
2. the method for claim 1, wherein carries out time series analysis to described Search Results quantity and described number of searches, to judge that the ageing step of described search word comprises:
Search Results quantity in the described time period and number of searches are temporally divided at interval, generates very first time sequence data;
White noise verification is carried out to described very first time sequence data, and judges the ageing feature of described search word according to the result of described white noise verification.
3. method as claimed in claim 2, wherein white noise verification carried out to described very first time sequence data and judge that according to the result of described white noise verification the ageing step of described search word comprises:
Suppose described very first time sequence data x 1, x 2, x 3x nqLB statistic meet card side distribution:
Wherein n is the value obtained after the described time interval divides the described time period, m is degree of freedom, it is coefficient of autocorrelation;
P value in response to described QLB statistic is less than the first threshold of the level of significance as agreement, judges that described search word has ageing.
4. the method for claim 1, wherein carries out time series analysis to the described Search Results quantity in certain hour section, to judge that the ageing step of described search word comprises:
Select and the default time point of current time at a distance of Different periods, calculate current time to the Search Results quantity M in the day part of default time point 1, M 2..., M j-1, M j, wherein j is the number of default time point, and the longest described time period no longer than obtaining Search Results quantity in the described period;
Ask for from the most long duration to described time period the described period by the mean value M of each that the described time interval divides m, calculate M 1, M 2..., M j-1, M jwith M mratio:
In response to there being the value of any one R to be greater than Second Threshold, judge that described search word has ageing.
5. the method for claim 1, if it is ageing wherein to judge that described search word has, then time series analysis also comprises Singularity detection, characterizes described ageing catastrophe point position to find out.
6. method as claimed in claim 5, wherein carry out Singularity detection and comprise:
Temporally divide and the very first time sequence data of generation at interval for by the Search Results quantity in the described time period and number of searches, find out and make five of following formula values all be greater than the k value of the 3rd threshold value to determine the position of catastrophe point:
diff 1=x k-x k-1
diff 2=x k-x k-2
diff 3=x k-x k-3
diff 4=x k+1-x k-1
diff 5=x k+2-x k-1
7. method as claimed in claim 6, also comprises:
According to the k value found out, described very first time sequence data is divided into 2 independent time series data S 1=x 1, x 2x k-1, S 2=x kx k+1x n, and according to S 1and S 2data model judge ageing be enhancing, decay or tend to be steady; And
Using ageing be in enhancing, decay or tend to be steady as tune sequence foundation, putting in order of described Search Results is adjusted.
8. a searcher, comprising:
Receiving element, for receiving the searching request of user to obtain search word;
Acquiring unit, for obtaining the Search Results, Search Results quantity and the number of searches relevant to described search word that obtain based on described search word;
Time series analysis unit, for carrying out time series analysis, to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches;
Adjusting sequence unit, for having ageing in response to the described search word of judgement, as tune sequence foundation, putting in order of described Search Results being adjusted using ageing.
9. device as claimed in claim 8, wherein said time series analysis unit also for:
Search Results quantity in the described time period and number of searches are temporally divided at interval, generates very first time sequence data;
White noise verification is carried out to described very first time sequence data, and judges the ageing of described search word according to the result of described white noise verification.
10. device as claimed in claim 9, wherein said time series analysis unit also for:
Suppose described very first time sequence data x 1, x 2, x 3x nqLB statistic meet card side distribution:
Wherein n is the value obtained after the described time interval divides the described time period, m is degree of freedom, it is coefficient of autocorrelation;
P value in response to described QLB statistic is less than the first threshold of the level of significance as agreement, judges that described search word has ageing.
11. devices as claimed in claim 8, wherein said time series analysis unit also for:
Select and the default time point of current time at a distance of Different periods, calculate current time to the Search Results quantity M in the day part of default time point 1, M 2..., M j-1, M j, wherein j is the number of default time point, and the most long duration in the described period is no longer than the described time period obtaining Search Results quantity;
Ask for from the most long duration to described time period the described period by the mean value M of each that the described time interval divides m, calculate M 1, M 2..., M j-1, M jwith M mratio:
In response to there being the value of any one R to be greater than Second Threshold, judge that described search word has ageing.
12. devices as claimed in claim 9, if search word has ageing described in described time series analysis unit judges, then described time series analysis unit also carries out Singularity detection, characterizes described ageing catastrophe point position to find out.
13. devices as claimed in claim 12, described Singularity detection comprises:
Temporally divide and the very first time sequence data of generation at interval for by the Search Results quantity in the described time period and number of searches, find out and make five of following formula values all be greater than the k value of the 3rd threshold value to determine the position of catastrophe point:
diff 1=x k-x k-1
diff 2=x k-x k-2
diff 3=x k-x k-3
diff 4=x k+1-x k-1
diff 5=x k+2-x k-1
14. devices as claimed in claim 13, described very first time sequence data, also according to the k value found out, is divided into 2 independent time series data S by wherein said time series analysis unit 1=x 1, x 2x k-1, S 2=x k, x k+1x n, and according to S 1and S 2data model judge ageing be enhancing, decay or tend to be steady; And
Described tune sequence unit using ageing be in enhancing, decay or tend to be steady as tune sequence foundation, putting in order of described Search Results is adjusted.
15. 1 kinds of search servers, comprising:
Storer, stores the searching record of user to search word for the network information with search word association store;
Receiving trap, for receiving the searching request of user;
Processor, be connected to described storer and described receiving trap, for obtaining search word from the described searching request received by described receiving trap, the Search Results obtained based on described search word is obtained from storer, Search Results quantity and the number of searches relevant to described search word, time series analysis is carried out to judge the ageing of described search word to the described Search Results quantity in certain hour section and described number of searches, and have ageing in response to the described search word of judgement, putting in order of described Search Results is adjusted using ageing as tune sequence foundation,
Dispensing device, have adjusted as tune sequence foundation the described Search Results put in order using ageing for sending to the client device of user.
CN201510843765.6A 2015-11-27 2015-11-27 Search method, search device and search server Expired - Fee Related CN105512199B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510843765.6A CN105512199B (en) 2015-11-27 2015-11-27 Search method, search device and search server

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510843765.6A CN105512199B (en) 2015-11-27 2015-11-27 Search method, search device and search server

Publications (2)

Publication Number Publication Date
CN105512199A true CN105512199A (en) 2016-04-20
CN105512199B CN105512199B (en) 2020-04-14

Family

ID=55720181

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510843765.6A Expired - Fee Related CN105512199B (en) 2015-11-27 2015-11-27 Search method, search device and search server

Country Status (1)

Country Link
CN (1) CN105512199B (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649606A (en) * 2016-11-29 2017-05-10 华为技术有限公司 Method and device for optimizing search result
CN108228679A (en) * 2016-12-22 2018-06-29 阿里巴巴集团控股有限公司 Time series data metering method and time series data metering device
CN108536716A (en) * 2017-03-06 2018-09-14 广东神马搜索科技有限公司 Method for processing search results, device and server
CN109829098A (en) * 2017-08-28 2019-05-31 广东神马搜索科技有限公司 Search result optimization method, device and server
CN110569441A (en) * 2019-09-16 2019-12-13 腾讯科技(深圳)有限公司 error correction method and device for search character string
CN111241379A (en) * 2018-11-28 2020-06-05 阿里巴巴集团控股有限公司 Search result processing method and device, electronic equipment and computer readable medium
CN111310017A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Method and device for generating timeliness scene content
CN111488516A (en) * 2019-01-28 2020-08-04 北京字节跳动网络技术有限公司 Searching method and device based on aging words
CN116894118A (en) * 2023-09-08 2023-10-17 腾讯科技(深圳)有限公司 Data searching method, device, equipment and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101140587A (en) * 2007-10-15 2008-03-12 深圳市迅雷网络技术有限公司 Searching method and apparatus
CN101604340A (en) * 2009-07-20 2009-12-16 腾讯科技(深圳)有限公司 A kind of method of the timeliness n that obtains to inquire about
US8285726B2 (en) * 2006-11-01 2012-10-09 United Video Properties, Inc. Presenting media guidance search results based on relevancy
CN103729359A (en) * 2012-10-12 2014-04-16 阿里巴巴集团控股有限公司 Method and system for recommending search terms
CN103995865A (en) * 2014-05-19 2014-08-20 北京奇虎科技有限公司 Method and device for recognizing abrupt timeliness search term

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8285726B2 (en) * 2006-11-01 2012-10-09 United Video Properties, Inc. Presenting media guidance search results based on relevancy
CN101140587A (en) * 2007-10-15 2008-03-12 深圳市迅雷网络技术有限公司 Searching method and apparatus
CN101604340A (en) * 2009-07-20 2009-12-16 腾讯科技(深圳)有限公司 A kind of method of the timeliness n that obtains to inquire about
CN103729359A (en) * 2012-10-12 2014-04-16 阿里巴巴集团控股有限公司 Method and system for recommending search terms
CN103995865A (en) * 2014-05-19 2014-08-20 北京奇虎科技有限公司 Method and device for recognizing abrupt timeliness search term

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
孙靖: "时间序列中的突发性检测与分析", 《中国优秀硕士学位论文全文数据库》 *
滕文杰: "时间序列分析法在突发公共卫生事件网络舆情分析中的应用研究", 《中国卫生统计》 *
王燕: "《应用时间序列分析》", 31 July 2005, 中国人民大学出版社 *

Cited By (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106649606A (en) * 2016-11-29 2017-05-10 华为技术有限公司 Method and device for optimizing search result
CN106649606B (en) * 2016-11-29 2020-03-31 华为技术有限公司 Method and device for optimizing search results
CN108228679A (en) * 2016-12-22 2018-06-29 阿里巴巴集团控股有限公司 Time series data metering method and time series data metering device
CN108228679B (en) * 2016-12-22 2022-02-18 阿里巴巴集团控股有限公司 Time series data metering method and time series data metering device
CN108536716B (en) * 2017-03-06 2021-06-11 阿里巴巴(中国)有限公司 Search result processing method and device and server
CN108536716A (en) * 2017-03-06 2018-09-14 广东神马搜索科技有限公司 Method for processing search results, device and server
CN109829098A (en) * 2017-08-28 2019-05-31 广东神马搜索科技有限公司 Search result optimization method, device and server
CN111241379A (en) * 2018-11-28 2020-06-05 阿里巴巴集团控股有限公司 Search result processing method and device, electronic equipment and computer readable medium
CN111241379B (en) * 2018-11-28 2023-04-25 阿里巴巴集团控股有限公司 Search result processing method and device, electronic equipment and computer readable medium
CN111310017A (en) * 2018-12-11 2020-06-19 阿里巴巴集团控股有限公司 Method and device for generating timeliness scene content
CN111310017B (en) * 2018-12-11 2023-05-12 阿里巴巴集团控股有限公司 Method and device for generating time-efficient scene content
CN111488516A (en) * 2019-01-28 2020-08-04 北京字节跳动网络技术有限公司 Searching method and device based on aging words
CN110569441A (en) * 2019-09-16 2019-12-13 腾讯科技(深圳)有限公司 error correction method and device for search character string
CN116894118A (en) * 2023-09-08 2023-10-17 腾讯科技(深圳)有限公司 Data searching method, device, equipment and storage medium
CN116894118B (en) * 2023-09-08 2023-12-22 腾讯科技(深圳)有限公司 Data searching method, device, equipment and storage medium

Also Published As

Publication number Publication date
CN105512199B (en) 2020-04-14

Similar Documents

Publication Publication Date Title
CN105512199A (en) Search method, search device and search server
CN106657057B (en) Anti-crawler system and method
CN105224959A (en) The training method of order models and device
CN105808284A (en) Incremental upgrading method and server using increment upgrading method
US20040225644A1 (en) Method and apparatus for search engine World Wide Web crawling
CN102880501A (en) Realizing method, device and system for recommending applications
CN105389722A (en) Malicious order identification method and device
CN103207899A (en) Method and system for recommending text files
WO2015070735A1 (en) Traffic quality analysis method and device
US20120005257A1 (en) System and method for generating web analytic reports
CN106302350A (en) URL monitoring method, device and equipment
CN108416609A (en) Advertisement gray scale put-on method and device
CN104683457A (en) Concurrency control method and device
CN110454910B (en) Method and equipment for defrosting of air conditioner
CN103593444A (en) Network keyword recognition processing method and device
CN103390067B (en) The data processing method analyzed for internet entity and device
CN102868685A (en) Method and device for judging automatic scanning behavior
CN110427574B (en) Route similarity determination method, device, equipment and medium
CN110310476B (en) Road congestion degree evaluation method and device, computer equipment and storage medium
CN110971673B (en) Computer device and method for acquiring user activity of deep learning platform
CN103530392A (en) Method and device for determining capture flows
CN104392000B (en) Determine the method and apparatus that mobile site captures quota
CN104350491A (en) Data sampling method and data sampling device
Jose et al. Application of ARIMA (1, 1, 0) model for predicting time delay of search engine crawlers.
CN103530393B (en) Determine that website sub channel captures the method and apparatus of flow quota

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right

Effective date of registration: 20200818

Address after: 310052 room 508, floor 5, building 4, No. 699, Wangshang Road, Changhe street, Binjiang District, Hangzhou City, Zhejiang Province

Patentee after: Alibaba (China) Co.,Ltd.

Address before: 510627 Guangdong city of Guangzhou province Whampoa Tianhe District Road No. 163 Xiping Yun Lu Yun Ping square B radio tower 12 layer self unit 01

Patentee before: GUANGZHOU SHENMA MOBILE INFORMATION TECHNOLOGY Co.,Ltd.

TR01 Transfer of patent right
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20200414

Termination date: 20201127

CF01 Termination of patent right due to non-payment of annual fee