CN109190004A - A method of search complexity is reduced to cope with excess load searching request based on specific policy - Google Patents
A method of search complexity is reduced to cope with excess load searching request based on specific policy Download PDFInfo
- Publication number
- CN109190004A CN109190004A CN201810999884.4A CN201810999884A CN109190004A CN 109190004 A CN109190004 A CN 109190004A CN 201810999884 A CN201810999884 A CN 201810999884A CN 109190004 A CN109190004 A CN 109190004A
- Authority
- CN
- China
- Prior art keywords
- search
- searching request
- user
- load
- policy
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5027—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Information Transfer Between Computers (AREA)
Abstract
Method of the search complexity to cope with excess load searching request is reduced based on specific policy the invention discloses a kind of, which is characterized in that including step 1: identification search request;Step 2: distinguishing request and additional treatments policy flag;Step 3: dividing policy processing request;Step 4: returning the result;Step 5: regular monitoring dynamic adjustable strategies;Step 6: recording process assessment influences.The coverage for reaching the service of would detract from controls under minimum influence degree, the effect that the degree for allowing search complexity to reduce also follows the concurrent load condition of searching request dynamically to be adjusted.
Description
Technical field
The present invention relates to data searching technology field, more particularly to one kind based on specific policy reduce search complexity with
The method for coping with excess load searching request;
Background technique
Search system is high load capacity computing system, needs to occupy a large amount of CPU and calculates time and memory source, on isochrone
Search system need to keep high availability again, and response time requirement is in Millisecond, thus its system resource for occupying and
Its maximum search request Concurrency that can be handled must be into a certain range ratio;
Company operation needs to consider economic benefit, and the hardware cost spent is limited;For search platform,
Belong to background service supplier in entire operation system, opens and used to the operation system of all accesses, the total place undertaken
Reason ability is relatively limited;When there is excess load searching request, i.e. the concurrent quantity of searching request is more than system busy hour,
Search platform would tend to occur example memory and overflow or load too high the case where causing application crashes that can not provide service;In order to
Under limited hardware cost, this excess load searching request can be coped with and occurred;Need search platform system that there can be a kind of base
In certain specific policy to reduce search complexity, is realized by damaging the thought of service and cope with excess load request;
Damaging service idea is generally acknowledged based under limited resources, passes through the inessential or non-core user in sacrifice part
Experience is utmostly to meet basic service demand;Search platform provides a search service, itself contains certain
Ambiguity naturally distinguishes over the coherence request of Database Systems, for same search term, can provide different search
Results set, as long as search result set meets certain accuracy and correlation;And in e-commerce search platform its
The complexity of searching service be it is very high, search calculate not only include based on Lucene marking recall sequence, also to add
Enter the business customizing logic of many complexity, such as considers the merchandise news quality for including in its region of search, businessman member's rank, goes through
History conclusion of the business situation, user click condition simultaneously will also do suitable decaying to the commodity of same member businessman under one's name are concentrated on;At it
There are many computing resource and the memory source for managing logic consumption, especially for example accessed personalized search, it is also contemplated that search user
Relevant search Behavior preference in portrait, correlation network etc.;By reducing search complexity, search platform can be made same
Search hardware resource under promoted its handle searching request concurrent capability, this is a kind of technical application scene of highly effective;
Summary of the invention
It is a kind of based on specific policy reduction the technical problem to be solved by the present invention is to overcome the deficiencies of the prior art and provide
Method of the search complexity to cope with excess load searching request is drawn by carrying out different strategies to all searching requests received
Divide different user groups, searching request wherein little to business value influence is scanned for the reduction of complexity, reduces
Method be remove search inquiry request in sequence, reduce the amount of recalling, to reduce its occupancy to search server performance
Handling capacity with search service cluster is improved, makes it that the excess load searching request because largely happening suddenly be avoided to lead to entire services set
Group can not service.Excess load searching request refers to that search platform encounters the search concurrent request for handling load more than its maximum and causes
System load Load Average index increases.
In order to solve the above technical problems, the present invention provides, a kind of based on specific policy to reduce search complexity super negative to cope with
The method of lotus searching request, includes the following steps:
Step 1: identification search request, i.e., all searching requests for being sent to search service cluster require to take specific use
Family mark, identifying includes user sources IP, the cookie information that user logs in, and Spider (crawler engine) flag bit is all to search
Rope request can all submit to search cluster transponder (Dispatcher) processing first;
Step 2: distinguishing request and additional treatments policy flag, i.e. search cluster transponder (Dispatcher) can be according to pipe
These searching requests are carried out User Priority differentiation, while judging current searching request by the configured policing rule of reason person
Whether quantity and server load have reached the standard of triggering surge reply, distinguish different user search requests and affix is special
Then fixed search process policy flag is transmitted to some search service example (Searcher) in cluster by Dispatcher
Reason;
Step 3: dividing policy processing request, i.e. some search service example (Searcher) in search cluster is responsible for receiving
After the searching request of the search process policy flag position, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result, i.e., the search result handled is returned to search collection by search service example (Searcher)
Group's transponder (Dispatcher), search cluster transponder return again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies, i.e., when search cluster transponder (Dispatcher) periodic scanning discovery
When excess load searching request slows down or aggravates (index of correlation of load and amount of access is when boundary value fluctuates), then according to spy
Surely logic is handled, fluctuation processing is carried out, until excess load searching request terminates, all indexs restore normal;
Step 6: the searching request that recording process assessment influence, i.e. search system will record all process flows and be related to
Information, and after surge, notify related service shareholder automatically by way of mail from the background, it is multiple to reduce search with assessment
The coverage of miscellaneous degree.
In the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule;
The setting policy levels index refers to: administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4,
Carry out the calling of different stage according to current search number of requests and server load, the data threshold of each rank is can be with
Human configuration is carried out according to policy levels index by administrator;Policy levels index includes: the search per second of current manipulative indexing
Request Concurrency number, 5 minutes average servers of server load (with the Load of (SuSE) Linux OS in current search cluster
The parameter value of Average is reference);
The setting is distinguished user's processing rule and referred to: administrator can distinguish user and set different level rules, will use
Family is divided into crawler mark Spidername and user identifier Username, and is arranged in different policy levels;
2-2:1 grades of strategy processing of step, i.e., after periodic scanning search server loads, if when searching request quantity is super
5 minutes Load Average values for crossing systemic presupposition maximum capacity or search server load are greater than the total nucleus number of server CPU,
Start 1 grade of countermeasure of rank, which can identify in all user identifiers that Dispatcher is received and have
The searching request of Spidername will be filtered out with the searching request of OtherSpider flag bit, then by these
It specific search process policy flag position is added after request gives corresponding search cluster service example again and handled, is added at this time
Search process policy flag position be * A*;
2-3:2 grades of strategy processing of step, i.e., after periodic scanning, when searching request quantity is more than systemic presupposition maximum capacity
When server CPU total nucleus number is still greater than in value or 5 minutes Load Average values of search server load, i.e. starting rank 2
Grade countermeasure, the strategy can identify all searching with crawler label in all user identifiers that Dispatcher is received
Rope request, does not repartition the priority of Spidername;Then by these request after specific search process policy flag is added
Position is given corresponding search cluster service example again and is handled, and the search process policy flag position being added at this time is * A*, simultaneously
The search sign position of the corresponding addition of original in step 2-2 is changed to * B*;
2-4:3 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than to be when searching request quantity
It unites and presets maximum capacity or when the total core of server CPU is still greater than in 5 minutes Load Average values of search server load
When number, i.e., starting 3 grades of countermeasures of rank, the strategy can judge the use in user's cookie information that Dispatcher is received
Whether family belongs to the registration user of corresponding business platform, if it is determined that user and no platform register user in the request, then the use
The Username at family is identified as Other, specific search process policy flag position is added after these are requested gives again and search accordingly
Rope cluster service example is handled, and the search process policy flag position being added at this time is * A*, while being upgraded step by step by step 2-
The search sign position of the corresponding addition of original in 3 is changed to * B*, while the search sign position of the corresponding addition of original in step 2-2 being changed
For * C*;
2-5:4 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than to be when searching request quantity
It unites and presets maximum capacity or when the total core of server CPU is still greater than in 5 minutes Load Average values of search server load
When number, i.e., starting 4 grades of countermeasures of rank, the strategy can use registration after all users request that Dispatcher is received
Specific search process policy flag position is added in family mark, that is, user's request of Username=Login to give again accordingly
Search cluster service example handled, the search process policy flag position being added at this time is * A*, while will be in step 2-4
The search sign position of the corresponding addition of original be changed to * B*, while the search sign position of the corresponding addition of original in step 2-3 is changed to *
C*, while the search sign position of the corresponding addition of original in step 2-2 is changed to * D*;
Step 2-6: strategy is transferred the files step by step, i.e., after periodic scanning, Dispatcher discovery is more than when searching request quantity
Systemic presupposition maximum capacity or when that server CPU is still greater than is total for 5 minutes Load Average values of search server load
When nucleus number, Dispatcher can be in sequence successively by step 2-2,2-3,2-4, the search plan of corresponding searching request in 2-5
Slightly flag bit is successively progressive to * D* since * A*, until in addition to operation system registration user (i.e. Usernmae=Login's
User) except can determine whether as system registry use in search strategy flag bit all * D*, Cookie after all user's requests
The searching request policy flag position highest level at family is promoted to * C*.
In the step 3, further includes:
Step 3-1: load caching, i.e. Searcher receive the search with specific search process policy flag position * A* and ask
Ask, according to the flag bit, which will load from the caching of current Searcher first, if caching load less than
It still further carries out re-searching for operation, reduces search computation complexity by caching to call;
Step 3-2: single-point damages service, i.e. the search that Searcher receives specific search process policy flag position * B* is asked
It asks, according to the flag bit, which will do it single-point and damage service, and single-point damages service and refers to because in search cluster
Searching service is using distributed search mechanism, i.e., primary search needs to request 2 or 2 or more Searcher search real respectively
Example merges processing after executing searching request again, and after Searcher starting single-point damages service, i.e., the Searcher will not
It is returned after requesting other Searhcer examples, direct single-point Searcher to be finished respectively again;
Step 3-3: dimensionality reduction simplifies, i.e. Searcher receives the searching request of specific search process policy flag position * C*, root
According to the flag bit, which will do it dimensionality reduction and damages service, and dimensionality reduction damages service and refers to the meter of the searching request of user
It calculates complexity progress dimensionality reduction to simplify, skips the service logic that part takes a long time, such as ignore customized marking sequence, subtract
The business logic processings such as few call back number;
Step 3-4: returning empty as a result, i.e. Searcher receives the searching request of specific search process policy flag position * D*,
According to the flag bit, Searcher can directly refuse the searching request, return to empty search result.
In the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition when searching request quantity
Maximum capacity or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU,
The specific search process policy flag position of the searching request of current all corresponding different disposal ranks is downshifted, downshift
Rule drops to * B* from * C* step by step, drops to * A* step by step from * B* to drop to * C* step by step from * D*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to system
Default maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to server CPU
When total nucleus number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, until all processing marks are all removed, i.e., all to search
The normal processing of rope request, if it is maximum that discovery searching request quantity is greater than systemic presupposition in next timing scan period
Capability value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then by current institute
There is the specific search process policy flag position of the searching request of corresponding different disposal rank to upshift, the rule of downshift is
* D* is risen to from * C*, rises to * C* from * B*, rises to * B* from * A*.
In the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, pass through arrangement
Summarize all dispositions to report to stakeholder to assess coverage;Surge per treatment can all generate a specific search
Rope requests mabage report
Step 6-2: search system will record the server cpu load and searching request number of concurrent of relative strategy rank at that time
Amount, and it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the variation feelings of the processing mark of the searching request of relative strategy rank at that time
Condition, such as when from change histories situations such as * A* to * B*, be added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode;
In the step 2-1, policy levels index is set are as follows:
1:[200,24];2:[250,24];3:[300,24];4:[400,32];
I.e. by taking the meaning of first parameter as an example, indicate that 1 grade of tactful data threshold is request per second more than 200 (packets
Containing 200), 5 minutes average servers of Cpu are loaded more than 24, behind and so on, this parameter can support dynamic configuration
's;
It is as follows that user's processing rule setting will be distinguished:
1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*].
Advantageous effects of the invention:
(1) user identity information entrained by the search client in the present invention can be used as the grading to user,
Dispatcher of all users' classification grading work in search platform is completed, and client is not necessarily to carrying out other configurations,
I.e. when high load capacity searching request occurs, it is not necessarily to client intervention, all counter-measures have the Dispatcher in search platform
Processing is automatically performed with Searcher example.
(2) using in the present invention to the application for damaging service idea combines user to the different weights of service impact, no
Co-extensive and the different degrees of method of service that damages combine.The remarkable fusing mode imposed uniformity without examining individual cases, but use ladder
Type go up and down step by step damage method of service to cope with high load capacity searching request, make business receive damage service impact range it is most
It may be small.
(3) present invention characteristic of distributed search platform is utilized, by reduce concurrent node recall in the way of,
Search result is recalled using single node to reduce search complexity.Recalling range by limitation in high load capacity reduces meter
Complexity is calculated, while also assuring search accuracy, it is more friendly to client for the method for service for directly refusing excess
It is good.
(4) the invention proposes using containing in searching request, basic Lucene is inquired and the marking of customized business is sorted
Two-part calculating demand proposes the thought by dimensionality reduction service, i.e. search platform Searcher is only in high load capacity
The calculation amount comprising basic Lucene inquiry is completed, and gives up the calculating of customized business marking sequence.It is taken by this dimensionality reduction
The mode of business can reduce search complexity and improve the performance of reply high load capacity searching request.
(5) the invention proposes after coping with high load capacity searching request, all systems, which are automatically processed, reduces complexity
Detailed process is sent to each business shareholder being set in advance by way of mabage report, so that business shareholder can be with
Know that system reduces the coverage of complexity, and assesses the influence to business.In this way, administrator can improve and match
The mode for setting specific policy makes entire search platform preferably cope with the searching request of high load capacity, hard without additionally increasing
Part server expenditure.
Detailed description of the invention
Fig. 1 is the method flow schematic diagram of exemplary embodiment of the present invention.
Specific embodiment
The present invention is further illustrated with exemplary embodiment with reference to the accompanying drawing:
As shown in Figure 1, the present invention includes the following steps:
Step 1: identification search request
All searching requests for being sent to search service cluster require to take specific user identifier, and mark is come including user
Source IP, the cookie information that user logs in, Spider (crawler engine) flag bit, all searching requests can all submit to one first
A search cluster transponder (Dispatcher) processing;
Step 2: distinguishing request and additional treatments policy flag
The policing rule that cluster transponder (Dispatcher) can be good according to administrator configurations is searched for, by these searching requests
User Priority differentiation is carried out, while judging whether current searching request quantity and server load have reached triggering surge and answered
Pair standard, distinguish different user search requests and the specific search process policy flag of affix, then by
Dispatcher is transmitted to some search service example (Searcher) in cluster and handles;
Step 3: dividing policy processing request
Some search service example (Searcher) in search cluster is responsible for receiving the search process policy flag position
After searching request, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result
The search result handled is returned to search cluster transponder by search service example (Searcher)
(Dispatcher), search cluster transponder returns again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies
When search cluster transponder (Dispatcher) periodic scanning discovery excess load searching request slows down or aggravates
(index of correlation of load and amount of access is when boundary value fluctuates) is waited, then according to particular procedure logic, carries out fluctuation processing, until
Excess load searching request terminates, and all indexs restore normal;
Step 6: recording process assessment influences
The searching request information that search system will record all process flows and be related to, and after surge, backstage
Notify related service shareholder, automatically by way of mail to assess the coverage for reducing search complexity.
In the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule
Set policy levels index
Administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4, according to current search number of requests kimonos
Business device loads to carry out the calling of different stage, and the data threshold of each rank is can be by administrator according to policy levels index
Carry out human configuration;Policy levels index includes: the searching request number of concurrent per second of current manipulative indexing, in current search cluster
5 minutes average servers of server load (using the parameter value of the Load Average of (SuSE) Linux OS as reference);Such as
It is provided that 1:[200,24];2:[250,24];3:[300,24];4:[400,32].By taking the meaning of first parameter as an example,
It indicates that 1 grade of tactful data threshold is request per second more than 200 (including 200), and 5 minutes average servers load of Cpu is super
Cross 24.Below and so on, this parameter can support dynamic configuration.
Setting distinguishes user and handles rule
Administrator can distinguish user and set different level rules, and user is divided into crawler mark Spidername and user
Username is identified, the Spider priority such as Google and Bing is high, and the priority of other Spider is lower, then is arranged
It is distinguished in different policy levels, such as according to different Username user class, is divided into registration user Login and its
He is nonregistered user Other.
Citing is provided that 1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*]
2-2:1 grades of strategy processing of step
When periodic scanning search server load after, if when searching request quantity be more than systemic presupposition maximum capacity or
5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, i.e. starting 1 grade of countermeasure of rank,
The policing rule is pre-defined in step 2-1, can be identified in all user identifiers that Dispatcher is received
The searching request of Spidername is had out, and, with the 1 grade of user's processing strategie rule illustrated in step 2-1
" 1:[Spidername=Other:*A*] " for, the searching request with OtherSpider flag bit is filtered
Out, then by these request after specific search process policy flag position be added give corresponding search cluster service example again
It is handled, the search process policy flag position being added at this time is * A*;
2-3:2 grades of strategy processing of step
After periodic scanning, when searching request quantity is more than the 5 of systemic presupposition maximum capacity or search server load
When server CPU total nucleus number is still greater than in minute Load Average value, i.e. starting 2 grades of countermeasures of rank, with step 2-1
For 2 grades of user processing strategies rules " 1:[Spidername=All:*A*] " of middle citing, which can be in Dispatcher
All searching requests with crawler label are identified in all user identifiers received, do not repartition Spidername;Then
By these request after specific search process policy flag position be added give corresponding search cluster service example again and handle,
The search process policy flag position being added at this time is * A*, while the search sign position of the corresponding addition of original in step 2-2 being changed
For * B*, this is handled with regard to escalation policy step by step described in Fig. 1;
2-4:3 grades of strategy processing of step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched
When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, that is, start 3 grades of rank replies
Strategy, by taking the 3 grades of user processing strategies rules " 3:[Username=Other:*A*] " illustrated in step 2-1 as an example, the strategy
It can judge whether the user belongs to the registration use of corresponding business platform in user's cookie information that Dispatcher is received
Family, if it is determined that user and no platform registration user in the request, then the Username of the user is identified as Other, by these
It specific search process policy flag position is added after request gives corresponding search cluster service example again and handled, is added at this time
Search process policy flag position be * A*, while upgrading change the search sign position of the corresponding addition of original in step 2-3 step by step
For * B*, while the search sign position of the corresponding addition of original in step 2-2 is changed to * C*;
2-5:4 grades of strategy processing of step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched
When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, that is, start 4 grades of rank replies
Strategy, by taking the 4 grades of user processing strategies rules " 1:[Username=Login:*A*] " illustrated in step 2-1 as an example, the strategy
User identifier, that is, the use of Username=Login can will be registered after all users request that Dispatcher is received
Specific search process policy flag position be added in the request of family give corresponding search cluster service example again and handled, at this time plus
The search process policy flag position entered is * A*, while the search sign position of the corresponding addition of original in step 2-4 is changed to * B*, together
When the search sign position of the corresponding addition of original in step 2-3 is changed to * C*, while searching the corresponding addition of original in step 2-2
Rope flag bit is changed to * D*;
Step 2-6: strategy is transferred the files step by step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched
When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, Dispatcher can be according to suitable
Sequence is successively successively progressive since * A* by the search strategy flag bit of searching request corresponding in step 2-2,2-3,2-4,2-5
To * D*, after all user's requests other than operation system registers user (i.e. the user of Usernmae=Login)
It can determine whether the searching request policy flag position highest for system registry user in search strategy flag bit all * D*, Cookie
Rank is promoted to * C*.
In the step 3, further includes:
Step 3-1: load caching
Searcher receives the searching request with specific search process policy flag position * A*, and according to the flag bit, this is searched
Rope request will be loaded from the caching of current Searcher first, if caching load is less than still further carrying out re-searching for transporting
It calculates, reduces search computation complexity by caching to call;
Step 3-2: single-point damages service
Searcher receives the searching request of specific search process policy flag position * B*, and according to the flag bit, which is asked
Asking will do it single-point and damages service, and single-point damages service and refers to because the searching service in search cluster is using distributed search
Mechanism, i.e., primary search need to be closed again after requesting 2 or 2 or more Searcher search examples to execute searching request respectively
And handle, after Searcher starting single-point damages service, i.e. the Searcher will not request other Searhcer real respectively again
Example, direct single-point Searcher are returned after being finished, and reduce search computation complexity;But the search result returned is relative to just
Normal search process result can be on the low side, but still is able to satisfy most searching requests and has search result return;
Step 3-3: dimensionality reduction simplifies
Searcher receives the searching request of specific search process policy flag position * C*, and according to the flag bit, which is asked
Asking will do it dimensionality reduction and damages service, and dimensionality reduction, which damages service and refers to, carries out dimensionality reduction letter for the computational complexity of the searching request of user
Change, skip the service logic that part takes a long time, such as ignore customized marking sequence, reduces the service logics such as call back number
Processing, the search result returned can be deviated relative to the accuracy of normal search process result, but can guarantee certain
Degree correlation reduces search computation complexity;
Step 3-4: empty result is returned
Searcher receives the searching request of specific search process policy flag position * D*, according to the flag bit, Searcher
The searching request can directly be refused, empty search result is returned to, to reduce system loading.
In the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition when searching request quantity
Maximum capacity or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU,
The specific search process policy flag position of the searching request of current all corresponding different disposal ranks is downshifted, downshift
Rule drops to * B* from * C* step by step, drops to * A* step by step from * B* to drop to * C* step by step from * D*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to system
Default maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to server CPU
When total nucleus number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, until all processing marks are all removed, i.e., all to search
The normal processing of rope request, if it is maximum that discovery searching request quantity is greater than systemic presupposition in next timing scan period
Capability value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then by current institute
There is the specific search process policy flag position of the searching request of corresponding different disposal rank to upshift, the rule of downshift is
* D* is risen to from * C*, rises to * C* from * B*, rises to * B* from * A*.
In the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, because passing through
Certain loss of search precision can be brought to cope with excess load searching request by reducing search complexity, may cause shadow to business
It rings, needs to report to stakeholder by aggregating all dispositions to assess coverage;Surge per treatment is all
A specific searching request mabage report can be generated
Step 6-2: search system will record the server cpu load and searching request number of concurrent of relative strategy rank at that time
Amount, and it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the variation feelings of the processing mark of the searching request of relative strategy rank at that time
Condition, such as when from change histories situations such as * A* to * B*, be added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode.
In other words, the present invention is exactly to be carried out by specific strategy to user for the technical application scene for reality
Identify and distinguish between, all searching requests received of search platform be all include user Cookie, IP, crawler zone bit information;
The different stage that user can be distinguished by these information, by the different stage for dividing user, so that it may which setting is in face of super
Different countermeasure ranks under load searching request;And it can also be added for these different countermeasure ranks different
Reply processing mark;After these reply processing marks are attached to the correspondence searching request of search platform, so that it may make search flat
Platform distinguishes processing, and the complexity carried out in varying degrees reduces;
Search complexity reduces can be by calling caching, reducing concurrent collector node, the reduction customized row of searching request
Sequence calculates to realize;Cooperate the differentiation of corresponding different user group, rather than by the way of imposing uniformity without examining individual cases, it can would detract from servicing
Coverage control under minimum influence degree;Simultaneously by way of setting specific strategy, search complexity can be allowed to drop
Low degree also follows the concurrent load condition of searching request dynamically to be adjusted;Increase in excess load request durations,
The degree and range for then improving search complexity reduction reduce when in excess load request durations, then reduce search complexity
Reduced degree and range;Until finally exiting completely reduces the feelings that search complexity reply is intervened, and this reply is intervened
Condition issues related service shareholder by way of mabage report;
Present invention is mainly used for provide one kind to reduce search complexity based on specific policy to cope with excess load searching request
Method, the beneficial effect is that:
(1) user identity information entrained by the search client in the present invention can be used as the grading to user,
Dispatcher of all users' classification grading work in search platform is completed, and client is not necessarily to carrying out other configurations,
I.e. when high load capacity searching request occurs, it is not necessarily to client intervention, all counter-measures have the Dispatcher in search platform
Processing is automatically performed with Searcher example.
(2) using in the present invention to the application for damaging service idea combines user to the different weights of service impact, no
Co-extensive and the different degrees of method of service that damages combine.The remarkable fusing mode imposed uniformity without examining individual cases, but use ladder
Type go up and down step by step damage method of service to cope with high load capacity searching request, make business receive damage service impact range it is most
It may be small.
(3) present invention characteristic of distributed search platform is utilized, by reduce concurrent node recall in the way of,
Search result is recalled using single node to reduce search complexity.Recalling range by limitation in high load capacity reduces meter
Complexity is calculated, while also assuring search accuracy, it is more friendly to client for the method for service for directly refusing excess
It is good.
(4) the invention proposes using containing in searching request, basic Lucene is inquired and the marking of customized business is sorted
Two-part calculating demand proposes the thought by dimensionality reduction service, i.e. search platform Searcher is only in high load capacity
The calculation amount comprising basic Lucene inquiry is completed, and gives up the calculating of customized business marking sequence.It is taken by this dimensionality reduction
The mode of business can reduce search complexity and improve the performance of reply high load capacity searching request.
(5) the invention proposes after coping with high load capacity searching request, all systems, which are automatically processed, reduces complexity
Detailed process is sent to each business shareholder being set in advance by way of mabage report, so that business shareholder can be with
Know that system reduces the coverage of complexity, and assesses the influence to business.In this way, administrator can improve and match
The mode for setting specific policy makes entire search platform preferably cope with the searching request of high load capacity, hard without additionally increasing
Part server expenditure.
Above embodiments do not limit the present invention in any way, all to be made in a manner of equivalent transformation to above embodiments
Other improvement and application, belong to protection scope of the present invention.
Claims (6)
1. a kind of reduce method of the search complexity to cope with excess load searching request based on specific policy, which is characterized in that packet
Include following steps:
Step 1: identification search request, i.e., all searching requests for being sent to search service cluster require to take specific user's mark
Know, identifying includes user sources IP, the cookie information that user logs in, Spider (crawler engine) flag bit, and all search are asked
Search cluster transponder (Dispatcher) processing can all be submitted to first by asking;
Step 2: distinguishing request and additional treatments policy flag, i.e. search cluster transponder (Dispatcher) can be according to administrator
These searching requests are carried out User Priority differentiation, while judging current searching request quantity by configured policing rule
The standard for whether having reached triggering surge reply with server load, distinguishes different user search requests and affix is specific
Then search process policy flag is transmitted to some search service example (Searcher) in cluster by Dispatcher and handles;
Step 3: dividing policy processing request, i.e., some search service example (Searcher) in search cluster, which is responsible for receiving this, searches
After the searching request of rope processing strategie flag bit, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result, i.e., the search result handled is returned to search cluster and turned by search service example (Searcher)
It sends out device (Dispatcher), search cluster transponder returns again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies, i.e., when search cluster transponder (Dispatcher) periodic scanning discovery is super negative
When lotus searching request slows down or aggravates (index of correlation of load and amount of access is when boundary value fluctuates), then according to specific place
Logic is managed, fluctuation processing is carried out, until excess load searching request terminates, all indexs restore normal;
Step 6: recording process assessment influences, i.e., the searching request that search system will record all process flows and be related to is believed
Breath, and after surge, notify related service shareholder automatically by way of mail from the background, it is complicated to reduce search with assessment
The coverage of degree.
2. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as described in claim 1
Method, it is characterised in that: in the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule;
The setting policy levels index refers to: administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4, according to
Current search number of requests and server load carry out the calling of different stage, and the data threshold of each rank is can be by pipe
Reason person carries out human configuration according to policy levels index;Policy levels index includes: the searching request per second of current manipulative indexing
Number of concurrent, 5 minutes average servers of server load (with the Load of (SuSE) Linux OS in current search cluster
The parameter value of Average is reference);
The setting is distinguished user's processing rule and referred to: administrator can distinguish user and set different level rules, by user point
Spidername and user identifier Username is identified for crawler, and is arranged in different policy levels;
2-2:1 grades of strategy processing of step, i.e., after periodic scanning search server loads, if when searching request quantity is more than to be
Default maximum capacity of uniting or 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, that is, open
Dynamic 1 grade of countermeasure of rank, the strategy can be identified in all user identifiers that Dispatcher is received and be had
Searching request with OtherSpider flag bit is filtered out, is then asked by these by the searching request of Spidername
It specific search process policy flag position is added after asking gives corresponding search cluster service example again and handled, is added at this time
Search process policy flag position is * A*;
The strategy processing of 2-3:2 grades of step, i.e., after periodic scanning, when searching request quantity be more than systemic presupposition maximum capacity or
When server CPU total nucleus number is still greater than in 5 minutes Load Average values of search server load, that is, starts 2 grades of rank and answer
To strategy, which can identify that all search with crawler label are asked in all user identifiers that Dispatcher is received
It asks, does not repartition the priority of Spidername;Then by these request after specific search process policy flag position is added again
It gives corresponding search cluster service example to be handled, the search process policy flag position being added at this time is * A*, while will step
The search sign position of the corresponding addition of original in rapid 2-2 is changed to * B*;
2-4:3 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than that system is pre- when searching request quantity
If maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load,
Start 3 grades of countermeasures of rank, which can judge that the user is in user's cookie information that Dispatcher is received
The no registration user for belonging to corresponding business platform, if it is determined that user and no platform register user in the request, then the user
Username is identified as Other, and specific search process policy flag position is added after these are requested and gives corresponding search collection again
Group's Service Instance is handled, and the search process policy flag position being added at this time is * A*, while upgrading step by step will be in step 2-3
The search sign position of the corresponding addition of original be changed to * B*, while the search sign position of the corresponding addition of original in step 2-2 is changed to *
C*;
2-5:4 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than that system is pre- when searching request quantity
If maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load,
Start 4 grades of countermeasures of rank, which can be after all users request that Dispatcher is received, by registration user's mark
Know, that is, specific search process policy flag position is added gives again in user's request of Username=Login and search accordingly
Rope cluster service example is handled, and the search process policy flag position being added at this time is * A*, while by the original in step 2-4
The search sign position of corresponding addition is changed to * B*, while the search sign position of the corresponding addition of original in step 2-3 is changed to * C*, together
When the search sign position of the corresponding addition of original in step 2-2 is changed to * D*;
Step 2-6: strategy is transferred the files step by step, i.e., after periodic scanning, Dispatcher discovery is more than system when searching request quantity
Preset maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load
When, Dispatcher can be in sequence successively by step 2-2,2-3,2-4, the search strategy mark of corresponding searching request in 2-5
Will position is successively progressive to * D* since * A*, until registering user (i.e. the user of Usernmae=Login) in addition to operation system
Except can determine whether as system registry user's in search strategy flag bit all * D*, Cookie after all user's requests
Searching request policy flag position highest level is promoted to * C*.
3. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 2
Method, it is characterised in that: in the step 3, further includes:
Step 3-1: load caching, i.e. Searcher receive the searching request with specific search process policy flag position * A*, root
According to the flag bit, which will load from the caching of current Searcher first, if caching load is another less than again
It is outer to carry out re-searching for operation, search computation complexity is reduced by caching to call;
Step 3-2: single-point damages service, i.e. Searcher receives the searching request of specific search process policy flag position * B*, root
According to the flag bit, which will do it single-point and damages service, and single-point damages service and refers to the search because in search cluster
Business is using distributed search mechanism, i.e., primary search needs that 2 or 2 or more Searcher search examples is requested to be held respectively
Processing is merged after row searching request again, after Searcher starting single-point damages service, i.e., the Searcher will not divide again
It does not invite and is returned after asking other Searhcer examples, direct single-point Searcher to be finished;
Step 3-3: dimensionality reduction simplifies, i.e. Searcher receives the searching request of specific search process policy flag position * C*, according to this
Flag bit, the searching request will do it dimensionality reduction and damage service, and dimensionality reduction damages service and refers to that the calculating by the searching request of user is answered
Miscellaneous degree carries out dimensionality reduction and simplifies, and skips the service logic that part takes a long time, such as ignores customized marking sequence, and reduction is called together
Return the business logic processings such as quantity;
Step 3-4: returning empty as a result, i.e. Searcher receives the searching request of specific search process policy flag position * D*, according to
The flag bit, Searcher can directly refuse the searching request, return to empty search result.
4. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 3
Method, it is characterised in that: in the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition maximum when searching request quantity
Capability value or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU, will work as
The specific search process policy flag position of the searching request of preceding all corresponding different disposal ranks downshifts, the rule of downshift
To drop to * C* step by step from * D*, drop to * B* step by step from * C*, drops to * A* step by step from * B*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to systemic presupposition
Maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to the total core of server CPU
When number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, and until all processing marks are all removed, i.e., all search are asked
Normal processing is asked, if discovery searching request quantity is greater than systemic presupposition maximum capacity in next timing scan period
Value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then will be current all right
The specific search process policy flag position of the searching request for the different disposal rank answered upshifts, and the rule of downshift is from * C*
* D* is risen to, rises to * C* from * B*, rises to * B* from * A*.
5. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 4
Method, it is characterised in that: in the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, by aggregating
All dispositions are reported to stakeholder to assess coverage;Surge per treatment can all generate a specific search and ask
Seek mabage report
Step 6-2: search system will record the server cpu load and the concurrent quantity of searching request of relative strategy rank at that time,
And it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the situation of change of the processing mark of the searching request of relative strategy rank at that time, example
Such as when from * A* to * B* change histories situation, it is added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode.
6. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 5
Method, it is characterised in that: in the step 2-1, set policy levels index are as follows:
1:[200,24];2:[250,24];3:[300,24];4:[400,32];
I.e. by taking the meaning of first parameter as an example, indicate that 1 grade of tactful data threshold (includes more than 200 for request per second
200), 5 minutes average servers of Cpu are loaded more than 24, behind and so on, this parameter can support dynamic configuration;
It is as follows that user's processing rule setting will be distinguished:
1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*].
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810999884.4A CN109190004B (en) | 2018-08-30 | 2018-08-30 | Method for reducing search complexity based on specific strategy |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810999884.4A CN109190004B (en) | 2018-08-30 | 2018-08-30 | Method for reducing search complexity based on specific strategy |
Publications (2)
Publication Number | Publication Date |
---|---|
CN109190004A true CN109190004A (en) | 2019-01-11 |
CN109190004B CN109190004B (en) | 2020-07-07 |
Family
ID=64917232
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810999884.4A Active CN109190004B (en) | 2018-08-30 | 2018-08-30 | Method for reducing search complexity based on specific strategy |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109190004B (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113869976A (en) * | 2021-09-26 | 2021-12-31 | 中国联合网络通信集团有限公司 | Cargo list generation method and device, server and readable storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1627688A (en) * | 2003-12-10 | 2005-06-15 | 联想(北京)有限公司 | Method for searching sharing files under wireless network grid |
CN101379757B (en) * | 2006-02-07 | 2011-12-07 | 思科技术公司 | Methods and systems for providing telephony services and enforcing policies in a communication network |
CN103346971A (en) * | 2013-06-19 | 2013-10-09 | 华为技术有限公司 | Data forwarding method, controller, forwarding device and system |
CN103428101A (en) * | 2013-08-01 | 2013-12-04 | 华为技术有限公司 | Load sharing method and device |
CN103650439A (en) * | 2011-07-15 | 2014-03-19 | 瑞典爱立信有限公司 | Policy tokens in communication networks |
US9038079B2 (en) * | 2009-12-30 | 2015-05-19 | International Business Machines Corporation | Reducing cross queue synchronization on systems with low memory latency across distributed processing nodes |
CN106407011A (en) * | 2016-09-20 | 2017-02-15 | 焦点科技股份有限公司 | A routing table-based search system cluster service management method and system |
CN106453564A (en) * | 2016-10-18 | 2017-02-22 | 北京京东尚科信息技术有限公司 | Elastic cloud distributed massive request processing method, device and system |
-
2018
- 2018-08-30 CN CN201810999884.4A patent/CN109190004B/en active Active
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1627688A (en) * | 2003-12-10 | 2005-06-15 | 联想(北京)有限公司 | Method for searching sharing files under wireless network grid |
CN101379757B (en) * | 2006-02-07 | 2011-12-07 | 思科技术公司 | Methods and systems for providing telephony services and enforcing policies in a communication network |
US9038079B2 (en) * | 2009-12-30 | 2015-05-19 | International Business Machines Corporation | Reducing cross queue synchronization on systems with low memory latency across distributed processing nodes |
CN103650439A (en) * | 2011-07-15 | 2014-03-19 | 瑞典爱立信有限公司 | Policy tokens in communication networks |
CN103346971A (en) * | 2013-06-19 | 2013-10-09 | 华为技术有限公司 | Data forwarding method, controller, forwarding device and system |
CN103428101A (en) * | 2013-08-01 | 2013-12-04 | 华为技术有限公司 | Load sharing method and device |
CN106407011A (en) * | 2016-09-20 | 2017-02-15 | 焦点科技股份有限公司 | A routing table-based search system cluster service management method and system |
CN106453564A (en) * | 2016-10-18 | 2017-02-22 | 北京京东尚科信息技术有限公司 | Elastic cloud distributed massive request processing method, device and system |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113869976A (en) * | 2021-09-26 | 2021-12-31 | 中国联合网络通信集团有限公司 | Cargo list generation method and device, server and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN109190004B (en) | 2020-07-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8037185B2 (en) | Dynamic application placement with allocation restrictions, vertical stacking and even load distribution | |
CN101266557B (en) | The mine to target assignment of computational tasks in client-server or hosted environment | |
CN100527090C (en) | Method for dynamically distributing computer resource | |
US8606919B2 (en) | Resource management tool | |
US7779416B2 (en) | Load balance control method and load balance control apparatus in data-processing system | |
US20050038890A1 (en) | Load distribution method and client-server system | |
US7836178B1 (en) | Technique for limiting access to the resources of a system | |
US7085893B2 (en) | Negotiated distribution of cache content | |
US8176037B2 (en) | System and method for SQL query load balancing | |
US7840674B1 (en) | Routing messages across a network in a manner that ensures that non-idempotent requests are processed | |
US7085894B2 (en) | Selectively accepting cache content | |
US20100162251A1 (en) | System, method, and computer-readable medium for classifying problem queries to reduce exception processing | |
CN106202092A (en) | The method and system that data process | |
CN110149394A (en) | Dispatching method, device and the storage medium of system resource | |
US20040267553A1 (en) | Evaluating storage options | |
US7721295B2 (en) | Execution multiplicity control system, and method and program for controlling the same | |
CN112306383B (en) | Method for executing operation, computing node, management node and computing equipment | |
JP2559499B2 (en) | Online transaction processing system | |
CN109190004A (en) | A method of search complexity is reduced to cope with excess load searching request based on specific policy | |
US20110055168A1 (en) | System, method, and computer-readable medium to facilitate application of arrival rate qualifications to missed throughput server level goals | |
CN116546028A (en) | Service request processing method and device, storage medium and electronic equipment | |
US20050060496A1 (en) | Selectively caching cache-miss content | |
US20140089311A1 (en) | System. method, and computer-readable medium for classifying problem queries to reduce exception processing | |
CN111382196B (en) | Distributed accounting processing method and system | |
US20090144256A1 (en) | Workflow control in a resource hierarchy |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |