CN109190004A - A method of search complexity is reduced to cope with excess load searching request based on specific policy - Google Patents

A method of search complexity is reduced to cope with excess load searching request based on specific policy Download PDF

Info

Publication number
CN109190004A
CN109190004A CN201810999884.4A CN201810999884A CN109190004A CN 109190004 A CN109190004 A CN 109190004A CN 201810999884 A CN201810999884 A CN 201810999884A CN 109190004 A CN109190004 A CN 109190004A
Authority
CN
China
Prior art keywords
search
searching request
user
load
policy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201810999884.4A
Other languages
Chinese (zh)
Other versions
CN109190004B (en
Inventor
姜平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Focus Technology Co Ltd
Original Assignee
Focus Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Focus Technology Co Ltd filed Critical Focus Technology Co Ltd
Priority to CN201810999884.4A priority Critical patent/CN109190004B/en
Publication of CN109190004A publication Critical patent/CN109190004A/en
Application granted granted Critical
Publication of CN109190004B publication Critical patent/CN109190004B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

Method of the search complexity to cope with excess load searching request is reduced based on specific policy the invention discloses a kind of, which is characterized in that including step 1: identification search request;Step 2: distinguishing request and additional treatments policy flag;Step 3: dividing policy processing request;Step 4: returning the result;Step 5: regular monitoring dynamic adjustable strategies;Step 6: recording process assessment influences.The coverage for reaching the service of would detract from controls under minimum influence degree, the effect that the degree for allowing search complexity to reduce also follows the concurrent load condition of searching request dynamically to be adjusted.

Description

It is a kind of that search complexity is reduced to cope with excess load searching request based on specific policy Method
Technical field
The present invention relates to data searching technology field, more particularly to one kind based on specific policy reduce search complexity with The method for coping with excess load searching request;
Background technique
Search system is high load capacity computing system, needs to occupy a large amount of CPU and calculates time and memory source, on isochrone Search system need to keep high availability again, and response time requirement is in Millisecond, thus its system resource for occupying and Its maximum search request Concurrency that can be handled must be into a certain range ratio;
Company operation needs to consider economic benefit, and the hardware cost spent is limited;For search platform, Belong to background service supplier in entire operation system, opens and used to the operation system of all accesses, the total place undertaken Reason ability is relatively limited;When there is excess load searching request, i.e. the concurrent quantity of searching request is more than system busy hour, Search platform would tend to occur example memory and overflow or load too high the case where causing application crashes that can not provide service;In order to Under limited hardware cost, this excess load searching request can be coped with and occurred;Need search platform system that there can be a kind of base In certain specific policy to reduce search complexity, is realized by damaging the thought of service and cope with excess load request;
Damaging service idea is generally acknowledged based under limited resources, passes through the inessential or non-core user in sacrifice part Experience is utmostly to meet basic service demand;Search platform provides a search service, itself contains certain Ambiguity naturally distinguishes over the coherence request of Database Systems, for same search term, can provide different search Results set, as long as search result set meets certain accuracy and correlation;And in e-commerce search platform its The complexity of searching service be it is very high, search calculate not only include based on Lucene marking recall sequence, also to add Enter the business customizing logic of many complexity, such as considers the merchandise news quality for including in its region of search, businessman member's rank, goes through History conclusion of the business situation, user click condition simultaneously will also do suitable decaying to the commodity of same member businessman under one's name are concentrated on;At it There are many computing resource and the memory source for managing logic consumption, especially for example accessed personalized search, it is also contemplated that search user Relevant search Behavior preference in portrait, correlation network etc.;By reducing search complexity, search platform can be made same Search hardware resource under promoted its handle searching request concurrent capability, this is a kind of technical application scene of highly effective;
Summary of the invention
It is a kind of based on specific policy reduction the technical problem to be solved by the present invention is to overcome the deficiencies of the prior art and provide Method of the search complexity to cope with excess load searching request is drawn by carrying out different strategies to all searching requests received Divide different user groups, searching request wherein little to business value influence is scanned for the reduction of complexity, reduces Method be remove search inquiry request in sequence, reduce the amount of recalling, to reduce its occupancy to search server performance Handling capacity with search service cluster is improved, makes it that the excess load searching request because largely happening suddenly be avoided to lead to entire services set Group can not service.Excess load searching request refers to that search platform encounters the search concurrent request for handling load more than its maximum and causes System load Load Average index increases.
In order to solve the above technical problems, the present invention provides, a kind of based on specific policy to reduce search complexity super negative to cope with The method of lotus searching request, includes the following steps:
Step 1: identification search request, i.e., all searching requests for being sent to search service cluster require to take specific use Family mark, identifying includes user sources IP, the cookie information that user logs in, and Spider (crawler engine) flag bit is all to search Rope request can all submit to search cluster transponder (Dispatcher) processing first;
Step 2: distinguishing request and additional treatments policy flag, i.e. search cluster transponder (Dispatcher) can be according to pipe These searching requests are carried out User Priority differentiation, while judging current searching request by the configured policing rule of reason person Whether quantity and server load have reached the standard of triggering surge reply, distinguish different user search requests and affix is special Then fixed search process policy flag is transmitted to some search service example (Searcher) in cluster by Dispatcher Reason;
Step 3: dividing policy processing request, i.e. some search service example (Searcher) in search cluster is responsible for receiving After the searching request of the search process policy flag position, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result, i.e., the search result handled is returned to search collection by search service example (Searcher) Group's transponder (Dispatcher), search cluster transponder return again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies, i.e., when search cluster transponder (Dispatcher) periodic scanning discovery When excess load searching request slows down or aggravates (index of correlation of load and amount of access is when boundary value fluctuates), then according to spy Surely logic is handled, fluctuation processing is carried out, until excess load searching request terminates, all indexs restore normal;
Step 6: the searching request that recording process assessment influence, i.e. search system will record all process flows and be related to Information, and after surge, notify related service shareholder automatically by way of mail from the background, it is multiple to reduce search with assessment The coverage of miscellaneous degree.
In the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule;
The setting policy levels index refers to: administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4, Carry out the calling of different stage according to current search number of requests and server load, the data threshold of each rank is can be with Human configuration is carried out according to policy levels index by administrator;Policy levels index includes: the search per second of current manipulative indexing Request Concurrency number, 5 minutes average servers of server load (with the Load of (SuSE) Linux OS in current search cluster The parameter value of Average is reference);
The setting is distinguished user's processing rule and referred to: administrator can distinguish user and set different level rules, will use Family is divided into crawler mark Spidername and user identifier Username, and is arranged in different policy levels;
2-2:1 grades of strategy processing of step, i.e., after periodic scanning search server loads, if when searching request quantity is super 5 minutes Load Average values for crossing systemic presupposition maximum capacity or search server load are greater than the total nucleus number of server CPU, Start 1 grade of countermeasure of rank, which can identify in all user identifiers that Dispatcher is received and have The searching request of Spidername will be filtered out with the searching request of OtherSpider flag bit, then by these It specific search process policy flag position is added after request gives corresponding search cluster service example again and handled, is added at this time Search process policy flag position be * A*;
2-3:2 grades of strategy processing of step, i.e., after periodic scanning, when searching request quantity is more than systemic presupposition maximum capacity When server CPU total nucleus number is still greater than in value or 5 minutes Load Average values of search server load, i.e. starting rank 2 Grade countermeasure, the strategy can identify all searching with crawler label in all user identifiers that Dispatcher is received Rope request, does not repartition the priority of Spidername;Then by these request after specific search process policy flag is added Position is given corresponding search cluster service example again and is handled, and the search process policy flag position being added at this time is * A*, simultaneously The search sign position of the corresponding addition of original in step 2-2 is changed to * B*;
2-4:3 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than to be when searching request quantity It unites and presets maximum capacity or when the total core of server CPU is still greater than in 5 minutes Load Average values of search server load When number, i.e., starting 3 grades of countermeasures of rank, the strategy can judge the use in user's cookie information that Dispatcher is received Whether family belongs to the registration user of corresponding business platform, if it is determined that user and no platform register user in the request, then the use The Username at family is identified as Other, specific search process policy flag position is added after these are requested gives again and search accordingly Rope cluster service example is handled, and the search process policy flag position being added at this time is * A*, while being upgraded step by step by step 2- The search sign position of the corresponding addition of original in 3 is changed to * B*, while the search sign position of the corresponding addition of original in step 2-2 being changed For * C*;
2-5:4 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than to be when searching request quantity It unites and presets maximum capacity or when the total core of server CPU is still greater than in 5 minutes Load Average values of search server load When number, i.e., starting 4 grades of countermeasures of rank, the strategy can use registration after all users request that Dispatcher is received Specific search process policy flag position is added in family mark, that is, user's request of Username=Login to give again accordingly Search cluster service example handled, the search process policy flag position being added at this time is * A*, while will be in step 2-4 The search sign position of the corresponding addition of original be changed to * B*, while the search sign position of the corresponding addition of original in step 2-3 is changed to * C*, while the search sign position of the corresponding addition of original in step 2-2 is changed to * D*;
Step 2-6: strategy is transferred the files step by step, i.e., after periodic scanning, Dispatcher discovery is more than when searching request quantity Systemic presupposition maximum capacity or when that server CPU is still greater than is total for 5 minutes Load Average values of search server load When nucleus number, Dispatcher can be in sequence successively by step 2-2,2-3,2-4, the search plan of corresponding searching request in 2-5 Slightly flag bit is successively progressive to * D* since * A*, until in addition to operation system registration user (i.e. Usernmae=Login's User) except can determine whether as system registry use in search strategy flag bit all * D*, Cookie after all user's requests The searching request policy flag position highest level at family is promoted to * C*.
In the step 3, further includes:
Step 3-1: load caching, i.e. Searcher receive the search with specific search process policy flag position * A* and ask Ask, according to the flag bit, which will load from the caching of current Searcher first, if caching load less than It still further carries out re-searching for operation, reduces search computation complexity by caching to call;
Step 3-2: single-point damages service, i.e. the search that Searcher receives specific search process policy flag position * B* is asked It asks, according to the flag bit, which will do it single-point and damage service, and single-point damages service and refers to because in search cluster Searching service is using distributed search mechanism, i.e., primary search needs to request 2 or 2 or more Searcher search real respectively Example merges processing after executing searching request again, and after Searcher starting single-point damages service, i.e., the Searcher will not It is returned after requesting other Searhcer examples, direct single-point Searcher to be finished respectively again;
Step 3-3: dimensionality reduction simplifies, i.e. Searcher receives the searching request of specific search process policy flag position * C*, root According to the flag bit, which will do it dimensionality reduction and damages service, and dimensionality reduction damages service and refers to the meter of the searching request of user It calculates complexity progress dimensionality reduction to simplify, skips the service logic that part takes a long time, such as ignore customized marking sequence, subtract The business logic processings such as few call back number;
Step 3-4: returning empty as a result, i.e. Searcher receives the searching request of specific search process policy flag position * D*, According to the flag bit, Searcher can directly refuse the searching request, return to empty search result.
In the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition when searching request quantity Maximum capacity or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU, The specific search process policy flag position of the searching request of current all corresponding different disposal ranks is downshifted, downshift Rule drops to * B* from * C* step by step, drops to * A* step by step from * B* to drop to * C* step by step from * D*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to system Default maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to server CPU When total nucleus number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, until all processing marks are all removed, i.e., all to search The normal processing of rope request, if it is maximum that discovery searching request quantity is greater than systemic presupposition in next timing scan period Capability value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then by current institute There is the specific search process policy flag position of the searching request of corresponding different disposal rank to upshift, the rule of downshift is * D* is risen to from * C*, rises to * C* from * B*, rises to * B* from * A*.
In the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, pass through arrangement Summarize all dispositions to report to stakeholder to assess coverage;Surge per treatment can all generate a specific search Rope requests mabage report
Step 6-2: search system will record the server cpu load and searching request number of concurrent of relative strategy rank at that time Amount, and it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the variation feelings of the processing mark of the searching request of relative strategy rank at that time Condition, such as when from change histories situations such as * A* to * B*, be added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode;
In the step 2-1, policy levels index is set are as follows:
1:[200,24];2:[250,24];3:[300,24];4:[400,32];
I.e. by taking the meaning of first parameter as an example, indicate that 1 grade of tactful data threshold is request per second more than 200 (packets Containing 200), 5 minutes average servers of Cpu are loaded more than 24, behind and so on, this parameter can support dynamic configuration 's;
It is as follows that user's processing rule setting will be distinguished:
1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*].
Advantageous effects of the invention:
(1) user identity information entrained by the search client in the present invention can be used as the grading to user, Dispatcher of all users' classification grading work in search platform is completed, and client is not necessarily to carrying out other configurations, I.e. when high load capacity searching request occurs, it is not necessarily to client intervention, all counter-measures have the Dispatcher in search platform Processing is automatically performed with Searcher example.
(2) using in the present invention to the application for damaging service idea combines user to the different weights of service impact, no Co-extensive and the different degrees of method of service that damages combine.The remarkable fusing mode imposed uniformity without examining individual cases, but use ladder Type go up and down step by step damage method of service to cope with high load capacity searching request, make business receive damage service impact range it is most It may be small.
(3) present invention characteristic of distributed search platform is utilized, by reduce concurrent node recall in the way of, Search result is recalled using single node to reduce search complexity.Recalling range by limitation in high load capacity reduces meter Complexity is calculated, while also assuring search accuracy, it is more friendly to client for the method for service for directly refusing excess It is good.
(4) the invention proposes using containing in searching request, basic Lucene is inquired and the marking of customized business is sorted Two-part calculating demand proposes the thought by dimensionality reduction service, i.e. search platform Searcher is only in high load capacity The calculation amount comprising basic Lucene inquiry is completed, and gives up the calculating of customized business marking sequence.It is taken by this dimensionality reduction The mode of business can reduce search complexity and improve the performance of reply high load capacity searching request.
(5) the invention proposes after coping with high load capacity searching request, all systems, which are automatically processed, reduces complexity Detailed process is sent to each business shareholder being set in advance by way of mabage report, so that business shareholder can be with Know that system reduces the coverage of complexity, and assesses the influence to business.In this way, administrator can improve and match The mode for setting specific policy makes entire search platform preferably cope with the searching request of high load capacity, hard without additionally increasing Part server expenditure.
Detailed description of the invention
Fig. 1 is the method flow schematic diagram of exemplary embodiment of the present invention.
Specific embodiment
The present invention is further illustrated with exemplary embodiment with reference to the accompanying drawing:
As shown in Figure 1, the present invention includes the following steps:
Step 1: identification search request
All searching requests for being sent to search service cluster require to take specific user identifier, and mark is come including user Source IP, the cookie information that user logs in, Spider (crawler engine) flag bit, all searching requests can all submit to one first A search cluster transponder (Dispatcher) processing;
Step 2: distinguishing request and additional treatments policy flag
The policing rule that cluster transponder (Dispatcher) can be good according to administrator configurations is searched for, by these searching requests User Priority differentiation is carried out, while judging whether current searching request quantity and server load have reached triggering surge and answered Pair standard, distinguish different user search requests and the specific search process policy flag of affix, then by Dispatcher is transmitted to some search service example (Searcher) in cluster and handles;
Step 3: dividing policy processing request
Some search service example (Searcher) in search cluster is responsible for receiving the search process policy flag position After searching request, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result
The search result handled is returned to search cluster transponder by search service example (Searcher) (Dispatcher), search cluster transponder returns again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies
When search cluster transponder (Dispatcher) periodic scanning discovery excess load searching request slows down or aggravates (index of correlation of load and amount of access is when boundary value fluctuates) is waited, then according to particular procedure logic, carries out fluctuation processing, until Excess load searching request terminates, and all indexs restore normal;
Step 6: recording process assessment influences
The searching request information that search system will record all process flows and be related to, and after surge, backstage Notify related service shareholder, automatically by way of mail to assess the coverage for reducing search complexity.
In the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule
Set policy levels index
Administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4, according to current search number of requests kimonos Business device loads to carry out the calling of different stage, and the data threshold of each rank is can be by administrator according to policy levels index Carry out human configuration;Policy levels index includes: the searching request number of concurrent per second of current manipulative indexing, in current search cluster 5 minutes average servers of server load (using the parameter value of the Load Average of (SuSE) Linux OS as reference);Such as It is provided that 1:[200,24];2:[250,24];3:[300,24];4:[400,32].By taking the meaning of first parameter as an example, It indicates that 1 grade of tactful data threshold is request per second more than 200 (including 200), and 5 minutes average servers load of Cpu is super Cross 24.Below and so on, this parameter can support dynamic configuration.
Setting distinguishes user and handles rule
Administrator can distinguish user and set different level rules, and user is divided into crawler mark Spidername and user Username is identified, the Spider priority such as Google and Bing is high, and the priority of other Spider is lower, then is arranged It is distinguished in different policy levels, such as according to different Username user class, is divided into registration user Login and its He is nonregistered user Other.
Citing is provided that 1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*]
2-2:1 grades of strategy processing of step
When periodic scanning search server load after, if when searching request quantity be more than systemic presupposition maximum capacity or 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, i.e. starting 1 grade of countermeasure of rank, The policing rule is pre-defined in step 2-1, can be identified in all user identifiers that Dispatcher is received The searching request of Spidername is had out, and, with the 1 grade of user's processing strategie rule illustrated in step 2-1
" 1:[Spidername=Other:*A*] " for, the searching request with OtherSpider flag bit is filtered Out, then by these request after specific search process policy flag position be added give corresponding search cluster service example again It is handled, the search process policy flag position being added at this time is * A*;
2-3:2 grades of strategy processing of step
After periodic scanning, when searching request quantity is more than the 5 of systemic presupposition maximum capacity or search server load When server CPU total nucleus number is still greater than in minute Load Average value, i.e. starting 2 grades of countermeasures of rank, with step 2-1 For 2 grades of user processing strategies rules " 1:[Spidername=All:*A*] " of middle citing, which can be in Dispatcher All searching requests with crawler label are identified in all user identifiers received, do not repartition Spidername;Then By these request after specific search process policy flag position be added give corresponding search cluster service example again and handle, The search process policy flag position being added at this time is * A*, while the search sign position of the corresponding addition of original in step 2-2 being changed For * B*, this is handled with regard to escalation policy step by step described in Fig. 1;
2-4:3 grades of strategy processing of step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, that is, start 3 grades of rank replies Strategy, by taking the 3 grades of user processing strategies rules " 3:[Username=Other:*A*] " illustrated in step 2-1 as an example, the strategy It can judge whether the user belongs to the registration use of corresponding business platform in user's cookie information that Dispatcher is received Family, if it is determined that user and no platform registration user in the request, then the Username of the user is identified as Other, by these It specific search process policy flag position is added after request gives corresponding search cluster service example again and handled, is added at this time Search process policy flag position be * A*, while upgrading change the search sign position of the corresponding addition of original in step 2-3 step by step For * B*, while the search sign position of the corresponding addition of original in step 2-2 is changed to * C*;
2-5:4 grades of strategy processing of step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, that is, start 4 grades of rank replies Strategy, by taking the 4 grades of user processing strategies rules " 1:[Username=Login:*A*] " illustrated in step 2-1 as an example, the strategy User identifier, that is, the use of Username=Login can will be registered after all users request that Dispatcher is received Specific search process policy flag position be added in the request of family give corresponding search cluster service example again and handled, at this time plus The search process policy flag position entered is * A*, while the search sign position of the corresponding addition of original in step 2-4 is changed to * B*, together When the search sign position of the corresponding addition of original in step 2-3 is changed to * C*, while searching the corresponding addition of original in step 2-2 Rope flag bit is changed to * D*;
Step 2-6: strategy is transferred the files step by step
After periodic scanning, Dispatcher discovery is when searching request quantity is more than systemic presupposition maximum capacity or ought be searched When server CPU total nucleus number is still greater than in 5 minutes Load Average values of rope server load, Dispatcher can be according to suitable Sequence is successively successively progressive since * A* by the search strategy flag bit of searching request corresponding in step 2-2,2-3,2-4,2-5 To * D*, after all user's requests other than operation system registers user (i.e. the user of Usernmae=Login) It can determine whether the searching request policy flag position highest for system registry user in search strategy flag bit all * D*, Cookie Rank is promoted to * C*.
In the step 3, further includes:
Step 3-1: load caching
Searcher receives the searching request with specific search process policy flag position * A*, and according to the flag bit, this is searched Rope request will be loaded from the caching of current Searcher first, if caching load is less than still further carrying out re-searching for transporting It calculates, reduces search computation complexity by caching to call;
Step 3-2: single-point damages service
Searcher receives the searching request of specific search process policy flag position * B*, and according to the flag bit, which is asked Asking will do it single-point and damages service, and single-point damages service and refers to because the searching service in search cluster is using distributed search Mechanism, i.e., primary search need to be closed again after requesting 2 or 2 or more Searcher search examples to execute searching request respectively And handle, after Searcher starting single-point damages service, i.e. the Searcher will not request other Searhcer real respectively again Example, direct single-point Searcher are returned after being finished, and reduce search computation complexity;But the search result returned is relative to just Normal search process result can be on the low side, but still is able to satisfy most searching requests and has search result return;
Step 3-3: dimensionality reduction simplifies
Searcher receives the searching request of specific search process policy flag position * C*, and according to the flag bit, which is asked Asking will do it dimensionality reduction and damages service, and dimensionality reduction, which damages service and refers to, carries out dimensionality reduction letter for the computational complexity of the searching request of user Change, skip the service logic that part takes a long time, such as ignore customized marking sequence, reduces the service logics such as call back number Processing, the search result returned can be deviated relative to the accuracy of normal search process result, but can guarantee certain Degree correlation reduces search computation complexity;
Step 3-4: empty result is returned
Searcher receives the searching request of specific search process policy flag position * D*, according to the flag bit, Searcher The searching request can directly be refused, empty search result is returned to, to reduce system loading.
In the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition when searching request quantity Maximum capacity or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU, The specific search process policy flag position of the searching request of current all corresponding different disposal ranks is downshifted, downshift Rule drops to * B* from * C* step by step, drops to * A* step by step from * B* to drop to * C* step by step from * D*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to system Default maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to server CPU When total nucleus number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, until all processing marks are all removed, i.e., all to search The normal processing of rope request, if it is maximum that discovery searching request quantity is greater than systemic presupposition in next timing scan period Capability value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then by current institute There is the specific search process policy flag position of the searching request of corresponding different disposal rank to upshift, the rule of downshift is * D* is risen to from * C*, rises to * C* from * B*, rises to * B* from * A*.
In the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, because passing through Certain loss of search precision can be brought to cope with excess load searching request by reducing search complexity, may cause shadow to business It rings, needs to report to stakeholder by aggregating all dispositions to assess coverage;Surge per treatment is all A specific searching request mabage report can be generated
Step 6-2: search system will record the server cpu load and searching request number of concurrent of relative strategy rank at that time Amount, and it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the variation feelings of the processing mark of the searching request of relative strategy rank at that time Condition, such as when from change histories situations such as * A* to * B*, be added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode.
In other words, the present invention is exactly to be carried out by specific strategy to user for the technical application scene for reality Identify and distinguish between, all searching requests received of search platform be all include user Cookie, IP, crawler zone bit information; The different stage that user can be distinguished by these information, by the different stage for dividing user, so that it may which setting is in face of super Different countermeasure ranks under load searching request;And it can also be added for these different countermeasure ranks different Reply processing mark;After these reply processing marks are attached to the correspondence searching request of search platform, so that it may make search flat Platform distinguishes processing, and the complexity carried out in varying degrees reduces;
Search complexity reduces can be by calling caching, reducing concurrent collector node, the reduction customized row of searching request Sequence calculates to realize;Cooperate the differentiation of corresponding different user group, rather than by the way of imposing uniformity without examining individual cases, it can would detract from servicing Coverage control under minimum influence degree;Simultaneously by way of setting specific strategy, search complexity can be allowed to drop Low degree also follows the concurrent load condition of searching request dynamically to be adjusted;Increase in excess load request durations, The degree and range for then improving search complexity reduction reduce when in excess load request durations, then reduce search complexity Reduced degree and range;Until finally exiting completely reduces the feelings that search complexity reply is intervened, and this reply is intervened Condition issues related service shareholder by way of mabage report;
Present invention is mainly used for provide one kind to reduce search complexity based on specific policy to cope with excess load searching request Method, the beneficial effect is that:
(1) user identity information entrained by the search client in the present invention can be used as the grading to user, Dispatcher of all users' classification grading work in search platform is completed, and client is not necessarily to carrying out other configurations, I.e. when high load capacity searching request occurs, it is not necessarily to client intervention, all counter-measures have the Dispatcher in search platform Processing is automatically performed with Searcher example.
(2) using in the present invention to the application for damaging service idea combines user to the different weights of service impact, no Co-extensive and the different degrees of method of service that damages combine.The remarkable fusing mode imposed uniformity without examining individual cases, but use ladder Type go up and down step by step damage method of service to cope with high load capacity searching request, make business receive damage service impact range it is most It may be small.
(3) present invention characteristic of distributed search platform is utilized, by reduce concurrent node recall in the way of, Search result is recalled using single node to reduce search complexity.Recalling range by limitation in high load capacity reduces meter Complexity is calculated, while also assuring search accuracy, it is more friendly to client for the method for service for directly refusing excess It is good.
(4) the invention proposes using containing in searching request, basic Lucene is inquired and the marking of customized business is sorted Two-part calculating demand proposes the thought by dimensionality reduction service, i.e. search platform Searcher is only in high load capacity The calculation amount comprising basic Lucene inquiry is completed, and gives up the calculating of customized business marking sequence.It is taken by this dimensionality reduction The mode of business can reduce search complexity and improve the performance of reply high load capacity searching request.
(5) the invention proposes after coping with high load capacity searching request, all systems, which are automatically processed, reduces complexity Detailed process is sent to each business shareholder being set in advance by way of mabage report, so that business shareholder can be with Know that system reduces the coverage of complexity, and assesses the influence to business.In this way, administrator can improve and match The mode for setting specific policy makes entire search platform preferably cope with the searching request of high load capacity, hard without additionally increasing Part server expenditure.
Above embodiments do not limit the present invention in any way, all to be made in a manner of equivalent transformation to above embodiments Other improvement and application, belong to protection scope of the present invention.

Claims (6)

1. a kind of reduce method of the search complexity to cope with excess load searching request based on specific policy, which is characterized in that packet Include following steps:
Step 1: identification search request, i.e., all searching requests for being sent to search service cluster require to take specific user's mark Know, identifying includes user sources IP, the cookie information that user logs in, Spider (crawler engine) flag bit, and all search are asked Search cluster transponder (Dispatcher) processing can all be submitted to first by asking;
Step 2: distinguishing request and additional treatments policy flag, i.e. search cluster transponder (Dispatcher) can be according to administrator These searching requests are carried out User Priority differentiation, while judging current searching request quantity by configured policing rule The standard for whether having reached triggering surge reply with server load, distinguishes different user search requests and affix is specific Then search process policy flag is transmitted to some search service example (Searcher) in cluster by Dispatcher and handles;
Step 3: dividing policy processing request, i.e., some search service example (Searcher) in search cluster, which is responsible for receiving this, searches After the searching request of rope processing strategie flag bit, performed corresponding processing according to the difference of the policy flag;
Step 4: returning the result, i.e., the search result handled is returned to search cluster and turned by search service example (Searcher) It sends out device (Dispatcher), search cluster transponder returns again to result to the corresponding search client for initiating request;
Step 5: regular monitoring dynamic adjustable strategies, i.e., when search cluster transponder (Dispatcher) periodic scanning discovery is super negative When lotus searching request slows down or aggravates (index of correlation of load and amount of access is when boundary value fluctuates), then according to specific place Logic is managed, fluctuation processing is carried out, until excess load searching request terminates, all indexs restore normal;
Step 6: recording process assessment influences, i.e., the searching request that search system will record all process flows and be related to is believed Breath, and after surge, notify related service shareholder automatically by way of mail from the background, it is complicated to reduce search with assessment The coverage of degree.
2. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as described in claim 1 Method, it is characterised in that: in the step 2, further includes:
Step 2-1: Provisioning Policy rule, including setting policy levels index and setting distinguish user and handle rule;
The setting policy levels index refers to: administrator understands configuration strategy and is divided into four ranks, from rank 1 to rank 4, according to Current search number of requests and server load carry out the calling of different stage, and the data threshold of each rank is can be by pipe Reason person carries out human configuration according to policy levels index;Policy levels index includes: the searching request per second of current manipulative indexing Number of concurrent, 5 minutes average servers of server load (with the Load of (SuSE) Linux OS in current search cluster The parameter value of Average is reference);
The setting is distinguished user's processing rule and referred to: administrator can distinguish user and set different level rules, by user point Spidername and user identifier Username is identified for crawler, and is arranged in different policy levels;
2-2:1 grades of strategy processing of step, i.e., after periodic scanning search server loads, if when searching request quantity is more than to be Default maximum capacity of uniting or 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, that is, open Dynamic 1 grade of countermeasure of rank, the strategy can be identified in all user identifiers that Dispatcher is received and be had Searching request with OtherSpider flag bit is filtered out, is then asked by these by the searching request of Spidername It specific search process policy flag position is added after asking gives corresponding search cluster service example again and handled, is added at this time Search process policy flag position is * A*;
The strategy processing of 2-3:2 grades of step, i.e., after periodic scanning, when searching request quantity be more than systemic presupposition maximum capacity or When server CPU total nucleus number is still greater than in 5 minutes Load Average values of search server load, that is, starts 2 grades of rank and answer To strategy, which can identify that all search with crawler label are asked in all user identifiers that Dispatcher is received It asks, does not repartition the priority of Spidername;Then by these request after specific search process policy flag position is added again It gives corresponding search cluster service example to be handled, the search process policy flag position being added at this time is * A*, while will step The search sign position of the corresponding addition of original in rapid 2-2 is changed to * B*;
2-4:3 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than that system is pre- when searching request quantity If maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load, Start 3 grades of countermeasures of rank, which can judge that the user is in user's cookie information that Dispatcher is received The no registration user for belonging to corresponding business platform, if it is determined that user and no platform register user in the request, then the user Username is identified as Other, and specific search process policy flag position is added after these are requested and gives corresponding search collection again Group's Service Instance is handled, and the search process policy flag position being added at this time is * A*, while upgrading step by step will be in step 2-3 The search sign position of the corresponding addition of original be changed to * B*, while the search sign position of the corresponding addition of original in step 2-2 is changed to * C*;
2-5:4 grades of strategy processing of step, i.e., after periodic scanning, Dispatcher discovery is more than that system is pre- when searching request quantity If maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load, Start 4 grades of countermeasures of rank, which can be after all users request that Dispatcher is received, by registration user's mark Know, that is, specific search process policy flag position is added gives again in user's request of Username=Login and search accordingly Rope cluster service example is handled, and the search process policy flag position being added at this time is * A*, while by the original in step 2-4 The search sign position of corresponding addition is changed to * B*, while the search sign position of the corresponding addition of original in step 2-3 is changed to * C*, together When the search sign position of the corresponding addition of original in step 2-2 is changed to * D*;
Step 2-6: strategy is transferred the files step by step, i.e., after periodic scanning, Dispatcher discovery is more than system when searching request quantity Preset maximum capacity or when the total nucleus number of server CPU is still greater than in 5 minutes Load Average values of search server load When, Dispatcher can be in sequence successively by step 2-2,2-3,2-4, the search strategy mark of corresponding searching request in 2-5 Will position is successively progressive to * D* since * A*, until registering user (i.e. the user of Usernmae=Login) in addition to operation system Except can determine whether as system registry user's in search strategy flag bit all * D*, Cookie after all user's requests Searching request policy flag position highest level is promoted to * C*.
3. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 2 Method, it is characterised in that: in the step 3, further includes:
Step 3-1: load caching, i.e. Searcher receive the searching request with specific search process policy flag position * A*, root According to the flag bit, which will load from the caching of current Searcher first, if caching load is another less than again It is outer to carry out re-searching for operation, search computation complexity is reduced by caching to call;
Step 3-2: single-point damages service, i.e. Searcher receives the searching request of specific search process policy flag position * B*, root According to the flag bit, which will do it single-point and damages service, and single-point damages service and refers to the search because in search cluster Business is using distributed search mechanism, i.e., primary search needs that 2 or 2 or more Searcher search examples is requested to be held respectively Processing is merged after row searching request again, after Searcher starting single-point damages service, i.e., the Searcher will not divide again It does not invite and is returned after asking other Searhcer examples, direct single-point Searcher to be finished;
Step 3-3: dimensionality reduction simplifies, i.e. Searcher receives the searching request of specific search process policy flag position * C*, according to this Flag bit, the searching request will do it dimensionality reduction and damage service, and dimensionality reduction damages service and refers to that the calculating by the searching request of user is answered Miscellaneous degree carries out dimensionality reduction and simplifies, and skips the service logic that part takes a long time, such as ignores customized marking sequence, and reduction is called together Return the business logic processings such as quantity;
Step 3-4: returning empty as a result, i.e. Searcher receives the searching request of specific search process policy flag position * D*, according to The flag bit, Searcher can directly refuse the searching request, return to empty search result.
4. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 3 Method, it is characterised in that: in the step 5, further includes:
Step 5-1: after periodic scanning, Dispatcher discovery is less than or equal to systemic presupposition maximum when searching request quantity Capability value or when 5 minutes Load Average values of search server load are less than or equal to the total nucleus number of server CPU, will work as The specific search process policy flag position of the searching request of preceding all corresponding different disposal ranks downshifts, the rule of downshift To drop to * C* step by step from * D*, drop to * B* step by step from * C*, drops to * A* step by step from * B*;
Step 5-2: after periodic scanning, Dispatcher discovery when searching request quantity still less than or be equal to systemic presupposition Maximum capacity or when search server load 5 minutes Load Average values still less than or be equal to the total core of server CPU When number, the processing mark after the searching request of all processing marks with * A* is removed, normal searching processing is switched to;
Step 5-3: repeating rule according to step 5-1,5-2, and until all processing marks are all removed, i.e., all search are asked Normal processing is asked, if discovery searching request quantity is greater than systemic presupposition maximum capacity in next timing scan period Value or when 5 minutes Load Average values of search server load are greater than the total nucleus number of server CPU, then will be current all right The specific search process policy flag position of the searching request for the different disposal rank answered upshifts, and the rule of downshift is from * C* * D* is risen to, rises to * C* from * B*, rises to * B* from * A*.
5. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 4 Method, it is characterised in that: in the step 6, further includes:
Step 6-1: search system will record the e-mail connection mode of the shareholder of corresponding searching service, by aggregating All dispositions are reported to stakeholder to assess coverage;Surge per treatment can all generate a specific search and ask Seek mabage report
Step 6-2: search system will record the server cpu load and the concurrent quantity of searching request of relative strategy rank at that time, And it is depicted as column comparison diagram, it is added in mabage report;
Step 6-3: search system also will record the situation of change of the processing mark of the searching request of relative strategy rank at that time, example Such as when from * A* to * B* change histories situation, it is added in mabage report;
Step 6-4: the mabage report is sent to all shareholders of corresponding business with lettergram mode.
6. a kind of side for coping with excess load searching request based on specific policy reduction search complexity as claimed in claim 5 Method, it is characterised in that: in the step 2-1, set policy levels index are as follows:
1:[200,24];2:[250,24];3:[300,24];4:[400,32];
I.e. by taking the meaning of first parameter as an example, indicate that 1 grade of tactful data threshold (includes more than 200 for request per second 200), 5 minutes average servers of Cpu are loaded more than 24, behind and so on, this parameter can support dynamic configuration;
It is as follows that user's processing rule setting will be distinguished:
1:[Spidername=Other:*A*];
2:[Spidername=All:*A*];
3:[Username=Other:*A*];
4:[Username=Login:*A*].
CN201810999884.4A 2018-08-30 2018-08-30 Method for reducing search complexity based on specific strategy Active CN109190004B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810999884.4A CN109190004B (en) 2018-08-30 2018-08-30 Method for reducing search complexity based on specific strategy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810999884.4A CN109190004B (en) 2018-08-30 2018-08-30 Method for reducing search complexity based on specific strategy

Publications (2)

Publication Number Publication Date
CN109190004A true CN109190004A (en) 2019-01-11
CN109190004B CN109190004B (en) 2020-07-07

Family

ID=64917232

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810999884.4A Active CN109190004B (en) 2018-08-30 2018-08-30 Method for reducing search complexity based on specific strategy

Country Status (1)

Country Link
CN (1) CN109190004B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113869976A (en) * 2021-09-26 2021-12-31 中国联合网络通信集团有限公司 Cargo list generation method and device, server and readable storage medium

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1627688A (en) * 2003-12-10 2005-06-15 联想(北京)有限公司 Method for searching sharing files under wireless network grid
CN101379757B (en) * 2006-02-07 2011-12-07 思科技术公司 Methods and systems for providing telephony services and enforcing policies in a communication network
CN103346971A (en) * 2013-06-19 2013-10-09 华为技术有限公司 Data forwarding method, controller, forwarding device and system
CN103428101A (en) * 2013-08-01 2013-12-04 华为技术有限公司 Load sharing method and device
CN103650439A (en) * 2011-07-15 2014-03-19 瑞典爱立信有限公司 Policy tokens in communication networks
US9038079B2 (en) * 2009-12-30 2015-05-19 International Business Machines Corporation Reducing cross queue synchronization on systems with low memory latency across distributed processing nodes
CN106407011A (en) * 2016-09-20 2017-02-15 焦点科技股份有限公司 A routing table-based search system cluster service management method and system
CN106453564A (en) * 2016-10-18 2017-02-22 北京京东尚科信息技术有限公司 Elastic cloud distributed massive request processing method, device and system

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1627688A (en) * 2003-12-10 2005-06-15 联想(北京)有限公司 Method for searching sharing files under wireless network grid
CN101379757B (en) * 2006-02-07 2011-12-07 思科技术公司 Methods and systems for providing telephony services and enforcing policies in a communication network
US9038079B2 (en) * 2009-12-30 2015-05-19 International Business Machines Corporation Reducing cross queue synchronization on systems with low memory latency across distributed processing nodes
CN103650439A (en) * 2011-07-15 2014-03-19 瑞典爱立信有限公司 Policy tokens in communication networks
CN103346971A (en) * 2013-06-19 2013-10-09 华为技术有限公司 Data forwarding method, controller, forwarding device and system
CN103428101A (en) * 2013-08-01 2013-12-04 华为技术有限公司 Load sharing method and device
CN106407011A (en) * 2016-09-20 2017-02-15 焦点科技股份有限公司 A routing table-based search system cluster service management method and system
CN106453564A (en) * 2016-10-18 2017-02-22 北京京东尚科信息技术有限公司 Elastic cloud distributed massive request processing method, device and system

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113869976A (en) * 2021-09-26 2021-12-31 中国联合网络通信集团有限公司 Cargo list generation method and device, server and readable storage medium

Also Published As

Publication number Publication date
CN109190004B (en) 2020-07-07

Similar Documents

Publication Publication Date Title
US8037185B2 (en) Dynamic application placement with allocation restrictions, vertical stacking and even load distribution
CN101266557B (en) The mine to target assignment of computational tasks in client-server or hosted environment
CN100527090C (en) Method for dynamically distributing computer resource
US8606919B2 (en) Resource management tool
US7779416B2 (en) Load balance control method and load balance control apparatus in data-processing system
US20050038890A1 (en) Load distribution method and client-server system
US7836178B1 (en) Technique for limiting access to the resources of a system
US7085893B2 (en) Negotiated distribution of cache content
US8176037B2 (en) System and method for SQL query load balancing
US7840674B1 (en) Routing messages across a network in a manner that ensures that non-idempotent requests are processed
US7085894B2 (en) Selectively accepting cache content
US20100162251A1 (en) System, method, and computer-readable medium for classifying problem queries to reduce exception processing
CN106202092A (en) The method and system that data process
CN110149394A (en) Dispatching method, device and the storage medium of system resource
US20040267553A1 (en) Evaluating storage options
US7721295B2 (en) Execution multiplicity control system, and method and program for controlling the same
CN112306383B (en) Method for executing operation, computing node, management node and computing equipment
JP2559499B2 (en) Online transaction processing system
CN109190004A (en) A method of search complexity is reduced to cope with excess load searching request based on specific policy
US20110055168A1 (en) System, method, and computer-readable medium to facilitate application of arrival rate qualifications to missed throughput server level goals
CN116546028A (en) Service request processing method and device, storage medium and electronic equipment
US20050060496A1 (en) Selectively caching cache-miss content
US20140089311A1 (en) System. method, and computer-readable medium for classifying problem queries to reduce exception processing
CN111382196B (en) Distributed accounting processing method and system
US20090144256A1 (en) Workflow control in a resource hierarchy

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant