CN115080617A - Gateway flow control-based Elasticissearch query method and system - Google Patents

Gateway flow control-based Elasticissearch query method and system Download PDF

Info

Publication number
CN115080617A
CN115080617A CN202210667993.2A CN202210667993A CN115080617A CN 115080617 A CN115080617 A CN 115080617A CN 202210667993 A CN202210667993 A CN 202210667993A CN 115080617 A CN115080617 A CN 115080617A
Authority
CN
China
Prior art keywords
query
dsl
gateway
request
cache
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210667993.2A
Other languages
Chinese (zh)
Other versions
CN115080617B (en
Inventor
宋岩强
白剑波
李青龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Smart Starlight Information Technology Co ltd
Original Assignee
Beijing Smart Starlight Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Smart Starlight Information Technology Co ltd filed Critical Beijing Smart Starlight Information Technology Co ltd
Priority to CN202210667993.2A priority Critical patent/CN115080617B/en
Publication of CN115080617A publication Critical patent/CN115080617A/en
Application granted granted Critical
Publication of CN115080617B publication Critical patent/CN115080617B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/217Database tuning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24564Applying rules; Deductive queries
    • G06F16/24565Triggers; Constraints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2468Fuzzy queries

Abstract

The invention discloses an Elasticissearch query method and system based on gateway flow control, wherein the method comprises the following steps: judging whether the DSL request parameter meets the query protection condition, if not, terminating the query; if the query time meets the preset caching time interval, carrying out normalization and time period segmentation on the query time; calculating a hash value of the request parameter, wherein the hash value is an inquiry cache ID; judging whether the query cache ID exists in the gateway cache ID or not; if the query result exists, the query result stored in the memory is fed back; calculating a DSL credit value if not present; judging whether the score value is larger than the residual score value of the current time period or not; if the number of the DSL inquiry requests is larger than the preset value, the DSL inquiry requests are queued for waiting for the gateway to release the score again; if not, subtracting the DSL credit value from the residual value of the current time period as the updated residual value of the current time period, executing a DSL inquiry request in the ES cluster, and returning an inquiry result; and storing the query cache ID and the query result into an ES gateway memory. By the steps, the situation that the computation resources of the ES cluster are exhausted by high-concurrency and complex queries is avoided.

Description

Gateway flow control-based Elasticissearch query method and system
Technical Field
The invention relates to the technical field of data text processing, in particular to an Elasticissearch query method and system based on gateway flow control.
Background
An elastic search, abbreviated as ES, is a distributed, highly-extended, highly-real-time search and data analysis engine. It can conveniently make a large amount of data have the capability of searching, analyzing and exploring. The horizontal flexibility of the elastic search is fully utilized, so that the data becomes more valuable in a production environment. The Elasticsearch itself is a data normalized, centralized database, but highly concurrent, complex queries can be to exhaust the computational resources of the ES cluster.
Disclosure of Invention
Therefore, the gateway flow control-based Elasticissearch query method and system provided by the embodiment of the invention avoid the high-concurrency and complex query from exhausting the computing resources of the ES cluster through the gateway flow control.
In order to achieve the purpose, the invention provides the following technical scheme:
in a first aspect, an embodiment of the present invention provides an Elasticsearch query method based on gateway flow control, including:
obtaining request parameters of a DSL inquiry request, wherein the request parameters comprise inquiry time;
judging whether the request parameters meet the query protection conditions of the ES cluster;
if the request parameter does not meet the query protection condition of the ES cluster, terminating the DSL query request and returning a query abnormal value;
if the request parameter meets the query protection condition of the ES cluster, performing time normalization processing and time period segmentation processing on the query time according to a preset cache time interval;
calculating a hash value of a request parameter, and using the hash value as an inquiry cache ID of the DSL inquiry request;
judging whether the query cache ID exists in a gateway cache ID of the ES gateway or not;
if the query cache ID exists in the gateway cache ID of the ES gateway, sending a query result which is stored in an internal memory of the ES gateway and corresponds to the query cache ID;
if the query cache ID does not exist in the gateway cache ID of the ES gateway, calculating the DSL credit rating value of the DSL query request;
judging whether the DSL credit value is larger than the residual credit value of the ES cluster in the current time period or not;
if the DSL credit score value is larger than the residual score value of the ES cluster in the current time period, queuing the DSL inquiry request for waiting for releasing the credit score value again at the next time of the ES gateway;
if the DSL credit score value is less than or equal to the current time period residual score of the ES cluster, subtracting the DSL credit score value from the current time period residual score value to serve as the updated current time period residual score, executing search of a DSL inquiry request in the ES cluster, and returning an inquiry result;
and storing the query cache ID and the query result corresponding to the DSL query request into an ES gateway memory.
In one embodiment, the query protection condition of the ES cluster includes: the query time span is smaller than the preset time span, the number of fuzzy matching conditions is smaller than the preset number, the aggregation dimensionality is smaller than the preset dimensionality, and the ES cluster load value is smaller than the preset load.
In an embodiment, the step of performing time normalization processing and time slice segmentation processing on the query time according to a preset cache time interval includes:
calculating integral multiple time of a preset caching time interval;
converting the query starting time and the query ending time of the query time to integral times of a preset cache time interval to obtain normalized query time;
and carrying out time period segmentation on the normalized query time according to a preset cache time interval.
In one embodiment, the step of calculating the DSL credit value of the DSL inquiry request comprises:
determining the range of query time, the number of fuzzy matching conditions and the aggregation dimension according to the DSL query request;
and accumulating the rating values of the query time range, the number of fuzzy matching conditions and the aggregation dimension according to the preset calculation value to obtain the DSL rating value of the DSL query request.
In one embodiment, the value released by the ES gateway at each moment is determined according to the ES cluster load and ES cluster hardware.
In one embodiment, the method further comprises: recording a request start time and a request end time of the DSL query request.
In a second aspect, an embodiment of the present invention provides an Elasticsearch query system for gateway flow control, including:
an obtaining module, configured to obtain a request parameter of a DSL query request, where the request parameter includes a query time;
the first judgment module is used for judging whether the request parameter meets the query protection condition of the ES cluster;
a first processing module, configured to terminate the DSL query request and return a query abnormal value if the request parameter does not satisfy the query protection condition of the ES cluster;
the second processing module is used for carrying out time normalization processing and time period segmentation processing on the query time according to a preset cache time interval if the request parameter meets the query protection condition of the ES cluster;
a third processing module, configured to calculate a hash value of a request parameter, and use the hash value as an inquiry cache ID of the DSL inquiry request;
the second judging module is used for judging whether the inquiry cache ID exists in a gateway cache ID of the ES gateway or not;
a fourth processing module, configured to send, if the query cache ID exists in a gateway cache ID of an ES gateway, a query result corresponding to the query cache ID stored in an ES gateway memory;
a fifth processing module, configured to calculate a DSL credit score value of the DSL inquiry request if the inquiry cache ID does not exist in the gateway cache ID of the ES gateway;
a third judging module, configured to judge whether the DSL credit value is greater than a remaining credit value of the ES cluster in a current time period;
a sixth processing module, configured to queue the DSL query request for waiting for re-releasing the score at the next time of the ES gateway if the DSL score value is greater than the remaining score of the ES cluster in the current time period;
a seventh processing module, configured to, if the DSL score value is less than or equal to the current time period remaining score value of the ES cluster, subtract the DSL score value from the current time period remaining score value as an updated current time period remaining score value, perform a search of a DSL query request in the ES cluster, and return a query result;
and the eighth processing module is configured to store the query cache ID and the query result corresponding to the DSL query request in an ES gateway memory.
In a third aspect, an embodiment of the present invention provides a computer device, including: the gateway flow control based Elasticissearch query propagation method comprises at least one processor and a memory which is in communication connection with the at least one processor, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the at least one processor so as to enable the at least one processor to execute the gateway flow control based Elasticissearch query propagation method in the first aspect of the embodiment of the invention.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, where computer instructions are stored, and the computer instructions are configured to cause a computer to execute the gateway flow control based Elasticsearch query method according to the first aspect of the embodiment of the present invention.
The technical scheme of the invention has the following advantages:
the invention provides an Elasticissearch query method and system based on gateway flow control, which comprises the steps of firstly judging whether DSL request parameters meet the query protection conditions of an ES cluster, if not, terminating query and returning to a query abnormal value; if so, carrying out time normalization processing and time period segmentation processing on the query time according to a preset cache time interval; then, calculating a hash value of the request parameter, and taking the hash value as an ID of the query cache; judging whether the query cache ID exists in a gateway cache ID of the ES gateway or not; if the query result exists, directly feeding back the query result stored in the memory of the ES gateway to the client; if not, calculating the DSL credit rating value of the DSL inquiry request; then, judging whether the DSL credit value is larger than the residual credit value of the ES cluster in the current time period; if the value is larger than the residual value, queuing the DSL inquiry request and waiting for the ES gateway to release the value again at the next moment; if the current time period residual score is not larger than the residual score, subtracting the DSL score from the current time period residual score to obtain an updated current time period residual score, executing search of a DSL query request in the ES cluster, and returning a query result; and finally, storing the query cache ID and the query result corresponding to the DSL query request into an ES gateway memory. The flow control of the ES gateway is realized through the steps, and the situation that the computing resources of the ES cluster are exhausted by high-concurrency and complex queries is avoided.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained by those skilled in the art without creative efforts.
Fig. 1 is a flowchart of a specific example of an Elasticsearch query method based on gateway flow control according to an embodiment of the present invention;
fig. 2 is a flowchart of another specific example of an Elasticsearch query method based on gateway flow control according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an ES gateway memory idle condition of an Elasticsearch query method based on gateway flow control according to an embodiment of the present invention;
fig. 4 is a histogram of miss cache of an ES gateway memory of the gateway flow control based Elasticsearch query method provided in the embodiment of the present invention;
fig. 5 is a schematic diagram of an ES gateway memory usage of the gateway flow control-based Elasticsearch query method provided in the embodiment of the present invention;
fig. 6 is a histogram of memory hit cache in an ES gateway of the gateway flow control-based Elasticsearch query method provided in the embodiment of the present invention;
fig. 7 is a block composition diagram of an example of an Elasticsearch query system based on gateway flow control according to an embodiment of the present invention;
fig. 8 is a block diagram of a specific example of a computer device according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it should be understood that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
In addition, the technical features involved in the different embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1
An embodiment of the present invention provides an Elasticsearch query method based on gateway flow control, as shown in fig. 1, the method includes the following steps:
step S1: obtaining request parameters of a DSL query request, wherein the request parameters comprise query time.
In this embodiment, dsl (domain Specific language) is a json-based Specific syntax for the Elasticsearch query. The request parameters of the DSL inquiry request include parameters such as an inquiry time and an inquiry condition of the DSL inquiry request. The query time includes a query start time and a query end time, that is, a query time range corresponding to the DSL query request.
Step S2: and judging whether the request parameters meet the query protection conditions of the ES cluster.
The Elasticsearch is a Lucene-based search server, abbreviated as ES.
In this embodiment, in order to avoid an excessive computing pressure of an ES cluster (elastic search cluster), and to meet the customer computing requirement stably and maximally, the cluster needs to be protected, so query protection conditions of the ES cluster are set. When the DSL query request does not satisfy the query protection condition, the DSL query request is executed, which may consume a large amount of computing power of the ES cluster, causing the cluster to be unable to process other query requests, and affecting queries of other users. The requested search can only be performed when the DSL query request meets the query protection conditions.
In this embodiment, when the query time span is too large, the fuzzy matching condition is too large, the aggregation dimension is too large, and the ES cluster is overloaded, the cluster is protected, the flow is temporarily tightened, and the cluster pressure is prevented from being too large. Therefore, the query protection conditions of the ES cluster include: the query time span is smaller than the preset time span, the number of fuzzy matching conditions is smaller than the preset number, the aggregation dimensionality is smaller than the preset dimensionality, and the ES cluster load value is smaller than the preset load.
Specifically, the preset time span may be 5 years, the preset number may be 2, the preset dimension may be 3 dimensions, and the preset load may be 80% of the overall load of the ES cluster. This is only illustrated schematically in the present embodiment, and is not limited thereto.
Step S3: and if the request parameters do not meet the query protection conditions of the ES cluster, terminating the DSL query request and returning to the query abnormal value.
In this embodiment, when the request parameter does not satisfy the set ES cluster query protection condition, the DSL query request cannot be performed, that is, the DSL query request is terminated, and a query abnormal value is returned. The query outlier is used to characterize that the DSL query request cannot be executed, and specifically, the query outlier may be an exception code, such as 404, or "query exception"; this is only schematically illustrated in the present embodiment, and may be reasonably set according to actual needs.
Step S4: and if the request parameters meet the query protection conditions of the ES cluster, performing time normalization processing and time period segmentation processing on the query time according to a preset cache time interval.
In this embodiment, when the request parameter satisfies the ES cluster query protection condition, query time normalization and time slice are performed according to a preset cache time interval.
The preset buffer interval can be divided by 3600(1 hour equals 3600 seconds), for example: 5 minutes, 3 minutes, 15 seconds, 30 seconds, etc., with the objective of sectioning the time in equal portions. The time normalization process is to convert the query start time and the query end time of the DSL query request to the near fixed time point so as to realize the unification of the query time. And after the query time is normalized, time equal slicing is carried out on the normalized query time according to a preset cache time interval.
Step S5: and calculating a hash value of the request parameter, and using the hash value as the query cache ID of the DSL query request.
In this embodiment, the request parameters after querying the time slice are converted into character strings, then md5() is calculated to obtain a hash value, and the hash value is used as the query cache ID of the DSL query request.
Step S6: and judging whether the query cache ID exists in the gateway cache ID of the ES gateway or not.
In this embodiment, a hash value of the historical DSL query request is stored in a gateway cache ID of the ES gateway, and a result obtained by querying the historical DSL query request is stored in an ES gateway memory. The gateway uses OpenResty's LRU memory (key-value), key is the ID, and value is the search result.
Matching the query cache ID corresponding to the DSL query request with the hash value in the gateway cache ID, and executing the step S7 when the query cache ID exists in the gateway cache ID; when the query cache ID does not exist in the gateway cache ID, step S8 is performed.
Step S7: and if the query cache ID exists in the gateway cache ID of the ES gateway, sending a query result which is stored in the memory of the ES gateway and corresponds to the query cache ID.
In this embodiment, when the query cache ID exists in the gateway cache ID of the ES gateway, it indicates that the DSL query request is the same as the previous historical query, and the query result exists in the ES gateway memory, so that it is not necessary to search again in the ES cluster, and the query result stored in the ES gateway memory is returned to the requesting end, thereby avoiding the consumption of ES cluster computing resources by repeated requests.
Step S8: and if the query cache ID does not exist in the gateway cache ID of the ES gateway, calculating the DSL credit value of the DSL query request.
In this embodiment, when the query cache ID does not exist in the gateway cache ID of the ES gateway, it indicates that the DSL query request is different from the previous historical query, and needs to be queried in the ES cluster. The DSL credit value for the DSL enquiry request is first calculated. And scoring the DSL query request according to the syntax complexity of the DSL query request to obtain a score value.
Step S9: and judging whether the DSL credit value is larger than the residual credit value of the ES cluster in the current time period.
In this embodiment, when the DSL score value is greater than the remaining score of the ES cluster in the current time period, step S10 is executed; when the DSL score value is not greater than the current time period remaining score of the ES cluster, step S11 is performed.
Step S10: and if the DSL credit score value is larger than the residual credit score of the ES cluster in the current time period, queuing the DSL inquiry request and waiting for the ES gateway to release the credit score again at the next time.
In this embodiment, the remaining score of the ES cluster in the current time period represents the current data processing capacity of the ES cluster, and when the DSL score is greater than the remaining score of the ES cluster in the current time period, it indicates that the current processing capacity of the ES cluster cannot satisfy the DSL query request, so that the DSL query request is queued in series. And waiting for the ES gateway to release the score again at the next moment after the score of the current ES cluster is consumed, and comparing the scores again.
The value released by the ES gateway at each moment is determined according to the ES cluster load and ES cluster hardware. Specifically, in each time interval, according to the pressure of the ES clusters, the number of the ES cluster servers and other conditions, the release score of the current time interval is set, for example, 1000 ES cluster releases per minute. The ES cluster load is small, and when the ES cluster hardware is high, the release score can be increased; conversely, the release score may decrease. This is only schematically illustrated in this embodiment, and may be reasonably set according to actual needs.
Step S11: and if the DSL score value is less than or equal to the current time period residual score of the ES cluster, subtracting the DSL score value from the current time period residual score to obtain an updated current time period residual score, executing search of the DSL query request in the ES cluster, and returning a query result.
In this embodiment, when the DSL credit score is less than or equal to the remaining credit score of the ES cluster in the current time period, the DSL credit score is subtracted from the remaining credit score of the current time period after the query request. And executing the DSL inquiry request search in the ES cluster and returning the inquiry result to the request end.
Step S12: and storing the query cache ID and the query result corresponding to the DSL query request into an ES gateway memory.
In this embodiment, the query cache ID and the query result corresponding to the DSL query request are stored in the ES gateway memory, so that the query result is directly returned from the gateway memory in the subsequent same query request, and the ES cluster resources are saved.
Firstly, judging whether DSL request parameters meet the query protection conditions of an ES cluster, if not, terminating the query and returning to a query abnormal value; if so, carrying out time normalization processing and time period segmentation processing on the query time according to a preset cache time interval; then, calculating a hash value of the request parameter, and taking the hash value as an ID of the query cache; judging whether the query cache ID exists in a gateway cache ID of the ES gateway or not; if the query result exists, directly feeding back the query result stored in the memory of the ES gateway to the client; if not, calculating the DSL score of the DSL inquiry request; then, judging whether the DSL score value is larger than the residual score value of the ES cluster in the current time period; if the value is larger than the residual value, queuing the DSL inquiry request and waiting for the ES gateway to release the value again at the next moment; if the current time period residual score is not larger than the residual score, subtracting the DSL score from the current time period residual score to obtain an updated current time period residual score, executing search of a DSL query request in the ES cluster, and returning a query result; and finally, storing the query cache ID and the query result corresponding to the DSL query request into an ES gateway memory. The flow control of the ES gateway is realized through the steps, and the situation that the computing resources of the ES cluster are exhausted by high-concurrency and complex queries is avoided.
In the embodiment of the present invention, the step S4 of performing time normalization processing and time slice slicing processing on the query time according to the preset cache time interval includes steps S41-S43.
Step S41: and calculating integral multiple time of the preset caching time interval.
In this embodiment, the preset buffering time interval is set to be evenly divided by 3600(1 hour equals 3600 seconds), for example: 5 minutes, 3 minutes, 15 seconds, 30 seconds, etc., in order to slice time equally.
Step S42: and converting the query starting time and the query ending time of the query time to the integral multiple time of the preset cache time interval to obtain the normalized query time.
In this embodiment, the query start time and the query end time are converted to the time points close to the fixed time points. Specifically, if the preset buffering time interval is set to 2 minutes, 1 hour is divided into 30 equal parts, i.e., divided into the following times (00:00,02:00,04:00,06:00,08:00, 58:00,1:00:00), and the query time t is converted into the time closest to the left side.
For example:
(a) the buffer time was set to 10 seconds
Query time range: starting: 2021-04-0110: 00:04 end: 2021-04-0110:10:06
Slicing time range: starting: 2021-04-0110: 00:00 end: 2021-04-0110:10:00
(b) The buffer time was set to 2 minutes
Query time range: starting: 2021-04-0110: 01:04 end: 2021-04-0310:16:06
Slicing time range: starting: 2021-04-0110: 00:00 end: 2021-04-0310:16:00
(c) The buffer time was set to 15 seconds
Query time range: starting: 2021-04-0110: 01:16 stop: 2021-05-0113:10:44
Slicing time range: starting: 2021-04-0110: 01:15 stop: 2021-05-0113:10:30
Step S43: and carrying out time period segmentation on the normalized query time according to a preset cache time interval.
In this embodiment, the normalized query time range is equally divided according to a preset cache time interval.
According to the method, the query starting time and the query ending time are converted to the time points close to the fixed time points, so that the query time normalization is realized, and the follow-up query is facilitated.
In the embodiment of the present invention, the step of calculating the DSL score value of the DSL inquiry request in step S8 includes steps S81 to S82.
Step S81: and determining the range of the query time, the number of fuzzy matching conditions and the aggregation dimension according to the DSL query request.
In this embodiment, the range of the query time, the number of fuzzy matching conditions, and the aggregation dimension may be obtained according to the request parameter of the DSL query request. The fuzzy matching condition means that the query condition of the elastic search is relatively fuzzy (such as a;. represents any character), which may result in too much query content, too slow query speed, large consumption of the computing power of the ES cluster, increased load of the ES cluster, and should be avoided as much as possible during real-time query.
Step S82: and accumulating the scores of the query time range, the number of fuzzy matching conditions and the aggregation dimension according to the preset calculation score to obtain the DSL score of the DSL query request.
In this embodiment, the preset calculation score includes a first score coefficient corresponding to the query time, a second score coefficient corresponding to the fuzzy matching condition, and a third score coefficient corresponding to the aggregation dimension. Specifically, the first fractional coefficient is 5 minutes/month, the second fractional coefficient is 5 minutes/month, and the third fractional coefficient is 10 minutes/dimension.
Total score value N1 × C1+ N2 × C2+ N3 × C3
Wherein, N1 is the number of months corresponding to the query time range, C1 is the first score coefficient, N2 is the number of fuzzy matching conditions, C2 is the second score coefficient, N3 is the dimension number of the aggregation dimension, and C3 is the third score coefficient.
According to the method, the score of the DSL query request is calculated according to the query time span, the fuzzy matching condition number and the aggregation dimension, and then the syntactic complexity of the query request can be obtained according to the score.
In the embodiment of the present invention, the method further includes: recording a request start time and a request end time of the DSL query request.
In this embodiment, when the DSL inquiry request is acquired in step S1, the request start time of the request may be recorded; when the query request is finished, recording the request end time of the request. Specifically, the DSL query request, the request start time and the request end time are recorded in a log table, which is convenient for subsequent analysis and monitoring of the time of the whole query request.
The following describes in detail an Elasticsearch query method based on gateway flow control with a specific example, and a detailed flowchart is shown in fig. 2.
(1) log record
Recording the request start time and request parameters of the DSL inquiry request.
(2) Touch and reach red line
Protecting the ES cluster by setting a bottom line value, which is generally query time span, fuzzy matching condition number and aggregation dimension; an exception is returned if the floor value is exceeded.
(3) And (5) directly setting an exception code by touching the red line, and directly returning.
Such as: the query span is 5 years, and complex fuzzy matching conditions exist.
(4) Time sliced Griddle
The set time must be divisible by 3600(1 hour 3600 seconds), for example: 5 minutes, 3 minutes, 15 seconds, 30 seconds, etc., with the objective of sectioning the time in equal portions. The starting time and the ending time are converted to the time points close to the fixed time points. With a buffer time of 2 minutes, 1 hour would be 30 equal parts (00:00,02:00,04:00,06:00,08:00, 58:00,1:00:00), and time t would be scaled to the time closest to the left.
For example:
(a) the buffer time was set to 10 seconds
Query time range: starting: 2021-04-0110: 00:04 end: 2021-04-0110:10:06
Slicing time range: starting: 2021-04-0110: 00:00 termination: 2021-04-0110:10:00
(b) The buffer time was set to 2 minutes
Query time range: starting: 2021-04-0110: 01:04 end: 2021-04-0310:16:06
Slicing time range: starting: 2021-04-0110: 00:00 end: 2021-04-0310:16:00
(c) The buffer time was set to 15 seconds
Query time range: starting: 2021-04-0110: 01:16 stop: 2021-05-0113:10:44
Slicing time range: starting: 2021-04-0110: 01:15 stop: 2021-05-0113:10:30
(5) Computing gateway cache ID for DSL (json form)
The DSL is converted to a string and then md5() is computed to get the cache ID. I.e. the unique ID (Hash value) of the query statement, as the cache ID.
(6) Querying gateway caches with cache IDs
The gateway cache uses the OpenResty's LRU memory (key-value), key is the ID, and value is the search result. If yes, entering the step (7).
(7) Fetch cache
And obtaining the search result by the cache ID and returning the search result to the request end.
(8) DSL scoring
Manually, according to the grammar complexity, the specific scoring value range is 1-100 points.
(9) Remaining score for current time period
Setting a current time interval (the calculated score of ES cluster release per minute, for example, 1000 ES cluster release per minute) according to ES cluster pressure conditions in each time interval; each time a new request is made, the score set by the current request is consumed (subtracted).
(10) Waiting in line
And if the score of the current ES cluster is completely consumed, waiting for the ES gateway to release the score again at the next moment.
(11) Determining ES clusters
In order to prevent the crash of the ES cluster and the backup cluster, the ES gateway is switched to the ES backup cluster according to the setting of a cluster administrator, and the stability of a product is ensured.
(12) Performing DSL
The ES cluster executes the DSL statement and returns the result.
(13) Caching execution results with request unique ID (key-value)
LRU caching is used in limited memory.
(20) log merge
Recording the end time of the request and requesting the corresponding status code.
The following describes, by way of a specific example, a case where the gateway flow control based Elasticsearch query method uses an ES gateway memory in actual use. FIG. 3 is a diagram illustrating an ES gateway memory idle condition; FIG. 4 is a histogram of ES gateway memory miss cache; FIG. 5 is a diagram illustrating the usage of an ES gateway memory; fig. 6 is a memory hit cache histogram for an ES gateway.
As can be seen from the figure, the total capacity of the ES gateway memory is 10GB, and the time interval of data statistics is 5 minutes. Fig. 3 shows the idle status of the ES gateway memory, and fig. 4 shows the number of times that the DSL query request misses the ES gateway memory. Fig. 5 shows the usage of the ES gateway memory, and fig. 6 shows the number of times the DSL query request hits the ES gateway memory.
Specifically, the first column in fig. 4 and 6 is taken as an example for explanation. Within the time of 09: 30-9: 35, the number of times of cache hit is 10, that is, 10 DSL query requests do not need to perform ES cluster query, and query results stored in an ES gateway memory are directly returned; the number of cache misses is 50, that is, 50 DSL query requests do not exist in the ES gateway memory, and an ES cluster query needs to be performed. The repeated requests store request query results in the memory of the ES gateway, and the request query results can be directly fed back to the client without performing ES cluster query again, so that the flow consumption is avoided, and the consumption of computing resources is reduced.
Example 2
An embodiment of the present invention provides an Elasticsearch query method system based on gateway flow control, as shown in fig. 7, including:
an obtaining module 1, configured to obtain a request parameter of a DSL query request, where the request parameter includes a query time; this module executes the method described in step S1 in embodiment 1, and is not described herein again.
A first judging module 2, configured to judge whether the request parameter meets an inquiry protection condition of the ES cluster; this module executes the method described in step S2 in embodiment 1, and is not described herein again.
A first processing module 3, configured to terminate the DSL query request and return a query abnormal value if the request parameter does not satisfy the query protection condition of the ES cluster; this module executes the method described in step S3 in embodiment 1, and is not described herein again.
The second processing module 4 is configured to, if the request parameter meets the query protection condition of the ES cluster, perform time normalization processing and time segment segmentation processing on the query time according to a preset cache time interval; this module executes the method described in step S4 in embodiment 1, and is not described herein again.
A third processing module 5, configured to calculate a hash value of a request parameter, and use the hash value as an inquiry cache ID of the DSL inquiry request; this module executes the method described in step S5 in embodiment 1, and is not described herein again.
A second judging module 6, configured to judge whether the query cache ID exists in a gateway cache ID of the ES gateway; this module executes the method described in step S6 in embodiment 1, and is not described herein again.
A fourth processing module 7, configured to send, if the query cache ID exists in a gateway cache ID of an ES gateway, a query result corresponding to the query cache ID stored in an ES gateway memory; this module executes the method described in step S7 in embodiment 1, and is not described herein again.
A fifth processing module 8, configured to calculate a DSL credit value of the DSL query request if the query cache ID does not exist in the gateway cache ID of the ES gateway; this module executes the method described in step S8 in embodiment 1, and is not described herein again.
A third judging module 9, configured to judge whether the DSL credit value is greater than a remaining credit value of the ES cluster in the current time period; this module executes the method described in step S9 in embodiment 1, and is not described herein again.
A sixth processing module 10, configured to queue the DSL query request for waiting for re-releasing the score at the next time of the ES gateway if the DSL score is greater than the remaining score of the ES cluster in the current time period; this module executes the method described in step S10 in embodiment 1, and is not described herein again.
A seventh processing module 11, configured to, if the DSL score value is less than or equal to the current time period remaining score value of the ES cluster, subtract the DSL score value from the current time period remaining score value as an updated current time period remaining score value, perform a search of a DSL query request in the ES cluster, and return a query result; this module executes the method described in step S11 in embodiment 1, and is not described herein again.
An eighth processing module 12, configured to store the query cache ID and the query result corresponding to the DSL query request in an ES gateway memory; this module executes the method described in step S12 in embodiment 1, and is not described herein again.
In this embodiment of the present invention, the query protection condition of the ES cluster includes: the query time span is smaller than the preset time span, the number of fuzzy matching conditions is smaller than the preset number, the aggregation dimensionality is smaller than the preset dimensionality, and the ES cluster load value is smaller than the preset load.
In an embodiment of the present invention, the second processing module includes: a first processing unit, configured to calculate an integral multiple time of a preset buffering time interval, where this module executes the method described in step S41 in embodiment 1, and details of this module are not described herein again; a second processing unit, configured to convert the query start time and the query end time of the query time to an integral multiple of the preset cache time interval to obtain a normalized query time, where this module executes the method described in step S42 in embodiment 1, and details are not described here again; a third processing unit, configured to perform time segment splitting on the normalized query time according to a preset cache time interval, where this module executes the method described in step S43 in embodiment 1, and details are not repeated here.
In an embodiment of the present invention, the fifth processing module includes: a fourth processing unit, configured to determine, according to the DSL query request, a range of query time, the number of fuzzy matching conditions, and an aggregation dimension, where this module executes the method described in step S81 in embodiment 1, and details are not described here again; a fifth processing module, configured to perform score accumulation on the query time range, the number of fuzzy matching conditions, and the aggregation dimension according to a preset calculated score to obtain a DSL score of the DSL query request, where this module executes the method described in step S82 in embodiment 1, and details are not described here again.
In the embodiment of the invention, the value released by the ES gateway at each moment is determined according to the ES cluster load and ES cluster hardware.
In the embodiment of the present invention, the system further includes: a ninth processing module, configured to record a request start time and a request end time of the DSL inquiry request.
The gateway flow control-based Elasticissearch query system provided by the embodiment of the invention firstly judges whether DSL request parameters meet the query protection conditions of an ES cluster, and if not, the query is terminated and a query abnormal value is returned; if so, carrying out time normalization processing and time period segmentation processing on the query time according to a preset cache time interval; then, calculating a hash value of the request parameter, and taking the hash value as an ID of the query cache; judging whether the query cache ID exists in a gateway cache ID of the ES gateway or not; if the query result exists, directly feeding back the query result stored in the memory of the ES gateway to the client; if not, calculating the DSL score of the DSL inquiry request; then, judging whether the DSL credit value is larger than the residual credit value of the ES cluster in the current time period; if the value is larger than the residual value, queuing the DSL inquiry request and waiting for the ES gateway to release the value again at the next moment; if the current time period residual score is not larger than the residual score, subtracting the DSL score from the current time period residual score to obtain an updated current time period residual score, executing search of a DSL query request in the ES cluster, and returning a query result; and finally, storing the query cache ID and the query result corresponding to the DSL query request into an ES gateway memory. The system realizes the flow control of the ES gateway, and avoids the exhaustion of the computing resources of the ES cluster by high-concurrency and complex query.
Example 3
An embodiment of the present invention provides a computer device, as shown in fig. 8, including: at least one processor 401, such as a CPU (Central Processing Unit), at least one communication interface 403, memory 404, and at least one communication bus 402. Wherein a communication bus 402 is used to enable connective communication between these components. The communication interface 403 may include a Display (Display) and a Keyboard (Keyboard), and the optional communication interface 403 may also include a standard wired interface and a standard wireless interface. The Memory 404 may be a RAM (random Access Memory) or a non-volatile Memory (non-volatile Memory), such as at least one disk Memory. The memory 404 may optionally be at least one memory device located remotely from the processor 401. Wherein, the processor 401 may execute the gateway flow control based Elasticsearch query method of embodiment 1. A set of program codes is stored in the memory 404, and the processor 401 calls the program codes stored in the memory 404 for executing the gateway flow control based Elasticsearch query method of embodiment 1.
The communication bus 402 may be a Peripheral Component Interconnect (PCI) bus or an Extended Industry Standard Architecture (EISA) bus. The communication bus 402 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one line is shown in FIG. 8, but this does not represent only one bus or one type of bus.
The memory 404 may include a volatile memory (RAM), such as a random-access memory (RAM); the memory may also include a non-volatile memory (english: non-volatile memory), such as a flash memory (english: flash memory), a hard disk (english: hard disk drive, abbreviated: HDD) or a solid-state drive (english: SSD); the memory 404 may also comprise a combination of memories of the kind described above.
The processor 401 may be a Central Processing Unit (CPU), a Network Processor (NP), or a combination of a CPU and an NP.
The processor 401 may further include a hardware chip. The hardware chip may be an application-specific integrated circuit (ASIC), a Programmable Logic Device (PLD), or a combination thereof. The PLD may be a Complex Programmable Logic Device (CPLD), a field-programmable gate array (FPGA), a General Array Logic (GAL), or any combination thereof.
Optionally, the memory 404 is also used to store program instructions. The processor 401 may call a program instruction to implement the gateway flow control based Elasticsearch query method in embodiment 1 as described in this application.
An embodiment of the present invention further provides a computer-readable storage medium, where a computer-executable instruction is stored in the computer-readable storage medium, and the computer-executable instruction may execute the gateway flow control based Elasticsearch query method according to embodiment 1. The storage medium may be a magnetic Disk, an optical Disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a Flash Memory (Flash Memory), a Hard Disk Drive (Hard Disk Drive, abbreviated as HDD), or a Solid-State Drive (SSD); the storage medium may also comprise a combination of memories of the kind described above.
It should be understood that the above examples are only for clarity of illustration and are not intended to limit the embodiments. Other variations and modifications will be apparent to persons skilled in the art in light of the above description. And are neither required nor exhaustive of all embodiments. And obvious variations or modifications of the invention may be made without departing from the spirit or scope of the invention.

Claims (9)

1. An Elasticissearch query method based on gateway flow control is characterized by comprising the following steps:
obtaining request parameters of a DSL inquiry request, wherein the request parameters comprise inquiry time;
judging whether the request parameters meet the query protection conditions of the ES cluster;
if the request parameter does not meet the query protection condition of the ES cluster, terminating the DSL query request and returning a query abnormal value;
if the request parameter meets the query protection condition of the ES cluster, performing time normalization processing and time period segmentation processing on the query time according to a preset cache time interval;
calculating a hash value of a request parameter, and using the hash value as an inquiry cache ID of the DSL inquiry request;
judging whether the query cache ID exists in a gateway cache ID of the ES gateway or not;
if the query cache ID exists in the gateway cache ID of the ES gateway, sending a query result which is stored in an internal memory of the ES gateway and corresponds to the query cache ID;
if the query cache ID does not exist in the gateway cache ID of the ES gateway, calculating the DSL credit rating value of the DSL query request;
judging whether the DSL credit value is larger than the residual credit value of the ES cluster in the current time period or not;
if the DSL credit score value is larger than the residual score value of the ES cluster in the current time period, queuing the DSL inquiry request for waiting for releasing the credit score value again at the next time of the ES gateway;
if the DSL credit score value is less than or equal to the current time period residual score of the ES cluster, subtracting the DSL credit score value from the current time period residual score value to serve as the updated current time period residual score, executing search of a DSL inquiry request in the ES cluster, and returning an inquiry result;
and storing the query cache ID and the query result corresponding to the DSL query request to an ES gateway memory.
2. The gateway flow control based Elasticissearch query method according to claim 1, wherein the query protection conditions of the ES cluster include: the query time span is smaller than the preset time span, the number of fuzzy matching conditions is smaller than the preset number, the aggregation dimensionality is smaller than the preset dimensionality, and the ES cluster load value is smaller than the preset load.
3. The gateway flow control-based Elasticissearch query method according to claim 1, wherein the step of performing time normalization processing and time segment segmentation processing on the query time according to a preset cache time interval comprises:
calculating integral multiple time of a preset caching time interval;
converting the query starting time and the query ending time of the query time to integral times of a preset cache time interval to obtain normalized query time;
and carrying out time period segmentation on the normalized query time according to a preset cache time interval.
4. The gateway flow control based Elasticissearch query method of claim 1, wherein the step of calculating the DSL credit value of the DSL query request comprises:
determining the range of query time, the number of fuzzy matching conditions and the aggregation dimension according to the DSL query request;
and accumulating the scores of the query time range, the number of fuzzy matching conditions and the aggregation dimension according to the preset calculation score to obtain the DSL score of the DSL query request.
5. The gateway flow control-based Elasticissearch query method according to claim 1, wherein the value released by the ES gateway at each moment is determined according to ES cluster load and ES cluster hardware.
6. The gateway flow control based Elasticissearch query method according to any one of claims 1 to 5, further comprising:
recording a request start time and a request end time of the DSL query request.
7. An Elasticissearch query system based on gateway flow control is characterized by comprising:
an obtaining module, configured to obtain a request parameter of a DSL query request, where the request parameter includes a query time;
the first judgment module is used for judging whether the request parameter meets the query protection condition of the ES cluster;
a first processing module, configured to terminate the DSL query request and return a query abnormal value if the request parameter does not satisfy the query protection condition of the ES cluster;
the second processing module is used for carrying out time normalization processing and time period segmentation processing on the query time according to a preset cache time interval if the request parameter meets the query protection condition of the ES cluster;
a third processing module, configured to calculate a hash value of a request parameter, and use the hash value as an inquiry cache ID of the DSL inquiry request;
the second judging module is used for judging whether the query cache ID exists in the gateway cache ID of the ES gateway or not;
a fourth processing module, configured to send, if the query cache ID exists in a gateway cache ID of an ES gateway, a query result corresponding to the query cache ID stored in an ES gateway memory;
a fifth processing module, configured to calculate a DSL credit score value of the DSL inquiry request if the inquiry cache ID does not exist in the gateway cache ID of the ES gateway;
a third judging module, configured to judge whether the DSL credit value is greater than a remaining credit value of the ES cluster in a current time period;
a sixth processing module, configured to queue the DSL query request for waiting for re-releasing the score at the next time of the ES gateway if the DSL score is greater than the remaining score of the ES cluster in the current time period;
a seventh processing module, configured to, if the DSL score value is less than or equal to the current time period remaining score value of the ES cluster, subtract the DSL score value from the current time period remaining score value as an updated current time period remaining score value, perform a search of a DSL query request in the ES cluster, and return a query result;
and the eighth processing module is configured to store the query cache ID and the query result corresponding to the DSL query request in an ES gateway memory.
8. A computer device, comprising: at least one processor, and a memory communicatively coupled to the at least one processor, wherein the memory stores instructions executable by the at least one processor, the instructions being executable by the at least one processor to cause the at least one processor to perform the gateway flow control based Elasticissearch query method of any of claims 1-6.
9. A computer-readable storage medium, wherein the computer-readable storage medium stores computer instructions for causing the computer to execute the gateway traffic control based Elasticsearch query method of any of 1-6.
CN202210667993.2A 2022-06-14 2022-06-14 Gateway flow control-based elastic search query method and system Active CN115080617B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210667993.2A CN115080617B (en) 2022-06-14 2022-06-14 Gateway flow control-based elastic search query method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210667993.2A CN115080617B (en) 2022-06-14 2022-06-14 Gateway flow control-based elastic search query method and system

Publications (2)

Publication Number Publication Date
CN115080617A true CN115080617A (en) 2022-09-20
CN115080617B CN115080617B (en) 2024-04-12

Family

ID=83250903

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210667993.2A Active CN115080617B (en) 2022-06-14 2022-06-14 Gateway flow control-based elastic search query method and system

Country Status (1)

Country Link
CN (1) CN115080617B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161860A1 (en) * 2001-02-28 2002-10-31 Benjamin Godlin Method and system for differential distributed data file storage, management and access
US20170109436A1 (en) * 2015-10-16 2017-04-20 Arris Enterprises, Inc. Apparatus and method for providing alerts for network device errors and for resolving network device errors
CN107133267A (en) * 2017-04-01 2017-09-05 北京京东尚科信息技术有限公司 Inquire about method, device, electronic equipment and the readable storage medium storing program for executing of elasticsearch clusters
CN110417901A (en) * 2019-07-31 2019-11-05 北京金山云网络技术有限公司 Data processing method, device and gateway server

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020161860A1 (en) * 2001-02-28 2002-10-31 Benjamin Godlin Method and system for differential distributed data file storage, management and access
US20170109436A1 (en) * 2015-10-16 2017-04-20 Arris Enterprises, Inc. Apparatus and method for providing alerts for network device errors and for resolving network device errors
CN107133267A (en) * 2017-04-01 2017-09-05 北京京东尚科信息技术有限公司 Inquire about method, device, electronic equipment and the readable storage medium storing program for executing of elasticsearch clusters
CN110417901A (en) * 2019-07-31 2019-11-05 北京金山云网络技术有限公司 Data processing method, device and gateway server

Also Published As

Publication number Publication date
CN115080617B (en) 2024-04-12

Similar Documents

Publication Publication Date Title
JP4815459B2 (en) Load balancing control server, load balancing control method, and computer program
US8290896B2 (en) Statistical applications in OLTP environment
CN108694075B (en) Method and device for processing report data, electronic equipment and readable storage medium
JP5744707B2 (en) Computer-implemented method, computer program, and system for memory usage query governor (memory usage query governor)
US8171228B2 (en) Garbage collection in a cache with reduced complexity
US20160147888A1 (en) Federation optimization using ordered queues
US20110225116A1 (en) Systems and methods for policy based execution of time critical data warehouse triggers
CN107450994B (en) Interface calling method and system
US11093496B1 (en) Performance-based query plan caching
CN112559271B (en) Interface performance monitoring method, device and equipment for distributed application and storage medium
CN110347706A (en) For handling method, Database Systems and the computer readable storage medium of inquiry
WO2011022995A1 (en) Search condition prompt system and method for english word search
CA3131330A1 (en) Database aggregation query method, device and system
WO2020211363A1 (en) Method and apparatus for improving efficiency of program loading, computer device and storage medium
CN105573838B (en) Cache health degree detection method and device
CN115080617A (en) Gateway flow control-based Elasticissearch query method and system
CN110750498A (en) Object access method, device and storage medium
WO2022250876A1 (en) Asynchronous processing of transaction log requests in a database transaction log service
CN109388658B (en) Data determination method and device
WO2021143199A1 (en) Method and apparatus for searching log, computer device, and storage medium
CN114996292B (en) Automatic change method, device and equipment for execution plan and readable storage medium
US8880546B1 (en) System, method, and computer program for refining a set of keywords utilizing historical activity thresholds
CN107423375B (en) Application program searching method and device
US20150134919A1 (en) Information processing apparatus and data access method
US11720550B2 (en) Transaction log validation in a database transaction log service

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant