CN112463825A - Elasticisearch cluster protection device, method, storage medium and computer equipment - Google Patents

Elasticisearch cluster protection device, method, storage medium and computer equipment Download PDF

Info

Publication number
CN112463825A
CN112463825A CN202011204062.6A CN202011204062A CN112463825A CN 112463825 A CN112463825 A CN 112463825A CN 202011204062 A CN202011204062 A CN 202011204062A CN 112463825 A CN112463825 A CN 112463825A
Authority
CN
China
Prior art keywords
cluster
query
threshold
data
slow query
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011204062.6A
Other languages
Chinese (zh)
Inventor
王士强
刘伟
赵子健
杨晓勤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Construction Bank Corp
Original Assignee
China Construction Bank Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Construction Bank Corp filed Critical China Construction Bank Corp
Priority to CN202011204062.6A priority Critical patent/CN112463825A/en
Publication of CN112463825A publication Critical patent/CN112463825A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Computational Linguistics (AREA)
  • Debugging And Monitoring (AREA)

Abstract

The embodiment of the invention provides an Elasticissearch cluster protection device, method, storage medium and computer equipment, and relates to the field of Elasticissearch cluster protection. The device includes: the configuration management unit is used for acquiring a recording threshold of the slow query log and sending the recording threshold to the Elasticissearch cluster so that the cluster can record the slow query log according to the recording threshold; the data acquisition unit is used for acquiring the slow query logs recorded by the cluster from the cluster and analyzing the slow query logs to form structured slow query data; the data analysis unit is used for acquiring index data of set indexes from the structured slow query data; and the alarm processing unit is used for comparing the index data of the set index with an index threshold corresponding to the set index, and if the index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster. The embodiment of the invention can provide protection for the Elasticissearch cluster and ensure the stable operation of the Elasticissearch cluster.

Description

Elasticisearch cluster protection device, method, storage medium and computer equipment
Technical Field
The invention relates to the field of elastic search cluster protection, in particular to an elastic search cluster protection device, method, storage medium and computer equipment.
Background
Traditional enterprises choose to use relational databases (typically Mysql, Oracle, etc.) to store business data. However, in the face of a large data writing process, a field is not fixed, and a scene that a result needs to be fed back quickly for a simple query, the traditional relational database is not satisfactory. In recent years, a non-relational database represented by an elastic search engine and the like rapidly rises, meets the storage and query requirements of enterprises on massive document data, and is widely applied.
In the practical application process, the Elasticissearch cluster is excellent in mass data storage and retrieval. However, when the Elasticsearch cluster processes a write request and a query request, the two types of requests tend to affect each other. When the number of the query requests is large and the data volume of the query requests is large, the Elasticsearch cluster will use main CPU and memory resources to complete the query requests of the user, and at this time, the processing speed of the Elasticsearch cluster on the write request may be reduced or the write request cannot be processed, and the Elasticsearch cluster cannot stably run.
Disclosure of Invention
The embodiment of the invention provides an Elasticissearch cluster protection device, method, storage medium and computer equipment, which aim to solve the problem that an Elasticissearch cluster in the prior art cannot stably run when processing a large-data-volume query request.
In one aspect, an embodiment of the present invention provides an elastic search cluster protection device, where the cluster protection device includes:
a configuration management unit for performing the following processing: acquiring a recording threshold value of a configured slow query log, and sending the recording threshold value to an Elasticissearch cluster so that the Elasticissearch cluster can record the slow query log according to the recording threshold value;
a data acquisition unit for performing the following processes: acquiring slow query logs recorded by the Elasticissearch cluster from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
a data analysis unit for performing the following processing: index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
an alarm handling unit for performing the following processing: and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
In one implementation of the present embodiment,
the setting of the index further includes: inquiring the number of waiting tasks in the waiting queue;
the alert handling unit is further configured to: comparing the number of the waiting tasks with a waiting task number threshold value, and sending a control instruction for shielding a subsequent query request to an Nginx (a high-performance HTTP and reverse proxy web server) access proxy server when the number of the waiting tasks exceeds the waiting task number threshold value.
In one implementation of the present embodiment,
the alert handling unit is further configured to: detecting whether the waiting task number is recovered to be normal or not, and sending a control instruction for canceling shielding of a subsequent query request to an Nginx access proxy server when the waiting task number is recovered to be normal, wherein the step of recovering to be normal comprises the following steps: the number of waiting tasks returns to not exceeding the waiting task threshold.
In one implementation of the present embodiment,
the configuration management unit is further configured to perform the following processing: acquiring a configured Nginx shielding threshold;
the data acquisition unit is further configured to perform the following: acquiring a user access log from an Nginx access proxy server, and analyzing the user access log to form structured user access data;
the data analysis unit is further configured to perform the following processing: and acquiring the number of times of requests sent by each IP address per second from the structured user access data, comparing the number of times of requests sent by each IP address per second with the shielding threshold, and if the number of times of requests sent by each IP address per second exceeds the shielding threshold, sending a control instruction for shielding subsequent requests of the IP addresses of which the number of times of requests sent per second exceeds the shielding threshold to the Nginx access proxy server.
In an implementation manner of this embodiment, the configuration management unit is further configured to perform the following processing:
acquiring the IP address and the installation path of the Elasticissearch server input by the user,
and displaying the running state of the Elasticise cluster corresponding to the IP address and the installation path according to the IP address and the installation path.
On the other hand, an embodiment of the present invention provides an Elasticsearch cluster protection method, where the method includes:
acquiring a recording threshold value of a configured slow query log, and sending the recording threshold value to an Elasticissearch cluster so that the Elasticissearch cluster can record the slow query log according to the recording threshold value;
acquiring slow query logs recorded by the Elasticissearch cluster from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
In an implementation manner of this embodiment, the setting the index further includes: inquiring the number of waiting tasks in the waiting queue;
the method further comprises the following steps:
and comparing the number of the waiting tasks with a waiting task number threshold, and sending a control instruction for shielding a subsequent query request to the Nginx access proxy server when the number of the waiting tasks exceeds the waiting task number threshold.
In an implementation manner of this embodiment, the method further includes:
detecting whether the waiting task number is recovered to be normal or not, and sending a control instruction for canceling shielding of a subsequent query request to an Nginx access proxy server when the waiting task number is recovered to be normal, wherein the step of recovering to be normal comprises the following steps: the number of waiting tasks returns to not exceeding the waiting task threshold.
In an implementation manner of this embodiment, the method further includes:
acquiring a configured Nginx shielding threshold;
acquiring a user access log from an Nginx access proxy server, and analyzing the user access log to form structured user access data;
and acquiring the number of times of requests sent by each IP address per second from the structured user access data, comparing the number of times of requests sent by each IP address per second with the shielding threshold, and if the number of times of requests sent by each IP address per second exceeds the shielding threshold, sending a control instruction for shielding subsequent requests of the IP addresses of which the number of times of requests sent per second exceeds the shielding threshold to the Nginx access proxy server.
In an implementation manner of this embodiment, the method further includes:
acquiring the IP address and the installation path of the Elasticissearch server input by the user,
and displaying the running state of the Elasticise cluster corresponding to the IP address and the installation path according to the IP address and the installation path.
In still another aspect, an embodiment of the present invention provides a computer storage medium, on which computer instructions are stored, where the computer instructions can be executed by a processor to implement the Elasticsearch cluster protection method described in any implementation manner of the foregoing embodiment.
In another aspect, an embodiment of the present invention provides a computer device, including:
a memory having a computer program stored thereon;
a processor, configured to execute the computer program to implement the Elasticsearch cluster protection method described in any implementation manner of the foregoing embodiments.
Compared with the prior art, the elastic search cluster protection device, the elastic search cluster protection method, the storage medium and the computer equipment provided by the embodiment of the invention have the following beneficial technical effects:
the embodiment of the invention can continuously monitor and collect the current slow query log recorded by the Elasticissearch cluster, then continuously acquire the index data of the current slow query set index from the current slow query log, and judge whether to cancel the query task of the current slow query according to the index data of the current slow query set index and the corresponding index threshold. By the method, the slow query can be analyzed in real time, the slow query with too high resource consumption can be cancelled in time, and the problem that the Elasticissearch cluster consumes resources in unnecessary slow query requests is avoided, so that the resource consumption is saved, and the stable operation of the cluster is ensured.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only drawings of some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a schematic block diagram of an Elasticsearch cluster protection apparatus according to embodiment 1 of the present invention;
fig. 2 is a flowchart of a method of Elasticsearch cluster protection according to embodiment 2 of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, embodiments of the present invention will be described in detail with reference to the accompanying drawings. It is to be understood that the following examples are illustrative only and are not intended to limit the scope of the present invention.
[ example 1 ]
Fig. 1 is a schematic block diagram of a cluster protection device according to embodiment 1 of the present invention. As shown in fig. 1, the Elasticsearch cluster protection apparatus of this embodiment may include a configuration management unit 11, a data acquisition unit 12, a data analysis unit 13, and an alarm handling unit 14.
Wherein, the configuration management unit 11 is configured to execute the following processes: acquiring a recording threshold value of the configured slow query log, and sending the recording threshold value to the Elasticissearch cluster 20, so that the Elasticissearch cluster 20 records the slow query log according to the recording threshold value;
the data acquisition unit 12 is configured to perform the following processing: acquiring slow query logs recorded by the Elasticissearch cluster 20 from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
the data analysis unit 13 is configured to perform the following processing: index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
the alarm handling unit 14 is configured to perform the following processing: and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
The configuration management unit 11 may include a back-end application and a front-end page, where the back-end application may be a micro-service application developed based on Spring boot (a development framework for Spring application), and the front-end page may be a user page implemented based on Vue (a progressive JavaScript framework for building a user interface). The user can configure the recording threshold of the slow query log, which may be a query elapsed time threshold, through the front-end page of the configuration management unit 11. After configuration, the configuration management unit 11 may obtain a recording threshold of a slow query log configured by a user, and send the recording threshold to the Elasticsearch cluster 20 through a recording instruction to instruct the Elasticsearch cluster 20 to record a query log of a query that takes more time than the recording threshold, where the query is a slow query and the query log is a slow query log. For example, if the record threshold of the configured slow query log is 5 seconds, the configuration management unit 11 may instruct the Elasticsearch cluster to send the record threshold to the Elasticsearch cluster by a record instruction, so as to instruct the Elasticsearch cluster to take a query that takes more than 5 seconds as a slow query and record the query log of the query. After receiving the recording instruction sent by the configuration management unit 11, each data node in the Elasticsearch cluster may record a slow query log according to a recording threshold carried in the recording instruction.
The configuration management unit 11 of this embodiment provides a visual front-end page for a user to configure a recording threshold, and sends the recording threshold configured by the user to the Elasticsearch cluster through a recording instruction, thereby instructing the Elasticsearch cluster to record the slow query log according to the recording threshold configured by the user. Compared with the traditional method of configuring the record threshold of the slow query log of the elastic search cluster by modifying the file of the configuration file of the elastic search cluster and restarting the service through a login server, the method and the system for configuring the slow query log of the elastic search cluster have the advantages that the login server is not needed, configuration management can be conveniently carried out by operation and maintenance personnel in the operation and maintenance management process, and the workload of configuration management is reduced.
In an implementation manner of this embodiment, the configuration management unit 11 may further provide an option of whether to record the slow query log for a user to select, and when the user selects to record the slow query log, obtain a recording threshold configured by the user, and further send the recording threshold to the Elasticsearch cluster to instruct the Elasticsearch cluster to record the slow query log according to the recording threshold.
In an implementation manner of this embodiment, the configuration management unit 11 may communicate with a plurality of Elasticsearch clusters, and perform configuration management on the plurality of Elasticsearch clusters. At this time, the user may select one of the Elasticsearch clusters through the configuration management unit 11 to configure the recording threshold of the cluster. After configuration, the configuration management unit 11 may obtain the recording threshold of the cluster, and send the recording threshold of the cluster to the cluster, so as to instruct the cluster to record the slow query log according to the recording threshold of the cluster. In other implementation manners, the user may also configure the recording threshold of each Elasticsearch cluster through the configuration management unit 11, and after configuration, the configuration management unit 11 may send the recording threshold of each cluster to each corresponding cluster, so as to instruct each cluster to record the slow query log according to the respective recording threshold.
After the elastosearch cluster records the slow query log, the data acquisition unit 12 may monitor and acquire the slow query log of the current slow query on each data node of the elastosearch cluster through a flash (a log collection system) data acquisition agent, and then analyze the slow query log of the current slow query through logstack (a data processing pipeline for collecting, analyzing and converting the log) to form the structured slow query data of the current slow query.
In this embodiment, the Elasticsearch cluster protection apparatus may further include a data storage unit, and the data storage unit may provide a data storage service using the Elasticsearch cluster, so as to be convenient for use. After forming the structured slow query data, the data collection unit 11 may store the structured slow query data of the current slow query to the data storage unit.
The data analysis unit 13 may obtain the structured slow query data of the current slow query from the data storage unit, then obtain the index data of the set indexes, such as the number of hits of the current slow query in the total number of indexes and the number of related query fragments, from the structured slow query data, and then send the index data of the set indexes of the current slow query to the alarm handling unit 14.
After receiving each index data of the current slow query, the alarm handling unit 14 may compare each index data with a corresponding index threshold, and if any index data exceeds the corresponding index threshold, send a control instruction to cancel the query task corresponding to the current slow query to the Elasticsearch cluster 20. For example, if the current slow query is slow query a, the alarm handling unit 14 may obtain a ratio of the number of hits in the slow query a to the total number of indexes and a related query fragment number, then compare the ratio of the number of hits in the slow query a to the total number of indexes with a ratio threshold of the number of hits in the total number of indexes, compare the fragment number related to the slow query a to a related fragment number threshold, and if index data of an index in the two indexes of the slow query a exceeds an index threshold of the index, that is, the index data of any one or two indexes in the two indexes exceeds a corresponding index threshold, the alarm processing unit 14 may send a control instruction to cancel a query task corresponding to the slow query a to the Elasticsearch cluster 20 performing the slow query a. The ratio threshold of the number of hits to the total number of indexes and the threshold of the number of slices involved in slow query may be pre-configured in the cluster protection device described in this embodiment, and these two index thresholds may be empirical values, or may be dynamically generated according to index data in a set time period. After receiving a control instruction for canceling the query task of the current slow query sent by the alarm processing unit 14, the Elasticsearch cluster 20 may cancel the query task of the current slow query a.
Generally speaking, in the Elasticissearch cluster, the number of query hits and the number of partitions involved in a query should be within a reasonable range. If the proportion of the number of hits in the Elasticsearch cluster to the total number of indexes is too large, or the number of fragments involved in the query is too large, it means that the query of the user may consume too many resources, and the setting of the query itself may cause a problem, which may affect the normal operation of the Elasticsearch cluster. For example, if the number of query hits is too large, and the ratio of the query hits to the total number of indexes is too large, it means that the query condition of the user is not accurate enough, a large number of results may be returned, which may cause the Elasticsearch cluster to have memory overflow or slow response. For the characteristic of the Elasticsearch cluster, when such a request is monitored, the embodiment sends a control instruction for canceling the query task of the query request to the Elasticsearch cluster, so as to cancel the query that consumes too much resources, and avoid the Elasticsearch cluster consuming resources in unnecessary query requests. By canceling the query task of the query which consumes too much resources, the Elasticsearch cluster can save the resource overhead, and the resources are used in necessary processing such as processing of a write request, so that the cluster is ensured to operate stably.
The technical personnel in the field can understand that the Elasticsearch cluster protection device provided by this embodiment can be a real-time stream processing device, the data acquisition unit can constantly monitor and acquire the slow query log of the current slow query recorded by the Elasticsearch cluster and analyze the slow query log of the current slow query into the structured slow query data of the current slow query, and the data analysis unit can constantly acquire the specified index data of the current slow query from the structured slow query data of the current slow query and send the specified index data to the alarm handling unit, so that the alarm handling unit constantly compares the index data of the current slow query with the corresponding index threshold to consume excessive query tasks. By the method, the slow query in the Elasticissearch cluster can be analyzed in real time, and the query task of the slow query which consumes excessive resources can be cancelled in time.
In other implementation manners of this embodiment, in addition to the two indexes, i.e., the proportion of the number of query hits to the total number of indexes and the number of fragments involved in the query, this embodiment may also obtain other query indexes of the current query to perform alarm handling by combining the two indexes. For example, in addition to obtaining two index data of the total number of hits of the current slow query and the number of related query fragments from the structured slow query data of the current slow query, the data analysis unit 13 may further obtain the query elapsed time of the current slow query from the structured slow query data of the current slow query, and then send the ratio of the number of hits of the current slow query to the total number of indexes, the number of related query fragments, and the query elapsed time to the alarm handling unit 14. The alarm handling unit 14 may compare the three index data, i.e., the number of hits of the current slow query in the total number of indexes, the number of related query fragments, and the query time consumption, with the corresponding index threshold, and if the index data of two indexes exceeds the corresponding index threshold, the alarm handling unit 14 may send a request for canceling the query task corresponding to the slow query to the Elasticsearch cluster. For example, if the query time consumption of the current slow query exceeds the query time consumption threshold and the number of query fragments involved in the query exceeds the query fragment number threshold, a request for canceling the query task corresponding to the slow query is sent to the Elasticsearch cluster.
In another implementation manner of this embodiment, in addition to canceling the query task corresponding to the slow query with too high resource consumption, this embodiment may also shield the subsequent query request by linking the nginnx access proxy server, thereby ensuring stable operation of the Elasticsearch cluster.
Specifically, the data analysis unit 13 may obtain the structured query data of the current slow query, then obtain the number of waiting tasks in the current query queue from the structured slow query data of the current slow query, and then send the number of waiting tasks to the alarm handling unit 14. The alarm handling unit 14, after receiving the waiting task number, may compare the waiting task number with a waiting task number threshold, and if the waiting task number exceeds the waiting task threshold, may send a control instruction to the Nginx access proxy server to shield a subsequent query request of the current query request. In response to the control instruction for shielding the subsequent query request, the nginnx access proxy server may shield all the requests subsequent to the current query request until the number of waiting tasks returns to normal.
Further, the data analysis unit 13 may continuously obtain the number of waiting tasks in the current query queue, and send the number of waiting tasks in the current query queue to the alarm handling unit 14. The alarm handling unit 14 may continuously compare the number of waiting tasks in the current query queue with the waiting task number threshold to detect whether the number of waiting tasks returns to normal. If it is detected that the number of waiting tasks in the current query queue returns to not exceeding the waiting task number threshold, the alert handling unit 14 may send a control instruction to the Nginx access proxy server to unmask the subsequent query request. In response to the control directive to unmask subsequent query requests, the Nginx access proxy server may unmask subsequent query requests. The waiting task number threshold may be configured in advance in the Elasticsearch cluster protection apparatus of this embodiment.
In the Elasticissearch cluster, the index of the number of waiting tasks in the query queue is kept at 0, and once the number of waiting tasks rapidly increases and exceeds the waiting task threshold, the index indicates that excessive queries may affect the stable operation of the Elasticissearch cluster. In the embodiment, whether the number of waiting tasks in the current query queue exceeds the set waiting task number threshold is continuously detected, and when the number of waiting tasks in the current query queue exceeds the waiting task number threshold is detected, the Nginx is linked to access the proxy server, and a control instruction for shielding a subsequent query request is sent to the Nginx server. By the method, all subsequent query requests can be shielded by accessing the proxy server through the Nginx when the number of the waiting tasks exceeds the set threshold, so that stable operation of the Elasticissearch cluster is ensured.
If a certain IP address frequently sends query requests to the Elasticsearch cluster, it will also have a bad influence on the stable operation of the Elasticsearch cluster. In an implementation manner of this embodiment, the Elasticsearch cluster protection device may further link the Nginx access proxy server to shield a subsequent query request sent by an IP address that frequently sends a query request, so as to ensure stable operation of the Elasticsearch cluster, thereby protecting the Elasticsearch cluster.
Specifically, in addition to providing an option to configure the logging threshold, the configuration management unit 11 may also provide an option to configure the Nginx masking threshold for the user to configure the Nginx masking threshold. After the user configures the Nginx block threshold, the configuration management unit may obtain the user configured Nginx block threshold, and then send the Nginx block threshold to the alarm handling unit 14.
The data acquisition unit 12 may continuously monitor and acquire the current user access log of the Nginx access proxy server while monitoring and acquiring the slow query log of the Elasticsearch cluster, and analyze the current user access log into structured user access data to be stored in the data storage unit. The data analysis unit 13 may continuously obtain the structured user access data of each current IP address from the data storage unit, obtain the number of times that each current IP address sends a request per second from the structured user access data of each current IP address, and send the number of times to the alarm handling unit 14.
The alarm handling unit 14 may receive a nginn shielding threshold sent by the configuration management unit 11 and the number of times of requests per second sent by each current IP address sent by the data acquisition unit 12, compare the number of times of requests per second sent by each IP address with the nginn shielding threshold, and send a control instruction for shielding subsequent requests sent by IP addresses of which the number of times of requests per second exceeds the shielding threshold to the nginn access proxy server if the number of times of requests per second sent by each IP address exceeds the nginn shielding threshold. In response to the masking instruction, the Nginx access proxy server may mask subsequent requests issued by IP addresses that issue requests more than a masking threshold number of times per second. For example, if there are currently two IP addresses sending query requests to the Elasticsearch cluster through the nginnx access proxy, the number of times that the IP address 1 sends requests per second is 5, the number of times that the IP address 2 sends requests per second is 8, and the nginnx mask threshold is 6, the alarm handling unit 14 may send a control instruction to the nginnx access proxy to mask subsequent query requests sent by the IP address 2. In response to the control instruction, the Nginx access proxy server may mask subsequent query requests sent by IP address 2.
Further, the data analysis unit 13 may continuously obtain the number of times of requests per second of the IP address currently being masked and send the number of times of requests per second of the IP address to the alarm handling unit 14. The alarm handling unit 14 may continuously compare the number of times per second that the IP address issues requests with a Nginx mask threshold to detect whether the number of times per second that the IP address issues requests returns to normal. If the number of times of requests issued by the IP address per second is recovered to be lower than the Nginx shielding threshold value, a control instruction for canceling shielding subsequent requests of the IP address can be sent to the Nginx access proxy server, so that the IP address can normally send the requests to the Elasticisarch cluster through the Nginx access proxy server. For example, after the subsequent query request sent by the IP address 2 is masked by the nginnx access proxy server, the number of times of sending requests by the IP address 2 per second may be continuously detected, and if the number of times of sending requests per second returns to be lower than the nginnx masking threshold, a control instruction for canceling the masking of the subsequent query request sent by the IP address 2 may be sent to the nginnx access proxy server. In response to the control instruction, the nginnx access proxy server may unmask the subsequent query request sent by the IP address 2, so that the IP address 2 can normally send the query request to the Elasticsearch cluster through the nginnx access proxy server.
By linking the Nginx access proxy server to shield subsequent query requests sent by IP addresses frequently sending the requests, the Elasticissearch cluster provided by the embodiment can further ensure the stable operation of the Elasticissearch cluster.
In an implementation manner of this embodiment, the configuration management unit 11 may be configured for a user to configure the recording threshold and the Nginx access threshold, and also for the user to view an operating state of the Elasticsearch cluster.
Specifically, the configuration management unit 11 may provide the running state of each Elasticsearch cluster managed by it for the user to select and view. The user may also input the IP address and the installation path of the Elasticsearch server of the Elasticsearch cluster that the user needs to view in the configuration management unit 11. In response to the IP address and the installation path input by the user, the configuration management unit 11 may send an instruction for obtaining the running state of the corresponding Elasticsearch cluster to the corresponding Elasticsearch cluster according to the IP address and the installation path, and obtain the running state of the Elasticsearch cluster that the user needs to view, and display the running state to the user front-end page.
Compared with the traditional method that a user needs to log in each Elasticissearch cluster server to check the running state of each Elasticissearch cluster, the implementation mode can perform centralized management on a plurality of Elasticissearch clusters, the user can visually check the running state of any Elasticissearch cluster managed by the cluster protection device only through the cluster protection device provided by the implementation mode, and does not need to log in each cluster to check the running state, so that operation and maintenance personnel can conveniently perform centralized operation and maintenance on the plurality of Elasticissearch clusters.
[ example 2 ]
The embodiment provides an elastic search cluster protection method. Fig. 2 shows a flowchart of an Elasticsearch cluster protection method according to embodiment 2 of the present invention. As shown in fig. 2, the Elasticsearch cluster protection method provided in this embodiment may include the following processing:
s101: acquiring a recording threshold value of a configured slow query log, and sending the recording threshold value to an Elasticissearch cluster so that the Elasticissearch cluster can record the slow query log according to the recording threshold value;
s102: acquiring slow query logs recorded by the Elasticissearch cluster from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
s103: index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
s104: and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
In an implementation manner of this embodiment, the setting the index further includes: inquiring the number of waiting tasks in the waiting queue;
the method further comprises the following steps:
and comparing the number of the waiting tasks with a waiting task number threshold, and sending a control instruction for shielding a subsequent query request to the Nginx access proxy server when the number of the waiting tasks exceeds the waiting task number threshold.
In an implementation manner of this embodiment, the method further includes:
detecting whether the waiting task number is recovered to be normal or not, and sending a control instruction for canceling shielding of a subsequent query request to an Nginx access proxy server when the waiting task number is recovered to be normal, wherein the step of recovering to be normal comprises the following steps: the number of waiting tasks returns to not exceeding the waiting task threshold.
In an implementation manner of this embodiment, the method further includes:
acquiring a configured Nginx shielding threshold;
acquiring a user access log from an Nginx access proxy server, and analyzing the user access log to form structured user access data;
and acquiring the number of times of requests sent by each IP address per second from the structured user access data, comparing the number of times of requests sent by each IP address per second with the shielding threshold, and if the number of times of requests sent by each IP address per second exceeds the shielding threshold, sending a control instruction for shielding subsequent requests of the IP addresses of which the number of times of requests sent per second exceeds the shielding threshold to the Nginx access proxy server.
In an implementation manner of this embodiment, the method further includes:
acquiring the IP address and the installation path of the Elasticissearch server input by the user,
and displaying the running state of the Elasticise cluster corresponding to the IP address and the installation path according to the IP address and the installation path.
Those skilled in the art can understand that the method for protecting an Elasticsearch cluster provided in this embodiment may be applied to the Elasticsearch cluster protection apparatus provided in embodiment 1, and specific processing of each step may refer to a corresponding process in the foregoing embodiment of the Elasticsearch cluster protection apparatus, and is not described herein again.
Through the above description of the embodiments, those skilled in the art will clearly understand that the present invention can be implemented by combining software and a hardware platform. With this understanding in mind, all or part of the technical solutions of the present invention that contribute to the background can be embodied in the form of a software product, which can be stored in a storage medium, such as a ROM/RAM, a magnetic disk, an optical disk, etc., and includes instructions for causing a computer device (which can be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.
[ example 3 ]
The present embodiment provides a computer-readable storage medium, such as a hard disk, an optical disk, a flash memory, a floppy disk, a magnetic tape, etc., on which computer-readable instructions are stored, which can be executed by a processor to implement the process of the Elasticsearch cluster protection method described in embodiment 2.
[ example 4 ]
The present embodiment provides a computer device, including:
a memory having a computer program stored thereon,
a processor which can execute the computer program to realize the processing of the Elasticsearch cluster protection method described in embodiment 2.
The terms and expressions used in the present specification are used as terms of illustration only and are not meant to be limiting. It will be appreciated by those skilled in the art that changes could be made to the details of the above-described embodiments without departing from the underlying principles thereof. The scope of the invention is, therefore, indicated by the appended claims, in which all terms are intended to be interpreted in their broadest reasonable sense unless otherwise indicated.

Claims (12)

1. An Elasticsearch cluster protection apparatus, the cluster protection apparatus comprising:
a configuration management unit for performing the following processing: acquiring a recording threshold value of a configured slow query log, and sending the recording threshold value to an Elasticissearch cluster so that the Elasticissearch cluster can record the slow query log according to the recording threshold value;
a data acquisition unit for performing the following processes: acquiring slow query logs recorded by the Elasticissearch cluster from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
a data analysis unit for performing the following processing: index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
an alarm handling unit for performing the following processing: and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
2. Cluster protection device according to claim 1,
the setting of the index further includes: inquiring the number of waiting tasks in the waiting queue;
the alert handling unit is further configured to: and comparing the number of the waiting tasks with a waiting task number threshold, and sending a control instruction for shielding a subsequent query request to the Nginx access proxy server when the number of the waiting tasks exceeds the waiting task number threshold.
3. Cluster protection device according to claim 2,
the alert handling unit is further configured to: detecting whether the waiting task number is recovered to be normal or not, and sending a control instruction for canceling shielding of a subsequent query request to an Nginx access proxy server when the waiting task number is recovered to be normal, wherein the step of recovering to be normal comprises the following steps: the number of waiting tasks returns to not exceeding the waiting task threshold.
4. Cluster protection device according to claim 1,
the configuration management unit is further configured to perform the following processing: acquiring a configured Nginx shielding threshold;
the data acquisition unit is further configured to perform the following: acquiring a user access log from an Nginx access proxy server, and analyzing the user access log to form structured user access data;
the data analysis unit is further configured to perform the following processing: and acquiring the number of times of requests sent by each IP address per second from the structured user access data, comparing the number of times of requests sent by each IP address per second with the shielding threshold, and if the number of times of requests sent by each IP address per second exceeds the shielding threshold, sending a control instruction for shielding subsequent requests of the IP addresses of which the number of times of requests sent per second exceeds the shielding threshold to the Nginx access proxy server.
5. The cluster protection device of claim 1, wherein the configuration management unit is further configured to:
acquiring the IP address and the installation path of the Elasticissearch server input by the user,
and displaying the running state of the Elasticise cluster corresponding to the IP address and the installation path according to the IP address and the installation path.
6. An Elasticissearch cluster protection method, characterized in that the method comprises:
acquiring a recording threshold value of a configured slow query log, and sending the recording threshold value to an Elasticissearch cluster so that the Elasticissearch cluster can record the slow query log according to the recording threshold value;
acquiring slow query logs recorded by the Elasticissearch cluster from the Elasticissearch cluster, and analyzing the slow query logs to form structured slow query data;
index data of set indexes of the slow query is obtained from the structured slow query data, and the set indexes comprise: the number of hits of the slow query is a proportion of the total number of indexes and the number of query fragments involved in the slow query;
and comparing the index data of the set index with an index threshold corresponding to the set index, and if any index data exceeds the corresponding index threshold, sending a control instruction for canceling the query task corresponding to the slow query to the Elasticissearch cluster.
7. The cluster protection method of claim 6,
the setting of the index further includes: inquiring the number of waiting tasks in the waiting queue;
the method further comprises the following steps:
and comparing the number of the waiting tasks with a waiting task number threshold, and sending a control instruction for shielding a subsequent query request to the Nginx access proxy server when the number of the waiting tasks exceeds the waiting task number threshold.
8. The cluster protection method of claim 7, further comprising:
detecting whether the waiting task number is recovered to be normal or not, and sending a control instruction for canceling shielding of a subsequent query request to an Nginx access proxy server when the waiting task number is recovered to be normal, wherein the step of recovering to be normal comprises the following steps: the number of waiting tasks returns to not exceeding the waiting task threshold.
9. The cluster protection method of claim 6, further comprising:
acquiring a configured Nginx shielding threshold;
acquiring a user access log from an Nginx access proxy server, and analyzing the user access log to form structured user access data;
and acquiring the number of times of requests sent by each IP address per second from the structured user access data, comparing the number of times of requests sent by each IP address per second with the shielding threshold, and if the number of times of requests sent by each IP address per second exceeds the shielding threshold, sending a control instruction for shielding subsequent requests of the IP addresses of which the number of times of requests sent per second exceeds the shielding threshold to the Nginx access proxy server.
10. The cluster protection method of claim 6, further comprising:
acquiring the IP address and the installation path of the Elasticissearch server input by the user,
and displaying the running state of the Elasticise cluster corresponding to the IP address and the installation path according to the IP address and the installation path.
11. A computer storage medium having stored thereon computer instructions executable by a processor to implement the Elasticsearch cluster protection method of any of claims 6 to 10.
12. A computer device, comprising:
a memory having a computer program stored thereon;
a processor for executing the computer program to implement the Elasticsearch cluster protection method of any of claims 6 to 10.
CN202011204062.6A 2020-11-02 2020-11-02 Elasticisearch cluster protection device, method, storage medium and computer equipment Pending CN112463825A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011204062.6A CN112463825A (en) 2020-11-02 2020-11-02 Elasticisearch cluster protection device, method, storage medium and computer equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011204062.6A CN112463825A (en) 2020-11-02 2020-11-02 Elasticisearch cluster protection device, method, storage medium and computer equipment

Publications (1)

Publication Number Publication Date
CN112463825A true CN112463825A (en) 2021-03-09

Family

ID=74834295

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011204062.6A Pending CN112463825A (en) 2020-11-02 2020-11-02 Elasticisearch cluster protection device, method, storage medium and computer equipment

Country Status (1)

Country Link
CN (1) CN112463825A (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
RU2647616C1 (en) * 2016-12-21 2018-03-16 Общество с ограниченной ответственностью "ОНСЕК ИНК." Method of detecting brute force attack on web service
US20180102938A1 (en) * 2016-10-11 2018-04-12 Oracle International Corporation Cluster-based processing of unstructured log messages
CN108073465A (en) * 2017-12-29 2018-05-25 中国平安人寿保险股份有限公司 Dynamic current limiting method, Nginx servers, storage medium and device
CN110765126A (en) * 2019-09-10 2020-02-07 浙江大华技术股份有限公司 Data storage and query method, device and storage medium of distributed database
CN110795614A (en) * 2019-09-27 2020-02-14 广东浪潮大数据研究有限公司 Index automatic optimization method and device
CN111797096A (en) * 2020-06-29 2020-10-20 中国平安财产保险股份有限公司 Data indexing method and device based on ElasticSearch, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180102938A1 (en) * 2016-10-11 2018-04-12 Oracle International Corporation Cluster-based processing of unstructured log messages
RU2647616C1 (en) * 2016-12-21 2018-03-16 Общество с ограниченной ответственностью "ОНСЕК ИНК." Method of detecting brute force attack on web service
CN108073465A (en) * 2017-12-29 2018-05-25 中国平安人寿保险股份有限公司 Dynamic current limiting method, Nginx servers, storage medium and device
CN110765126A (en) * 2019-09-10 2020-02-07 浙江大华技术股份有限公司 Data storage and query method, device and storage medium of distributed database
CN110795614A (en) * 2019-09-27 2020-02-14 广东浪潮大数据研究有限公司 Index automatic optimization method and device
CN111797096A (en) * 2020-06-29 2020-10-20 中国平安财产保险股份有限公司 Data indexing method and device based on ElasticSearch, computer equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
安明远;孙秀明;孙凝晖;: "动态分片在线聚集", 计算机研究与发展, no. 11, 15 November 2010 (2010-11-15), pages 82 - 89 *

Similar Documents

Publication Publication Date Title
US11789943B1 (en) Configuring alerts for tags associated with high-latency and error spans for instrumented software
US11023355B2 (en) Dynamic tracing using ranking and rating
US8627147B2 (en) Method and computer program product for system tuning based on performance measurements and historical problem data and system thereof
US20150046512A1 (en) Dynamic collection analysis and reporting of telemetry data
US11093349B2 (en) System and method for reactive log spooling
US20100153431A1 (en) Alert triggered statistics collections
US20150160992A1 (en) Large log file diagnostics system
Jiang et al. Efficient fault detection and diagnosis in complex software systems with information-theoretic monitoring
WO2019223155A1 (en) Sql performance monitoring method and device, computer apparatus, and storage medium
US9965327B2 (en) Dynamically scalable data collection and analysis for target device
Karumuri et al. Towards observability data management at scale
CN111338901A (en) Redis monitoring method, Redis monitoring device and terminal
Roschke et al. A flexible and efficient alert correlation platform for distributed ids
CN110659307A (en) Event stream correlation analysis method and system
CN111046022A (en) Database auditing method based on big data technology
Fu et al. Performance issue diagnosis for online service systems
CN112015646A (en) Network request monitoring method and device, computer equipment and storage medium
US20180052882A1 (en) Isolation anomaly quantification through heuristical pattern detection
US20090157923A1 (en) Method and System for Managing Performance Data
US9397921B2 (en) Method and system for signal categorization for monitoring and detecting health changes in a database system
CN112256548B (en) Abnormal data monitoring method and device, server and storage medium
CN113778810A (en) Log collection method, device and system
US9922071B2 (en) Isolation anomaly quantification through heuristical pattern detection
Gu et al. Online failure forecast for fault-tolerant data stream processing
CN112463825A (en) Elasticisearch cluster protection device, method, storage medium and computer equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination