CN113986707B - Method for monitoring and controlling slow SQL based on big data kudu partition - Google Patents
Method for monitoring and controlling slow SQL based on big data kudu partition Download PDFInfo
- Publication number
- CN113986707B CN113986707B CN202111293515.1A CN202111293515A CN113986707B CN 113986707 B CN113986707 B CN 113986707B CN 202111293515 A CN202111293515 A CN 202111293515A CN 113986707 B CN113986707 B CN 113986707B
- Authority
- CN
- China
- Prior art keywords
- partition
- kudu
- monitoring
- slow
- sql
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
- G06F11/30—Monitoring
- G06F11/34—Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
- G06F11/3466—Performance evaluation by tracing or monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/248—Presentation of query results
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
- G06F16/278—Data partitioning, e.g. horizontal or vertical partitioning
Abstract
The invention discloses a method for monitoring and controlling slow sql based on big data kudu partition, which carries out cluster classification physical partition division on a large amount of data and numerous classified data, more visually displays data state problems, memory space, partition number and the like existing in each partition system, simultaneously visually checks the slow sql existing in the system, dynamically configures a set time threshold by using a dictionary table, and queries a dynamic trend change diagram of the slow sql quantity exceeding the time threshold and a detailed list state of each slow sql running, so that a system maintainer can visually see the system data problems, quickly solves the problems of system slow link timeout, system accidental interruption and the like caused by the sql existing in the system, solves the problem of sql pain points existing in numerous systems, does not need to track through links, and is convenient and efficient.
Description
Technical Field
The invention relates to the technical field of KUDU, in particular to a method for monitoring and controlling slow sql based on big data KUDU subareas.
Background
In recent years, KUDU has become more widely used in large data platforms. And has an irreplaceable position. For characteristics of kudu, generally, such a massive data OLAP scene does not need a preprocessing scheme, for example, Cube management is performed like EBAY kylin, or predefined aggregation operation is performed according to business requirements like google Mesa. And moreover, a data channel is built by the system, and two systems of real-time processing and batch processing are connected in series, so that respective characteristics are exerted. Kudu is positioned in a rapid analysis type data warehouse for dealing with rapidly changing data, and hopefully supports application scenes (possible scenes such as time series data analysis and log data real-time monitoring analysis) which simultaneously need high throughput rate and random reading and writing by the self-capability of the system, provides a system between the performance characteristics of HDFS and HBase, finds a balance point between random reading and writing and batch scanning, and ensures stable and predictable response delay. There is currently a lack of an effective method for kudu partition monitoring and slow sql monitoring.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method for monitoring and controlling slow sql based on big data kudu partitions.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for monitoring and controlling slow sql based on big data kudu partition comprises the following specific processes:
the process of Kudu partition monitoring is as follows: when live partition scheduling is executed, live queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to hdfs configuration after initialization of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, judging whether the kudu partition statistical information is cm5 or cm6 according to the kudu partition statistical information, and obtaining corresponding partition tables and storage capacity according to the judgment result; inserting a kudu partition table, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on the conditions of different kudu partition tables according to the summary statistical monitoring information;
the process of monitoring and scheduling slow sql information by the kudu comprises the following steps: when the kudu executes monitoring scheduling, configuring Clouderagercontrolbuilder information according to the hadoop cluster node; and then querying a slow sql result set according to a slow query time threshold value and a filtering condition configured in the dictionary table, displaying the condition that slow sql exists in a set time period, and checking details.
Further, the configuration of the dictionary table about the slow query time threshold, the filtering condition and the set time period are all customized by the user.
Further, after a slow sql result set is obtained through query, a Kudu monitoring table obtained through monitoring by a Kudu partition is inserted in batches, details of the slow sql are obtained, the condition that the slow sql exists in a set time period is displayed, and details are checked.
The invention has the beneficial effects that:
1. the method of the invention can better deal with the slow sql query condition existing in each application system, and can carry out list expansion and analysis on the sql execution users of each slow query one by one, so that a maintainer can better optimize the problems existing in the system, improve the system performance, enable the system link to respond more quickly, and avoid a series of problems of link overtime, waiting, interruption and the like caused by the sql.
2. Aiming at different data types with large data volume, the invention adopts partition statistics, different cm acquisition partition tables and storage quantities are different, the invention can monitor the data of different partitions, monitor the data of the partitions with abnormal problems, and count different storage quantities, partition numbers and state data, so that a system maintainer can more visually see the data of the partitions with abnormal conditions, the storage quantities and the partition numbers of different partitions, optimize the data of the partitions with abnormal problems, optimize the data with large storage quantities, promote the quality optimization of the system data, and enable the system to run stably.
The invention carries out cluster classification physical partition on a large amount of classified data, more intuitively displays the data state problem, the memory amount, the partition number and the like existing in each partition system, can intuitively check the slow sql existing in the system, can dynamically configure the set time threshold value by applying the dictionary table, and inquires the dynamic trend change graph of the slow sql quantity exceeding the time threshold value and the running detail list state of each slow sql by taking the time threshold value as a reference, so that a system maintainer can intuitively see the system data problem, quickly solve the problems of system slow link overtime, system accidental interruption and the like caused by sql in the system, solve the sql pain point problem existing in a large number of systems, does not need to track through links, and is convenient and efficient.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and it should be noted that the present embodiment is based on the technical scheme, and a detailed implementation manner and a specific operation process are provided, but the protection scope of the present invention is not limited to the present embodiment.
The embodiment provides a method for monitoring and controlling slow sql in a partition based on big data kudu, and as shown in fig. 1, the specific process is as follows:
During monitoring of Kudu partitioning: when hive partition scheduling is executed, hive queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to the initialized hdfs configuration of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, judging whether the kudu partition statistical information is cm5 or cm6 according to the kudu partition statistical information, and obtaining corresponding partition tables and storage capacity according to the judgment result; when a kudu partition table is inserted, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on different kudu partition table conditions according to the summary statistical monitoring information;
when the kudu monitors and schedules slow sql information: when the kudu executes monitoring scheduling, configuring Cloudera manager information according to the hadoop cluster node; then, according to the configured slow query time threshold value and the filtering condition of the dictionary table, querying a slow sql result set; and displaying the condition that slow sql exists in a set time period, and checking the details.
It should be noted that, the dictionary table is configured by the user in a customized manner with respect to the slow query time threshold, the filtering condition and the setting time period. And if the query time exceeds the slow query time threshold, the slow query is carried out.
Further, in the process of inquiring the slow sql result set, inserting the Kudu monitoring table obtained by Kudu partition monitoring in batches, obtaining details of the slow sql, displaying the condition that the slow sql exists in a set time period, and checking the details.
It should be noted that the slow sql existing in the set time period is displayed, and the checking details include a dynamic trend change diagram of the number of slow sql exceeding a slow time threshold and a state of a running detail list of each slow sql.
Various changes and modifications can be made by those skilled in the art based on the above technical solutions and concepts, and all such changes and modifications should be included in the protection scope of the present invention.
Claims (3)
1. A method for monitoring and controlling slow sql on the basis of big data kudu partition is characterized by comprising the following specific processes:
the process of Kudu partition monitoring is: when hive partition scheduling is executed, hive queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to hdfs configuration after initialization of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, accordingly judging whether the kudu partition statistical information is cm5 or cm6, and obtaining a corresponding partition table and storage capacity according to the judgment result; inserting a kudu partition table, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on the conditions of different kudu partition tables according to the summary statistical monitoring information;
The process of monitoring and scheduling slow sql information by the kudu comprises the following steps: when the kudu executes monitoring scheduling, configuring Clouderagercontrolbuilder information according to the hadoop cluster node; and then querying a slow sql result set according to a slow query time threshold value and a filtering condition configured in the dictionary table, displaying the condition that slow sql exists in a set time period, and checking details.
2. The method of claim 1, wherein the configuration of the dictionary tables with respect to slow query time thresholds, filtering conditions, and set time periods are customized by a user.
3. The method as claimed in claim 1, wherein after the slow sql result set is obtained through query, the Kudu monitoring table obtained through Kudu partition monitoring is inserted in batch, slow sql detail information is obtained, the situation that slow sql exists in a set time period is displayed, and details are checked.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111293515.1A CN113986707B (en) | 2021-11-03 | 2021-11-03 | Method for monitoring and controlling slow SQL based on big data kudu partition |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111293515.1A CN113986707B (en) | 2021-11-03 | 2021-11-03 | Method for monitoring and controlling slow SQL based on big data kudu partition |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113986707A CN113986707A (en) | 2022-01-28 |
CN113986707B true CN113986707B (en) | 2022-06-14 |
Family
ID=79746089
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111293515.1A Active CN113986707B (en) | 2021-11-03 | 2021-11-03 | Method for monitoring and controlling slow SQL based on big data kudu partition |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113986707B (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273504A (en) * | 2017-06-19 | 2017-10-20 | 浪潮软件集团有限公司 | Data query method and device based on Kudu |
CN109871392A (en) * | 2019-02-18 | 2019-06-11 | 浪潮软件集团有限公司 | A kind of slow sql real-time data acquisition method under distribution application system |
CN111176954A (en) * | 2020-01-02 | 2020-05-19 | 浪潮软件股份有限公司 | Monitoring method of kudu |
CN111338770A (en) * | 2020-02-12 | 2020-06-26 | 咪咕文化科技有限公司 | Task scheduling method, server and computer readable storage medium |
CN112084211A (en) * | 2020-10-12 | 2020-12-15 | 北京高因科技有限公司 | Slow SQL statement processing system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
GB2569678A (en) * | 2017-12-22 | 2019-06-26 | Warevalley Co Ltd | Automation of SQL tuning method and system using statistic SQL pattern analysis |
US10983895B2 (en) * | 2018-06-05 | 2021-04-20 | Unravel Data Systems, Inc. | System and method for data application performance management |
-
2021
- 2021-11-03 CN CN202111293515.1A patent/CN113986707B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107273504A (en) * | 2017-06-19 | 2017-10-20 | 浪潮软件集团有限公司 | Data query method and device based on Kudu |
CN109871392A (en) * | 2019-02-18 | 2019-06-11 | 浪潮软件集团有限公司 | A kind of slow sql real-time data acquisition method under distribution application system |
CN111176954A (en) * | 2020-01-02 | 2020-05-19 | 浪潮软件股份有限公司 | Monitoring method of kudu |
CN111338770A (en) * | 2020-02-12 | 2020-06-26 | 咪咕文化科技有限公司 | Task scheduling method, server and computer readable storage medium |
CN112084211A (en) * | 2020-10-12 | 2020-12-15 | 北京高因科技有限公司 | Slow SQL statement processing system |
Non-Patent Citations (4)
Title |
---|
《kudu,sentry,hue,hdfs,hbase基本配置》;网友;《https://blog.51cto.com/bigdata/2855138》;20210603;全文 * |
《kudu介绍》;网友;《https://www.jianshu.com/p/93c602b637a4》;20180719;全文 * |
《从头搭建presto+kudu+hive+hdfs(6节点)》;网友;《https://blog.csdn.net/fly0512/article/details/100863889》;20200707;全文 * |
《基于kudu+Impala的交通大数据存储和分析平台》;宁群仪,周超;《数据库与信息管理》;20181130;全文 * |
Also Published As
Publication number | Publication date |
---|---|
CN113986707A (en) | 2022-01-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN104951340A (en) | Information processing method and device | |
EP3324304A1 (en) | Data processing method, device and system | |
CN106815260B (en) | Index establishing method and equipment | |
EP2707812A1 (en) | Optimised data stream management system | |
US11734313B2 (en) | Systems and methods for intelligently grouping financial product users into cohesive cohorts | |
CN110995497A (en) | Method for unified operation and maintenance in cloud computing environment, terminal device and storage medium | |
CN105278879A (en) | Processing method and device of monitoring data | |
CN111914013B (en) | Data management method, system, terminal and medium based on pandas database and InfluxDB database | |
US20110106938A1 (en) | Multi-Level Offload of Model-Based Adaptive Monitoring for Systems Management | |
US11829377B2 (en) | Efficient storage method for time series data | |
CN105279226A (en) | Data monitoring method and equipment based on big data | |
CN113986707B (en) | Method for monitoring and controlling slow SQL based on big data kudu partition | |
CN114218211A (en) | Data processing system, method, computer device and readable storage medium | |
CN112631754A (en) | Data processing method, data processing device, storage medium and electronic device | |
CN105653654A (en) | Lucky draw qualification indexing system and method | |
CN114745275A (en) | Node updating method and device in cloud service environment and computer equipment | |
CN112181678A (en) | Service data processing method, device and system, storage medium and electronic device | |
KR20170130178A (en) | In-Memory DB Connection Support Type Scheduling Method and System for Real-Time Big Data Analysis in Distributed Computing Environment | |
CN111161818A (en) | Medical data exchange sharing system and method based on big data technology | |
CN115525603A (en) | Storage statistics method and device, computer readable storage medium and AI device | |
CN113886472A (en) | Data access system, access method, computer equipment and storage medium | |
CN106230618A (en) | A kind of system journal centralized processing system | |
WO2019196595A1 (en) | Method and apparatus for managing application program | |
CN111885159B (en) | Data acquisition method and device, electronic equipment and storage medium | |
WO2022088466A1 (en) | Query method and system for spare parts consumption data, and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |