CN113986707B - Method for monitoring and controlling slow SQL based on big data kudu partition - Google Patents

Method for monitoring and controlling slow SQL based on big data kudu partition Download PDF

Info

Publication number
CN113986707B
CN113986707B CN202111293515.1A CN202111293515A CN113986707B CN 113986707 B CN113986707 B CN 113986707B CN 202111293515 A CN202111293515 A CN 202111293515A CN 113986707 B CN113986707 B CN 113986707B
Authority
CN
China
Prior art keywords
partition
kudu
monitoring
slow
sql
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111293515.1A
Other languages
Chinese (zh)
Other versions
CN113986707A (en
Inventor
于洋
高经郡
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Kejie Technology Co ltd
Original Assignee
Beijing Kejie Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Kejie Technology Co ltd filed Critical Beijing Kejie Technology Co ltd
Priority to CN202111293515.1A priority Critical patent/CN113986707B/en
Publication of CN113986707A publication Critical patent/CN113986707A/en
Application granted granted Critical
Publication of CN113986707B publication Critical patent/CN113986707B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/30Monitoring
    • G06F11/34Recording or statistical evaluation of computer activity, e.g. of down time, of input/output operation ; Recording or statistical evaluation of user activity, e.g. usability assessment
    • G06F11/3466Performance evaluation by tracing or monitoring
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • G06F16/278Data partitioning, e.g. horizontal or vertical partitioning

Abstract

The invention discloses a method for monitoring and controlling slow sql based on big data kudu partition, which carries out cluster classification physical partition division on a large amount of data and numerous classified data, more visually displays data state problems, memory space, partition number and the like existing in each partition system, simultaneously visually checks the slow sql existing in the system, dynamically configures a set time threshold by using a dictionary table, and queries a dynamic trend change diagram of the slow sql quantity exceeding the time threshold and a detailed list state of each slow sql running, so that a system maintainer can visually see the system data problems, quickly solves the problems of system slow link timeout, system accidental interruption and the like caused by the sql existing in the system, solves the problem of sql pain points existing in numerous systems, does not need to track through links, and is convenient and efficient.

Description

Method for monitoring and controlling slow SQL based on big data kudu partition
Technical Field
The invention relates to the technical field of KUDU, in particular to a method for monitoring and controlling slow sql based on big data KUDU subareas.
Background
In recent years, KUDU has become more widely used in large data platforms. And has an irreplaceable position. For characteristics of kudu, generally, such a massive data OLAP scene does not need a preprocessing scheme, for example, Cube management is performed like EBAY kylin, or predefined aggregation operation is performed according to business requirements like google Mesa. And moreover, a data channel is built by the system, and two systems of real-time processing and batch processing are connected in series, so that respective characteristics are exerted. Kudu is positioned in a rapid analysis type data warehouse for dealing with rapidly changing data, and hopefully supports application scenes (possible scenes such as time series data analysis and log data real-time monitoring analysis) which simultaneously need high throughput rate and random reading and writing by the self-capability of the system, provides a system between the performance characteristics of HDFS and HBase, finds a balance point between random reading and writing and batch scanning, and ensures stable and predictable response delay. There is currently a lack of an effective method for kudu partition monitoring and slow sql monitoring.
Disclosure of Invention
Aiming at the defects of the prior art, the invention aims to provide a method for monitoring and controlling slow sql based on big data kudu partitions.
In order to achieve the purpose, the invention adopts the following technical scheme:
a method for monitoring and controlling slow sql based on big data kudu partition comprises the following specific processes:
the process of Kudu partition monitoring is as follows: when live partition scheduling is executed, live queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to hdfs configuration after initialization of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, judging whether the kudu partition statistical information is cm5 or cm6 according to the kudu partition statistical information, and obtaining corresponding partition tables and storage capacity according to the judgment result; inserting a kudu partition table, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on the conditions of different kudu partition tables according to the summary statistical monitoring information;
the process of monitoring and scheduling slow sql information by the kudu comprises the following steps: when the kudu executes monitoring scheduling, configuring Clouderagercontrolbuilder information according to the hadoop cluster node; and then querying a slow sql result set according to a slow query time threshold value and a filtering condition configured in the dictionary table, displaying the condition that slow sql exists in a set time period, and checking details.
Further, the configuration of the dictionary table about the slow query time threshold, the filtering condition and the set time period are all customized by the user.
Further, after a slow sql result set is obtained through query, a Kudu monitoring table obtained through monitoring by a Kudu partition is inserted in batches, details of the slow sql are obtained, the condition that the slow sql exists in a set time period is displayed, and details are checked.
The invention has the beneficial effects that:
1. the method of the invention can better deal with the slow sql query condition existing in each application system, and can carry out list expansion and analysis on the sql execution users of each slow query one by one, so that a maintainer can better optimize the problems existing in the system, improve the system performance, enable the system link to respond more quickly, and avoid a series of problems of link overtime, waiting, interruption and the like caused by the sql.
2. Aiming at different data types with large data volume, the invention adopts partition statistics, different cm acquisition partition tables and storage quantities are different, the invention can monitor the data of different partitions, monitor the data of the partitions with abnormal problems, and count different storage quantities, partition numbers and state data, so that a system maintainer can more visually see the data of the partitions with abnormal conditions, the storage quantities and the partition numbers of different partitions, optimize the data of the partitions with abnormal problems, optimize the data with large storage quantities, promote the quality optimization of the system data, and enable the system to run stably.
The invention carries out cluster classification physical partition on a large amount of classified data, more intuitively displays the data state problem, the memory amount, the partition number and the like existing in each partition system, can intuitively check the slow sql existing in the system, can dynamically configure the set time threshold value by applying the dictionary table, and inquires the dynamic trend change graph of the slow sql quantity exceeding the time threshold value and the running detail list state of each slow sql by taking the time threshold value as a reference, so that a system maintainer can intuitively see the system data problem, quickly solve the problems of system slow link overtime, system accidental interruption and the like caused by sql in the system, solve the sql pain point problem existing in a large number of systems, does not need to track through links, and is convenient and efficient.
Drawings
FIG. 1 is a flow chart of a method according to an embodiment of the present invention.
Detailed Description
The present invention will be further described with reference to the accompanying drawings, and it should be noted that the present embodiment is based on the technical scheme, and a detailed implementation manner and a specific operation process are provided, but the protection scope of the present invention is not limited to the present embodiment.
The embodiment provides a method for monitoring and controlling slow sql in a partition based on big data kudu, and as shown in fig. 1, the specific process is as follows:
During monitoring of Kudu partitioning: when hive partition scheduling is executed, hive queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to the initialized hdfs configuration of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, judging whether the kudu partition statistical information is cm5 or cm6 according to the kudu partition statistical information, and obtaining corresponding partition tables and storage capacity according to the judgment result; when a kudu partition table is inserted, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on different kudu partition table conditions according to the summary statistical monitoring information;
when the kudu monitors and schedules slow sql information: when the kudu executes monitoring scheduling, configuring Cloudera manager information according to the hadoop cluster node; then, according to the configured slow query time threshold value and the filtering condition of the dictionary table, querying a slow sql result set; and displaying the condition that slow sql exists in a set time period, and checking the details.
It should be noted that, the dictionary table is configured by the user in a customized manner with respect to the slow query time threshold, the filtering condition and the setting time period. And if the query time exceeds the slow query time threshold, the slow query is carried out.
Further, in the process of inquiring the slow sql result set, inserting the Kudu monitoring table obtained by Kudu partition monitoring in batches, obtaining details of the slow sql, displaying the condition that the slow sql exists in a set time period, and checking the details.
It should be noted that the slow sql existing in the set time period is displayed, and the checking details include a dynamic trend change diagram of the number of slow sql exceeding a slow time threshold and a state of a running detail list of each slow sql.
Various changes and modifications can be made by those skilled in the art based on the above technical solutions and concepts, and all such changes and modifications should be included in the protection scope of the present invention.

Claims (3)

1. A method for monitoring and controlling slow sql on the basis of big data kudu partition is characterized by comprising the following specific processes:
the process of Kudu partition monitoring is: when hive partition scheduling is executed, hive queues and tenants are configured, and then hdfs configuration is initialized according to different partitions configured by a hadoop cluster; then, according to hdfs configuration after initialization of different cluster partitions, performing kudu partition statistics to obtain kudu partition statistical information, accordingly judging whether the kudu partition statistical information is cm5 or cm6, and obtaining a corresponding partition table and storage capacity according to the judgment result; inserting a kudu partition table, acquiring detailed statistical monitoring information of the kudu partition, acquiring summary statistical monitoring information of the kudu partition, and performing partition monitoring statistics and display on the conditions of different kudu partition tables according to the summary statistical monitoring information;
The process of monitoring and scheduling slow sql information by the kudu comprises the following steps: when the kudu executes monitoring scheduling, configuring Clouderagercontrolbuilder information according to the hadoop cluster node; and then querying a slow sql result set according to a slow query time threshold value and a filtering condition configured in the dictionary table, displaying the condition that slow sql exists in a set time period, and checking details.
2. The method of claim 1, wherein the configuration of the dictionary tables with respect to slow query time thresholds, filtering conditions, and set time periods are customized by a user.
3. The method as claimed in claim 1, wherein after the slow sql result set is obtained through query, the Kudu monitoring table obtained through Kudu partition monitoring is inserted in batch, slow sql detail information is obtained, the situation that slow sql exists in a set time period is displayed, and details are checked.
CN202111293515.1A 2021-11-03 2021-11-03 Method for monitoring and controlling slow SQL based on big data kudu partition Active CN113986707B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111293515.1A CN113986707B (en) 2021-11-03 2021-11-03 Method for monitoring and controlling slow SQL based on big data kudu partition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111293515.1A CN113986707B (en) 2021-11-03 2021-11-03 Method for monitoring and controlling slow SQL based on big data kudu partition

Publications (2)

Publication Number Publication Date
CN113986707A CN113986707A (en) 2022-01-28
CN113986707B true CN113986707B (en) 2022-06-14

Family

ID=79746089

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111293515.1A Active CN113986707B (en) 2021-11-03 2021-11-03 Method for monitoring and controlling slow SQL based on big data kudu partition

Country Status (1)

Country Link
CN (1) CN113986707B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273504A (en) * 2017-06-19 2017-10-20 浪潮软件集团有限公司 Data query method and device based on Kudu
CN109871392A (en) * 2019-02-18 2019-06-11 浪潮软件集团有限公司 A kind of slow sql real-time data acquisition method under distribution application system
CN111176954A (en) * 2020-01-02 2020-05-19 浪潮软件股份有限公司 Monitoring method of kudu
CN111338770A (en) * 2020-02-12 2020-06-26 咪咕文化科技有限公司 Task scheduling method, server and computer readable storage medium
CN112084211A (en) * 2020-10-12 2020-12-15 北京高因科技有限公司 Slow SQL statement processing system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2569678A (en) * 2017-12-22 2019-06-26 Warevalley Co Ltd Automation of SQL tuning method and system using statistic SQL pattern analysis
US10983895B2 (en) * 2018-06-05 2021-04-20 Unravel Data Systems, Inc. System and method for data application performance management

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107273504A (en) * 2017-06-19 2017-10-20 浪潮软件集团有限公司 Data query method and device based on Kudu
CN109871392A (en) * 2019-02-18 2019-06-11 浪潮软件集团有限公司 A kind of slow sql real-time data acquisition method under distribution application system
CN111176954A (en) * 2020-01-02 2020-05-19 浪潮软件股份有限公司 Monitoring method of kudu
CN111338770A (en) * 2020-02-12 2020-06-26 咪咕文化科技有限公司 Task scheduling method, server and computer readable storage medium
CN112084211A (en) * 2020-10-12 2020-12-15 北京高因科技有限公司 Slow SQL statement processing system

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
《kudu,sentry,hue,hdfs,hbase基本配置》;网友;《https://blog.51cto.com/bigdata/2855138》;20210603;全文 *
《kudu介绍》;网友;《https://www.jianshu.com/p/93c602b637a4》;20180719;全文 *
《从头搭建presto+kudu+hive+hdfs(6节点)》;网友;《https://blog.csdn.net/fly0512/article/details/100863889》;20200707;全文 *
《基于kudu+Impala的交通大数据存储和分析平台》;宁群仪,周超;《数据库与信息管理》;20181130;全文 *

Also Published As

Publication number Publication date
CN113986707A (en) 2022-01-28

Similar Documents

Publication Publication Date Title
CN104951340A (en) Information processing method and device
EP3324304A1 (en) Data processing method, device and system
CN106815260B (en) Index establishing method and equipment
EP2707812A1 (en) Optimised data stream management system
US11734313B2 (en) Systems and methods for intelligently grouping financial product users into cohesive cohorts
CN110995497A (en) Method for unified operation and maintenance in cloud computing environment, terminal device and storage medium
CN105278879A (en) Processing method and device of monitoring data
CN111914013B (en) Data management method, system, terminal and medium based on pandas database and InfluxDB database
US20110106938A1 (en) Multi-Level Offload of Model-Based Adaptive Monitoring for Systems Management
US11829377B2 (en) Efficient storage method for time series data
CN105279226A (en) Data monitoring method and equipment based on big data
CN113986707B (en) Method for monitoring and controlling slow SQL based on big data kudu partition
CN114218211A (en) Data processing system, method, computer device and readable storage medium
CN112631754A (en) Data processing method, data processing device, storage medium and electronic device
CN105653654A (en) Lucky draw qualification indexing system and method
CN114745275A (en) Node updating method and device in cloud service environment and computer equipment
CN112181678A (en) Service data processing method, device and system, storage medium and electronic device
KR20170130178A (en) In-Memory DB Connection Support Type Scheduling Method and System for Real-Time Big Data Analysis in Distributed Computing Environment
CN111161818A (en) Medical data exchange sharing system and method based on big data technology
CN115525603A (en) Storage statistics method and device, computer readable storage medium and AI device
CN113886472A (en) Data access system, access method, computer equipment and storage medium
CN106230618A (en) A kind of system journal centralized processing system
WO2019196595A1 (en) Method and apparatus for managing application program
CN111885159B (en) Data acquisition method and device, electronic equipment and storage medium
WO2022088466A1 (en) Query method and system for spare parts consumption data, and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant