CN113220705A - Slow query identification method and device - Google Patents

Slow query identification method and device Download PDF

Info

Publication number
CN113220705A
CN113220705A CN202010081806.3A CN202010081806A CN113220705A CN 113220705 A CN113220705 A CN 113220705A CN 202010081806 A CN202010081806 A CN 202010081806A CN 113220705 A CN113220705 A CN 113220705A
Authority
CN
China
Prior art keywords
query
query request
historical
data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010081806.3A
Other languages
Chinese (zh)
Inventor
刘华毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Original Assignee
Beijing Jingdong Century Trading Co Ltd
Beijing Wodong Tianjun Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Jingdong Century Trading Co Ltd, Beijing Wodong Tianjun Information Technology Co Ltd filed Critical Beijing Jingdong Century Trading Co Ltd
Priority to CN202010081806.3A priority Critical patent/CN113220705A/en
Publication of CN113220705A publication Critical patent/CN113220705A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2425Iterative querying; Query formulation based on the results of a preceding query
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data

Abstract

The invention discloses a method and a device for slow query identification, and relates to the technical field of computers. One embodiment of the method comprises: receiving a user query request, and determining a query condition and a query starting and stopping time of the query request; querying the historical data volume of the historical query request matched with the query condition, and determining the data volume predicted value of the query request according to the historical data volume and the query starting and stopping time; and identifying whether the query request is slow query or not according to the data quantity predicted value of the query request. According to the embodiment, the data volume predicted value of the current query request is determined according to the query starting and ending time of the current query request and the historical data volume of the historical query request, and whether the current query request is slow query or not is identified according to the data volume predicted value, so that the identification accuracy of slow query can be improved.

Description

Slow query identification method and device
Technical Field
The invention relates to the technical field of computers, in particular to a method and a device for slow query identification.
Background
Slow queries refer to query requests that have a longer query time. Referring to fig. 1, the currently mainstream slow query identification scheme is: defining a slow query time threshold, acquiring a database query log, judging whether the execution time of each query is greater than the threshold, and marking the query as slow query if the execution time of each query is greater than the threshold. When a user executes the query with the same query condition next time, whether the query record which has the same query condition as the query condition of the current query request and is marked as slow query exists is judged, and if the query record exists, the current query is determined to be slow query.
In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art:
for a time-sequence database, the execution time required by a query has a great relationship with the time range, the number of tags and the like of the data queried by the query. The query time range is long, the execution time is long, the query can be marked as slow query, but the query time range is short, the execution time is short, the query cannot be marked as slow query next time, and the accuracy of slow query identification is low.
Disclosure of Invention
In view of this, embodiments of the present invention provide a method and an apparatus for slow query identification, which can improve the accuracy of slow query identification.
To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a method of slow query identification, including:
receiving a user query request, and determining a query condition and a query starting and stopping time of the query request;
querying the historical data volume of the historical query request matched with the query condition, and determining the data volume predicted value of the query request according to the historical data volume and the query starting and stopping time;
and identifying whether the query request is slow query or not according to the data quantity predicted value of the query request.
Optionally, the historical data amount is a data point number stored in the historical query request in unit time, and is recorded as a unit data point number; the data volume predicted value is a predicted value of the number of data points corresponding to the query request;
determining a data volume predicted value of the query request according to the historical data volume and the query starting and stopping time, wherein the data volume predicted value comprises the following steps: determining a data time range of the query request according to the query start-stop time of the query request; and taking the product of the data time range of the query request and the unit data point number as the predicted value of the data point number corresponding to the query request.
Optionally, before determining the data volume prediction value of the query request according to the historical data volume and the query start-stop time, the method further includes:
determining a data time range of the historical query request according to the query condition and the query starting and stopping time of the historical query request; determining the number of data points corresponding to the historical query request according to the query log of the historical query request; and taking the quotient of the data point number corresponding to the historical query request and the data time range of the historical query request as the unit data point number.
Optionally, the query log is stored by using an openstdb storage model and comprises at least one row of time series data; each row of the time series data includes: row key and row value; the row key includes: a metric, timestamp, and tag pair; the row value includes at least one data point, the data point including: the time offset from the timestamp corresponds to an index value of the index corresponding to the time offset;
determining the number of data points corresponding to the historical query request according to the query log of the historical query request, wherein the determining comprises the following steps: and taking the sum of the number of data points included in the time series data of each row corresponding to the historical query request as the number of data points corresponding to the historical query request.
Optionally, the historical query request or the data time range of the query request is determined according to the following steps:
determining a time difference between the historical query request or the query time now of the query request and a query end timestamp; judging whether the time difference is greater than or equal to a preset time length or not;
if yes, determining a first numerical value of the historical query request or the query ending timestamp of the query request after the query ending timestamp of the query request is rounded for a preset time length, and determining a first difference value between the historical query request or the query ending timestamp of the query request and the first numerical value; determining a second numerical value of the historical query request or the query start timestamp of the query request after the query start timestamp of the historical query request or the query request is rounded for a preset time length, and determining a second difference value between the historical query request or the query start timestamp of the query request and the second numerical value; determining a third difference between the first difference and the second difference; taking the sum of the third difference and the preset time length as a historical query request or a time range of the query request;
otherwise, determining a fourth difference between the historical query request or the query time of the query request and the first difference; and taking the sum of the third difference and the fourth difference as the historical query request or the time range of the query request.
Optionally, identifying whether the query request is a slow query according to the predicted data amount value of the query request includes: judging whether the predicted value of the data volume of the query request is greater than a preset threshold value; if yes, judging the query request to be slow query; otherwise, it is determined that the query request is not a slow query.
Optionally, after determining that the query request is a slow query, the method further includes: and updating the historical data volume of the historical query request with the data volume predicted value of the query request.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for slow query recognition, including:
the determining module is used for receiving a user query request, and determining query conditions and query starting and stopping time of the query request;
the prediction module is used for inquiring the historical data quantity of the historical inquiry request matched with the inquiry condition and determining the data quantity predicted value of the inquiry request according to the historical data quantity and the inquiry starting and stopping time;
and the identification module is used for identifying whether the query request is slow query or not according to the data quantity predicted value of the query request.
Optionally, the historical data amount is a data point number stored in the historical query request in unit time, and is recorded as a unit data point number; the data volume predicted value is a predicted value of the number of data points corresponding to the query request;
the prediction module determines a data volume prediction value of the query request according to the historical data volume and the query starting and stopping time, and comprises the following steps: determining a data time range of the query request according to the query start-stop time of the query request; and taking the product of the data time range of the query request and the unit data point number as the predicted value of the data point number corresponding to the query request.
Optionally, the prediction module is further configured to: before determining the data quantity predicted value of the query request according to the historical data quantity and the query starting and stopping time, determining the data time range of the historical query request according to the query condition and the query starting and stopping time of the historical query request; determining the number of data points corresponding to the historical query request according to the query log of the historical query request; and taking the quotient of the data point number corresponding to the historical query request and the data time range of the historical query request as the unit data point number.
Optionally, the query log is stored by using an openstdb storage model and comprises at least one row of time series data; each row of the time series data includes: row key and row value; the row key includes: a metric, timestamp, and tag pair; the row value includes at least one data point, the data point including: the time offset from the timestamp corresponds to an index value of the index corresponding to the time offset;
the prediction module determines the number of data points corresponding to the historical query request according to the query log of the historical query request, and the method comprises the following steps: and taking the sum of the number of data points included in the time series data of each row corresponding to the historical query request as the number of data points corresponding to the historical query request.
Optionally, the prediction module is further configured to: determining the historical query request or the data time range of the query request according to the following steps:
determining a time difference between the historical query request or the query time now of the query request and a query end timestamp; judging whether the time difference is greater than or equal to a preset time length or not;
if yes, determining a first numerical value of the historical query request or the query ending timestamp of the query request after the query ending timestamp of the query request is rounded for a preset time length, and determining a first difference value between the historical query request or the query ending timestamp of the query request and the first numerical value; determining a second numerical value of the historical query request or the query start timestamp of the query request after the query start timestamp of the historical query request or the query request is rounded for a preset time length, and determining a second difference value between the historical query request or the query start timestamp of the query request and the second numerical value; determining a third difference between the first difference and the second difference; taking the sum of the third difference and the preset time length as a historical query request or a time range of the query request;
otherwise, determining a fourth difference between the historical query request or the query time of the query request and the first difference; and taking the sum of the third difference and the fourth difference as the historical query request or the time range of the query request.
Optionally, identifying whether the query request is a slow query according to the predicted data amount value of the query request includes: judging whether the predicted value of the data volume of the query request is greater than a preset threshold value; if yes, judging the query request to be slow query; otherwise, it is determined that the query request is not a slow query.
Optionally, after determining that the query request is a slow query, the method further includes: and updating the historical data volume of the historical query request with the data volume predicted value of the query request.
According to a third aspect of embodiments of the present invention, there is provided an electronic device for slow query recognition, comprising:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method provided by the first aspect of the embodiments of the present invention.
According to a fourth aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method provided by the first aspect of embodiments of the present invention.
One embodiment of the above invention has the following advantages or benefits: the identification accuracy of the slow query can be improved by determining the data quantity predicted value of the current query request according to the query starting and ending time of the current query request and the historical data quantity of the historical query request and then identifying whether the current query request is the slow query according to the data quantity predicted value.
Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.
Drawings
The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:
FIG. 1 is a schematic diagram of a prior art slow query identification method;
FIG. 2 is a schematic diagram of the main flow of a method of slow query identification of an embodiment of the present invention;
FIG. 3 is a flow diagram of a method of slow query identification in an alternative embodiment of the invention;
FIG. 4 is a graphical illustration of data time ranges in some embodiments of the inventions;
FIG. 5 is a graphical illustration of time ranges of data in further embodiments of the invention;
FIG. 6 is a schematic diagram of the main blocks of a slow query identification apparatus of an embodiment of the present invention;
FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;
fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.
Detailed Description
Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.
According to one aspect of an embodiment of the present invention, a method of slow query identification is provided.
Fig. 2 is a schematic diagram of a main flow of a slow query identification method according to an embodiment of the present invention, and as shown in fig. 2, the slow query identification method includes: step S201, step S202, and step S203.
Step S201, receiving a user query request, and determining a query condition and a query start-stop time of the query request.
The query condition is used to indicate the data object that the user needs to query. The query condition may not contain a field indicating a unique identification of the data object from which the data object to be queried is determined. The query condition may also include a plurality of fields through which the data object to be queried is uniquely determined.
Optionally, each data object to be queried is stored using openndsb (hbase-based distributed, scalable time series database) storage model. The query conditions include: the index (metric) of the query and the tag (tagk) of the query and the value (tagv) corresponding to the tag are required. The label describes a key-value pair of the monitoring index, such as cluster ═ lf and module ═ searcher, which means that the machine room is lf and the module is searcher. One index can be provided with a plurality of labels, and a user can filter and view time series data according to different labels. Illustratively, the index is a utilization rate (cpu.usage) of the cpu, the tag is a location of the computer room, and a value corresponding to the tag is an area a.
The query start-stop time is used to indicate a query start timestamp and a query end timestamp of the user query. Illustratively, when data between 2019, 8.20.2019 and 2019, 8.21.2019 needs to be queried, the query start timestamp is 2019, 8.20.2019 and the query end timestamp is 2019, 8.21.8.
Step S202, inquiring the historical data quantity of the historical inquiry request matched with the inquiry condition, and determining the data quantity predicted value of the inquiry request according to the historical data quantity and the inquiry starting and stopping time.
The historical data amount is used to indicate the size of the data amount in the query log obtained by the historical query request. The historical data amount can take the size of the storage space occupied by the data contained in the query log as a measurement index, and also can take the number of the data contained in the query log as a measurement index.
Optionally, the historical data amount is a data point number stored in the historical query request in unit time, and is recorded as a unit data point number; and the data volume predicted value is the predicted value of the data point number corresponding to the query request. Determining a data volume predicted value of the query request according to the historical data volume and the query starting and stopping time, wherein the data volume predicted value comprises the following steps: determining a data time range of the query request according to the query start-stop time of the query request; and taking the product of the data time range of the query request and the unit data point number as the predicted value of the data point number corresponding to the query request.
In the embodiment, the number of data points of the current query request is judged according to the number of unit data points corresponding to the historical query request of the current query condition system, and then whether the current query request is slow query is identified, so that whether the current query request is slow query can be accurately identified even if the query start time and the query stop time of the current query request are different from the query start time stamp of the historical query, and the problem of poor accuracy caused by adopting query execution time to identify the slow query is avoided.
In addition, the query start timestamp is not necessarily the same as the start time of the queried data, and the query end timestamp is not necessarily the same as the deadline of the queried data. Illustratively, a user sends a query request on day 11/1 in 2019, and the query request corresponds to a query start timestamp of day 1 in 10/2019 and a query end timestamp of day 31 in 10/2019. Then the deadline for the data is 2019, 10, 31, unlike the query end timestamp. Compared with the method for calculating the number of currently inquired data points by taking the time difference between the inquiry starting time stamp and the inquiry ending time stamp as the basis, the method for calculating the number of the currently inquired data points by taking the inquiry starting time stamp and the inquiry ending time stamp as the basis determines the data time range of the inquiry request according to the inquiry starting time stamp and the inquiry ending time stamp, and can further improve the identification accuracy of slow inquiry.
Optionally, before determining the data volume prediction value of the query request according to the historical data volume and the query start-stop time, the method further includes: determining a data time range of the historical query request according to the query condition and the query starting and stopping time of the historical query request; determining the number of data points corresponding to the historical query request according to the query log of the historical query request; and taking the quotient of the data point number corresponding to the historical query request and the data time range of the historical query request as the unit data point number. The method for calculating the number of the positioning data points is simple.
Optionally, the query log is stored by using an openstdb storage model and comprises at least one row of time series data; each row of the time series data includes: row key and row value; the row key includes: a metric, timestamp, and tag pair; the row value includes at least one data point, the data point including: the time offset from the time stamp corresponds to an index value of the index corresponding to the time offset. Determining the number of data points corresponding to the historical query request according to the query log of the historical query request, wherein the determining comprises the following steps: and taking the sum of the number of data points included in the time series data of each row corresponding to the historical query request as the number of data points corresponding to the historical query request.
The definition mode of the time stamp can be selectively set according to the actual situation. The time stamp may be on the order of an hour, i.e., a row of time series data stores all data points for an hour for a query. For example, the time stamps of the plurality of time-series data are 1 point, 2 points, 3 points, …, 24 points, and the like. The time stamp can be in other time length levels, for example, the time stamps of the plurality of time series data are respectively 3 points, 6 points, 9 points, … points, 24 points, etc., that is, all data points of one query condition in two hours are stored in one row of time series data. The minimum time unit of the timestamp can also be selectively set according to actual conditions, for example, time setting on the order of milliseconds or seconds is supported, and the specific value of the timestamp is, for example, a certain time, a certain minute and a certain second in a certain month and a certain day of a certain year.
Each row of the data table in the opentsdb storage model represents a time series data. Illustratively, Table 1 shows a query log corresponding to a historical query request A, which includes three rows of time series data.
TABLE 1 time series data in query logs
Figure BDA0002380569060000091
Figure BDA0002380569060000101
In table 1, there are two columns for each row of time-series data, the first column is a row key (rowkey) and is composed of a metric (metric), a timestamp (timestamp), and a tag pair (tagk1 ═ tagv1, tagk2 ═ tagv2, …, tagkn ═ tagvn), where the timestamp is an hour-scale timestamp, and supports a second-scale. The second column is a set of data points, each data point comprises two parts, namely, a time offset from timestamp is 0 to 3599 seconds, an index value corresponding to the time offset is provided, and all data points of a query condition in one hour are stored in a row of time series data. The sum of the number of data points included in each row of time-series data in table 1 is the number of data points corresponding to the historical query request a.
Optionally, the historical query request or the data time range of the query request is determined according to the following steps:
determining a time difference between the historical query request or the query time now of the query request and a query end timestamp; judging whether the time difference is greater than or equal to a preset time length or not;
if yes, determining a first numerical value of the historical query request or the query ending timestamp of the query request after the query ending timestamp of the query request is rounded for a preset time length, and determining a first difference value between the historical query request or the query ending timestamp of the query request and the first numerical value; determining a second numerical value of the historical query request or the query start timestamp of the query request after the query start timestamp of the historical query request or the query request is rounded for a preset time length, and determining a second difference value between the historical query request or the query start timestamp of the query request and the second numerical value; determining a third difference between the first difference and the second difference; taking the sum of the third difference and the preset time length as a historical query request or a time range of the query request;
otherwise, determining a fourth difference between the historical query request or the query time of the query request and the first difference; and taking the sum of the third difference and the fourth difference as the historical query request or the time range of the query request.
Referring to fig. 4, when the time difference is greater than or equal to a preset time length, the time range of the historical query request or the query request is:
scan_time=(end-end%T)-(start-start%T)+T;
referring to fig. 5, when the time difference is smaller than a preset time length, the time range of the historical query request or the query request is:
scan_time=(end-end%T)-(start-start%T)+[now-(end-end%T)];
in the formula, start represents the historical query request or the query start timestamp of the query request, end represents the historical query request or the query end timestamp of the query request, now represents the historical query request or the query time of the query request, and T represents a preset time length. The preset duration is the time difference between two adjacent query start time stamps or query start end stamps.
In the embodiment shown in fig. 4 and 5, the time stamp is on the order of an hour and the preset time period is one hour. In fig. 4 and 5, _ end-end% T, _ start-start% T, T is 3600 seconds.
Step S203, identifying whether the query request is a slow query or not according to the data quantity predicted value of the query request. Slow queries refer to query requests that have a longer query time. For slow queries, the database system resources used by the backend can be limited so as not to affect other users.
Optionally, identifying whether the query request is a slow query according to the predicted data amount value of the query request includes: judging whether the predicted value of the data volume of the query request is greater than a preset threshold value; if yes, judging the query request to be slow query; otherwise, it is determined that the query request is not a slow query.
Optionally, after determining that the query request is a slow query, the method further includes: and updating the historical data volume of the historical query request with the data volume predicted value of the query request. By continuously updating the historical data volume, the timeliness of the historical data volume can be guaranteed, and the accuracy of slow query identification based on the historical data volume is improved.
FIG. 3 is a schematic diagram of the main flow of a method of slow query identification according to an embodiment of the present invention. As shown in fig. 3, the method of slow query identification comprises the steps of:
step S301, receiving a user query request, and determining a query condition and a query start-stop time of the query request
Step S302, judging whether a historical query request matched with the query condition of the current query request exists; if yes, jumping to step S307; otherwise, jumping to step S303;
step S303, executing the current query request and acquiring a query log of the current query request;
step S304, determining the total data points of the current query request according to the query log;
step S305, determining the number of data points stored in unit time according to the query starting and ending time and the total number of data points of the current query request;
and S306, adding unit data points corresponding to the query conditions of the current query request. Illustratively, the query condition of the current query request and the corresponding location data point number are recorded into mysql (a relational database management system). When the same query condition is queried next time, the data time range can be obtained according to the start-stop time of the query, the data time range is multiplied by the recorded unit data point number, the estimated data point number of the next query can be calculated, if the estimated data point number is larger than a preset threshold value, the next query can be judged to be a slow query, the query resource used at the rear end can be limited, and meanwhile, a new positioning data point number is calculated through the data time range of the next query and a query log so as to update the unit data point number recorded in mysql;
step S307, determining a data quantity predicted value of the current query request according to the query starting and stopping time of the current query request and the unit data point number of the historical query request;
step S308, judging whether the predicted value of the data volume of the current query request is greater than or equal to a preset threshold value; if yes, jumping to step S309; otherwise, jumping to step S310;
step S309, judging that the current query request is a slow query;
step S310, judging that the current query request is not a slow query.
In this embodiment, for the same query condition, when the user queries for the first time, the number of data points stored in the query condition in unit time is calculated according to the number of data points included in the query log corresponding to the historical query request and the start-stop time of the query request. And when the user inquires the same inquiry condition for the second time, estimating the data point number inquired at this time according to the starting and stopping time of the inquiry request and the unit data point number inquired at the previous time. And when the number of the data points is larger than a preset threshold value, judging the query as a slow query. Meanwhile, the data point number of the unit time corresponding to the query condition is recalculated for the query and is continuously updated. The embodiment can greatly improve the accuracy of slow query identification.
In a large distributed application system, in order to know the running state of the system (whether the function is good or not, whether the performance is good or not, and the like) in real time, a set of monitoring system is often built, and as most monitoring data are time-sequence data, the scale is large, aggregation calculation is required, and an opentsdb database is usually selected. The embodiment of the invention is suitable for slow query identification when the opentdb database is queried.
According to a second aspect of the embodiments of the present invention, there is provided an apparatus for implementing the above method.
FIG. 6 is a schematic diagram of the main blocks of a slow query identification apparatus of an embodiment of the present invention. As shown in fig. 6, the slow query identification apparatus 600 comprises:
a determining module 601, configured to receive a user query request, and determine a query condition and a query start/stop time of the query request;
the prediction module 602 is used for querying the historical data volume of the historical query request matched with the query condition and determining the data volume prediction value of the query request according to the historical data volume and the query starting and stopping time;
the identifying module 603 identifies whether the query request is a slow query according to the predicted data amount value of the query request.
Optionally, the historical data amount is a data point number stored in the historical query request in unit time, and is recorded as a unit data point number; the data volume predicted value is a predicted value of the number of data points corresponding to the query request;
the prediction module determines a data volume prediction value of the query request according to the historical data volume and the query starting and stopping time, and comprises the following steps: determining a data time range of the query request according to the query start-stop time of the query request; and taking the product of the data time range of the query request and the unit data point number as the predicted value of the data point number corresponding to the query request.
Optionally, the prediction module is further configured to: before determining the data quantity predicted value of the query request according to the historical data quantity and the query starting and stopping time, determining the data time range of the historical query request according to the query condition and the query starting and stopping time of the historical query request; determining the number of data points corresponding to the historical query request according to the query log of the historical query request; and taking the quotient of the data point number corresponding to the historical query request and the data time range of the historical query request as the unit data point number.
Optionally, the query log is stored by using an openstdb storage model and comprises at least one row of time series data; each row of the time series data includes: row key and row value; the row key includes: a metric, timestamp, and tag pair; the row value includes at least one data point, the data point including: the time offset from the timestamp corresponds to an index value of the index corresponding to the time offset;
the prediction module determines the number of data points corresponding to the historical query request according to the query log of the historical query request, and the method comprises the following steps: and taking the sum of the number of data points included in the time series data of each row corresponding to the historical query request as the number of data points corresponding to the historical query request.
Optionally, the prediction module is further configured to: determining the historical query request or the data time range of the query request according to the following steps:
determining a time difference between the historical query request or the query time now of the query request and a query end timestamp; judging whether the time difference is greater than or equal to a preset time length or not;
if yes, determining a first numerical value of the historical query request or the query ending timestamp of the query request after the query ending timestamp of the query request is rounded for a preset time length, and determining a first difference value between the historical query request or the query ending timestamp of the query request and the first numerical value; determining a second numerical value of the historical query request or the query start timestamp of the query request after the query start timestamp of the historical query request or the query request is rounded for a preset time length, and determining a second difference value between the historical query request or the query start timestamp of the query request and the second numerical value; determining a third difference between the first difference and the second difference; taking the sum of the third difference and the preset time length as a historical query request or a time range of the query request;
otherwise, determining a fourth difference between the historical query request or the query time of the query request and the first difference; and taking the sum of the third difference and the fourth difference as the historical query request or the time range of the query request.
Optionally, identifying whether the query request is a slow query according to the predicted data amount value of the query request includes: judging whether the predicted value of the data volume of the query request is greater than a preset threshold value; if yes, judging the query request to be slow query; otherwise, it is determined that the query request is not a slow query.
Optionally, after determining that the query request is a slow query, the method further includes: and updating the historical data volume of the historical query request with the data volume predicted value of the query request.
According to a third aspect of embodiments of the present invention, there is provided an electronic device for slow query recognition, comprising:
one or more processors;
a storage device for storing one or more programs,
when the one or more programs are executed by the one or more processors, the one or more processors are caused to implement the method provided by the first aspect of the embodiments of the present invention.
According to a fourth aspect of embodiments of the present invention, there is provided a computer readable medium, on which a computer program is stored, which when executed by a processor, implements the method provided by the first aspect of embodiments of the present invention.
Fig. 7 illustrates an exemplary system architecture 700 of a slow query identification method or apparatus to which embodiments of the invention may be applied.
As shown in fig. 7, the system architecture 700 may include terminal devices 701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the terminal devices 701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.
A user may use the terminal devices 701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The terminal devices 701, 702, 703 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).
The terminal devices 701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.
The server 705 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the terminal devices 701, 702, 703. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.
It should be noted that the slow query identification method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the slow query identification apparatus is generally disposed in the server 705.
It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.
As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.
The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.
In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.
It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor comprising: the determining module is used for receiving a user query request, and determining query conditions and query starting and stopping time of the query request; the prediction module is used for inquiring the historical data quantity of the historical inquiry request matched with the inquiry condition and determining the data quantity predicted value of the inquiry request according to the historical data quantity and the inquiry starting and stopping time; and the identification module is used for identifying whether the query request is slow query or not according to the data quantity predicted value of the query request. Where the names of these modules do not in some cases constitute a limitation on the module itself, for example, a determination module may also be described as a "module that identifies whether the query request is a slow query based on the data volume prediction value of the query request".
As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: receiving a user query request, and determining a query condition and a query starting and stopping time of the query request; querying the historical data volume of the historical query request matched with the query condition, and determining the data volume predicted value of the query request according to the historical data volume and the query starting and stopping time; and identifying whether the query request is slow query or not according to the data quantity predicted value of the query request.
According to the technical scheme of the embodiment of the invention, the data quantity predicted value of the current query request is determined according to the query starting and stopping time of the current query request and the historical data quantity of the historical query request, and whether the current query request is slow query or not is identified according to the data quantity predicted value, so that the identification accuracy rate of the slow query can be improved.
The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of slow query identification, comprising:
receiving a user query request, and determining a query condition and a query starting and stopping time of the query request;
querying the historical data volume of the historical query request matched with the query condition, and determining the data volume predicted value of the query request according to the historical data volume and the query starting and stopping time;
and identifying whether the query request is slow query or not according to the data quantity predicted value of the query request.
2. The method of claim 1, wherein the historical data amount is a number of data points stored in the historical query request per unit time, and is recorded as a unit number of data points; the data volume predicted value is a predicted value of the number of data points corresponding to the query request;
determining a data volume predicted value of the query request according to the historical data volume and the query starting and stopping time, wherein the data volume predicted value comprises the following steps: determining a data time range of the query request according to the query start-stop time of the query request; and taking the product of the data time range of the query request and the unit data point number as the predicted value of the data point number corresponding to the query request.
3. The method of claim 2, wherein prior to determining the data volume prediction value for the query request based on the historical data volume and the query start-stop time, further comprising:
determining a data time range of the historical query request according to the query condition and the query starting and stopping time of the historical query request; determining the number of data points corresponding to the historical query request according to the query log of the historical query request; and taking the quotient of the data point number corresponding to the historical query request and the data time range of the historical query request as the unit data point number.
4. The method of claim 3, wherein the query log is stored using an openntsdb storage model, including at least one line of time series data; each row of the time series data includes: row key and row value; the row key includes: a metric, timestamp, and tag pair; the row value includes at least one data point, the data point including: the time offset from the timestamp corresponds to an index value of the index corresponding to the time offset;
determining the number of data points corresponding to the historical query request according to the query log of the historical query request, wherein the determining comprises the following steps: and taking the sum of the number of data points included in the time series data of each row corresponding to the historical query request as the number of data points corresponding to the historical query request.
5. A method according to claim 2 or 3, wherein the historical query request or the data time range of the query request is determined according to the following steps:
determining a time difference between the historical query request or the query time now of the query request and a query end timestamp; judging whether the time difference is greater than or equal to a preset time length or not;
if so, determining a first numerical value of the historical query request or the query ending timestamp of the query request after the query ending timestamp of the query request is rounded for a preset time length, and determining a first difference value between the historical query request or the query ending timestamp of the query request and the first numerical value; determining a second numerical value of the historical query request or the query start timestamp of the query request after the query start timestamp of the historical query request or the query request is rounded for a preset time length, and determining a second difference value between the historical query request or the query start timestamp of the query request and the second numerical value; determining a third difference between the first difference and the second difference; taking the sum of the third difference and the preset time length as a historical query request or a time range of the query request;
otherwise, determining a fourth difference between the historical query request or the query time of the query request and the first difference; and taking the sum of the third difference and the fourth difference as the historical query request or the time range of the query request.
6. The method of claim 1, wherein identifying whether the query request is a slow query based on a data volume prediction value of the query request comprises: judging whether the predicted value of the data volume of the query request is greater than a preset threshold value; if yes, judging the query request to be slow query; otherwise, it is determined that the query request is not a slow query.
7. The method of claim 6, wherein after determining that the query request is a slow query, further comprising: and updating the historical data volume of the historical query request with the data volume predicted value of the query request.
8. An apparatus for slow query identification, comprising:
the determining module is used for receiving a user query request, and determining query conditions and query starting and stopping time of the query request;
the prediction module is used for inquiring the historical data quantity of the historical inquiry request matched with the inquiry condition and determining the data quantity predicted value of the inquiry request according to the historical data quantity and the inquiry starting and stopping time;
and the identification module is used for identifying whether the query request is slow query or not according to the data quantity predicted value of the query request.
9. An electronic device for slow query recognition, comprising:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-7.
10. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-7.
CN202010081806.3A 2020-02-06 2020-02-06 Slow query identification method and device Pending CN113220705A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010081806.3A CN113220705A (en) 2020-02-06 2020-02-06 Slow query identification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010081806.3A CN113220705A (en) 2020-02-06 2020-02-06 Slow query identification method and device

Publications (1)

Publication Number Publication Date
CN113220705A true CN113220705A (en) 2021-08-06

Family

ID=77085550

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010081806.3A Pending CN113220705A (en) 2020-02-06 2020-02-06 Slow query identification method and device

Country Status (1)

Country Link
CN (1) CN113220705A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114880351A (en) * 2022-05-31 2022-08-09 中国电信股份有限公司 Slow query statement identification method and device, storage medium and electronic equipment
WO2023077823A1 (en) * 2021-11-05 2023-05-11 深圳前海微众银行股份有限公司 Slow query alarm method, electronic device, and storage medium

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2023077823A1 (en) * 2021-11-05 2023-05-11 深圳前海微众银行股份有限公司 Slow query alarm method, electronic device, and storage medium
CN114880351A (en) * 2022-05-31 2022-08-09 中国电信股份有限公司 Slow query statement identification method and device, storage medium and electronic equipment
CN114880351B (en) * 2022-05-31 2024-02-06 中国电信股份有限公司 Recognition method and device of slow query statement, storage medium and electronic equipment

Similar Documents

Publication Publication Date Title
CN109299348B (en) Data query method and device, electronic equipment and storage medium
CN107798108B (en) Asynchronous task query method and device
CN109471783B (en) Method and device for predicting task operation parameters
CN111190888A (en) Method and device for managing graph database cluster
CN110019367B (en) Method and device for counting data characteristics
CN107291835B (en) Search term recommendation method and device
CN113220705A (en) Slow query identification method and device
CN114817651A (en) Data storage method, data query method, device and equipment
CN108985805B (en) Method and device for selectively executing push task
CN110737655B (en) Method and device for reporting data
CN110737691B (en) Method and apparatus for processing access behavior data
CN110598068A (en) Global identification generation method and device
CN115905322A (en) Service processing method and device, electronic equipment and storage medium
CN113779412B (en) Message touch method, node and system based on blockchain network
CN113722113A (en) Traffic statistic method and device
CN113760176A (en) Data storage method and device
CN113434754A (en) Method and device for determining recommended API (application program interface) service, electronic equipment and storage medium
CN113468354A (en) Method and device for recommending chart, electronic equipment and computer readable medium
CN112926613A (en) Method and device for positioning time sequence training start node
CN113535768A (en) Production monitoring method and device
CN113129473B (en) Data acquisition method, device and system
CN113362097B (en) User determination method and device
CN110019165A (en) A kind of method and apparatus for cleaning abnormal data
CN112734147A (en) Method and device for equipment evaluation management
CN117407196A (en) Data importing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination