CN117971881A - Data query method, device, electronic equipment and storage medium - Google Patents

Data query method, device, electronic equipment and storage medium Download PDF

Info

Publication number
CN117971881A
CN117971881A CN202311865506.4A CN202311865506A CN117971881A CN 117971881 A CN117971881 A CN 117971881A CN 202311865506 A CN202311865506 A CN 202311865506A CN 117971881 A CN117971881 A CN 117971881A
Authority
CN
China
Prior art keywords
time period
target
query
data
storage node
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311865506.4A
Other languages
Chinese (zh)
Inventor
叶权
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing QIYI Century Science and Technology Co Ltd
Original Assignee
Beijing QIYI Century Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing QIYI Century Science and Technology Co Ltd filed Critical Beijing QIYI Century Science and Technology Co Ltd
Priority to CN202311865506.4A priority Critical patent/CN117971881A/en
Publication of CN117971881A publication Critical patent/CN117971881A/en
Pending legal-status Critical Current

Links

Abstract

The application relates to a data query method, a device, an electronic device and a storage medium, which are applied to a data storage system, wherein the data storage system comprises a plurality of storage nodes, and when data are generated, the storage nodes in a working state in the data storage system store the data at the same time, and the method comprises the following steps: acquiring a query time period, acquiring a working time period of each storage node in a data storage system, determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node, querying target data in the corresponding target time period stored by each target storage node to obtain a plurality of target data, and combining the plurality of target data to obtain query data corresponding to the query time period. Therefore, different data stored can be queried through a plurality of storage nodes at the same time, and the efficiency and the success rate of data query are improved.

Description

Data query method, device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data processing technologies, and in particular, to a data query method, a data query device, an electronic device, and a storage medium.
Background
Prometheus is an open-source monitoring system, and in a cloud primary scene, prometheus clusters are increasingly used, the scale is also increased, and along with the increase of the scale, the performance, the resource efficiency and the cost are also more and more concerned.
In the prior art, a promethaus cluster may include multiple instances, and the multiple instances store monitored data at the same time. However, in practical applications, one or more of the instances may be rendered inoperable due to failure, network problems, equipment maintenance, etc. At this time, if the user wants to query the monitoring data, if a single instance is used for query and one of the instances that cannot work is queried, the query will fail; if multi-instance query is adopted, a large amount of repeated data exists in the queried data, and the queried data needs to be combined and then de-duplicated, which leads to a large amount of CPU/storage resources being wasted.
Disclosure of Invention
The application provides a data query method, a data query device, electronic equipment and a storage medium, which are used for solving the problem that in the prior art, when data query is carried out, if a single instance is adopted for query and one of the instances which cannot work is queried, query failure is caused; if multi-instance query is adopted, a large amount of repeated data exists in the queried data, and the queried data needs to be combined and then de-duplicated, so that the technical problem of wasting a large amount of CPU/storage resources is caused.
In a first aspect, the present application provides a data query method applied to a data storage system, where the data storage system includes a plurality of storage nodes, and when data is generated, the storage nodes in a working state in the data storage system store the data at the same time, and the method includes:
acquiring a query time period;
Acquiring a working time period of each storage node in a data storage system;
determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node;
inquiring target data stored in each target storage node in the corresponding target time period to obtain a plurality of target data;
and merging the plurality of target data to obtain query data corresponding to the query time period.
As one possible implementation manner, the acquiring a query period includes:
receiving a time period input by a user through a visual interface;
Determining an initial time and an end time of the time period;
Acquiring an initial time stamp of the initial time and an end time stamp of the end time;
And determining a continuous interval formed by the initial timestamp and the ending timestamp as a query time period.
As one possible implementation manner, the acquiring the working period of each storage node in the data storage system includes:
Determining whether the inquiry time period belongs to a preset time range; transmitting a preset query instruction to each storage node of the data storage system under the condition that the query time period belongs to the preset time range; receiving a working time period in the preset time range fed back by each storage node;
Or alternatively
Sending a query instruction carrying the query time period to each storage node of the data storage system; and receiving a working time period in a target time range which is fed back by each storage node and is determined according to the inquiry time period.
As one possible implementation manner, the determining a plurality of target storage nodes to be queried according to the querying time period and the working time period of each storage node, and the target time period queried by each target storage node includes:
Traversing the working time period of each storage node, and determining whether a first storage node with the working time period containing the query time period exists;
If the first storage node exists, determining the first storage node as a target storage node; dividing the inquiry time period according to the first number of the first storage nodes to obtain a target time period inquired by each target storage node;
if the first storage node does not exist, dividing the query time period according to the second number of the storage nodes of the data storage system to obtain a plurality of sub-query time periods; and determining a target storage node corresponding to each sub-query time period and a target time period queried by each target storage node according to each sub-query time period.
As one possible implementation manner, the query time period is divided according to the first number of the first storage nodes, so as to obtain a target time period queried by each target storage node, which includes:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the first quantity to obtain a target time length;
Dividing the inquiry time period according to the target time length to obtain a plurality of target time periods;
And the target time periods are in one-to-one correspondence with the target storage nodes, so that the target time period inquired by each target storage node is obtained.
As a possible implementation manner, the dividing the query time period according to the second number of storage nodes of the data storage system to obtain a plurality of sub-query time periods includes:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the second number to obtain an average time length;
Dividing the inquiry time period according to the average duration to obtain a plurality of sub inquiry time periods.
As one possible implementation manner, the determining the target storage node corresponding to the sub-query time period and the target time period queried by each target storage node includes:
Traversing the working time period of each storage node, and determining whether a second storage node with the working time period containing the sub-query time period exists;
If the second storage nodes exist, determining any N second storage nodes as target storage nodes; dividing the sub-query time period according to the N to obtain a target time period queried by each target storage node, wherein the N is a positive integer;
If the second storage node does not exist, determining each storage node in the data storage system as a target storage node; and determining the sub-query time period as a target time period queried by each target storage node.
As one possible implementation manner, the determining a plurality of target storage nodes to be queried according to the querying time period and the working time period of each storage node, and the target time period queried by each target storage node includes:
traversing the working time periods of each storage node, and searching sub-working time periods of M storage nodes to form the query time period;
Determining M storage nodes as target storage nodes;
For each target storage node, determining the corresponding sub-working time period as the target time period queried by the target storage node.
In a second aspect, an embodiment of the present application provides a data query device applied to a data storage system, where the data storage system includes a plurality of storage nodes, and when data is generated, the storage nodes in a working state in the data storage system store the data at the same time, where the device includes:
The first acquisition module is used for acquiring the inquiry time period;
The second acquisition module is used for acquiring the working time period of each storage node in the data storage system;
The determining module is used for determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node;
The query module is used for querying the target data stored by each target storage node in the corresponding target time period to obtain a plurality of target data;
and the merging module is used for merging the plurality of target data to obtain query data corresponding to the query time period.
As an optional implementation manner, the first obtaining module is specifically configured to:
receiving a time period input by a user through a visual interface;
Determining an initial time and an end time of the time period;
Acquiring an initial time stamp of the initial time and an end time stamp of the end time;
And determining a continuous interval formed by the initial timestamp and the ending timestamp as a query time period.
As an optional implementation manner, the second obtaining module is specifically configured to:
Determining whether the inquiry time period belongs to a preset time range; transmitting a preset query instruction to each storage node of the data storage system under the condition that the query time period belongs to the preset time range; receiving a working time period in the preset time range fed back by each storage node;
Or alternatively
Sending a query instruction carrying the query time period to each storage node of the data storage system; and receiving a working time period in a target time range which is fed back by each storage node and is determined according to the inquiry time period.
As one possible implementation manner, the determining module includes:
a traversing sub-module, configured to traverse the working time period of each storage node, and determine whether there is a first storage node whose working time period includes the query time period;
A first determining submodule, configured to determine the first storage node as a target storage node if the first storage node exists; dividing the inquiry time period according to the first number of the first storage nodes to obtain a target time period inquired by each target storage node;
A second determining sub-module, configured to divide the query time period according to a second number of storage nodes of the data storage system if the first storage node does not exist, to obtain a plurality of sub-query time periods; and determining a target storage node corresponding to each sub-query time period and a target time period queried by each target storage node according to each sub-query time period.
As a possible implementation manner, the first determining submodule is specifically configured to:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the first quantity to obtain a target time length;
Dividing the inquiry time period according to the target time length to obtain a plurality of target time periods;
And the target time periods are in one-to-one correspondence with the target storage nodes, so that the target time period inquired by each target storage node is obtained.
As a possible implementation manner, the second determining submodule is specifically configured to:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the second number to obtain an average time length;
Dividing the inquiry time period according to the average duration to obtain a plurality of sub inquiry time periods.
As a possible implementation manner, the second determining submodule is specifically configured to:
Traversing the working time period of each storage node, and determining whether a second storage node with the working time period containing the sub-query time period exists;
If the second storage nodes exist, determining any N second storage nodes as target storage nodes; dividing the sub-query time period according to the N to obtain a target time period queried by each target storage node, wherein the N is a positive integer;
If the second storage node does not exist, determining each storage node in the data storage system as a target storage node; and determining the sub-query time period as a target time period queried by each target storage node.
As a possible implementation manner, the determining module is specifically configured to:
traversing the working time periods of each storage node, and searching sub-working time periods of M storage nodes to form the query time period;
Determining M storage nodes as target storage nodes;
For each target storage node, determining the corresponding sub-working time period as the target time period queried by the target storage node.
In a third aspect, an embodiment of the present application provides an electronic device, including: a processor and a memory, the processor being configured to execute a data query program stored in the memory, to implement the data query method according to any one of the first aspects.
In a fourth aspect, an embodiment of the present application provides a storage medium storing one or more programs executable by one or more processors to implement the data query method of any one of the first aspects.
According to the technical scheme provided by the embodiment of the application, the working time period of each storage node in the data storage system is acquired by acquiring the query time period, the multiple target storage nodes to be queried and the target time period queried by each target storage node are determined according to the query time period and the working time period of each storage node, the target data in the corresponding target time period stored by each target storage node are queried to obtain multiple target data, and the multiple target data are combined to obtain query data corresponding to the query time period. According to the technical scheme, the target storage nodes are determined according to the working time periods of the storage nodes, so that query failure caused by data in the query time period when the queried storage nodes do not exist is avoided, the data can be queried simultaneously through a plurality of target storage nodes by distributing the corresponding target time period for each target storage node, query failure caused by overload of the target storage nodes is prevented, different data stored by the plurality of storage nodes are queried simultaneously, and therefore efficiency and success rate of data query are improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the invention and together with the description, serve to explain the principles of the invention.
In order to more clearly illustrate the embodiments of the invention or the technical solutions of the prior art, the drawings which are used in the description of the embodiments or the prior art will be briefly described, and it will be obvious to a person skilled in the art that other drawings can be obtained from these drawings without inventive effort.
One or more embodiments are illustrated by way of example and not limitation in the figures of the accompanying drawings, in which like references indicate similar elements, and in which the figures of the drawings are not to be taken in a limiting sense, unless otherwise indicated.
FIG. 1 is a schematic diagram of a data storage system according to an embodiment of the present application;
FIG. 2 is a flowchart of an embodiment of a data query method according to an embodiment of the present application;
FIG. 3 is a flowchart of another embodiment of a data query method according to an embodiment of the present application;
FIG. 4 is a block diagram of an embodiment of a data query device according to an embodiment of the present application;
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. They are, of course, merely examples and are not intended to limit the invention. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed.
In order to solve the problem that in the prior art, when data query is performed, if a single instance is adopted for query and one of the instances which cannot work is queried, query failure is caused; if multi-instance query is adopted, a large amount of repeated data exists in the queried data, and the queried data needs to be combined and then de-duplicated, so that the technical problem of wasting a large amount of CPU/storage resources is caused.
In order to facilitate understanding of the data query method provided by the application, an application scenario related to the application is first described in the following by way of example.
Referring to fig. 1, a schematic structure diagram of a data storage system according to an embodiment of the present application is provided. As shown in fig. 1, the data storage system 10 may include a terminal 11, a storage node 12, a storage node 13, and a storage node 14.
The data storage system 10 may be a promethaus cluster or other data storage systems, where promethaus is an open-source monitoring system, and in a cloud primary scenario, the promethaus cluster is used more and more commonly, and the scale is also larger, and along with the increase of the scale, the performance, the resource efficiency and the cost are also more and more focused.
The terminal 11 may be a hardware device or software supporting a network connection to provide various network services. When the terminal 11 is hardware, it may be a device that supports various electronic devices with a display screen, including but not limited to a smart phone, a tablet computer, a laptop portable computer, a desktop computer, etc., only the terminal 11 is taken as an example of the desktop computer in fig. 1. When the terminal 11 is software, it can be installed in the above-listed electronic device.
The storage nodes 12, 13, and 14 are storage nodes in a cluster for storing data, and each storage node may be one server or a cluster formed by a plurality of servers, which is not limited in this embodiment of the present application. The storage nodes can work simultaneously in the data storage system, namely, the generated data can be stored simultaneously.
It should be noted that, the number of storage nodes included in the data storage system according to the embodiments of the present application is not limited, and may be two, three, or more than three.
In the prior art, a promethaus cluster may include multiple instances (i.e., the storage nodes described above), and multiple storage nodes store monitored data simultaneously. However, in practical applications, one or more storage nodes may not work due to failure, network problems, equipment maintenance, etc. At this time, if the user wants to query the monitoring data, if a single storage node is used for query and one of the storage nodes which cannot work in the query time period is queried, the query will fail; if multiple storage nodes are queried simultaneously, a large amount of repeated data exists in the queried data, and the queried data needs to be combined and then de-duplicated, which causes a large amount of CPU/storage resources to be wasted.
In this regard, the embodiment of the present application provides a data query method, when a user needs to query data, a corresponding query time period may be input to the terminal 11, and the terminal 11 may determine a target storage node to be queried and a target time period queried by each target storage node by applying the data query method provided by the embodiment of the present application, so that data in different time periods may be queried at a plurality of target storage nodes at the same time, so as to query different stored data from a plurality of storage nodes at the same time when performing data query, thereby improving efficiency and success rate of data query.
For example, assume that the terminal 11 determines the storage node 12 and the storage node 13 as target storage nodes by using the data query method provided by the embodiment of the present application, and the target time period queried by the storage node 12 is t1 to t2, and the target time period queried by the storage node 13 is t3 to t4. Then, the data in the time period t1 to t2 may be queried from the storage node 12, and the data in the time period t3 to t4 may be queried from the storage node 13.
The data query method provided by the application is further explained with reference to the drawings by using specific embodiments, and the embodiments do not limit the embodiments of the application.
Referring to fig. 2, a flowchart of an embodiment of a data query method is provided in an embodiment of the present application. As one example, the flow shown in fig. 2 may be applied to the data storage system shown in fig. 1, which may include a plurality of storage nodes, and when data is generated, the storage nodes in the data storage system that are in operation may store the data at the same time. As shown in fig. 2, the process may include the steps of:
step 201, acquiring a query time period.
The above-described inquiry time period refers to a time period corresponding to data that the user wants to inquire. The inquiry time period may be a continuous interval formed by two different time stamps, which is not limited in this embodiment of the present application.
Alternatively, the query time period may be one continuous time period or a plurality of discontinuous time periods, which is not limited in this embodiment of the present application.
In an embodiment, when a user wants to query data, a time period may be selected or input in the visual interface, based on which the execution subject of the embodiment of the present application may receive the time period input by the user through the visual interface, and determine an initial time and an end time of the time period.
Then, an initial time stamp corresponding to the initial time and an end time stamp corresponding to the end time are obtained, and a continuous section formed by the initial time stamp and the end time stamp is determined as the inquiry time period.
Step 202, acquiring a working time period of each storage node in the data storage system.
The data storage system may include a plurality of storage nodes, and when data is generated, the storage nodes in an operating state in the storage system may store the generated data at the same time.
In practical applications, a certain storage node in the data storage system may not work due to a fault, a network problem, or a device dimension. Therefore, in order to determine the working time of each storage node, in the embodiment of the present application, a time range index may be added to each storage node, that is, when the storage node is in a working state, the working time period of the storage node is recorded, where the working time period may be time-series data represented in a form of a time stamp, for example: 1688123434:1688427131-1688427131:1688440112-1688440112:1688441109.
Based on this, when determining to perform data query, the execution body of the embodiment of the present application may first acquire a working period of each storage node in the data storage system to determine which storage node is specifically performing data storage, so as to prevent the queried storage node from not being in a working state in the query period, which results in query failure.
In one embodiment, the data storage system may be pre-configured with a time frame (e.g., the last year or two years) that may be a period of time during which each querying storage node is operating within a pre-configured range. Based on this, it may be determined whether the inquiry time period belongs to the preset time range, and in the case that the inquiry time period belongs to the preset time range, a preset inquiry instruction may be directly transmitted to each storage node of the storage system.
After receiving the query instruction, the storage node may feed back the working time period within the preset range to the execution body according to the embodiment of the present application, so that the execution body according to the embodiment of the present application may receive the working time period within the preset time range fed back by each storage node.
Optionally, in the case that the query time period does not belong to the preset time range, a target time range to which the query time period belongs may be determined, and a query instruction carrying the target time range may be sent to each storage node.
After that, when each storage node receives the query instruction, the target time range carried by the query instruction can be determined, the working time period in the target time range is searched, and the working time period is fed back to the execution main body of the embodiment of the application. Thus, the execution body of the embodiment of the present application may receive the working time period within the target time range fed back by each storage node.
In another embodiment, the target time range to which the query time period belongs may be directly determined, and the query instruction carrying the target time range is sent to each storage node.
After that, when each storage node receives the query instruction, the target time range carried by the query instruction can be determined, the working time period in the target time range is searched, and the working time period is fed back to the execution main body of the embodiment of the application. Thus, the execution body of the embodiment of the present application may receive the working time period within the target time range fed back by each storage node.
In yet another embodiment, the query instruction carrying the above-described query period may be sent directly to each storage node of the data storage system. After receiving the query instruction, each storage node can analyze the query time period carried by the query instruction, determine the target time range to which the query time period belongs, and feed back the working time period in the target time range to the execution subject of the embodiment of the application.
Based on this, the execution body of the embodiment of the present application may receive the working time period within the target time range determined according to the query time period fed back by each storage node.
Step 203, determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node.
Step 204, querying target data stored in each target storage node in the corresponding target time period to obtain a plurality of target data.
Step 205, merging the plurality of target data to obtain query data corresponding to the query time period.
The following collectively describes steps 203 to 205:
the target storage node refers to a storage node for querying data in the query time period.
The target time period is a time period to be queried by the target storage node, and may be a certain sub-time period of the query time period or may be the whole query time period, which is not limited in the embodiment of the present application.
The query data refers to data in a query time period queried by a user, which may be all data corresponding to the query time period or may be part of data corresponding to the query time period, which is not limited in the embodiment of the present application.
In the embodiment of the application, in order to prevent a query failure from being caused if a queried storage node is not in a working state in a query time period when querying data and querying a storage node; when a plurality of storage nodes are queried simultaneously, a large amount of repeated data may exist in the queried data, and at this time, the repeated data is required to be deduplicated through computing resources and storage resources of the system, so that the computing resources and the storage resources are wasted.
Then, the target data in the target time period corresponding to each target storage node can be queried correspondingly, so that a plurality of target data are obtained. Since the target time period queried by each determined target storage node is different, the target data queried by the determined target storage node is different, that is, a plurality of target data are different. And when the data is queried, the data in the query time period is divided into a plurality of target storage nodes to query, so that the plurality of target storage nodes can bear query load together, and the problem that if one storage node queries the data in the query time period, query failure is possibly caused by overload of the load is avoided.
And finally, merging the target data queried by each target storage node to obtain the query data corresponding to the query time period.
In one embodiment, the target storage node and the target time period queried by each target storage node may be determined by grouping the storage node's operational time periods into query time periods.
As an exemplary embodiment, the working time period of each storage node can be traversed, and the sub-working time periods of M storage nodes can be searched to form a query time period, wherein M is a positive integer. Then, the M storage nodes may be determined as target storage nodes, and for each target storage node, the corresponding sub-working time period is determined as the target time period queried by the target storage node.
For example, assume a query period of 2: 00-12: 00, by traversing the working time period of each storage node, the sub-working time period 2 of the storage node 1 can be found: 00-5: 00. sub-operation period 5 of storage node 2: 00-9: 00, and sub-operation period 9 of storage node 3: 00-12: 00 constitutes the above-mentioned inquiry time period.
Thereafter, according to the above method, the storage node 1, the storage node 2, and the storage node 3 may be determined as target storage nodes, and the sub-operation period 2: 00-5: 00 determines a target period of time queried by the storage node 1, and a sub-operation period of time 5: 00-9: 00 determines the target time period queried by the storage node 2, and the sub-working time period 9 is as follows: 00-12: 00 determines the target time period queried for storage node 3.
In another embodiment, the multiple target storage nodes to be queried and the target time period queried by each target storage node may be determined by the flow shown in fig. 3, which is not described in detail herein.
According to the technical scheme provided by the embodiment of the application, the working time period of each storage node in the data storage system is acquired by acquiring the query time period, the multiple target storage nodes to be queried and the target time period queried by each target storage node are determined according to the query time period and the working time period of each storage node, the target data in the corresponding target time period stored by each target storage node are queried to obtain multiple target data, and the multiple target data are combined to obtain query data corresponding to the query time period. According to the technical scheme, the target storage nodes are determined according to the working time periods of the storage nodes, so that query failure caused by data in the query time period when the queried storage nodes do not exist is avoided, the data can be queried simultaneously through a plurality of target storage nodes by distributing the corresponding target time period for each target storage node, query failure caused by overload of the target storage nodes is prevented, different data stored by the plurality of storage nodes are queried simultaneously, and therefore efficiency and success rate of data query are improved.
Referring to fig. 3, a flowchart of an embodiment of another data query method according to an embodiment of the present application is provided. The flow shown in fig. 3 describes how to determine a plurality of target storage nodes to be queried and a target time period queried by each target storage node based on the flow shown in fig. 2, specifically, based on the query time period and the operation time period of each storage node. As shown in fig. 3, the process may include the steps of:
Step 301, traversing the working time period of each storage node, determining whether there is a first storage node whose working time period includes the query time period, if so, executing step 302; if not, go to step 304.
The first storage node refers to a storage node in which the working period completely includes the query period.
In the embodiment of the present application, after the working period of each storage node is acquired, the working period of each storage node may be traversed to determine whether there is a storage node (hereinafter referred to as a first storage node for convenience of description) in which the working period includes the above-mentioned query period in the storage nodes acquiring the working period. It should be noted that, the storage node that obtains the working period may be a storage node currently in a working state, that is, a storage node currently available for data query.
Alternatively, step 302 may be performed in the presence of the first storage node described above.
Conversely, in the absence of the first storage node described above, step 304 may be performed.
Step 302, determining the first storage node as a target storage node.
The target storage node refers to a determined storage node capable of querying data.
In the embodiment of the present application, when there is a first storage node whose working time period includes a query time period, it is described that if data is queried at the storage node, the data in the query time period can be successfully queried, so that the first storage node can be determined as a target storage node.
Step 303, dividing the query time period according to the first number of the first storage nodes, to obtain a target time period queried by each target storage node.
The target time period refers to a time period to be queried by the target storage node.
In the embodiment of the application, after the first storage node is determined as the target storage node, in order to enable a plurality of storage nodes to bear the load of query data together and avoid query failure caused by overload of the queried storage nodes, the execution body of the embodiment of the application can divide the query time period and determine the queried target time period for each determined target storage node.
In an embodiment, the query time period may be divided according to the number of first storage nodes (hereinafter, referred to as the first number for convenience of description), so as to obtain the target time period queried by each target storage node.
As an exemplary embodiment, a query duration corresponding to the query time period may be determined, and the query duration may be divided by the first number to obtain the target duration.
Then, the query time period can be divided according to the target time length to obtain a plurality of target time periods, and the target time periods are in one-to-one correspondence with the target storage nodes to obtain the target time period queried by each target storage node.
For example, assume a query period of 2: 00-22: 00, then the query time period corresponds to a query duration of 20 hours, continuing to assume a first number of 4, specifically target storage node 1, target storage node 2, target storage node 3, and target storage node 4. The target time period obtained according to the above method is 5 hours.
Then, the inquiry time period is divided according to the target time length, so that the following 4 target time periods can be obtained: 2: 00-7: 00. 7: 00-12: 00. 12: 00-17: 00, and 17: 00-22: 00, and the 4 target time periods are in one-to-one correspondence with the 4 target storage nodes, so as to determine the target time period corresponding to each target storage node, that is, the target time period corresponding to the target storage node 1 may be 2: 00-7: 00, the target time period corresponding to the target storage node 2 may be 7: 00-12: 00, the target time period corresponding to the target storage node 3 may be 12: 00-17: 00, the target time period corresponding to the target storage node 4 may be 17: 00-22: 00.
Step 304, dividing the query time period according to the second number of storage nodes of the data storage system to obtain a plurality of sub-query time periods.
In the embodiment of the application, when the first storage node with the working time period including the query time period is determined to be absent, the storage node in the working state is not present, and any storage node is always in the working state in the query time period. Therefore, in order to acquire the data of the query time period, the query time period may be divided to obtain a plurality of sub-query time periods, and a target storage node corresponding to each sub-query time period may be determined.
In an embodiment, the execution body of the embodiment of the present application may be divided according to the number of storage nodes (hereinafter referred to as the second number for convenience of description) of the data storage system, so as to obtain a plurality of sub-query time periods.
As an exemplary embodiment, the query duration corresponding to the query time period may be determined, and the query duration may be divided by the second number to obtain an average duration.
Then, the inquiry time period can be divided according to the average time length to obtain a plurality of sub-inquiry time periods.
For example, assume a query period of 2: 00-22: 00, then the query time period corresponds to a query duration of 20 hours, continuing to assume a second number of 4, which is specifically storage node 1, storage node 2, storage node 3, and storage node 4. Then the average time length obtained according to the method is 5 hours, and then the query time periods are divided according to the average time length, so that the following 4 sub-query time periods can be obtained: 2: 00-7: 00. 7: 00-12: 00. 12: 00-17: 00, and 17: 00-22: 00.
Step 305, for each sub-query time period, determining a target storage node corresponding to the sub-query time period, and the target time period queried by each target storage node.
In the embodiment of the application, in order to ensure that the data of the query time period is acquired, the target storage node corresponding to each sub-query time period and the target time period queried by each target storage node can be determined for each sub-query time period.
As an exemplary embodiment, the working period of each storage node may be traversed to determine whether there is a storage node (hereinafter referred to as a second storage node for ease of description) whose working period includes a sub-query period.
Optionally, if the second storage node exists, any N second storage nodes may be determined as target storage nodes, where N may be a positive integer. That is, the execution body of the embodiment of the present application may select any one of the second storage nodes to query the sub-query time period, or may select a plurality of the second storage nodes to query the sub-query time period.
Then, the sub-query time period may be divided according to the N, so as to obtain a target time period queried by each target storage node.
As an optional implementation manner, if a second storage node is selected as the target storage node, the sub-query time period may be directly determined as the target time period queried by the target storage node.
As another optional implementation manner, if a plurality of second storage nodes are selected as target storage nodes, the sub-query time period may be divided according to the N, so as to obtain a target time period queried by each target storage node.
As an exemplary embodiment, the sub-query duration of the sub-query time period may be determined, and the sub-query duration may be divided by N to obtain an average sub-query duration. And then dividing the sub-query time periods according to the average sub-query time length to obtain N target time periods.
Finally, the N target time periods and the N target storage nodes can be in one-to-one correspondence to obtain the target time period queried by each target storage node.
In contrast, if the second storage node does not exist, it is indicated that, among the currently queriable storage nodes, there is no storage node storing the sub-query time period. In order to avoid that storage nodes storing sub-query time period data are not queried due to errors, each storage node may be tried to query the data of the sub-query time period. Thus, each storage node in the data storage system may be determined to be a target storage node and the sub-query time period may be determined to be the target time period queried by each target storage node.
According to the technical scheme provided by the embodiment of the application, whether a first storage node with a working time period containing a query time period exists or not is determined by traversing the working time period of each storage node, if the first storage node exists, the first storage node is determined to be a target storage node, and the query time period is divided according to the first number of the first storage nodes, so that the target time period queried by each target storage node is obtained; if the first storage node is not present, dividing the query time period according to the second number of storage nodes of the data storage system to obtain a plurality of sub-query time periods, and determining, for each sub-query time period, a target storage node corresponding to the sub-query time period and a target time period queried by each target storage node. According to the technical scheme, whether the working time period of the storage node contains the query time period or not is determined, the corresponding target storage node is determined, and after the plurality of target storage nodes are determined, the query time period is divided into the plurality of target time periods and is distributed to the target storage nodes, so that data in different time periods of the plurality of target storage nodes can be queried simultaneously, the successful query of the data is ensured, the query efficiency is improved, and the success rate and the efficiency of data query are improved.
For easy understanding of the data query method provided by the present application, the following data storage system is a promethaus monitoring system, and the promethaus monitoring system may include a plurality of examples for illustration:
1. The time range index where the index exists is added for prometheus instances, using a timestamp continuous interval representation, in the format: start time, end time; normally, this addition of data records will only have a negligible effect on the storage and computation of the original instance.
2. The index presence time range in the example is synchronized to disk at regular time, e.g. 1 minute to update the disk record once.
3. When the instance is restarted or the index is interrupted, the new timestamp continuous interval is used for additional recording.
4. And introducing a Gateway device, directly receiving user inquiry, wherein the user inquiry has a ts range (time sequence data) parameter, and indicating the range of the data inquiry.
And 5, the Gateway sends an index existence range query instruction to different prometheus examples, and the prometheus examples quickly confirm the queriable range of the example according to the time range index and return the queriable range to the Gateway.
And 6, dynamically cutting the ts range of the query according to the strategy by gateway according to the data time ranges returned by different examples, and distributing the query instruction to the specific example.
7. Dynamic cutting strategy:
7.1 assume the time query range { t0, tn }, the number of instances is C.
7.2 If there are N examples that all contain { t0, tn }, the query time can be allocated using (tn-t 0)/N;
7.3 if there are no examples that all contain { t0, tn }, multiple timing slices can be obtained using (tn-t 0)/C, { ti, …, tj } as one timing slice;
7.4 if there are M sets of instances containing complete { ti, …, tj } timing data, one or M sending query requests may be randomly selected, where each query time is (tj-ti)/M resulting timing slices;
7.5 if no set of instances contains complete { ti, …, tj } timing data, then a { ti, …, tj } query instruction is sent to all instances.
And (8) after aggregating the data according to the time line, returning a query result.
Referring to fig. 4, a block diagram of an embodiment of a data query device according to an embodiment of the present application is provided. As an embodiment, the apparatus shown in fig. 4 may be applied to the data storage system shown in fig. 1, where the data storage system includes a plurality of storage nodes, and when data is generated, the storage nodes in the data storage system are in operation to store the data at the same time. As shown in fig. 4, the apparatus may include:
A first obtaining module 41, configured to obtain a query time period;
A second obtaining module 42, configured to obtain an operating period of each storage node in the data storage system;
A determining module 43, configured to determine a plurality of target storage nodes to be queried, and a target time period queried by each of the target storage nodes, according to the query time period and the working time period of each of the storage nodes;
a query module 44, configured to query target data stored in each target storage node and corresponding to the target time period, to obtain a plurality of target data;
And a merging module 45, configured to merge the plurality of target data to obtain query data corresponding to the query time period.
As an optional implementation manner, the first obtaining module 41 is specifically configured to:
receiving a time period input by a user through a visual interface;
Determining an initial time and an end time of the time period;
Acquiring an initial time stamp of the initial time and an end time stamp of the end time;
And determining a continuous interval formed by the initial timestamp and the ending timestamp as a query time period.
As an alternative implementation manner, the second obtaining module 42 is specifically configured to:
Determining whether the inquiry time period belongs to a preset time range; transmitting a preset query instruction to each storage node of the data storage system under the condition that the query time period belongs to the preset time range; receiving a working time period in the preset time range fed back by each storage node;
Or alternatively
Sending a query instruction carrying the query time period to each storage node of the data storage system; and receiving a working time period in a target time range which is fed back by each storage node and is determined according to the inquiry time period.
As a possible implementation manner, the determining module 43 includes:
a traversing sub-module, configured to traverse the working time period of each storage node, and determine whether there is a first storage node whose working time period includes the query time period;
A first determining submodule, configured to determine the first storage node as a target storage node if the first storage node exists; dividing the inquiry time period according to the first number of the first storage nodes to obtain a target time period inquired by each target storage node;
A second determining sub-module, configured to divide the query time period according to a second number of storage nodes of the data storage system if the first storage node does not exist, to obtain a plurality of sub-query time periods; and determining a target storage node corresponding to each sub-query time period and a target time period queried by each target storage node according to each sub-query time period.
As a possible implementation manner, the first determining submodule is specifically configured to:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the first quantity to obtain a target time length;
Dividing the inquiry time period according to the target time length to obtain a plurality of target time periods;
And the target time periods are in one-to-one correspondence with the target storage nodes, so that the target time period inquired by each target storage node is obtained.
As a possible implementation manner, the second determining submodule is specifically configured to:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the second number to obtain an average time length;
Dividing the inquiry time period according to the average duration to obtain a plurality of sub inquiry time periods.
As a possible implementation manner, the second determining submodule is specifically configured to:
Traversing the working time period of each storage node, and determining whether a second storage node with the working time period containing the sub-query time period exists;
If the second storage nodes exist, determining any N second storage nodes as target storage nodes; dividing the sub-query time period according to the N to obtain a target time period queried by each target storage node, wherein the N is a positive integer;
If the second storage node does not exist, determining each storage node in the data storage system as a target storage node; and determining the sub-query time period as a target time period queried by each target storage node.
As a possible implementation manner, the determining module 43 is specifically configured to:
traversing the working time periods of each storage node, and searching sub-working time periods of M storage nodes to form the query time period;
Determining M storage nodes as target storage nodes;
For each target storage node, determining the corresponding sub-working time period as the target time period queried by the target storage node.
As shown in fig. 5, a schematic structural diagram of an electronic device according to an embodiment of the present application includes a processor 51, a communication interface 52, a memory 53 and a communication bus 54, where the processor 51, the communication interface 52, the memory 53 complete communication with each other through the communication bus 54,
A memory 53 for storing a computer program;
In one embodiment of the present application, the processor 51 is configured to implement the data query method provided in any one of the foregoing method embodiments when executing the program stored in the memory 53, where the method includes:
acquiring a query time period;
Acquiring a working time period of each storage node in a data storage system;
determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node;
inquiring target data stored in each target storage node in the corresponding target time period to obtain a plurality of target data;
and merging the plurality of target data to obtain query data corresponding to the query time period.
The embodiment of the application also provides a storage medium, on which a computer program is stored, which when executed by a processor implements the steps of the data query method provided in any of the method embodiments described above.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
From the above description of embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus a general purpose hardware platform, or may be implemented by hardware. Based on such understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the related art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the method described in the respective embodiments or some parts of the embodiments.
It is to be understood that the terminology used herein is for the purpose of describing particular example embodiments only, and is not intended to be limiting. As used herein, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. The terms "comprises," "comprising," "includes," "including," and "having" are inclusive and therefore specify the presence of stated features, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, steps, operations, elements, components, and/or groups thereof. The method steps, processes, and operations described herein are not to be construed as necessarily requiring their performance in the particular order described or illustrated, unless an order of performance is explicitly stated. It should also be appreciated that additional or alternative steps may be used.
The foregoing is only a specific embodiment of the invention to enable those skilled in the art to understand or practice the invention. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the invention. Thus, the present invention is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (11)

1. A data query method, applied to a data storage system, the data storage system including a plurality of storage nodes, the storage nodes in an operating state in the data storage system storing data simultaneously when the data is generated, the method comprising:
acquiring a query time period;
Acquiring a working time period of each storage node in a data storage system;
determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node;
inquiring target data stored in each target storage node in the corresponding target time period to obtain a plurality of target data;
and merging the plurality of target data to obtain query data corresponding to the query time period.
2. The method of claim 1, wherein the acquiring a query time period comprises:
receiving a time period input by a user through a visual interface;
Determining an initial time and an end time of the time period;
Acquiring an initial time stamp of the initial time and an end time stamp of the end time;
And determining a continuous interval formed by the initial timestamp and the ending timestamp as a query time period.
3. The method of claim 1, wherein the acquiring the operational time period for each storage node in the data storage system comprises:
Determining whether the inquiry time period belongs to a preset time range; transmitting a preset query instruction to each storage node of the data storage system under the condition that the query time period belongs to the preset time range; receiving a working time period in the preset time range fed back by each storage node;
Or alternatively
Sending a query instruction carrying the query time period to each storage node of the data storage system; and receiving a working time period in a target time range which is fed back by each storage node and is determined according to the inquiry time period.
4. The method of claim 1, wherein said determining a plurality of target storage nodes to be queried and a target time period queried by each of said target storage nodes based on said querying time period and said operating time period of each of said storage nodes comprises:
Traversing the working time period of each storage node, and determining whether a first storage node with the working time period containing the query time period exists;
If the first storage node exists, determining the first storage node as a target storage node; dividing the inquiry time period according to the first number of the first storage nodes to obtain a target time period inquired by each target storage node;
if the first storage node does not exist, dividing the query time period according to the second number of the storage nodes of the data storage system to obtain a plurality of sub-query time periods; and determining a target storage node corresponding to each sub-query time period and a target time period queried by each target storage node according to each sub-query time period.
5. The method of claim 4, wherein the dividing the query time period by the first number of the first storage nodes to obtain the target time period queried by each of the target storage nodes comprises:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the first quantity to obtain a target time length;
Dividing the inquiry time period according to the target time length to obtain a plurality of target time periods;
And the target time periods are in one-to-one correspondence with the target storage nodes, so that the target time period inquired by each target storage node is obtained.
6. The method of claim 4, wherein dividing the query time period by the second number of storage nodes of the data storage system results in a plurality of sub-query time periods, comprising:
determining the query time length corresponding to the query time period;
dividing the inquiry time length by the second number to obtain an average time length;
Dividing the inquiry time period according to the average duration to obtain a plurality of sub inquiry time periods.
7. The method of claim 4, wherein the determining the target storage node corresponding to the sub-query time period and the target time period queried by each of the target storage nodes comprises:
Traversing the working time period of each storage node, and determining whether a second storage node with the working time period containing the sub-query time period exists;
If the second storage nodes exist, determining any N second storage nodes as target storage nodes; dividing the sub-query time period according to the N to obtain a target time period queried by each target storage node, wherein the N is a positive integer;
If the second storage node does not exist, determining each storage node in the data storage system as a target storage node; and determining the sub-query time period as a target time period queried by each target storage node.
8. The method of claim 1, wherein said determining a plurality of target storage nodes to be queried and a target time period queried by each of said target storage nodes based on said querying time period and said operating time period of each of said storage nodes comprises:
traversing the working time periods of each storage node, and searching sub-working time periods of M storage nodes to form the query time period;
Determining M storage nodes as target storage nodes;
For each target storage node, determining the corresponding sub-working time period as the target time period queried by the target storage node.
9. A data query device for use in a data storage system, the data storage system comprising a plurality of storage nodes, the storage nodes in the data storage system being operable to store data simultaneously when the data is generated, the device comprising:
The first acquisition module is used for acquiring the inquiry time period;
The second acquisition module is used for acquiring the working time period of each storage node in the data storage system;
The determining module is used for determining a plurality of target storage nodes to be queried and a target time period queried by each target storage node according to the query time period and the working time period of each storage node;
The query module is used for querying the target data stored by each target storage node in the corresponding target time period to obtain a plurality of target data;
and the merging module is used for merging the plurality of target data to obtain query data corresponding to the query time period.
10. An electronic device, comprising: a processor and a memory, the processor being configured to execute a data query program stored in the memory to implement the data query method of any one of claims 1 to 8.
11. A storage medium storing one or more programs executable by one or more processors to implement the data query method of any of claims 1-8.
CN202311865506.4A 2023-12-29 2023-12-29 Data query method, device, electronic equipment and storage medium Pending CN117971881A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311865506.4A CN117971881A (en) 2023-12-29 2023-12-29 Data query method, device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311865506.4A CN117971881A (en) 2023-12-29 2023-12-29 Data query method, device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN117971881A true CN117971881A (en) 2024-05-03

Family

ID=90848833

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311865506.4A Pending CN117971881A (en) 2023-12-29 2023-12-29 Data query method, device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN117971881A (en)

Similar Documents

Publication Publication Date Title
CN112910945B (en) Request link tracking method and service request processing method
US9703824B2 (en) Managing a distributed database
US10250451B1 (en) Intelligent analytic cloud provisioning
US8069224B2 (en) Method, equipment and system for resource acquisition
US10956990B2 (en) Methods and apparatuses for adjusting the distribution of partitioned data
CN106059825A (en) Distributed system and configuration method
US10235417B1 (en) Partitioned search of log events
EP3489825A1 (en) Method, apparatus and computer readable storage medium for processing service
CN103019853A (en) Method and device for dispatching job task
CN113032419B (en) Multi-source data aggregation search method, device, equipment and storage medium
CN108228322B (en) Distributed link tracking and analyzing method, server and global scheduler
WO2018177350A1 (en) Method and apparatus for providing serial number, electronic device and readable storage medium
CN111225029A (en) Dynamic message pushing method and system and automobile diagnosis server
CN110807145A (en) Query engine acquisition method, device and computer-readable storage medium
CN113065054B (en) Request processing method, request processing device, electronic equipment and storage medium
CN108696559B (en) Stream processing method and device
CN111552701B (en) Method for determining data consistency in distributed cluster and distributed data system
CN107515864B (en) Method and equipment for monitoring workflow
CN111835809B (en) Work order message distribution method, work order message distribution device, server and storage medium
CN117971881A (en) Data query method, device, electronic equipment and storage medium
CN111831503A (en) Monitoring method based on monitoring agent and monitoring agent device
CN114513469A (en) Traffic shaping method and device for distributed system and storage medium
CN112688982B (en) User request processing method and device
CN113407629A (en) Data synchronization method and device, electronic equipment and storage medium
CN112541038A (en) Time series data management method, system, computing device and storage medium

Legal Events

Date Code Title Description
PB01 Publication