CN116089414B - Time sequence database writing performance optimization method and device based on mass data scene - Google Patents

Time sequence database writing performance optimization method and device based on mass data scene Download PDF

Info

Publication number
CN116089414B
CN116089414B CN202310372120.3A CN202310372120A CN116089414B CN 116089414 B CN116089414 B CN 116089414B CN 202310372120 A CN202310372120 A CN 202310372120A CN 116089414 B CN116089414 B CN 116089414B
Authority
CN
China
Prior art keywords
task
writing
write
data
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310372120.3A
Other languages
Chinese (zh)
Other versions
CN116089414A (en
Inventor
罗实
陈昱宇
周宇
邵研
张君兰
王永恒
黄宝忠
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202310372120.3A priority Critical patent/CN116089414B/en
Publication of CN116089414A publication Critical patent/CN116089414A/en
Application granted granted Critical
Publication of CN116089414B publication Critical patent/CN116089414B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/21Design, administration or maintenance of databases
    • G06F16/215Improving data quality; Data cleansing, e.g. de-duplication, removing invalid entries or correcting typographical errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/27Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a time sequence database writing performance optimization method and device based on a mass data scene, wherein the method comprises the following steps: firstly, separating a read-write thread pool corresponding to a database read-write request based on Apache Calcite, and establishing an independent write task thread pool; then, slicing the write task in the write task thread pool through a hash strategy, and scheduling the sliced task according to the load value so as to write the data of the write task into the nodes in the cluster; and finally, acquiring a write-in request according to the write-in task of the node, distributing a memory array for the write-in request, dynamically distributing the memory required by the write-in task based on a massive time line, and flushing the data of the write-in task. The invention can relieve the situation that the writing thread is blocked by the reading thread in the time sequence database writing multi-reading few scenes, has better cluster load balancing effect, enhances the concurrency of tasks, is beneficial to reducing the frequency of brushing the disk when writing mass data, and improves the writing efficiency.

Description

Time sequence database writing performance optimization method and device based on mass data scene
Technical Field
The invention belongs to the technical field of databases, and particularly relates to a time sequence database writing performance optimization method and device based on a mass data scene.
Background
The time sequence database is totally called as a time sequence database (Time Series Database), is a specialized database for storing and managing time sequence data, has the characteristics of more writing, less reading, high concurrency writing, continuous writing of mass data and the like, and has wide application in the fields of Internet of things, economic finance, industrial production, system monitoring and the like.
Time series data is defined as a series of data indexed by time dimension. In daily life, time sequence data are ubiquitous, stock data are the most intuitive and typical, and stock prices are continuously changed along with time to form a time sequence. Typically, a time series database contains the following concepts:
a. a Table (Metric) corresponding to the tables in the relational database, representing a set of similar time series data;
b. tags (tags) that describe the characteristics of the data source, typically not changing over time;
c. a time stamp (Timestamp) representing the point in time at which the data was generated;
d. a measurement value (field) for describing a measurement index of the data source, which is generally continuously variable over time;
e. a Data Point (Data Point), wherein a certain measurement value generated by a Data source at a certain time is called a Data Point, and the Data Point is used as a statistical index according to the Data Point when the database is written in and inquired;
f. time Series (Time Series), where a certain measurement of the data source varies with Time, forms a Time line, and the combination of the table and the tag uniquely identifies a Time line, the computation of the Time Series data is typically based on the Time line dimension.
In recent years, with the continuous development and application of 5G network and internet of things technology in various industries, the collection and storage of large-scale time series data becomes an increasingly troublesome reality problem. At present, the mainstream time sequence database generally adopts modes of upgrading disk media, expanding cluster scale, optimizing data structures and the like to improve the system capacity, and continuous technical transformation ensures that the system obtains long-term performance improvement and is mature and stable gradually. However, in the scene of massive time series data, the optimization brought by the means often has the problems of long time cost, low cost performance, repeated waste of resources and the like, and how to solve the performance bottleneck by utilizing the service characteristics of the large-scale time series data becomes a new research direction.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a time sequence database writing performance optimization method and device based on a mass data scene.
The aim of the invention is realized by the following technical scheme: the first aspect of the embodiment of the invention provides a time sequence database writing performance optimization method based on a mass data scene, which comprises the following steps:
(1) Separating read-write thread pools corresponding to database read-write requests based on Apache Calcite, and establishing independent write task thread pools;
(2) Fragmenting the write task in the write task thread pool in the step (1) through a hash strategy, and scheduling the fragmented task according to a load value so as to write the data of the write task into a corresponding node in the cluster;
(3) And (3) acquiring a write-in request according to the write-in task of the write-in node in the step (2), distributing a memory array for the write-in request, dynamically distributing the memory required by the write-in task based on a massive time line, and then brushing the data of the write-in task.
Optionally, the processing flow of the Apache Calcite comprises:
(a) Analysis: acquiring and analyzing SQL sentences, and converting the SQL sentences into abstract syntax trees;
(b) Verification: performing grammar checking and semantic analysis; the grammar checking is specifically as follows: carrying out grammar verification according to metadata information of a database; the semantic analysis specifically comprises the following steps: constructing an initial version of logic plan through abstract syntax tree and meta information;
(c) And (3) optimizing: optimizing a logic plan;
(d) The execution stage: and generating a physical plan based on the optimized logic plan, and executing the physical plan.
Optionally, the step (1) includes the sub-steps of:
(1.1) submitting a task for the first time, acquiring and analyzing an SQL statement of the task based on Apache Calcite, judging whether the SQL statement contains a keyword insert, and if so, determining the task type as a writing task; otherwise, the task type is a reading task;
(1.2) if the task type acquired in the step (1.1) is a writing task, submitting the task to a writing thread pool; if the task type obtained in the step (1.1) is a read task, marking a statement corresponding to the read task to obtain a statement id corresponding to the read task, storing the statement id into a hash set, binding the hash set to a TCP connection, and submitting the task to a read thread pool;
(1.3) submitting a subsequent task, judging whether the statementId corresponding to the current task is in the hash set obtained in the step (1.2), if not, submitting the current task to a write thread pool, and if not, submitting the current task to the write thread pool; otherwise, the current task is a read task, and the current task is submitted to a read thread pool; separating the read-write thread pools corresponding to the tasks, and establishing an independent write task thread pool.
Optionally, the step (2) includes the sub-steps of:
(2.1) acquiring meta information of the writing task according to the writing task in the writing task thread pool in the step (1), wherein the meta information comprises the number of labels of the writing task, the number of measured values and the time length;
(2.2) calculating a corresponding hash value according to the hash strategy specified by the user; the hash policy includes: calculating a hash value according to the table name, calculating a hash value according to the selected tag, and calculating a hash value according to the table name and the tag;
(2.3) slicing the writing task according to the hash value calculated in the step (2.2), and putting the writing task into a queue to be written in a slicing unit;
(2.4) calculating a load value according to the meta information of the write task obtained in the step (2.1);
(2.5) scheduling the writing task to be written in the queue in the step (2.3) to a node with the minimum load value in the cluster nodes, and dynamically updating a load value table;
(2.6) writing the data of the write task into the corresponding node according to the scheduling mode in the step (2.5).
Optionally, the expression of the load value is:
where i represents a label and n represents the number of labels included in the slicing task.
Optionally, the dynamically updating the load value table in the step (2.5) includes: distributing the writing task of the fragment to a node with the smallest load value in the cluster nodes, and accumulating the load value to the load value of the node so as to dynamically update a load value table; when the writing of the writing task of the slice is finished, the accumulated load value of the corresponding node is subtracted to dynamically update the load value table.
Optionally, the step (3) comprises the following sub-steps:
(3.1) acquiring a write request according to the write task of the write node in the step (2), distributing a memory array for the write request, and writing data;
(3.2) judging whether the memory is larger than a memory threshold, if so, calculating the effective data duty ratio, adjusting the size of the memory array according to the effective data duty ratio, and then brushing the write-in data; otherwise, go to step (3.3);
3.3, judging whether the current memory array is full, if so, expanding the size of the memory array to be twice as large as the current memory array, and then brushing the written data; otherwise, go to step (3.4);
(3.4) judging whether other writing requests exist, and if so, returning to the step (3.1); otherwise, the writing task ends.
The second aspect of the embodiment of the invention provides a time sequence database writing performance optimizing device based on a mass data scene, which comprises one or more processors and is used for realizing the time sequence database writing performance optimizing method based on the mass data scene.
A third aspect of the embodiments of the present invention provides a computer program, including computer program instructions, which when executed by a processor, implement the above-mentioned method for optimizing writing performance of a time-series database based on a mass data scenario.
A fourth aspect of the embodiments of the present invention provides a computer readable storage medium having a program stored thereon, which when executed by a processor, is configured to implement the above-described method for optimizing writing performance of a time-series database based on a mass data scenario.
The method has the advantages that the read-write thread pool is separated, so that the problem of writing failure caused by the fact that the thread pool is filled up in a large inquiry state is effectively avoided, successful writing of data can be preferentially ensured under the condition of heavy load, the availability of a database is improved, and the method is particularly suitable for scenes with more writing and less reading; according to the invention, the user can effectively reasonably fragment the written data aiming at different service scenes by specifying the hash strategy, and the cluster fully realizes load balancing by defining the dynamic scheduling model based on the priority of the service load values, so that the concurrency of tasks is enhanced, and the writing efficiency is improved; the invention effectively avoids the condition of excessive pre-allocation space through dynamic memory allocation, optimizes the utilization rate of the memory, reduces the frequency of brushing the disk and improves the writing efficiency.
Drawings
FIG. 1 is a read-write thread pool separation flow diagram of the present invention;
FIG. 2 is a flow chart of task scheduling based on traffic load value prioritization in accordance with the present invention;
FIG. 3 is a flow chart of dynamic memory allocation according to the present invention;
FIG. 4 is a flow chart of a method for optimizing the writing performance of a time sequence database based on a mass data scene;
fig. 5 is a schematic structural diagram of a time-series database writing performance optimizing device based on a mass data scene.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the invention. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "responsive to a determination", depending on the context.
The present invention will be described in detail with reference to the accompanying drawings. The features of the examples and embodiments described below may be combined with each other without conflict.
Referring to fig. 4, the time sequence database writing performance optimization method based on mass data scene of the present invention includes the following steps:
(1) Separating read-write thread pools corresponding to database read-write requests based on Apache Calcite, and establishing independent write task thread pools.
In this embodiment, apache Calcite is a basic framework for optimizing query processing of heterogeneous data sources, provides a standard SQL (Structured Query Language, structured query) language, optimizes multiple queries and connects various data sources, and leaves the data storage and data management capabilities to the engine itself. Because Apache Calcite has a good pluggable architecture design, an SQL engine is constructed based on a part of functions in Apache Calcite, so that SQL support of a database kernel is realized, and research and development of database products are completed.
The processing flow of Apache Calcite mainly comprises the following four stages:
(a) Parser (parse) stage: and acquiring and analyzing the SQL sentence, and converting the SQL sentence into an abstract syntax tree.
(b) Validate (verification) phase: and performing grammar checking and semantic analysis, wherein the grammar checking is performed by grammar checking according to metadata information of a database, and the semantic analysis is performed by constructing a logic plan of an initial version through an abstract grammar tree and the metadata information.
(c) Optimization phase: optimizing a logic plan will typically be done according to a set of rules that the developer has set itself up.
(d) Execution phase: and generating a physical plan based on the optimized logic plan, and executing the physical plan.
The read-write thread pool is an optimization for the SQL engine. Under the condition that the reading and writing share one thread pool, if a certain user makes a large number of queries with larger range in a short time, the condition that the threads in the pool are occupied by the large queries may occur, in this case, the reading and writing of the database are unavailable, the writing of the data is seriously blocked, and a large number of data bins and writing failures may be caused.
Typically, when an api (Application Program Interface ) accesses a database, the read-write tasks may be classified according to the api name, so as to separate the read-write thread pools of the tasks. In the SQL engine based on Apache Calcite, all SQL sentences access the database through the same api, so that the read-write thread pool cannot be separated through the api name.
Therefore, based on Apache Calcite, the read-write thread pool corresponding to the database read-write request is separated, and an independent write task thread pool is built, as shown in FIG. 1, and specifically comprises the following steps:
(1.1) submitting a task for the first time, acquiring and analyzing an SQL statement of the task based on Apache Calcite, judging whether the SQL statement contains a keyword insert, and if so, determining the task type as a writing task; otherwise, the task type is a read task.
Specifically, when a task is submitted for the first time, the task will enter a Parser stage, and the task contains a complete SQL sentence for subsequent grammar checking and semantic analysis. Therefore, SQL sentences of the tasks can be directly acquired in the Parser stage; judging whether the SQL statement contains a keyword insert or not to distinguish a writing thread from a reading thread, and if the SQL statement contains the keyword insert, judging that the task type is a writing task; otherwise, the task type is a read task.
It should be understood that the tasks of the Parser stage, the value stage and the Optimize stage all comprise complete SQL sentences, and the task types can be obtained according to whether keywords are included in the tasks or not.
(1.2) if the task type obtained in the step (1.1) is a writing task, submitting the task to a writing thread pool; if the task type obtained in the step (1.1) is a read task, marking a statement corresponding to the read task to obtain a statement Id corresponding to the read task, storing the statement Id in a hash set, binding the hash set to a TCP connection, and submitting the task to a read thread pool.
In this embodiment, the task is submitted to the corresponding thread pool according to the task type and executed according to the actual situation of the subsequent step. The task types include a read task and a write task.
It should be noted that, since there is only one batch of data pairs in the subsequent task, and key information for effectively distinguishing the read-write operation is lacking, it is necessary to mark the statement corresponding to the read task, and store the corresponding statement id in the hash set, so as to facilitate the recognition of the read or write task of the subsequent task.
(1.3) submitting a subsequent task, judging whether the statementId corresponding to the current task is in the hash set obtained in the step (1.2), if not, submitting the current task to a write thread pool, and if not, submitting the current task to the write thread pool; otherwise, the current task is a read task, and the current task is submitted to a read thread pool; separating the read-write thread pools corresponding to the tasks, and establishing an independent write task thread pool.
After the semantic verification, plan generation and optimization of the first task submission are completed, the subsequent tasks only need to multiplex the states of the first task and submit new data. For example, the first task submits the following SQL statement: insert into table (a, b, c) values (.
In this case, only one batch of data pairs will be in the subsequent task, but the key information for effectively distinguishing the read-write operation is lacking, so that the states to be multiplexed in the subsequent task need to be marked, thereby distinguishing the read-write operation. Therefore, after distinguishing the read-write thread in the Parser stage, a hash set is maintained to store the statementId corresponding to the read thread, and the hash set is bound to the corresponding TCP connection, so as to ensure that the task type of the data injected through the TCP connection can be confirmed according to the statementId every time later.
And submitting a subsequent task, and only judging whether the current statementId is in the hash set in the Execute stage, distinguishing read-write threads so as to distribute the task to a corresponding thread pool. It should be noted that, if the current statementId does not exist in the hash set, the current task is a write task; otherwise, the current task is a read task.
(2) And (3) fragmenting the write tasks in the write task thread pool in the step (1) through a hash strategy, and scheduling the fragmented tasks according to the load value so as to write the data of the write tasks into the corresponding nodes in the cluster.
Specifically, as shown in fig. 2, the method comprises the following steps:
(2.1) acquiring meta information of the write task according to the write task in the write task thread pool in the step (1), wherein the meta information comprises the number of labels of the write task, the number of measured values and the time length.
In this embodiment, in the time-series database, the storage and management of data are performed based on the dimension of the time line, and the time line needs to be reasonably distributed to each write-in node in the database cluster, so as to implement load balancing of the distributed time-series database cluster. Where a node represents a single server in a cluster.
(2.2) calculating a corresponding hash value according to the hash policy specified by the user. The hash strategy comprises three types, namely: the hash value is calculated from the table name, the hash value is calculated from the selected tag, and the hash value is calculated from the table name and the tag.
In this embodiment, the hash value is calculated according to the table name by default, and the other two hash policies are configured by the support user according to the service data characteristics.
In a typical design of a distributed time sequence database, a system often segments a time line through a table name (Metric), places the time lines belonging to the same table in the same socket (storage space) and distributes the time lines to each node to realize distributed storage; however, this slicing strategy has a drawback that when there is a large write rate difference between the two tables, a situation occurs in which one node is very heavy and the other node is very light, and it is obvious that the load balancing requirement is not met. Therefore, the strategy of hashing fragments by tag values (Tags) can be extended on the basis of the original table name based fragment strategy.
Further, hash-slicing policies based on table names and tag values (Tags) facilitate load balancing of distributed time-series databases.
In this embodiment, in general, the user has a clearer understanding of the service data, and may designate the most suitable hash policy for slicing before writing. For example, the labels may be divided according to table names, or one or a group of labels specifying a timeline, or a combination of table names and labels. The system then calculates the corresponding hash value based on the user's selection.
It should be appreciated that the Hash value may be obtained by invoking a Hash (Hash) algorithm.
And (2.3) slicing the writing task according to the hash value calculated in the step (2.2), and putting the writing task into a queue to be written in a slicing unit. Wherein the queue adopts a first-in first-out scheduling strategy.
In the embodiment, the write task is segmented more reasonably according to the hash value calculated in the step (2.2), so that load balancing of the cluster is realized; and the write task is put into the queue to be written by taking the fragments as a unit, so that the write task can be conveniently scheduled according to a first-in first-out scheduling strategy. For example, a monitoring data table and an air humidity table are established for two machine rooms, the monitoring data are collected every 5 seconds, the air humidity table is collected every half hour, at this time, if the data are segmented according to table names, the load of the node where the monitoring data table is located is obviously heavier than that of the node where the air humidity table is located, resources cannot be fully utilized, and if the data are segmented according to machine room id, the data of the machine room 1 and the data of the machine room 2 can be separately written to obtain the nodes with almost the same load, so that load balancing of a cluster is realized.
(2.4) calculating the load value based on the meta information of the write task acquired in the step (2.1).
In this embodiment, the load value is the load condition that the node is processing tasks during the system operation. The higher the load value of the node in a period of time is, the larger the task amount which the node needs to write in is, and the more busy the work is.
Further, the formula of the load value is as follows:
the tag complexity is the product of the number of tags contained in the slicing task, and is expressed as:
where i represents a label and n represents the number of labels included in the slicing task.
Illustratively, when there are two labels in a certain segment, there are two values of area 1/2 under the area and three values of 1/2/3 under the machine room, the label complexity of the segment is 2*3 =6, the number of the label values is not fixed, the segment is allocated to area 1/2, the next segment is possibly allocated to area 3/4/5, and thus the label complexity under each segment is different; it is further assumed that the slice has two measured values of temperature and humidity, and 30s of data are recorded, and the load value of the slice is 6×2×30=360.
(2.5) scheduling the writing task to be written into the queue in the step (2.3) to the node with the smallest load (load) value in the cluster nodes, and dynamically updating the load (load) value table.
In this embodiment, after calculating the load value of the slice, the load value dynamic table of the cluster node (the load value of each node is 0 initially) is queried, the node with the smallest load is found, the task of the slice is allocated to the node, the load value is added to the load value of the node, and the load value table is dynamically updated. When the writing of the slicing task is finished, the accumulated load value of the corresponding node is subtracted and the load value table is dynamically updated as well. Based on the scheduling method, the business complexity of the large-scale time sequence data task in the distributed scene is fully considered, so that the load of the cluster is further balanced, the concurrency of the task is enhanced, and the writing efficiency is improved.
(2.6) writing the data of the write task into the corresponding node according to the scheduling mode in the step (2.5).
Specifically, according to the scheduling manner of step (2.5), the data of the writing task is written into the corresponding node, that is, written into the single server in the corresponding cluster.
(3) And (3) performing writing operation on the scheduled task on a specific node, acquiring a writing request according to the writing task of the writing node in the step (2), distributing a memory array for the writing request, dynamically distributing memories required by the writing task based on a massive time line, and then brushing data of the writing task.
In this embodiment, the dynamic allocation of the memory is an optimization for more scenes of the timeline. When the time lines are more, because a large amount of meta information needs to be stored, the memory is easy to overrun and trigger the flash disk, so that effective data stored by each flash disk is very limited, and the next flash disk needs to generate a large amount of meta information again, so that the writing efficiency is greatly reduced, and therefore, the memory needs to be dynamically allocated based on a large amount of time lines.
It should be understood that, in general, the field value is stored in a form of pre-allocated fixed-size array, so that in a scene with more time lines, the memory of the array storing the field value occupies a relatively large amount, the memory threshold is reached soon to trigger the flash, and a large amount of blank memory is often left for storing the field value without writing data, so that serious resource waste is caused.
In this embodiment, the dynamic memory optimization may reduce the space memory ratio by dynamically adjusting the memory size of the pre-allocated field value array, so as to avoid resource waste. It should be understood that the memory threshold is adjusted to be larger until the number of the timelines on the current scale is adapted, so that the resource waste can be avoided, but the method is a method for treating the symptoms without treating the root cause, on one hand, the timelines are likely to be further expanded, time and labor are wasted when the memory threshold is adjusted to be larger for multiple times, on the other hand, the load of the CPU and the memory is likely to be increased when the memory threshold is adjusted to be larger, and the performance is reduced, so that the feasibility of the method is not high.
Preferably, the memory is dynamically optimized by dynamically adjusting the memory size of the pre-allocated field value array.
The present embodiment optimizes the allocation of the data memory, specifically as follows: before each brushing, the size of the effective data in each data block is obtained and summed, the size of the effective data in each array is calculated by dividing the size of the effective data by the number of the arrays, and the average value is rounded upwards by 8 to obtain a reference value. When the reference value is smaller than the original memory size, taking the reference value as the memory size pre-allocated when the array is generated next time, and modifying the corresponding parameter value; when the reference value is equal to the original memory size, it is indicated that each field array can be fully written, and the array size can be properly enlarged, at this time, twice the reference value is taken as the memory size pre-allocated when the array is generated next time, and the corresponding parameter value is modified. Finally, the pre-allocated memory size will be dynamically stabilized to a fixed value, thereby reducing the frequency of brushing the disk and improving the writing efficiency.
Specifically, as shown in fig. 3, the method comprises the following steps:
and (3.1) acquiring a write request according to the write task of the write node in the step (2), distributing a memory array for the write request, and writing data.
(3.2) judging whether the memory is larger than a memory threshold, if so, calculating the effective data duty ratio, adjusting the size of the memory array according to the effective data duty ratio, and then brushing the write-in data; otherwise, go to step (3.3).
It should be understood that when the memory is greater than the preset memory threshold, the size of the memory array may be adjusted according to the effective data duty ratio, the trimmed size of the memory array is taken as the memory size pre-allocated when the array is generated next time, and then the disk is flushed, so that the write data is actually written into the disk.
3.3, judging whether the current memory array is full, if so, expanding the size of the memory array to be twice as large as the current memory array, and then brushing the written data; otherwise, go to step (3.4).
It should be understood that when the memory array can be fully written, the size of the array can be properly enlarged, the size of the memory array can be enlarged to be twice as much as the current size of the memory array, which is pre-allocated when the array is generated next time, and then the memory array is flushed, so that the written data is actually written into the disk.
(3.4) judging whether other writing requests exist, if so, returning to the step (3.1); otherwise, the writing task ends.
For example, 100 tens of thousands of fields are used, the memory threshold is 64MB, and the primitive array size is 128B. Before data writing, checking whether an array is allocated to the current field in a memory, if so, writing the data into the array, otherwise, generating a new array. In the original memory allocation method, because the array is larger, the memory required to be pre-allocated reaches 128B 100W and is about 128MB, which is far larger than the memory threshold value, the flash is necessarily triggered before the array is allocated, at this time, for the arrays with the allocated memory, only the first data is always written, and serious space waste is caused; in the optimized dynamic memory allocation method, the size of the array after the first adjustment is only 8B, the memory required to be pre-allocated is about 8MB, the memory threshold cannot be reached, the data writing will trigger the disk brushing after the array is filled, the memory space is fully utilized, the data writing is full at this time, the requirement of expanding the array is met, then the array is adjusted to 16B for the second time, and so on, the pre-allocated memory is infinitely close to the memory threshold, and the memory space is most effectively utilized.
In the embodiment, experiments and comparisons are made on the situation of brushing discs in different scale time line scenes based on two distribution methods before and after optimization. Experimental environment: TSDB single node, configuration: CPU4 cores, memory 16GB, disk size 480GB. The experiment is to write 1 hundred million time series data into the database through 10W/40W/70W time lines respectively, and count the average brushing time, the brushing times and the total time. The specification of each time line is 3 labels and 3 measurement values, and the 3 measurement values are BIGINT, DOUBLE, VARCHAR data types respectively. The experimental results are shown in table 1.
Table 1: comparison of brushing conditions in different scale timeline scenes
As can be seen from Table 1, in the scene with fewer time lines, the method has similar brushing time to the originally designed brushing time, and the brushing time of the method is obviously longer than the original design along with the increase of the time lines, so that the brushing frequency is reduced, the writing efficiency is improved, and the data writing pressure in the scene with massive time lines is effectively relieved.
Corresponding to the embodiment of the time sequence database writing performance optimizing method based on the mass data scene, the invention also provides an embodiment of the time sequence database writing performance optimizing device based on the mass data scene.
Referring to fig. 5, the device for optimizing writing performance of a time sequence database based on a mass data scene provided by the embodiment of the invention includes one or more processors, and is used for implementing the method for optimizing writing performance of a time sequence database based on a mass data scene in the above embodiment.
The embodiment of the time sequence database writing performance optimizing device based on the mass data scene can be applied to any device with data processing capability, and the device with data processing capability can be a device or a device such as a computer. The apparatus embodiments may be implemented by software, or may be implemented by hardware or a combination of hardware and software. Taking software implementation as an example, the device in a logic sense is formed by reading corresponding computer program instructions in a nonvolatile memory into a memory by a processor of any device with data processing capability. In terms of hardware, as shown in fig. 5, a hardware structure diagram of an apparatus with data processing capability according to the present invention, where the time-series database writing performance optimizing apparatus based on a mass data scenario is located, is shown in fig. 5, and in addition to a processor, a memory, a network interface, and a nonvolatile memory shown in fig. 5, any apparatus with data processing capability in the embodiment generally includes other hardware according to an actual function of the any apparatus with data processing capability, which is not described herein.
The implementation process of the functions and roles of each unit in the above device is specifically shown in the implementation process of the corresponding steps in the above method, and will not be described herein again.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present invention. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
The embodiment of the invention also provides a computer program, which comprises computer program instructions, when the instructions are executed by a processor, to realize the time sequence database writing performance optimization method based on the mass data scene in the embodiment.
The embodiment of the invention also provides a computer readable storage medium, on which a program is stored, which when executed by a processor, implements the method for optimizing the writing performance of a time sequence database based on a mass data scene in the embodiment.
The computer readable storage medium may be an internal storage unit, such as a hard disk or a memory, of any of the data processing enabled devices described in any of the previous embodiments. The computer readable storage medium may be any device having data processing capability, for example, a plug-in hard disk, a Smart Media Card (SMC), an SD Card, a Flash memory Card (Flash Card), or the like, which are provided on the device. Further, the computer readable storage medium may include both internal storage units and external storage devices of any data processing device. The computer readable storage medium is used for storing the computer program and other programs and data required by the arbitrary data processing apparatus, and may also be used for temporarily storing data that has been output or is to be output.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (8)

1. The time sequence database writing performance optimization method based on the mass data scene is characterized by comprising the following steps of:
(1) Separating read-write thread pools corresponding to database read-write requests based on Apache Calcite, and establishing independent write task thread pools;
(2) Fragmenting the write task in the write task thread pool in the step (1) through a hash strategy, and scheduling the fragmented task according to a load value so as to write the data of the write task into a corresponding node in the cluster;
(3) Acquiring a writing request according to the writing task of the writing node in the step (2), distributing a memory array for the writing request, dynamically distributing memories required by the writing task based on a massive time line, and then brushing data of the writing task;
said step (1) comprises the sub-steps of:
(1.1) submitting a task for the first time, acquiring and analyzing an SQL statement of the task based on Apache Calcite, judging whether the SQL statement contains a keyword insert, and if so, determining the task type as a writing task; otherwise, the task type is a reading task;
(1.2) if the task type acquired in the step (1.1) is a writing task, submitting the task to a writing thread pool; if the task type obtained in the step (1.1) is a read task, marking a statement corresponding to the read task to obtain a statement id corresponding to the read task, storing the statement id into a hash set, binding the hash set to a TCP connection, and submitting the task to a read thread pool;
(1.3) submitting a subsequent task, judging whether the statementId corresponding to the current task is in the hash set obtained in the step (1.2), if not, submitting the current task to a write thread pool, and if not, submitting the current task to the write thread pool; otherwise, the current task is a read task, and the current task is submitted to a read thread pool; separating the read-write thread pools corresponding to the tasks, and establishing an independent write task thread pool.
2. The method for optimizing writing performance of a time series database based on a mass data scenario as claimed in claim 1, wherein said step (2) comprises the sub-steps of:
(2.1) acquiring meta information of the writing task according to the writing task in the writing task thread pool in the step (1), wherein the meta information comprises the number of labels of the writing task, the number of measured values and the time length;
(2.2) calculating a corresponding hash value according to the hash strategy specified by the user; the hash policy includes: calculating a hash value according to the table name, calculating a hash value according to the selected tag, and calculating a hash value according to the table name and the tag;
(2.3) slicing the writing task according to the hash value calculated in the step (2.2), and putting the writing task into a queue to be written in a slicing unit;
(2.4) calculating a load value according to the meta information of the write task obtained in the step (2.1);
(2.5) scheduling the writing task to be written in the queue in the step (2.3) to a node with the minimum load value in the cluster nodes, and dynamically updating a load value table;
(2.6) writing the data of the write task into the corresponding node according to the scheduling mode in the step (2.5).
3. The method for optimizing writing performance of a time-series database based on a mass data scenario as claimed in claim 2, wherein the expression of the load value is:
load value = tag complexity number of measurement values time length
Where i represents a label and n represents the number of labels included in the slicing task.
4. The method for optimizing writing performance of a time-series database based on a mass data scenario of claim 2, wherein said dynamically updating the load value table in step (2.5) comprises: distributing the writing task of the fragment to a node with the smallest load value in the cluster nodes, and accumulating the load value to the load value of the node so as to dynamically update a load value table; when the writing of the writing task of the slice is finished, the accumulated load value of the corresponding node is subtracted to dynamically update the load value table.
5. The method for optimizing writing performance of a time series database based on a mass data scenario as claimed in claim 1, wherein said step (3) comprises the sub-steps of:
(3.1) acquiring a write request according to the write task of the write node in the step (2), distributing a memory array for the write request, and writing data;
(3.2) judging whether the memory is larger than a memory threshold, if so, calculating the effective data duty ratio, adjusting the size of the memory array according to the effective data duty ratio, and then brushing the write-in data; otherwise, go to step (3.3);
3.3, judging whether the current memory array is full, if so, expanding the size of the memory array to be twice as large as the current memory array, and then brushing the written data; otherwise, go to step (3.4);
(3.4) judging whether other writing requests exist, and if so, returning to the step (3.1); otherwise, the writing task ends.
6. A time-series database writing performance optimizing device based on mass data scenes, which is characterized by comprising one or more processors for realizing the time-series database writing performance optimizing method based on mass data scenes according to any one of claims 1-5.
7. A computer program comprising computer program instructions which, when executed by a processor, implement the mass data scenario-based time-series database write performance optimization method of any one of claims 1-5.
8. A computer readable storage medium, having stored thereon a program which, when executed by a processor, is adapted to carry out the method of optimizing writing performance of a time-series database based on a mass data scenario as claimed in any one of claims 1 to 5.
CN202310372120.3A 2023-04-10 2023-04-10 Time sequence database writing performance optimization method and device based on mass data scene Active CN116089414B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310372120.3A CN116089414B (en) 2023-04-10 2023-04-10 Time sequence database writing performance optimization method and device based on mass data scene

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310372120.3A CN116089414B (en) 2023-04-10 2023-04-10 Time sequence database writing performance optimization method and device based on mass data scene

Publications (2)

Publication Number Publication Date
CN116089414A CN116089414A (en) 2023-05-09
CN116089414B true CN116089414B (en) 2023-09-08

Family

ID=86199467

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310372120.3A Active CN116089414B (en) 2023-04-10 2023-04-10 Time sequence database writing performance optimization method and device based on mass data scene

Country Status (1)

Country Link
CN (1) CN116089414B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116992065B (en) * 2023-09-26 2024-01-12 之江实验室 Graph database data importing method, system, electronic equipment and medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108075989A (en) * 2016-11-11 2018-05-25 南京南瑞继保电气有限公司 A kind of load balancing network middleware implementation method based on extensible protocol
CN112134909A (en) * 2019-06-24 2020-12-25 同方威视科技江苏有限公司 Time sequence data processing method, device, system, server and readable storage medium
CN113778689A (en) * 2021-09-22 2021-12-10 重庆允成互联网科技有限公司 Distributed timed task operation method, system, equipment and storage medium
CN115129438A (en) * 2021-03-26 2022-09-30 北京沃东天骏信息技术有限公司 Method and device for task distributed scheduling
CN115525631A (en) * 2022-10-31 2022-12-27 华润数字科技有限公司 Database data migration method, device, equipment and storage medium
WO2023279801A1 (en) * 2021-07-08 2023-01-12 华为云计算技术有限公司 Shard adjustment method and apparatus for time series database, device, and readable storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108075989A (en) * 2016-11-11 2018-05-25 南京南瑞继保电气有限公司 A kind of load balancing network middleware implementation method based on extensible protocol
CN112134909A (en) * 2019-06-24 2020-12-25 同方威视科技江苏有限公司 Time sequence data processing method, device, system, server and readable storage medium
CN115129438A (en) * 2021-03-26 2022-09-30 北京沃东天骏信息技术有限公司 Method and device for task distributed scheduling
WO2023279801A1 (en) * 2021-07-08 2023-01-12 华为云计算技术有限公司 Shard adjustment method and apparatus for time series database, device, and readable storage medium
CN113778689A (en) * 2021-09-22 2021-12-10 重庆允成互联网科技有限公司 Distributed timed task operation method, system, equipment and storage medium
CN115525631A (en) * 2022-10-31 2022-12-27 华润数字科技有限公司 Database data migration method, device, equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Apache IoTDB的分布式框架设计;李天安;黄向东;王建民;毛东方;徐毅;袁骏;;中国科学:信息科学(第05期);5-20 *

Also Published As

Publication number Publication date
CN116089414A (en) 2023-05-09

Similar Documents

Publication Publication Date Title
US10073888B1 (en) Adjusting partitioning policies of a database system in view of storage reconfiguration
US20170083573A1 (en) Multi-query optimization
US10712967B2 (en) Transferring data between memories utilizing logical block addresses
CN110166282B (en) Resource allocation method, device, computer equipment and storage medium
US10922316B2 (en) Using computing resources to perform database queries according to a dynamically determined query size
CN103942342A (en) Memory database OLTP and OLAP concurrency query optimization method
US20220164345A1 (en) Managed query execution platform, and methods thereof
CN112287182A (en) Graph data storage and processing method and device and computer storage medium
US20070250517A1 (en) Method and Apparatus for Autonomically Maintaining Latent Auxiliary Database Structures for Use in Executing Database Queries
US11709835B2 (en) Re-ordered processing of read requests
CN116089414B (en) Time sequence database writing performance optimization method and device based on mass data scene
CN108874930A (en) File attribute information statistical method, device, system, equipment and storage medium
El Alami et al. Supply of a key value database redis in-memory by data from a relational database
US11645273B2 (en) Query execution utilizing probabilistic indexing
US20240004858A1 (en) Implementing different secondary indexing schemes for different segments stored via a database system
CN116756150B (en) Mpp database large table association acceleration method
US20220121711A1 (en) Delaying exceptions in query execution
Elmeiligy et al. An efficient parallel indexing structure for multi-dimensional big data using spark
US8660974B2 (en) Inference over semantic network with some links omitted from indexes
US20220215021A1 (en) Data Query Method and Apparatus, Computing Device, and Storage Medium
US20170031982A1 (en) Maintaining Performance in the Presence of Insertions, Deletions, and Streaming Queries
US20170031909A1 (en) Locality-sensitive hashing for algebraic expressions
CN110928839A (en) Storage method and system of international freight rate data
CN117689451B (en) Flink-based stream vector search method, device and system
JP2016184272A (en) Database management system, method of buffering the same, and computer program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant