CN117349323A - Database data processing method and device, storage medium and electronic equipment - Google Patents

Database data processing method and device, storage medium and electronic equipment Download PDF

Info

Publication number
CN117349323A
CN117349323A CN202311655917.0A CN202311655917A CN117349323A CN 117349323 A CN117349323 A CN 117349323A CN 202311655917 A CN202311655917 A CN 202311655917A CN 117349323 A CN117349323 A CN 117349323A
Authority
CN
China
Prior art keywords
data
query
sub
service
database
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202311655917.0A
Other languages
Chinese (zh)
Other versions
CN117349323B (en
Inventor
孙辽东
李世刚
张书博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Metabrain Intelligent Technology Co Ltd
Original Assignee
Suzhou Metabrain Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Metabrain Intelligent Technology Co Ltd filed Critical Suzhou Metabrain Intelligent Technology Co Ltd
Priority to CN202311655917.0A priority Critical patent/CN117349323B/en
Publication of CN117349323A publication Critical patent/CN117349323A/en
Application granted granted Critical
Publication of CN117349323B publication Critical patent/CN117349323B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24532Query optimisation of parallel queries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/248Presentation of query results
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data processing method and device of a database, a storage medium and electronic equipment, wherein the method comprises the following steps: responding to the acquired service inquiry instruction, and determining a plurality of first data fragments in which service data inquired by the service inquiry instruction in a designated database are stored; splitting a service inquiry instruction into a plurality of sub inquiry instructions according to a time period corresponding to each of the plurality of first data fragments; using each sub-query instruction in the plurality of sub-query instructions to query the corresponding first data fragments in parallel to obtain a plurality of sub-query results, wherein each sub-query result in the plurality of sub-query results comprises service data queried from the corresponding first data fragments; and merging the plurality of sub-query results into a target query result, and sending the target query result to target equipment, wherein the target equipment is equipment for sending a service query instruction. By the method and the device, the query efficiency of the service data is improved.

Description

Database data processing method and device, storage medium and electronic equipment
Technical Field
The embodiment of the application relates to the technical field of big data, in particular to a data processing method and device of a database, a storage medium and electronic equipment.
Background
The time sequence data is a series of data which is continuously generated along with time, and the time sequence data is data carrying a time stamp. A time series database (Time Series Database, abbreviated TSDB) is a database optimized for ingestion, processing and storage of time stamped data, such data possibly including metrics from servers and applications, readings from sensors of the internet of things, user interactions on websites or applications, and the like.
In an actual service scene, a time sequence database needs to meet the requirement of storage of mass data and data export across time periods; a plurality of physical nodes are required to be deployed for the time sequence database for mass data processing, and the load capacity of the time sequence database can be improved, but the operation and maintenance cost is high and the resource waste is high; the time sequence database in the single machine mode is convenient to maintain and deploy, but the data query response speed is low, and the rapid query of mass data cannot be met.
Therefore, the data processing method of the database in the related technology has the technical problem of slower data query response speed under the single machine condition.
Disclosure of Invention
The embodiment of the application provides a data processing method and device of a database, a storage medium and electronic equipment, and aims to at least solve the technical problem that the data processing efficiency is low in a single machine mode in the data processing method of the database in the related technology.
According to an embodiment of the present application, there is provided a data processing method of a database, including: determining a plurality of first data fragments in which service data queried by the service query instruction are stored in a designated database in response to the acquired service query instruction, wherein the service data in the designated database are respectively stored in corresponding data fragments according to the time period to which the service data belong, each data fragment in the designated database is used for storing the service data in one time period, and the time range covered by the plurality of first data fragments comprises the designated time period to which the service data queried by the service query instruction belong; splitting the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments; using each sub-query instruction in the plurality of sub-query instructions to query corresponding first data fragments in parallel to obtain a plurality of sub-query results, wherein each sub-query result in the plurality of sub-query results comprises service data queried from the corresponding first data fragments; and merging the plurality of sub-query results into a target query result, and sending the target query result to target equipment, wherein the target equipment is equipment for sending the service query instruction.
According to still another embodiment of the present application, there is provided a data processing apparatus of a database, including: a determining unit, configured to determine, in response to an obtained service query instruction, a plurality of first data slices in which service data queried by the service query instruction is stored in a specified database, where the service data in the specified database is stored in corresponding data slices according to a time period to which the service data belongs, each data slice in the specified database is used to store service data in a time period, and a time range covered by the plurality of first data slices includes a specified time period to which the service data queried by the service query instruction belongs; the splitting unit is used for splitting the service query instruction into a plurality of sub-query instructions according to the time period corresponding to each first data fragment in the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments; the using unit is used for inquiring the corresponding first data fragments in parallel by using each sub-inquiry instruction in the plurality of sub-inquiry instructions to obtain a plurality of sub-inquiry results, wherein each sub-inquiry result in the plurality of sub-inquiry results comprises service data inquired from the corresponding first data fragments; and the merging unit is used for merging the plurality of sub-query results into a target query result and sending the target query result to target equipment, wherein the target equipment is equipment for sending the service query instruction.
According to a further aspect of the embodiments of the present application, there is provided a computer readable storage medium comprising a stored program, wherein the program when run performs the steps of any of the method embodiments described above.
According to a further aspect of the embodiments of the present application, there is provided an electronic device comprising a memory in which a computer program is stored and a processor arranged to perform the steps of any of the method embodiments described above by means of the computer program.
According to the embodiment of the application, the method and the device for inquiring the service data, the service data inquired in the appointed database and the first data fragments corresponding to the belonged time period are determined in response to the acquired service inquiry instruction, so that the specific storage area of the inquired service data in the appointed database can be determined; splitting the service query instruction into a plurality of sub-query instructions respectively corresponding to the first data fragments according to the time periods corresponding to the plurality of first data fragments, thereby splitting the service query instruction into a plurality of sub-query instructions so as to facilitate subsequent parallel query; the method comprises the steps of using a plurality of sub-query instructions to query a plurality of corresponding first data fragments in parallel to obtain a plurality of sub-query results containing service data queried from the corresponding first data fragments, so that the query efficiency of the service data can be improved by performing parallel query through the plurality of sub-query instructions corresponding to the service query instructions; the sub-query results are combined into a target query result and sent to target equipment, so that the target query result corresponding to the service query instruction can be returned while the query efficiency is improved; under the condition that a single database processes mass data, the query efficiency of service data is improved, and the technical problem that the data query response speed is low under the condition of a single database in the related art is solved.
Drawings
Fig. 1 is a hardware block diagram of a computer terminal of a data processing method of a database according to an embodiment of the present application;
FIG. 2 is a flow chart of a method of data processing of a database according to an embodiment of the present application;
FIG. 3 is a flow chart of another method of database data processing according to an embodiment of the present application;
FIG. 4 is a flow chart of a method of data processing of yet another database according to an embodiment of the present application;
FIG. 5 is a schematic device interaction flow diagram of a data writing process according to an embodiment of the present application;
FIG. 6 is a schematic diagram of the modular interaction of a data processing apparatus of a database according to an embodiment of the present application;
FIG. 7 is a block diagram of an alternative database data processing apparatus according to an embodiment of the present application;
fig. 8 is a block diagram of an alternative computer system of an electronic device according to an embodiment of the present application.
Detailed Description
Embodiments of the present application will be described in detail below with reference to the accompanying drawings in conjunction with the embodiments.
It should be noted that the terms "first," "second," and the like in the description and the claims of the embodiments of the present application and the above figures are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used herein is for the purpose of describing embodiments of the present application only and is not intended to be limiting of the present application.
Before further describing embodiments of the present application in detail, the terms and expressions that are referred to in the embodiments of the present application are described, and are suitable for the following explanation.
1. InfluxDB, an open source time sequence database, is focused on high-performance reading, high-performance writing, high-efficiency storage, real-time analysis and the like of massive time sequence data.
2. Retention Policies, a retention policy in InfluxDB may define a retention time period of data and a partition manner of fragmented data (shard).
3. Telegraf, an index acquisition component, for data acquisition and writing into InfluxDB.
4. The object relation mapping (Object Relational Mapping, abbreviated as ORM) is used for mapping between the relational database and the business entity objects, so that when a specific business object is operated, a complex SQL sentence is not needed, and only the attribute and the method of the simple operation object are needed.
5. The structured query language (Structured Query Language, abbreviated as SQL) is a special purpose programming language, a database query and programming language, for accessing data and querying, updating and managing relational database systems.
The method embodiments provided in the embodiments of the present application may be performed in a mobile terminal, a computer terminal or similar computing device. Taking a computer terminal as an example, fig. 1 is a block diagram of a hardware structure of a computer terminal of a data processing method of a database according to an embodiment of the present application. As shown in fig. 1, a computer terminal may include one or more (only one is shown in fig. 1) processors 102 (the processors 102 may include, but are not limited to, a microprocessor, a processing arrangement such as a programmable logic device) and a memory 104 for storing data, wherein the computer terminal may also include a transmission device 106 for communication functions and an input-output device 108. It will be appreciated by those skilled in the art that the configuration shown in fig. 1 is merely illustrative and is not intended to limit the configuration of the computer terminal described above. For example, the computer terminal may also include more or fewer components than shown in FIG. 1, or have a different configuration than shown in FIG. 1.
The memory 104 may be used to store a computer program, for example, a software program of application software and a module, such as a computer program corresponding to a method of transmitting a message in the embodiment of the present application, and the processor 102 executes the computer program stored in the memory 104, thereby performing various functional applications and data processing, that is, implementing the method described above. Memory 104 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid-state memory. In some examples, the memory 104 may further include memory remotely located relative to the processor 102, which may be connected to the mobile terminal via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The transmission device 106 is used to receive or transmit data via a network. Specific examples of the network described above may include a wireless network provided by a communication provider of a computer terminal. In one example, the transmission device 106 includes a network adapter (Network Interface Controller, simply referred to as NIC) that can connect to other network devices through a base station to communicate with the internet. In one example, the transmission device 106 may be a Radio Frequency (RF) module, which is configured to communicate with the internet wirelessly.
In some embodiments, the database in the embodiments of the present application may be implemented as a time sequence database, and may specifically be an InfluxDB database.
InfluxDB is an open source distributed timing, time and index database, written in the Go language, without external reliance. The design goal of InfluxDB is to achieve distributed and horizontal telescoping extensions. InfluxDB is commonly used in artificial intelligence training platforms for storing monitoring data and reporting data. InfluxDB itself cannot support partial types of data features in stand-alone mode, e.g., over 75 tens of thousands of field writes per second, over 100 medium queries per second (multiple aggregated complex queries), over 1000 tens of thousands of progression radix.
In the related technology, influxDB pushes out enterprise versions aiming at mass data, so that the problems of large data storage and query can be solved, and a high-availability scheme is provided; however, the enterprise version is charged more and has poorer maintainability, and the enterprise version must deploy a plurality of physical nodes, thereby causing resource waste and operation and maintenance cost waste.
In the related art, a mode of deploying a plurality of InfluxDB is adopted to realize request load, so that not only can the loading capacity of the InfluxDB be improved, but also a high-availability scheme is provided; however, such methods require more resource consumption and require secondary development based on the InfluxDB open source version, resulting in a failure to stay consistent with the InfluxDB release version.
In an actual service scene, the storage of mass data, multi-dimensional report statistics and cross-time-period data export are required to be met; based on the problems, the embodiment of the application provides data processing of the database, which not only can save research and development cost and customer cost, but also improves expandability, stability and safety of products.
According to an aspect of the embodiments of the present application, there is provided a data processing method of a database, capable of improving query efficiency of service data in the case of a single database, taking a computer terminal as an example to execute the data processing method of the database in the embodiment, fig. 2 is a schematic flow chart of the data processing method of the database according to the embodiments of the present application, as shown in fig. 2, the flow includes the following steps:
in step S202, in response to the acquired service query instruction, a plurality of first data fragments in which the service data queried by the service query instruction in the specified database is stored are determined, where the service data in the specified database is stored in the corresponding data fragments according to the time period to which the service data belongs, each data fragment in the specified database is used to store the service data in one time period, and the time range covered by the plurality of first data fragments includes the specified time period to which the service data queried by the service query instruction belongs.
In order to respond to the service inquiry command quickly, after the service inquiry command is received, a plurality of first data fragments in which service data inquired by the service inquiry command is stored are determined, wherein the first data fragments are data fragments in which the service data stored in a specified database (time sequence database) are stored according to time periods, each data fragment in the specified database is used for storing the service data in a preset time period, and the plurality of first data fragments comprise the data fragments corresponding to the specified time period to which the inquired service data belong.
Here, data slicing (Data slicing) is a technology of storing Data scattered on a plurality of nodes, which divides a large Data set into smaller Data blocks, each of which is allocated to a different node for storage and processing. The data slicing aims to improve the expandability and the performance of a system, avoid single-point faults and improve the safety and the reliability of data.
In some embodiments, the specified database may be a time-series database (Time Series Database, abbreviated as TSDB), for example, influxdb may be used to store time-series data, and the service data queried by the service query instruction may be time-series data (data carrying a timestamp), where the time-series database stores service data in units of a library, and a time period to which each library in the time-series database allows storing service data is greater than a time period to which each data fragment in the time-series database allows storing the same kind of service data. Data slicing may be understood as a slicing file in a time-sequential database, and by means of the slicing position (the position of the data slicing in a specified database, for example, the position in a data table in the specified database), the data slicing may be located, so that writing and reading of service data according to the data slicing are realized. Here, the time series database is a database for ingest, process and store time stamp data, such data may include metrics from servers and applications, readings from internet of things sensors, real-time data generated on websites or applications.
For example, the specified database may be Influxdb and the business query instructions may be directed to managing storage hardware-related performance data and related multidimensional report data in a large-scale cluster in an artificial intelligence development platform.
By the embodiment provided by the application, after the service query instruction is received, a plurality of first data fragments in which the service data queried by the service query instruction are specifically stored can be determined in advance, so that the service query instruction can be split later.
In step S204, the service query instruction is split into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data slices, where the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data slices in the plurality of first data slices.
In order to improve the query efficiency of a single-node database under massive data, the service query instruction is split into sub-query instructions corresponding to each first data fragment respectively according to a time period corresponding to the first data fragment in which the service data queried by the service query instruction is stored, so that the subsequent parallel query is facilitated, and the query efficiency is improved.
In some embodiments, when the data to be written is written into the database, the database performs hierarchical processing on the data to be written according to the data characteristic of the data to be written (for example, according to the acquisition frequency of the data to be written), so as to obtain a plurality of first data fragments corresponding to the data to be written, where the plurality of first data fragments include a unique identifier, a database name identifier, a database table name identifier, and a fragment identifier.
In some embodiments, the period of time corresponding to the first data slice corresponds to a start time of data to be written (i.e. a time when data acquisition starts) to a deadline of data to be written (i.e. a time when data acquisition is stopped), i.e. the period of time corresponding to the first data slice is determined by a preset data classification policy.
Here, the frequency of data collection is defined during data collection, and the collected data to be written is classified according to the frequency of collection, for example, the data classification policy may be a second level (data collection is performed in units of seconds), a minute level (data collection is performed in units of minutes), an hour level (data collection is performed in units of hours), a day level (data collection is performed in units of days), and the like. For example, when the data classification policy is on the order of hours, one data slice (data slice identifier) is generated per hour and includes a start time and a deadline (time period) corresponding to the data slice identifier, where the acquisition interval may be on the order of seconds or minutes, or the like.
By the embodiment provided by the application, the service query instruction can be split into a plurality of sub-query data corresponding to the first data fragments according to the time period corresponding to the first data fragments, so that the follow-up parallel query can be conveniently carried out.
In step S206, the corresponding first data fragment is queried in parallel by using each sub-query instruction in the plurality of sub-query instructions, so as to obtain a plurality of sub-query results, where each sub-query result in the plurality of sub-query results includes the service data queried from the corresponding first data fragment.
In the related art, when a single-node time sequence database responds to a service query instruction to perform data query, the designated database can only be queried for the whole service query instruction, so that the query response speed is slow under the condition that mass data is stored in the database.
In the embodiment of the application, the service query instruction can be split into the plurality of sub-query instructions corresponding to the first data fragments according to the time period corresponding to the first data fragments, so that the corresponding first data fragments in the appointed database are queried in parallel by using the plurality of sub-query instructions, a plurality of sub-data query results are obtained, and the data query efficiency is improved.
Here, the speed of data positioning can be increased by performing data writing and reading in units of data fragments with respect to data inquiry in units of library. The plurality of first data fragments respectively store the service data queried by the service query instruction, each sub-query result in the plurality of sub-query results comprises the service data queried from the corresponding first data fragments, and the corresponding first data fragments are queried in parallel by using the plurality of sub-query instructions, so that different parts of the required service data can be queried simultaneously, and the data query efficiency is improved.
According to the embodiment of the application, the sub-query results can be obtained from the plurality of first data fragments respectively by carrying out parallel query on the plurality of sub-queries, so that the data query efficiency is improved.
In step S208, the multiple sub-query results are combined into a target query result, and the target query result is sent to the target device, where the target device is a device that sends the service query instruction.
In order to improve the query efficiency and return the target query result corresponding to the service query instruction, a plurality of sub-query results need to be combined to obtain the target query result corresponding to the service query instruction, and the target query result is returned to the target device.
In the related art, various types of data, such as CPU performance data and GPU performance data, need to be stored in a service scenario, and the data structure of each performance data is different; therefore, the database concurrent query controller dynamically returns different objects of different performance data for the scene, for example, different concurrent query controllers are defined according to different data structures (data types), but such methods cannot dynamically expand new functions (performances), and the maintenance cost of codes is increased due to the requirement of intrusion of service code logic, or a unified concurrent query controller is used, but data is unified, map (Map is used for storing data with mapping relation, map set stores two groups of values, one group of values is used for storing keys in Map, and the other group of values is used for storing value in Map, both keys and value can be any reference type data) structural storage, but such methods cannot provide north interface docking more intuitively, and service attribute information is ambiguous.
In this embodiment of the present application, the service object and the storage attribute information in the relational database (for example, influxdb) are maintained in the configuration file of the database according to the database name and the database table name (performance data), so as to obtain a service data conversion file, so that a plurality of sub-query results obtained by querying can be converted in an execution engine of the database, so as to obtain data in a unified format (for example, JSON format and object concept in Java), and to combine the plurality of sub-query results, so as to obtain a target query result (that is, a target service data conversion file).
In some embodiments, when the database is implemented as an Influxdb database, in the lnfluxDB database, the JSON formatted data may be written to the database through a web interface (HTTP API) or an Influxdb client library. Meanwhile, the InfluxDB database also supports inquiring and reading data in the JSON format, and can return an inquiring result in the JSON format. That is, JSON data may be stored in InfluxDB, read and operate through the query language and interface (API) of InfluxDB.
According to the embodiment provided by the application, the plurality of sub-query results are converted into the specific data format and summarized to obtain the target query result, so that the target query result corresponding to the service query instruction can be returned while the query efficiency is improved.
Through the steps of the embodiments of the present application, in response to an obtained service query instruction, determining a plurality of first data fragments in which service data queried by the service query instruction is stored in a specified database, where the service data in the specified database is respectively stored in corresponding data fragments according to a time period to which the service data belongs, each data fragment in the specified database is used for storing service data in a time period, and a time range covered by the plurality of first data fragments includes a specified time period to which the service data queried by the service query instruction belongs; splitting the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments; using each sub-query instruction in the plurality of sub-query instructions to query corresponding first data fragments in parallel to obtain a plurality of sub-query results, wherein each sub-query result in the plurality of sub-query results comprises service data queried from the corresponding first data fragments; and merging the plurality of sub-query results into a target query result, and sending the target query result to target equipment, wherein the target equipment is equipment for sending the service query instruction. Under the condition that a single database processes mass data, the query efficiency of service data is improved, and the technical problem that the data query response speed is low under the condition of a single database in the related art is solved.
In one exemplary embodiment, in response to the acquired service query instruction, determining a plurality of first data slices in a specified database in which service data queried by the service query instruction is stored, includes:
s11, responding to the acquired service query instruction, and determining a first data table to be queried by the service query instruction in the appointed database and an appointed time period.
And S12, determining a plurality of first data fragments in which the service data queried by the service query instruction is stored according to the time period corresponding to each data fragment contained in the first data table and the designated time period.
After the service query instruction is acquired, in order to facilitate the subsequent splitting of the service query instruction according to the time period corresponding to the service data, it is required to determine the first data tables of the plurality of first data fragments in which the service data queried by the service query instruction is stored and the designated time period corresponding to the service data; and matching according to the time period corresponding to each data fragment contained in the first data table and the appointed time period of the queried service data in the service query instruction, so as to determine a plurality of first data fragments in which the service data queried by the service query instruction is stored.
In some embodiments, the database includes a plurality of data tables (tables) in which traffic data is stored, the data tables being composed of rows (row) and columns (column), and being a two-dimensional grid structure, each column being a field. The field consists of a field name and the data type of the field, and some constraints.
In some embodiments, the specified time period is a time interval between a start time and an expiration time in the data collection process, where the specified time period is a time interval (corresponding to the first data slice) when the service data queried by the service query instruction is written (collected) to the database.
By the embodiment provided by the application, when the service query instruction is acquired, the first data table of the service data corresponding to the service query instruction and the designated time period are firstly determined so as to facilitate the subsequent determination of a plurality of first data fragments in which the queried service data is stored.
In an exemplary embodiment, the data fragments in the first data table sequentially store service data according to the sequence of time periods, and according to the time period corresponding to each data fragment included in the first data table and the designated time period, determining a plurality of first data fragments in which the service data queried by the service query instruction is stored includes:
S21, determining a first target data slice in the first data table, wherein the corresponding time period comprises the starting time of the designated time period, and a second target data slice in the first data table, wherein the corresponding time period comprises the deadline of the designated time period.
S22, determining a plurality of data fragments from the first target data fragment to the second target data fragment in the first data table as a plurality of first data fragments.
The service data are sequentially stored according to the sequence of the time periods when the data of the first data table are segmented, so that the first target data segmentation of the starting time of the corresponding appointed time period in the first data table and the second target data segmentation of the deadline of the corresponding appointed time period in the first data table can be determined; and determining a plurality of data fragments from the first target data fragment to the second target data fragment as a plurality of first data fragments in which the service data queried by the service query instruction is stored.
In some embodiments, the time periods may be divided according to the frequency of acquisition of the traffic data, such as a second-level time period, a minute-level time period, an hour-level time period, etc., where the data acquisition interval within each time period may be on the order of seconds. For example, traffic data is collected every second during a minute-level period.
In some embodiments, during service data collection, it is determined that the database and the data table (the first data table) to which the collected service data is written, that is, the service data stored in the database includes information such as a database name, a data table name, a writing time period, and the like corresponding to the service data.
According to the embodiment provided by the application, a plurality of first data fragments which are stored in the service data queried by the service query instruction can be determined according to the starting time and the ending time of the corresponding time period in the first time table, so that the data query efficiency is improved.
In one exemplary embodiment, in response to the acquired service query instruction, determining a plurality of first data slices in a specified database in which service data queried by the service query instruction is stored, includes:
s31, responding to the acquired service inquiry command, and analyzing a group of command parameter information from the service inquiry command.
The set of instruction parameter information comprises a specified database identifier of a specified database, a first data table identifier of a first data table in the specified database and a specified time period identifier of a specified time period.
S32, querying the relational database by using the appointed database identification, the first data table identification and the appointed time period identification to obtain data fragment identifications of the plurality of first data fragments.
The relational database is used for storing the corresponding relation between the data fragments contained in the data table in the database and the time periods to which the service data stored by the data fragments belong.
In some embodiments, the service query instruction carries a set of instruction parameter information including a database identifier of a designated database corresponding to the queried service data, a first data table identifier of a first data table in the designated database, and a designated time period identifier of a designated time period; here, the corresponding plurality of first data fragment identifications can be queried through the database identification, the first data table identification and the designated time period identification.
In some embodiments, the instruction parameter information carries a plurality of sets of corresponding specified database identifiers, first data table identifiers and specified time period identifiers, and each set of identifiers can query a plurality of corresponding first data fragments.
In some embodiments, the relational database stores the correspondence between the data fragments contained in the data table in the database and the time periods to which the service data stored by the data fragments belong, and the data structure of the relational database refers to table 1:
TABLE 1
The unique identifier is an identifier of the data fragment, the fragment identifiers in different data tables can be repeated, and a time period to which the service data belongs is a time period from the starting time to the deadline.
According to the embodiment provided by the application, the data fragments contained in the data table in the relational database (time sequence database) and the corresponding relation between the time periods of the service data stored by the data fragments are stored, so that a plurality of first data fragments in which the queried service data are stored can be determined according to the service query instruction.
In one exemplary embodiment, splitting the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data slices includes:
s41, respectively taking intersections of time periods corresponding to the first data fragments and the designated time periods as time periods to be queried, and generating sub-query instructions corresponding to the first data fragments to obtain a plurality of sub-query instructions.
The sub-query instruction corresponding to each first data fragment carries a data fragment identifier of each first data fragment.
The plurality of first data fragments store the service data corresponding to the service data according to the belonged time period, but the first data fragments may also store other data, so that the intersection of the appointed time period of the service data and the time period corresponding to the first data fragments is taken as the time period to be queried, and the sub-query instruction corresponding to the first data fragments is generated according to the time period to be queried.
In some embodiments, after obtaining the service query instruction, first, the key information of the database statement (to-be-executed SQL statement) is parsed, including: database name, table name, time range; secondly, reading a relational database according to the database name and the table name to obtain data fragment information; thirdly, completing cutting of the business query instruction statement according to the time range in the data slicing information and the time range of the database statement to be executed (namely the appointed time period to which the queried business data belongs) to obtain a plurality of sub-query statements (cut SQL statements to be executed); finally, a plurality of sub-query statements (cut SQL statements) are placed in a cache queue (e.g., arrayList in java).
For example, the business query instructions (to execute SQL statements) may be:
“select * from databaseX.tableX where time >=2021-11-26 01:10:00 and time <=2021-11-26 02:30:59”。
retrieving the relational database and returning to obtain a plurality of first data fragments, referring to Table 2:
TABLE 2
Respectively taking the intersection of the time period corresponding to each first data fragment and the designated time period as the time period to be queried, generating sub-query instructions corresponding to each first data fragment, and obtaining a plurality of sub-query instructions, wherein the plurality of sub-query instructions are as follows:
select * from databaseX.tableX where time >=2021-11-26 01:10:00 and time <=2021-11-26 01:59:59 on shard = 1
select * from databaseX.tableX where time >=2021-11-26 02:00:00 and time <=2021-11-26 02:30:59 on shard = 2
According to the embodiment provided by the application, the business query instruction can be split according to the time period to which the business query instruction belongs according to the corresponding relation between the data fragments contained in the data table in the database stored in the relational database and the time period to which the business data stored in the data fragments belong, so that a plurality of sub-query instructions are obtained, and the query efficiency is improved.
In an exemplary embodiment, before determining, in response to the acquired service query instruction, a plurality of first data slices in the specified database in which the service data queried by the service query instruction is stored, the method further includes:
s51, intercepting the instruction by the instruction interceptor to obtain the intercepted business query instruction.
The instruction interceptor is used for intercepting and reading the business data instructions stored in the appointed database.
In order to split the service query instruction, when the service query instruction is acquired, the instruction interception is needed through the instruction interceptor so as to obtain the intercepted service query instruction.
In some embodiments, when the database is implemented as an Influxdb database, the instruction interceptor may be implemented as an Aspect interceptor by which all traffic involving InfluxdB reads is intercepted in the traffic layer code.
By the embodiment provided by the application, the instruction interceptor is arranged to intercept the business data instructions stored in the appointed database, so that the problem that the corresponding query speed is slow due to the fact that the business query instructions are directly executed by the appointed database is avoided.
In one exemplary embodiment, the instruction interceptor is configured to intercept instructions in an intercepting manner that is at least one of: intercepting according to the file, intercepting according to the method name, and intercepting according to the specified file wildcard; the instruction interceptor cuts into the appointed service in a surrounding notification mode so as to be decoupled with the appointed service, and service data inquired by the service inquiry instruction are service data of the appointed service.
In some embodiments, when the database is implemented as an Influxdb database, the instruction interceptor may be implemented as an Aspect interceptor; intercepting all services related to InfluxdB reading in a service layer code of an Influxdb database through an Aspect interceptor mainly comprises the following steps: defining a cut-in point (e.g., uniform intercept by file definition, intercept by method name definition, intercept by file wild card definition), core business handling (cut-in of a specific business to decouple from the actual business using a surround notification in Aspect) and a common method in surround notification to invoke database Statement (SQL) execution logic.
In some embodiments, when the database is implemented as an Influxdb database, the instruction interceptor may be implemented as an Aspect interceptor, where Aspect is a concept in a spring framework (concept of AOP facet, a technology for implementing unified maintenance of program functions by precompiled mode and run-time dynamic proxy) may be understood as a method that the platform defines a global method, where the method includes an intercepted address, all actual execution of the service is intercepted by the method, and the interception returns after the interception process is completed, for example, may be: before () — Aspect interceptors need to handle the processing; the actual business () — would be directly grabbed by the spring framework; the service instruction interception by the Aspect interceptor can realize that the whole process is not invasive to the service code.
According to the embodiment provided by the application, the service query instruction is intercepted by the instruction interceptor according to the appointed interception mode, so that the service code can be uninterruptedly decoupled from the appointed service while the interception is realized.
In an exemplary embodiment, using each sub-query instruction of the plurality of sub-query instructions to query a corresponding first data fragment in parallel, a plurality of sub-query results are obtained, including:
S61, calling an instruction execution engine in parallel by using each sub-query instruction to obtain a sub-query result which is returned by the instruction execution engine and corresponds to each sub-query instruction.
The instruction execution engine is used for searching service data from the appointed database by using the acquired query instructions, and the sub-query results comprise sub-query results corresponding to each sub-query instruction.
In some embodiments, the instruction execution engine performs service data parallel retrieval from a specified database based on a specified time period corresponding to the service data and a plurality of first data fragment information carried in the sub-query instruction, so as to obtain a sub-query result corresponding to each sub-query instruction.
In some embodiments, the instruction execution engine is a database statement execution engine in a specified database that performs fast retrieval primarily through the time frame of the data slicing information and the queried business data.
In one exemplary embodiment, referring to FIG. 3, after the instruction execution engine is invoked in parallel using each sub-query instruction, the method further comprises:
s71, responding to each obtained sub-query instruction, and executing data retrieval operation by using each sub-query instruction as a current sub-query instruction through an instruction execution engine.
For each sub-query instruction, the data retrieval operation can be executed by the instruction execution engine by taking each sub-query instruction as the current sub-query instruction, so as to obtain a sub-query result corresponding to each sub-query instruction. The data retrieval operation may include the following steps, where the first data slice corresponding to the current sub-query instruction is a current data slice, the current data slice is a data slice included in a first data table in the specified database, and a time period to which the service data to be queried by the current sub-query instruction belongs is a current time period:
s711, extracting a designated database identifier of a designated database, a first data table identifier of a first data table, a current data fragment identifier of a current data fragment and a current time period identifier of a current time period from the current sub-query instruction;
s712, locating the current disk file in which the service data to be queried currently by the current sub-query instruction is stored according to the specified database identifier, the first data table identifier, the current data fragment identifier and the current time period identifier;
s713, querying the appointed database by using the appointed database identification, the first data table identification, the file identification of the current disk file and the current time period identification to obtain a current sub-query result corresponding to the current sub-query instruction.
When the database is implemented as the Influxdb database, since the Influxdb itself does not support directly searching according to the data fragment information (in the related art, the designated database can only be searched through the service query instruction), in order to complete decoupling of the data fragment information and the native database statement in the service layer, a query interceptor (i.e., an instruction interceptor) is added in the Influxdb database, the disk file to be searched by the Influxdb itself is positioned through the data fragment (data fragment identifier, database identifier, data table identifier), and then the method of transmitting the designated database identifier, the first data table identifier, the file identifier of the current disk file and the current time period identifier query designation database into the Influxdb itself is continuously executed, so as to obtain the current sub-query result corresponding to the current sub-query instruction.
By the embodiment provided by the application, the concurrent query of the data fragments based on the time sequence database (InfluxDB) is realized, so that the sub-query result corresponding to the sub-query instruction is obtained, the sub-query result is decoupled with the actual service layer code (the service is not invaded) in the whole realization process, the data query efficiency is improved, and the maintenance is convenient.
In one exemplary embodiment, querying a specified database using a specified database identifier, a first data table identifier, a file identifier of a current disk file, and a current time period identifier to obtain a sub-query result corresponding to a current sub-query instruction includes:
s81, generating target query information according to a hypertext transfer protocol by using a specified database identifier, a first data table identifier, a file identifier of a current disk file and a current time period identifier;
s82, calling a hypertext transfer protocol interface of the appointed database, and transmitting target query information into the appointed database to obtain a current sub-query result returned by the appointed database, wherein the current sub-query result is JSON data.
In order to improve the usability of the sub-query structure, the conversion from data in the database to page data needs to be mainly completed in the process of returning the sub-query result.
In some embodiments, the call instruction execution engine generates target query information according to a hypertext transfer protocol (HyperText Transfer Protocol, abbreviated as HTTP) by using a specified database identifier corresponding to the queried service data, a first data table identifier, a file identifier of a current disk file, and a current time period identifier; and calling a hypertext transfer protocol interface of the database through the instruction execution engine, transmitting target query information into the appointed database, and returning a current sub-query result (namely JSON data).
For example, the program code for directly calling the interface of InfluxDB using https protocol may be "Get 'http:// localhost: 8086/querypretty=true' - -data-url code" db=databaseX "- -data-url code" q=select ". FROM\" tab X\ "WHEE time > = 2021-11-26 01:10:00 and time < = 2021-11-26:59:59" - -card "1" ".
Here, invoking the hypertext transfer protocol interface of the specified database is a standard request mode of the Influxdb database, so that the Influxdb database returns JSON data based on the target query information.
In lnfluxDB, JSON formatted data can be written to a database through a hypertext transfer protocol interface or InfuxDB client library. Meanwhile, lnfluxDB also supports querying and reading data in JSON format, and can return query results in JSON format. Thus, JSON data may be stored in InfluxDB, read and operate through the query language and interface of InfluxDB.
In an exemplary embodiment, after invoking the hypertext transfer protocol interface of the specified database and transmitting the target query information into the specified database to obtain the current sub-query result returned by the specified database, the method further includes:
S91, converting the current sub-query result into the objectified JSON data to obtain a converted current sub-query result.
The targeted JSON data is JSON data which allows presentation through pages.
In order to improve usability of the sub-query result, data in the database needs to be converted into page data mainly in the process of returning the sub-query result so as to obtain the objective JSON data.
In some embodiments, after obtaining the returned sub-query results (JSON data), the sub-query results are converted to the objectified JSON data by the instruction execution engine from the relational file of the queried business data and database attributes (business data conversion file).
According to the embodiment provided by the application, the sub-query results returned by the instruction query engine can be converted into the targeted JSON data, so that the usability of the data query results is improved, and a plurality of sub-query results can be combined later.
In one exemplary embodiment, before merging the plurality of sub-query results into the target query result, the method further comprises:
s101, respectively executing data conversion operation on each sub-query result to obtain each converted sub-query result.
The data conversion operation is a conversion operation of converting service data stored in a specified database into page data which is allowed to be displayed through pages.
In some embodiments, the object-datalized JSON data may be page-presented data, the database data referring to data in Influxdb, this process mainly completes the data-to-page data conversion in Influxdb, mainly involving unit conversion, fractional reservation, etc.
Here, JSON is a lightweight data exchange format with two structures, a set of key-value pairs and an array. JSON is a common format that can be recognized in both front-end and back-end languages, e.g., JSON data is input into HTML for page presentation.
Through the embodiment provided by the application, the sub-query result returned by the instruction query engine can be converted into the targeted JSON data, so that the sub-query result can be visualized, and the user experience is improved.
In an exemplary embodiment, using each sub-query instruction of the plurality of sub-query instructions to query a corresponding first data fragment in parallel, a plurality of sub-query results are obtained, including:
s111, writing each sub-query instruction into an instruction cache queue corresponding to each sub-query instruction.
The instruction cache queues corresponding to each sub-query instruction are cache queues set for the first data fragments corresponding to each sub-query instruction, and instructions in different instruction cache queues are executed in parallel;
s112, under the condition that the current to-be-executed instruction of the instruction cache queue corresponding to each sub-query instruction is each sub-query instruction, extracting each sub-query instruction from the instruction cache queue corresponding to each sub-query instruction, and querying the corresponding first data fragment by using each sub-query instruction to obtain a sub-query result corresponding to each sub-query instruction.
Wherein the plurality of sub-query results includes sub-query results corresponding to each sub-query instruction.
In order to perform parallel query on multiple sub-query instructions, after splitting a service query instruction into multiple sub-query instructions, each sub-query instruction needs to be written into a corresponding instruction cache queue (instructions in different instruction cache queues are executed in parallel) respectively, so that subsequent concurrent control of sub-query statements is facilitated.
In some embodiments, a Java client using an Influxdb is connected to an Influxdb database, a Java concurrent programming mechanism is used for initiating parallel query, each sub-query instruction is extracted from an instruction cache queue corresponding to each sub-query instruction, and a first data fragment corresponding to each sub-query instruction is queried to obtain a sub-query result corresponding to each sub-query instruction.
Here, java's thread pool is used to manage threads of concurrent queries to increase the efficiency of concurrent queries.
According to the embodiment provided by the application, the plurality of sub-query instructions are respectively placed in the cache queue, so that the parallel execution of the plurality of sub-query instructions is conveniently carried out through the cache queue, and the concurrency quantity of data query is improved.
In one exemplary embodiment, referring to FIG. 4, each data slice in the designated database corresponds to one designated acquisition frequency of a set of designated acquisition frequencies, and the time period corresponding to each data slice in the designated database matches the designated acquisition frequency corresponding to each data slice in the designated database; the method further comprises the following steps:
s121, acquiring a data writing request, wherein the data writing request is used for requesting writing the acquired target service data into a specified database;
s122, under the condition that the fact that the data fragments corresponding to the target service data do not exist in the appointed database is determined according to the target acquisition frequency and the current time of the target service data, the target service data are written into a second data fragment created for the target service data in the appointed database.
The designated acquisition frequency corresponding to the second data slice is the designated acquisition frequency closest to the target acquisition frequency in the set of designated acquisition frequencies.
In some embodiments, each data slice in the designated database corresponds to a designated acquisition frequency of a set of designated acquisition frequencies, e.g., seconds, minutes, hours, days, etc., where the data acquisition interval within each acquisition frequency may be seconds. For example, the traffic data is collected every second (or every 1 second interval) within an hour-level collection frequency.
When the collected target service data is written into a designated database (time sequence database), and under the condition that the data fragments corresponding to the target service data do not exist in the designated database (data table), creating a second data fragment according to the target collection frequency and the current time of the target service data, and writing the collected target service data into the second data fragment in the designated database.
In some embodiments, whether the corresponding data fragments exist in the designated database is judged according to the designated database identifier of the expected writing carried by the data writing request and the data table identifier in the database.
In one exemplary embodiment, after obtaining the write data request, the method further includes:
s131, when the third data fragment corresponding to the target service data exists in the appointed database and the target service data does not exist in the third data fragment according to the target acquisition frequency and the current writing time of the target service data, the target service data is written into the third data fragment.
The designated acquisition frequency corresponding to the third data slice is the designated acquisition frequency closest to the target acquisition frequency in the set of designated acquisition frequencies.
In some embodiments, whether the corresponding data fragments exist in the designated database is judged according to the designated database identifier of the expected writing carried by the data writing request and the data table identifier in the database.
For example, referring to Table 3, when the specified database is Database X and the data table is TableX, then there is no change in the data table (time period 2021-11-26 02:10 to 2021-11-26-02:59) already exists in the data slice corresponding to unique identifier 2 (i.e., the third data slice) when the current write time is 2021-11-26-02:10:10).
TABLE 3 Table 3
By the embodiment provided by the application, when the data fragments corresponding to the target service data to be written exist, if the target service data does not exist in the data fragments, the target service data is directly written into the data fragments corresponding to the data fragments, so that the simplicity of a database (data table) is improved.
In one exemplary embodiment, the second data shard is a data shard contained in a second data table in the specified database; after writing the target service data into the second data fragment generated for the target service data in the specified database, the method further comprises:
and S141, adding target data fragment information corresponding to the second data fragment in the relational database.
The target data slicing information is used for indicating the corresponding relation among the appointed database, the second data table, the second data slicing and the time period corresponding to the second data slicing.
For example, referring to Table 4, when the specified database is Database X, the data table is TableX, and the current writing time is 2021-11-26-03:10:10, the data with unique identifier 4 is added to the data table of the database.
TABLE 4 Table 4
/>
Here, the target data shard uniquely identified as 4 is used to indicate a correspondence between a specified Database (i.e., databaseX), a second data table (TableX), a second data shard located in the second data table of the specified Database (shard identification in the TableX data table of the DatabaseX Database is 3), and a time period (2021-11-26-03:00 to 2021-11-26-03:59) corresponding to the second data shard.
In some embodiments, referring to fig. 5, in response to a user initiated write data request, querying a relational database for the presence of data to be written through a Telegraf output plug-in (Telegraf Outputplugin); if so, returning that the data to be written exists (processing is completed); if the data does not exist, a relational database initiates a slicing logic, and slicing data are generated in the data slicing; and writing the generated sliced data into an influxdb database through the data slicing, writing the sliced data into a relational database through the influxdb database, and then returning a processing completion notification to a user.
By the embodiment provided by the application, after the corresponding data fragments are generated in the appointed database, the corresponding unique identifiers are generated in the relational database so as to record the corresponding relations of the target data fragments in the appointed database (as shown in the table 4).
In one exemplary embodiment, after obtaining the write data request, the method further includes:
s151, determining target data retention time length corresponding to the target acquisition frequency of the target service data according to the corresponding relation between the preset acquisition frequency and the data retention time length.
The target data retention time is the time allowed to be retained after the target service data is persisted in the appointed database;
s152, setting data retention time for the target service data according to the current time and the target data retention time.
The data retention time is the time allowed to be retained by the target business data in the appointed database.
In some embodiments, the target data retention time period (retention policy) is defined according to data characteristics (e.g., acquisition frequency or expected retention time) before the target service data is written into the specified database, while maintaining in the relational database a relationship of database identification, data table identification, shard identification, and time period of the data shards corresponding to the target service data.
Here, the target data retention period (retention policy) may be a time that the data remains after persistence, such as 10 days of data retention, and the database system will automatically recycle all data (e.g., direct physical deletion) within 10 days at day 11. For example, the Influxdb database may be implemented by the program edit statement "CREATE RETENTION POLICY" policy name "ON database name_Table name DURATION DURATION REPLICATION copy number".
By the embodiment provided by the application, after the write-in data request is acquired, the target retention time of the data is defined (and recorded in the relational database) at the same time, so that overload of the database caused by excessive data is avoided.
In one exemplary embodiment, the acquisition of the collected business data and the writing of the business data into the data fragments in the specified database are specified by the specified collection component; the method further comprises, prior to obtaining the write data request:
and S161, dynamically loading the output plug-in according to the output mode indicated by the deployment script of the designated acquisition component to obtain the target output plug-in.
The target output plug-in is a packaged binary executable file, and a data judging code for judging whether the business data exist in the appointed database and a fragment generating code for generating data fragments in the appointed database are written in the binary executable file.
In some embodiments, when the database is implemented as an Influxdb database, the designated collection component may be Telegraf, in this embodiment, secondary development is performed based on InfluxDB Output native to Telegraf (collection component), logic such as data judgment and fragment generation is added, then, the secondarily developed code is packaged into a binary executable file, finally, a deployment script of the Telegraf is modified, an Output mode (for example,/bin/Telegraf-Output-filter myInfluxdbOutput) of the Output is configured in the deployment script, and an original Output plug-in is replaced by a dynamic loading mechanism of the Telegraf itself.
According to the embodiment provided by the application, the data judging code (used for judging whether the specified data exists in the specified database) and the slicing generation code (used for generating the data slicing in the specified database) are added in the acquisition component of the database, so that convenience in data acquisition is improved.
In one exemplary embodiment, the designated database is a time-series database, the time-series database stores service data in units of a database, and a time period to which each database in the time-series database allows storing service data is longer than a time period to which each data fragment in the designated database allows storing the same service data; and/or the service query instruction is a structured query language SQL query statement; and/or splitting the business query instruction into a plurality of sub-query instructions is performed by an instruction splitting component for performing query instruction splitting; and/or, the parallel querying of the corresponding first data fragment using each sub-query instruction is performed by an SQL execution engine.
Referring to fig. 6, the embodiment of the present application mainly includes three components, a Retention Policies dynamic definition component, an InfluxDB parse ORM component, and an InfluxDB concurrency query control component.
Retention Policies is mainly used for completing the definition of data write strategy and the maintenance of the relation between the sliced data and the time period, for example, defining the strategy of data retention according to the characteristics of the data (such as second-level data, hour-level data and day-level data) before the data is written into InfluxDB, maintaining the relation between the InfluxDB database/table/sliced and the time period and expanding Telegrafoutput plugin in a relational database.
The Retention Policies dynamic definition component comprises a data retention policy, and is used for storing data acquired by Telegraf through InfluxDB, planning the retention policy of the data according to the acquisition interval and planning the situation of data slicing according to the hierarchical policy.
In some embodiments, the ranking strategy includes a ranking by hour and a ranking by day, e.g., a ranking by hour is to generate one tile identifier per hour and a ranking by day is to generate one tile identifier per day (data collection is second-level data), referring to table 5:
TABLE 5
After the data slicing is completed, an InfluxDB native interface is called, the creation of the slicing information is completed in the InfluxDB, and meanwhile, the maintenance of the data is completed in a relational database; the problem that the Telegraf cannot specify the data fragments to store the data can be solved (the Telegraf can only write the data according to one strategy, and the time fragments for writing the data are in units of weeks to influence the throughput of service inquiry), and the Telegraf can write into different fragments according to a preset write strategy, so that the load of single fragment reading and writing is reduced; provides a basis for the subsequent implementation of InfluxDB parallel queries, and the decoupling from Telesraf and InfluxDB is accomplished through Telegraf Outputplugin.
Referring to FIG. 6, after periodic acquisition by the Telegraf plug-in, the acquired data is written to the Telegraf output plug-in by the Retention Policies dynamic definition component (Telegraf outputPlugin); judging whether a grading strategy is established according to the database name of the acquired data and the data table name of the acquired data; ending if the file is created; judging a grading strategy (for example, an hour unit or a day unit, wherein the hour unit comprises a second level, a minute level and an hour level, and the day unit comprises a day level and other day levels) according to the acquisition interval if the grading strategy is not created; after judging the hierarchical policy corresponding to the written collected data, creating a policy in the InfluxDB and maintaining the corresponding logical relationship in the relational database.
The InfluxDB analysis ORM component is mainly used for reading the InfluxDB through a program editing language (for example, java language), wherein the InfluxDB analysis ORM component comprises an Aspect interceptor, an SQL splitting component (used for automatically splitting a single SQL sentence into a plurality of SQL sentences) and an SQL execution engine.
The InfluxDB concurrent query control component is mainly used for completing concurrent query control and data summarization of a plurality of SQL sentences, wherein the InfluxDB concurrent query control component comprises: the system comprises a concurrent execution strategy component, a concurrent subtask coordination component and a data merging component. The whole implementation process is completely based on the creation and destruction of a java thread pool, the thread subtask scheduling fence and the realization of data callback.
The embodiment of the application provides a device for solving the problem of slow query response under InfluxDB single-node mass data based on parallel query, which not only can improve query efficiency, but also realizes the thought of data slicing in InfluxDB concurrent query, is decoupled from actual service layer codes in the whole realization process, and does not influence upgrading of Telesraf and InfluxDB. The scheme is integrated into the product in a plug-in mode, does not invade the original service, and reduces the influence on Telesraf and InfluxDB. Based on the scheme, the query efficiency of InfluxDB single-node mass data is improved, the maintenance cost and the technical risk are reduced, and the competitiveness of the AI platform in similar products is improved.
Referring to fig. 6, after receiving an initiated query service, intercepting all services related to InfluxDB reading by an Aspect interceptor of an InfluxDB parse ORM component in a service layer code; splitting according to data slicing (shared) firstly through an SQL splitting component, and splitting according to a time range to obtain a split SQL statement list; the split SQL statement list is put into a concurrency execution strategy component of the InfluxDB concurrency query control component; the SQL execution engine in the ORM component is analyzed through InfluxDB to carry out actual calling execution on the split SQL statement list, namely, query data is carried out according to the time range of data slicing; setting the waiting time of a subtask and the overtime processing strategy by a concurrent subtask coordination component in the InfluxDB concurrent query control component, and completing the deployment of converting service data into files by a data merging component; and finally returning service inquiry data (after interception).
Here, influxDB Retention Policies dynamically defines (components) a data slicing structure for defining the corresponding relation of the database, the data table names and the time periods, and completes initialization in the relational database; defining a corresponding relation between the acquisition frequency and the slicing and grading strategy (for example, the second level/minute level/hour level is an hour unit, the day level is a day unit); newly adding logic codes (adding logic such as data judgment, fragment generation and the like) in an acquisition component (telegraf), packaging the secondarily developed codes into binary executable files, finally modifying a deployment script of the telegraf, configuring an output mode in the deployment script, and restarting the acquisition component.
The InfluxDB analysis ORM component is used for deploying an Aspect interceptor and an SQL execution engine, wherein Aspect interceptor information is embedded into core service codes using InfluxDB, an interceptable service request is defined according to a file path, and a single SQL is deployed to automatically split a plurality of SQL components; and deploying an SQL execution engine and starting a service data conversion file in the dynamic loading InfluxDB concurrent query control.
The InfluxDB concurrent query control (component) comprises an SQL statement concurrent execution strategy component, a concurrent subtask coordination component and a data merging component, wherein the SQL statement concurrent execution strategy component is used for completing the definition of a thread pool, such as a core thread pool 50, a maximum thread pool 500, a request queue 1000, a strategy of re-queuing and the like; the concurrency subtask coordination component is configured to define a subtask waiting timeout time (e.g., 30 seconds) and a processing policy after the timeout (e.g., retry three times, if still failed, the query fails), and the data merge component is configured to complete deployment of the business data conversion file.
By the embodiment provided by the application, the data query and data summarization are completed under the condition of not invading service codes, the query segmentation of SQL statement data is completed under the condition of fast response of query results, and meanwhile, the query segmentation is decoupled with the Influxdb version, so that the possibility of service function problems caused by upgrading of the Influxdb version is reduced.
It should be noted that, for simplicity of description, the foregoing method embodiments are all expressed as a series of action combinations, but it should be understood by those skilled in the art that the present application is not limited by the order of actions described, as some steps may be performed in other order or simultaneously in accordance with the present application. Further, those skilled in the art will also appreciate that the embodiments described in the specification are all preferred embodiments, and that the acts and modules referred to are not necessarily required in the present application.
According to another aspect of the embodiments of the present application, a data processing device for a database is further provided, where the device is used to implement the data processing method for a database provided in the foregoing embodiments, and the description is omitted. As used below, the term "module" may be a combination of software and/or hardware that implements a predetermined function. While the means described in the following embodiments are preferably implemented in software, implementation in hardware, or a combination of software and hardware, is also possible and contemplated.
FIG. 7 is a block diagram of an alternative database data processing apparatus according to an embodiment of the present application, where, as shown in FIG. 7, the apparatus includes:
a determining unit 702, configured to determine, in response to the obtained service query instruction, a plurality of first data slices in which service data queried by the service query instruction in a specified database is stored, where the service data in the specified database is stored in corresponding data slices according to a time period to which the service data belongs, each data slice in the specified database is used to store service data in a time period, and a time range covered by the plurality of first data slices includes a specified time period to which the service data queried by the service query instruction belongs;
a splitting unit 704, configured to split the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data slices, where the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data slices in the plurality of first data slices;
a using unit 706, configured to query the corresponding first data fragment in parallel using each sub-query instruction in the plurality of sub-query instructions to obtain a plurality of sub-query results, where each sub-query result in the plurality of sub-query results includes service data queried from the corresponding first data fragment;
And the merging unit 708 is configured to merge the plurality of sub-query results into a target query result, and send the target query result to a target device, where the target device is a device that sends the service query instruction.
It should be noted that, the determining unit 702 in this embodiment may be used to perform the step S202, the splitting unit 704 in this embodiment may be used to perform the step S204, the using unit 706 in this embodiment may be used to perform the step S206, and the combining unit 708 may be used to perform the step S208.
According to the method and the device for storing the business data, the obtained business query instructions can be responded, a plurality of first data fragments in which the business data queried by the business query instructions in the designated database are stored are determined, the business data in the designated database are respectively stored in the corresponding data fragments according to the time periods to which the business data belong, each data fragment in the designated database is used for storing the business data in one time period, and the time range covered by the plurality of first data fragments comprises the designated time period to which the business data queried by the business query instructions belong; splitting a service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments; using each sub-query instruction in the plurality of sub-query instructions to query the corresponding first data fragments in parallel to obtain a plurality of sub-query results, wherein each sub-query result in the plurality of sub-query results comprises service data queried from the corresponding first data fragments; and merging the plurality of sub-query results into a target query result, and sending the target query result to target equipment, wherein the target equipment is equipment for sending a service query instruction. The query efficiency of the service data is improved, and the technical problem that the data query response speed is lower under the condition of single machine in the data processing method of the database in the related technology is solved.
In one exemplary embodiment, the determining unit includes:
the first determining module is used for determining a first data table to be queried by the service query instruction in the appointed database and an appointed time period in response to the acquired service query instruction;
and the second determining module is used for determining a plurality of first data fragments which are stored in the business data queried by the business query instruction according to the time period corresponding to each data fragment contained in the first data table and the designated time period.
In an exemplary embodiment, the data fragments in the first data table sequentially store service data according to the sequence of the time periods, and the second determining module includes:
a first determining sub-module, configured to determine a first target data slice in the first data table, where the corresponding time period includes a start time of the specified time period, and a second target data slice in the first data table, where the corresponding time period includes a deadline of the specified time period;
and the second determining submodule is used for determining a plurality of data fragments from the first target data fragment to the second target data fragment in the first data table as a plurality of first data fragments.
In one exemplary embodiment, the determining unit includes:
The analysis module is used for responding to the acquired service inquiry instruction and analyzing a group of instruction parameter information from the service inquiry instruction, wherein the group of instruction parameter information comprises a designated database identifier of a designated database, a first data table identifier of a first data table in the designated database and a designated time period identifier of a designated time period;
the usage module is used for querying the relational database by using the appointed database identification, the first data table identification and the appointed time period identification to obtain data slicing identifications of the plurality of first data slicing, wherein the relational database is used for storing the corresponding relation between the data slicing contained in the data table in the database and the time period to which the service data stored by the data slicing belong.
In one exemplary embodiment, the splitting unit includes:
the generation module is used for respectively taking the intersection of the time period corresponding to each first data fragment and the designated time period as the time period to be queried, generating a sub-query instruction corresponding to each first data fragment, and obtaining a plurality of sub-query instructions, wherein the sub-query instruction corresponding to each first data fragment carries the data fragment identification of each first data fragment.
In an exemplary embodiment, the above apparatus further includes:
the intercepting unit is used for intercepting the instruction through the instruction interceptor to obtain an intercepted service inquiry instruction, wherein the instruction interceptor is used for intercepting and reading the service data instruction stored in the appointed database.
In one exemplary embodiment, the instruction interceptor is configured to intercept instructions in an intercepting manner that is at least one of: intercepting according to the file, intercepting according to the method name, and intercepting according to the specified file wildcard; the instruction interceptor cuts into the appointed service in a surrounding notification mode so as to be decoupled with the appointed service, and service data inquired by the service inquiry instruction are service data of the appointed service.
In one exemplary embodiment, the usage unit includes:
the first use module is used for calling the instruction execution engine in parallel by using each sub-query instruction to obtain a sub-query result which is returned by the instruction execution engine and corresponds to each sub-query instruction, wherein the instruction execution engine is used for searching service data from a specified database by using the obtained query instruction, and the plurality of sub-query results comprise the sub-query result corresponding to each sub-query instruction.
In an exemplary embodiment, the above apparatus further includes:
the operation unit is used for responding to each obtained sub-query instruction, and respectively executing the following data retrieval operation by using each sub-query instruction as a current sub-query instruction through the instruction execution engine, wherein a first data fragment corresponding to the current sub-query instruction is a current data fragment, the current data fragment is a data fragment contained in a first data table in a designated database, and a time period to which service data to be queried of the current sub-query instruction belongs is a current time period:
extracting a designated database identifier of a designated database, a first data table identifier of a first data table, a current data fragment identifier of a current data fragment and a current time period identifier of a current time period from a current sub-query instruction;
locating a current disk file in which the service data to be queried currently of the current sub-query instruction is stored according to the specified database identifier, the first data table identifier, the current data fragment identifier and the current time period identifier;
and querying the appointed database by using the appointed database identification, the first data table identification, the file identification of the current disk file and the current time period identification to obtain a current sub-query result corresponding to the current sub-query instruction.
In one exemplary embodiment, the operation unit includes:
the second use module is used for generating target query information according to a hypertext transfer protocol by using the appointed database identifier, the first data table identifier, the file identifier of the current disk file and the current time period identifier;
and the calling module is used for calling the hypertext transfer protocol interface of the appointed database, transmitting the target query information into the appointed database, and obtaining the current sub-query result returned by the appointed database, wherein the current sub-query result is JSON data.
In an exemplary embodiment, the above apparatus further includes:
the first conversion unit is used for converting the current sub-query result into the objective JSON data to obtain the converted current sub-query result, wherein the objective JSON data is the JSON data which is allowed to be displayed through the page.
In an exemplary embodiment, the above apparatus further includes:
and the second conversion unit is used for respectively executing data conversion operation on each sub-query result before combining the plurality of sub-query results into the target query result to obtain each converted sub-query result, wherein the data conversion operation is a conversion operation for converting service data stored in a specified database into page data which is allowed to be displayed through a page.
In one exemplary embodiment, the usage unit includes:
the writing module is used for writing each sub-query instruction into an instruction cache queue corresponding to each sub-query instruction, wherein the instruction cache queue corresponding to each sub-query instruction is a cache queue set for the first data fragment corresponding to each sub-query instruction, and the instructions in different instruction cache queues are executed in parallel;
the extraction module is configured to extract each sub-query instruction from the instruction cache queue corresponding to each sub-query instruction, and query the corresponding first data fragment with each sub-query instruction to obtain a sub-query result corresponding to each sub-query instruction, where the plurality of sub-query results include sub-query results corresponding to each sub-query instruction, when the instruction to be executed currently in the instruction cache queue corresponding to each sub-query instruction is each sub-query instruction.
In one exemplary embodiment, each data slice in the specified database corresponds to one of a set of specified acquisition frequencies, a time period corresponding to each data slice in the specified database matches the specified acquisition frequency corresponding to each data slice in the specified database; the device further comprises:
The acquisition unit is used for acquiring a data writing request, wherein the data writing request is used for requesting to write the acquired target service data into the appointed database;
the first writing unit is used for writing the target service data into a second data fragment created for the target service data in the appointed database under the condition that the data fragment corresponding to the target service data does not exist in the appointed database according to the target acquisition frequency and the current time of the target service data, wherein the appointed acquisition frequency corresponding to the second data fragment is the appointed acquisition frequency closest to the target acquisition frequency in a group of appointed acquisition frequencies.
In an exemplary embodiment, the above apparatus further includes:
and the second writing unit is used for writing the target service data into the third data fragment when the third data fragment corresponding to the target service data exists in the specified database and the target service data does not exist in the third data fragment according to the target acquisition frequency and the current writing time of the target service data after the writing data request is acquired, wherein the specified acquisition frequency corresponding to the third data fragment is the specified acquisition frequency closest to the target acquisition frequency in a group of specified acquisition frequencies.
In one exemplary embodiment, the second data shard is a data shard contained in a second data table in the specified database; the device further comprises:
and the adding unit is used for adding target data slicing information corresponding to the second data slicing in the relational database after the target service data is written into the second data slicing generated for the target service data in the appointed database, wherein the target data slicing information is used for indicating the corresponding relation among the appointed database, the second data table, the second data slicing and the time period corresponding to the second data slicing.
In an exemplary embodiment, the above apparatus further includes:
the reservation unit is used for determining target data reservation time length corresponding to the target acquisition frequency of the target service data according to the corresponding relation between the preset acquisition frequency and the data reservation time length after the data writing request is acquired, wherein the target data reservation time length is the time length which is allowed to be reserved after the target service data is persisted in a designated database;
and the setting unit is used for setting data retention time for the target service data according to the current time and the target data retention time, wherein the data retention time is the time allowed to be retained by the target service data in the appointed database.
In an exemplary embodiment, the acquiring the collected service data and writing the service data into the data fragment of the specified database are specified by a specified collection component, the apparatus further comprising:
the loading unit is used for dynamically loading the output plug-in according to the output mode indicated by the deployment script of the designated acquisition component before acquiring the data writing request to obtain the target output plug-in, wherein the target output plug-in is a packaged binary executable file, and a data judging code for judging whether the business data exist in the designated database and a fragment generating code for generating the data fragments in the designated database are written in the binary executable file.
In one exemplary embodiment, the designated database is a time-series database, the time-series database stores service data in units of a database, and a time period to which each database in the time-series database allows storing service data is longer than a time period to which each data fragment in the designated database allows storing the same service data; and/or the service query instruction is a structured query language SQL query statement; and/or splitting the business query instruction into a plurality of sub-query instructions is performed by an instruction splitting component for performing query instruction splitting; and/or, the parallel querying of the corresponding first data fragment using each sub-query instruction is performed by an SQL execution engine.
According to a further aspect of the embodiments of the present application, there is also provided a computer readable storage medium having stored therein a computer program, wherein the computer program is arranged to perform the steps of any of the method embodiments described above when run.
In one exemplary embodiment, the computer readable storage medium may include, but is not limited to: a usb disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a removable hard disk, a magnetic disk, or an optical disk, or other various media capable of storing a computer program.
According to one aspect of the present application, a computer program product is provided, comprising a computer program/instructions containing program code for performing the method shown in the flow chart. In such an embodiment, referring to fig. 8, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable media 811. When executed by the central processor 801, the computer program performs the various functions provided by the embodiments of the present application. The foregoing embodiment numbers of the present application are merely for describing, and do not represent advantages or disadvantages of the embodiments.
Referring to fig. 8, fig. 8 is a block diagram of a computer system of an alternative electronic device according to an embodiment of the present application.
Fig. 8 schematically shows a block diagram of a computer system for implementing an electronic device according to an embodiment of the present application. As shown in fig. 8, the computer system 800 includes a central processing unit 801 (Central Processing Unit, simply referred to as CPU) which can execute various appropriate actions and processes according to a program stored in a Read-Only Memory 802 (ROM) or a program loaded from a storage section 808 into a random access Memory 803 (Random Access Memory, simply referred to as RAM). In the random access memory 803, various programs and data required for system operation are also stored. The central processing unit 801, the read only memory 802, and the random access memory 803 are connected to each other through a bus 804. An Input/Output interface 805 (I/O interface for short) is also connected to the bus 804.
The following components are connected to the input/output interface 805: an input portion 806 including a keyboard, mouse, etc.; an output portion 807 including a display such as a Cathode Ray Tube (CRT) or a liquid crystal display (Liquid Crystal Display or LCD) and a speaker; a storage section 808 including a hard disk or the like; and a communication section 809 including a network interface card such as a local area network card, modem, or the like. The communication section 809 performs communication processing via a network such as the internet. The drive 810 is also connected to the input/output interface 805 as needed. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as needed so that a computer program read out therefrom is mounted into the storage section 808 as needed.
In particular, according to embodiments of the present application, the processes described in the various method flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program may be downloaded and installed from a network via the communication section 809, and/or installed from the removable media 811. The computer programs, when executed by the central processor 801, perform the various functions defined in the system of the present application.
It should be noted that, the computer system 800 of the electronic device shown in fig. 8 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
According to a further aspect of embodiments of the present application, there is also provided an electronic device comprising a memory, in which a computer program is stored, and a processor arranged to run the computer program to perform the steps of any of the method embodiments described above.
In an exemplary embodiment, the electronic device may further include a transmission device connected to the processor, and an input/output device connected to the processor.
Specific examples in this embodiment may refer to the examples described in the foregoing embodiments and the exemplary implementation, and this embodiment is not described herein.
It will be appreciated by those skilled in the art that the modules or steps of the embodiments of the application described above may be implemented in a general purpose computing device, they may be concentrated on a single computing device, or distributed across a network of computing devices, they may be implemented in program code executable by computing devices, so that they may be stored in a storage device for execution by computing devices, and in some cases, the steps shown or described may be performed in a different order than what is shown or described, or they may be separately fabricated into individual integrated circuit modules, or multiple modules or steps of them may be fabricated into a single integrated circuit module. Thus, embodiments of the present application are not limited to any specific combination of hardware and software.
The foregoing is merely a preferred embodiment of the present application and is not intended to limit the embodiment of the present application, and various modifications and variations may be made to the embodiment of the present application by those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the principles of the embodiments of the present application should be included in the protection scope of the embodiments of the present application.

Claims (22)

1. A data processing method of a database is characterized in that,
comprising the following steps:
determining a plurality of first data fragments in which service data queried by the service query instruction are stored in a designated database in response to the acquired service query instruction, wherein the service data in the designated database are respectively stored in corresponding data fragments according to the time period to which the service data belong, each data fragment in the designated database is used for storing the service data in one time period, and the time range covered by the plurality of first data fragments comprises the designated time period to which the service data queried by the service query instruction belong;
splitting the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments;
using each sub-query instruction in the plurality of sub-query instructions to query corresponding first data fragments in parallel to obtain a plurality of sub-query results, wherein each sub-query result in the plurality of sub-query results comprises service data queried from the corresponding first data fragments;
And merging the plurality of sub-query results into a target query result, and sending the target query result to target equipment, wherein the target equipment is equipment for sending the service query instruction.
2. The method of claim 1, wherein the determining, in response to the obtained service query, a plurality of first data slices in a specified database in which service data queried by the service query is stored, comprises:
the first data table to be queried by the service query instruction in the appointed database and the appointed time period are determined in response to the acquired service query instruction;
and determining the plurality of first data fragments in which the service data queried by the service query instruction is stored according to the time period corresponding to each data fragment contained in the first data table and the designated time period.
3. The method of claim 2, wherein the data fragments in the first data table sequentially store service data according to a time period sequence;
the determining the plurality of first data fragments in which the service data queried by the service query instruction is stored according to the time period corresponding to each data fragment contained in the first data table and the designated time period comprises the following steps:
Determining a first target data slice in the first data table, wherein the corresponding time period comprises the starting time of the specified time period, and a second target data slice in the first data table, wherein the corresponding time period comprises the deadline of the specified time period;
and determining a plurality of data fragments from the first target data fragment to the second target data fragment in the first data table as the plurality of first data fragments.
4. The method of claim 1, wherein the determining, in response to the obtained service query, a plurality of first data slices in a specified database in which service data queried by the service query is stored, comprises:
the response to the acquired service query instruction, a set of instruction parameter information is analyzed from the service query instruction, wherein the set of instruction parameter information comprises a designated database identifier of the designated database, a first data table identifier of a first data table in the designated database and a designated time period identifier of the designated time period;
and querying a relational database by using the designated database identifier, the first data table identifier and the designated time period identifier to obtain data fragment identifiers of the plurality of first data fragments, wherein the relational database is used for storing the corresponding relation between the data fragments contained in the data table in the database and the time periods to which the service data stored by the data fragments belong.
5. The method of claim 1, wherein splitting the service query instruction into a plurality of sub-query instructions according to a time period corresponding to each of the plurality of first data slices comprises:
and respectively taking the intersection of the time period corresponding to each first data fragment and the designated time period as a time period to be queried, and generating sub-query instructions corresponding to each first data fragment to obtain the plurality of sub-query instructions, wherein the sub-query instructions corresponding to each first data fragment carry the data fragment identification of each first data fragment.
6. The method of claim 1, wherein prior to said determining, in response to the obtained service query instruction, a plurality of first data slices in a specified database in which service data queried by the service query instruction is stored, the method further comprises:
and intercepting the instruction through an instruction interceptor to obtain the intercepted service inquiry instruction, wherein the instruction interceptor is used for intercepting and reading the service data instruction stored in the appointed database.
7. The method of claim 6, wherein the instruction interceptor is configured to intercept instructions in an intercepting manner of at least one of: intercepting according to the file, intercepting according to the method name, and intercepting according to the specified file wildcard; the instruction interceptor cuts into the appointed service in a surrounding notification mode so as to be decoupled with the appointed service, and the service data queried by the service query instruction is the service data of the appointed service.
8. The method of claim 1, wherein the querying the corresponding first data slice in parallel using each sub-query instruction of the plurality of sub-query instructions to obtain a plurality of sub-query results comprises:
and calling an instruction execution engine in parallel by using each sub-query instruction to obtain a sub-query result which is returned by the instruction execution engine and corresponds to each sub-query instruction, wherein the instruction execution engine is used for carrying out service data retrieval from the appointed database by using the obtained query instruction, and the plurality of sub-query results comprise the sub-query result corresponding to each sub-query instruction.
9. The method of claim 8, wherein after said parallel invoking of the instruction execution engine using said each sub-query instruction, the method further comprises:
responding to the obtained sub-query instructions, and respectively executing the following data retrieval operations by using the instruction execution engine as a current sub-query instruction, wherein a first data fragment corresponding to the current sub-query instruction is a current data fragment, the current data fragment is a data fragment contained in a first data table in the appointed database, and a time period to which service data to be queried of the current sub-query instruction belongs is a current time period:
Extracting a designated database identifier of the designated database, a first data table identifier of the first data table, a current data slice identifier of the current data slice and a current time period identifier of the current time period from the current sub-query instruction;
locating a current disk file in which the service data to be queried currently by the current sub-query instruction is stored according to the specified database identifier, the first data table identifier, the current data fragment identifier and the current time period identifier;
and querying the specified database by using the specified database identifier, the first data table identifier, the file identifier of the current disk file and the current time period identifier to obtain a current sub-query result corresponding to the current sub-query instruction.
10. The method of claim 9, wherein querying the specified database using the specified database identification, the first data table identification, the file identification of the current disk file, and the current time period identification, obtains a sub-query result corresponding to the current sub-query instruction, comprising:
generating target query information according to a hypertext transfer protocol by using the specified database identifier, the first data table identifier, the file identifier of the current disk file and the current time period identifier;
And calling a hypertext transfer protocol interface of the appointed database, and transmitting the target query information into the appointed database to obtain the current sub-query result returned by the appointed database, wherein the current sub-query result is JSON data.
11. The method of claim 10, wherein after said invoking the hypertext transfer protocol interface of the specified database to pass the target-query information into the specified database to obtain the current sub-query result returned by the specified database, the method further comprises:
and converting the current sub-query result into the objectified JSON data to obtain the converted current sub-query result, wherein the objectified JSON data is the JSON data which is allowed to be displayed through a page.
12. The method of claim 1, wherein prior to said merging the plurality of sub-query results into a target query result, the method further comprises:
and respectively executing data conversion operation on each sub-query result to obtain each converted sub-query result, wherein the data conversion operation is a conversion operation for converting service data stored in the appointed database into page data allowing to be displayed through a page.
13. The method of claim 1, wherein the querying the corresponding first data slice in parallel using each sub-query instruction of the plurality of sub-query instructions to obtain a plurality of sub-query results comprises:
writing each sub-query instruction into an instruction cache queue corresponding to each sub-query instruction, wherein the instruction cache queue corresponding to each sub-query instruction is a cache queue set for the first data fragment corresponding to each sub-query instruction, and instructions in different instruction cache queues are executed in parallel;
and under the condition that the current to-be-executed instruction of the instruction cache queue corresponding to each sub-query instruction is the sub-query instruction, extracting each sub-query instruction from the instruction cache queue corresponding to each sub-query instruction, and querying a corresponding first data fragment by using each sub-query instruction to obtain a sub-query result corresponding to each sub-query instruction, wherein the plurality of sub-query results comprise sub-query results corresponding to each sub-query instruction.
14. The method of claim 1, wherein each data slice in the specified database corresponds to one of a set of specified acquisition frequencies, and wherein a time period corresponding to each data slice in the specified database matches the specified acquisition frequency corresponding to each data slice in the specified database;
The method further comprises the steps of:
acquiring a data writing request, wherein the data writing request is used for requesting to write the acquired target service data into the appointed database;
and under the condition that the data fragments corresponding to the target service data do not exist in the appointed database according to the target acquisition frequency and the current time of the target service data, writing the target service data into a second data fragment created for the target service data in the appointed database, wherein the appointed acquisition frequency corresponding to the second data fragment is the appointed acquisition frequency closest to the target acquisition frequency in the set of appointed acquisition frequencies.
15. The method of claim 14, wherein after the obtaining the write data request, the method further comprises:
and under the condition that a third data fragment corresponding to the target service data exists in the appointed database and the target service data does not exist in the third data fragment according to the target acquisition frequency and the current writing time of the target service data, writing the target service data into the third data fragment, wherein the appointed acquisition frequency corresponding to the third data fragment is the appointed acquisition frequency closest to the target acquisition frequency in the group of appointed acquisition frequencies.
16. The method of claim 14, wherein the second data shard is a data shard contained in a second data table in the specified database; after the writing of the target traffic data into the second data fragment generated for the target traffic data in the specified database, the method further comprises:
and adding target data slicing information corresponding to the second data slicing into a relational database, wherein the target data slicing information is used for indicating the corresponding relation among the appointed database, the second data table, the second data slicing and a time period corresponding to the second data slicing.
17. The method of claim 14, wherein after the obtaining the write data request, the method further comprises:
determining a target data retention time length corresponding to the target acquisition frequency of the target service data according to a corresponding relation between a preset acquisition frequency and a data retention time length, wherein the target data retention time length is a time length which is allowed to be retained after the target service data is persisted in the appointed database;
and setting data retention time for the target service data according to the current time and the target data retention time, wherein the data retention time is the time allowed to be retained by the target service data in the appointed database.
18. The method of claim 14, wherein acquiring collected business data and writing business data into the specified database of data fragments is specified by a specified collection component; before the acquiring the write data request, the method further comprises:
dynamically loading an output plug-in according to an output mode indicated by a deployment script of the specified acquisition component to obtain a target output plug-in, wherein the target output plug-in is a packaged binary executable file, and a data judging code for judging whether service data exist in the specified database and a fragment generating code for generating data fragments in the specified database are written in the binary executable file.
19. The method according to any one of claims 1 to 18, wherein the specified database is a time-series type database that stores traffic data in units of a library, and a period of time to which traffic data that is permitted to be stored in each of the time-series type databases belongs is longer than a period of time to which the same kind of traffic data that is permitted to be stored in each of the data slices in the specified database; and/or, the service query instruction is a structured query language SQL query statement; and/or splitting the business query instruction into the plurality of sub-query instructions is performed by an instruction splitting component for query instruction splitting; and/or, the first data fragment corresponding to the parallel query of each sub-query instruction is executed by the SQL execution engine.
20. A data processing device of a database is characterized in that,
comprising the following steps:
a determining unit, configured to determine, in response to an obtained service query instruction, a plurality of first data slices in which service data queried by the service query instruction is stored in a specified database, where the service data in the specified database is stored in corresponding data slices according to a time period to which the service data belongs, each data slice in the specified database is used to store service data in a time period, and a time range covered by the plurality of first data slices includes a specified time period to which the service data queried by the service query instruction belongs;
the splitting unit is used for splitting the service query instruction into a plurality of sub-query instructions according to the time period corresponding to each first data fragment in the plurality of first data fragments, wherein the sub-query instructions in the plurality of sub-query instructions are in one-to-one correspondence with the first data fragments in the plurality of first data fragments;
the using unit is used for inquiring the corresponding first data fragments in parallel by using each sub-inquiry instruction in the plurality of sub-inquiry instructions to obtain a plurality of sub-inquiry results, wherein each sub-inquiry result in the plurality of sub-inquiry results comprises service data inquired from the corresponding first data fragments;
And the merging unit is used for merging the plurality of sub-query results into a target query result and sending the target query result to target equipment, wherein the target equipment is equipment for sending the service query instruction.
21. A computer-readable storage medium comprising,
the computer readable storage medium has stored therein a computer program, wherein the computer program when executed by a processor realizes the steps of the method as claimed in any of claims 1 to 19.
22. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that,
the processor, when executing the computer program, implements the steps of the method as claimed in any one of claims 1 to 19.
CN202311655917.0A 2023-12-05 2023-12-05 Database data processing method and device, storage medium and electronic equipment Active CN117349323B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311655917.0A CN117349323B (en) 2023-12-05 2023-12-05 Database data processing method and device, storage medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311655917.0A CN117349323B (en) 2023-12-05 2023-12-05 Database data processing method and device, storage medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN117349323A true CN117349323A (en) 2024-01-05
CN117349323B CN117349323B (en) 2024-02-27

Family

ID=89367038

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311655917.0A Active CN117349323B (en) 2023-12-05 2023-12-05 Database data processing method and device, storage medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN117349323B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117931896A (en) * 2024-02-29 2024-04-26 双一力(宁波)电池有限公司 Database query method, device and database query system

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808661A (en) * 2016-02-29 2016-07-27 浪潮通信信息系统有限公司 Data query method and device
CN108182258A (en) * 2018-01-02 2018-06-19 福建中金在线信息科技有限公司 Distributed data analysis system and method
CN108737473A (en) * 2017-04-20 2018-11-02 贵州白山云科技有限公司 A kind of data processing method, apparatus and system
CN110046178A (en) * 2018-01-17 2019-07-23 北京京东尚科信息技术有限公司 The method and apparatus of distributed data inquiry
CN110765157A (en) * 2019-09-06 2020-02-07 中国平安财产保险股份有限公司 Data query method and device, computer equipment and storage medium
CN113779060A (en) * 2021-01-26 2021-12-10 北京沃东天骏信息技术有限公司 Data query method and device
CN114840546A (en) * 2021-02-02 2022-08-02 中国石油天然气股份有限公司 Oil-gas pipeline data query method, device, server and storage medium
CN114880368A (en) * 2022-05-26 2022-08-09 平安普惠企业管理有限公司 Data query method and device, electronic equipment and readable storage medium
CN115935090A (en) * 2023-03-10 2023-04-07 北京锐服信科技有限公司 Data query method and system based on time slicing
CN116701443A (en) * 2022-02-24 2023-09-05 腾讯科技(深圳)有限公司 Data query method, device, computer equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105808661A (en) * 2016-02-29 2016-07-27 浪潮通信信息系统有限公司 Data query method and device
CN108737473A (en) * 2017-04-20 2018-11-02 贵州白山云科技有限公司 A kind of data processing method, apparatus and system
CN108182258A (en) * 2018-01-02 2018-06-19 福建中金在线信息科技有限公司 Distributed data analysis system and method
CN110046178A (en) * 2018-01-17 2019-07-23 北京京东尚科信息技术有限公司 The method and apparatus of distributed data inquiry
CN110765157A (en) * 2019-09-06 2020-02-07 中国平安财产保险股份有限公司 Data query method and device, computer equipment and storage medium
CN113779060A (en) * 2021-01-26 2021-12-10 北京沃东天骏信息技术有限公司 Data query method and device
CN114840546A (en) * 2021-02-02 2022-08-02 中国石油天然气股份有限公司 Oil-gas pipeline data query method, device, server and storage medium
CN116701443A (en) * 2022-02-24 2023-09-05 腾讯科技(深圳)有限公司 Data query method, device, computer equipment and storage medium
CN114880368A (en) * 2022-05-26 2022-08-09 平安普惠企业管理有限公司 Data query method and device, electronic equipment and readable storage medium
CN115935090A (en) * 2023-03-10 2023-04-07 北京锐服信科技有限公司 Data query method and system based on time slicing

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117931896A (en) * 2024-02-29 2024-04-26 双一力(宁波)电池有限公司 Database query method, device and database query system

Also Published As

Publication number Publication date
CN117349323B (en) 2024-02-27

Similar Documents

Publication Publication Date Title
US11941017B2 (en) Event driven extract, transform, load (ETL) processing
JP6617117B2 (en) Scalable analysis platform for semi-structured data
US10713247B2 (en) Executing queries for structured data and not-structured data
US10528599B1 (en) Tiered data processing for distributed data
US11514032B2 (en) Splitting a query into native query operations and post-processing operations
CN105122243B (en) Expansible analysis platform for semi-structured data
US10963839B2 (en) Nested hierarchical rollups by level using a normalized table
CN111177178B (en) Data processing method and related equipment
US11379482B2 (en) Methods, systems, and computer readable mediums for performing an aggregated free-form query
CN117349323B (en) Database data processing method and device, storage medium and electronic equipment
US9229961B2 (en) Database management delete efficiency
US10855750B2 (en) Centralized management of webservice resources in an enterprise
CN111897867A (en) Database log statistical method, system and related device
EP3425534B1 (en) Selecting backing stores based on data request
CN113515564A (en) Data access method, device, equipment and storage medium based on J2EE
CN113297057A (en) Memory analysis method, device and system
CN114443599A (en) Data synchronization method and device, electronic equipment and storage medium
US11354304B1 (en) Stored procedures for incremental updates to internal tables for materialized views
CN113297245A (en) Method and device for acquiring execution information
CN106802922B (en) Tracing storage system and method based on object
KR101648401B1 (en) Database apparatus, storage unit and method for data management and data analysys
CN114817300A (en) Log query method based on SQL (structured query language) statements and application thereof
CN113448957A (en) Data query method and device
US10558637B2 (en) Modularized data distribution plan generation
CN113032430B (en) Data processing method, device, medium and computing equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant