CN117573703A - Universal retrieval method, system, equipment and storage medium for time sequence data - Google Patents

Universal retrieval method, system, equipment and storage medium for time sequence data Download PDF

Info

Publication number
CN117573703A
CN117573703A CN202410057573.1A CN202410057573A CN117573703A CN 117573703 A CN117573703 A CN 117573703A CN 202410057573 A CN202410057573 A CN 202410057573A CN 117573703 A CN117573703 A CN 117573703A
Authority
CN
China
Prior art keywords
universal
index
data
time sequence
universal index
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202410057573.1A
Other languages
Chinese (zh)
Other versions
CN117573703B (en
Inventor
游浣权
王勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kelai Network Technology Co ltd
Original Assignee
Kelai Network Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kelai Network Technology Co ltd filed Critical Kelai Network Technology Co ltd
Priority to CN202410057573.1A priority Critical patent/CN117573703B/en
Publication of CN117573703A publication Critical patent/CN117573703A/en
Application granted granted Critical
Publication of CN117573703B publication Critical patent/CN117573703B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2455Query execution
    • G06F16/24552Database cache management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/24569Query processing with adaptation to specific hardware, e.g. adapted for using GPUs or SSDs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a general retrieval method, a system, equipment and a storage medium of time sequence data, wherein the method comprises the following steps: respectively constructing a universal index according to the time sequence data table of each time point; the method comprises the steps of performing block compression on a universal index to obtain a compressed universal index, firstly storing the compressed universal index into a magnetic disk, and then storing time sequence data in a time sequence data table into the magnetic disk; the universal index is read from the disk, and the time series data is filtered and retrieved according to the universal index. The invention can quickly and accurately search time sequence data according to the position information in the universal index, has simple structure, is suitable for all time sequence data tables, simultaneously avoids the waste of read-write disk IO, consumes less CPU and memory resources, and further improves the effective utilization rate of the time sequence data.

Description

Universal retrieval method, system, equipment and storage medium for time sequence data
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a general retrieval method, system, device and storage medium for time series data.
Background
With the advent of the internet of things and the 5G era, the time sequence scenes such as the internet of things, application monitoring and the industrial internet are in explosive growth, so that large-scale time sequence data needs to be processed in real time. The time sequence data is data indexed according to time dimension, and can effectively reflect the change rule of the data index in the time dimension, so that the time sequence data is widely applied to big data analysis scenes including user behavior analysis, user tag calculation, network statistical engineering and the like. In these big data analysis scenarios, a lot of data tables are usually generated, and in the process of querying the data tables, a full amount of data query and data retrieval are required, wherein the data query is to query all data of a certain time point or time period statistical table, and the data retrieval is to retrieve data meeting certain conditions in the certain time point or time period statistical table.
The conventional retrieval scheme for time sequence data generally comprises the steps of firstly traversing time to find the time required to be retrieved, then reading data corresponding to the time from a disk, and filtering according to retrieval conditions to obtain the required data. Some schemes provide that invalid time points are filtered by adopting a time projection index, so that the retrieval performance is improved, but the scheme can only filter from a time level and cannot accurately position the position of data, and the waste of IO resources and the low effective utilization rate of the data are caused. Still other schemes propose to construct a fast index, and obtain the exact position of the data by reading the index, but the cost of constructing the fast index is too great, and a large amount of disk space is required for storing the index by the data table, and meanwhile, a large amount of CPU and memory resources are consumed when constructing the index, which is not suitable for all the data tables. Therefore, a time series data query searching scheme capable of effectively improving searching efficiency and having a wide application range is needed.
Disclosure of Invention
In view of the above-mentioned shortcomings of the prior art, the present invention provides a general retrieval method, system, device and storage medium for time series data, which effectively solves the problems of low retrieval efficiency, low effective utilization rate of data and smaller application range of the time series data retrieval method in the prior art.
In a first aspect, the present invention provides a general retrieval method of time series data, the method comprising the steps of:
respectively constructing a universal index according to the time sequence data table of each time point;
the universal index is compressed in blocks to obtain a compressed universal index, the compressed universal index is stored in a magnetic disk, and then time sequence data in the time sequence data table are stored in the magnetic disk;
and reading the universal index from the disk, and filtering and retrieving the time sequence data according to the universal index.
Further, the constructing the universal index according to the time sequence data table of each time point specifically includes:
applying for caching according to the length of the key value field configured by the time sequence data table;
copying key value data of the time sequence data table to the initial address of the cache;
sorting the key values of each row of the time sequence data table, and recording the sorted sequence values at the back of the key value data of the application cache;
the key value data and the sequence value generated by each row of the time sequence data table are combined to form the universal index.
Further, the size of the buffer is calculated using the following formula:
in the above-mentioned method, the step of,Tfor the size of the cache memory to be the same,nfor the total number of key-value fields,H i is the firstiThe size of the individual key-value fields,Zis the byte size of the sequence value.
Further, the block compressing the universal index to obtain a compressed universal index specifically includes:
dividing the universal index according to a first set value to obtain data blocks with the same size;
and compressing the data blocks to form compressed data blocks, and recombining the compressed data blocks according to the sequence of the data blocks before compression to obtain a compressed universal index for storing in a disk.
Further, the first setting value is set according to the size of the compressed data block, and the size of the compressed data block is set according to the parameters of the magnetic disk.
Further, reading the universal index from the disk, filtering and retrieving time sequence data according to the universal index specifically includes:
reading a universal index in a time range to be searched from a disk, and filtering according to the universal index;
if the time sequence data does not exist in the filtering, ending the searching, and returning the result to display the time sequence data;
and if filtering to obtain time sequence data, reading the time sequence data according to the position in the universal index, and returning the read time sequence data as a retrieval result.
Further, filtering according to the universal index specifically includes:
if the universal index has a sequence value, constructing an array with the size of the number of the universal index lines, scanning the sequence value in the universal index, searching the position corresponding to the sequence value in the array according to the sequence value, filling the position into the universal index line number, obtaining the ordered array of the universal index, and searching a filtering value in the ordered array by adopting a binary search method;
if the universal index has no sequence value, traversing the universal index and matching the filtering value until all the universal indexes are traversed.
In a second aspect, the present invention provides a general retrieval system for time series data, the system comprising:
the index construction module is used for constructing a universal index according to the time sequence data table at the same time;
the data storage module is used for carrying out block compression on the universal index to obtain a compressed universal index, storing the compressed universal index into a magnetic disk, and then storing time sequence data in the time sequence data table into the magnetic disk;
and the data retrieval module is used for reading the universal index from the disk, and filtering and retrieving the time sequence data according to the universal index.
In a third aspect, the invention provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor executing steps of the computer program implementing a general retrieval method of time series data according to the first aspect of the invention.
In a fourth aspect, the present invention provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the steps of the general purpose retrieval method of time series data according to the first aspect of the present invention.
The general retrieval method, the system, the device and the storage medium for time series data, provided by the invention, construct the general index to store before time series data storage, the general index is aggregated according to time points and comprises key value data and record row position information, the position recorded by the corresponding time series data table on a magnetic disk can be accurately positioned, and meanwhile, the key value information which does not exist at the time point can be filtered in advance. The universal index has simple structure, is suitable for all time sequence data tables, writes the compressed universal index into the disk and stores the compressed universal index, quickly and accurately searches time sequence data according to the position information in the universal index, avoids the waste of read-write disk IO, consumes less CPU and memory resources, and further improves the effective utilization rate of the time sequence data.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope, and other related drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a general retrieval method of time series data provided by an embodiment of the invention;
FIG. 2 is a diagram illustrating the organization of memory data with universal index according to an embodiment of the present invention;
FIG. 3 is a block diagram of a general retrieval system for time series data according to an embodiment of the present invention;
fig. 4 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs. The terminology used in the description of the templates herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. The term "and/or" as used herein includes any and all combinations of one or more of the associated listed items.
The conventional retrieval scheme for time sequence data generally comprises the steps of firstly traversing time to find the time required to be retrieved, then reading data corresponding to the time from a disk, and filtering according to retrieval conditions to obtain the required data. Some schemes provide that invalid time points are filtered by adopting a time projection index, so that the retrieval performance is improved, but the scheme can only filter from a time level and cannot accurately position the position of data, and the waste of IO resources and the low effective utilization rate of the data are caused. Still other schemes propose to construct a fast index, and obtain the exact position of the data by reading the index, but the cost of constructing the fast index is too great, and a large amount of disk space is required for storing the index by the data table, and meanwhile, a large amount of CPU and memory resources are consumed when constructing the index, which is not suitable for all the data tables.
In view of the above-mentioned shortcomings of the prior art, the present invention provides a general retrieval method, system, device and storage medium for time series data, which effectively solves the problems of low retrieval efficiency, low effective utilization rate of data and smaller application range of the time series data retrieval method in the prior art.
An embodiment of the present invention provides a general retrieval method of time series data, fig. 1 is a flowchart of a general retrieval method of time series data provided by the embodiment of the present invention, as shown in fig. 1, the method includes the following steps:
step S100, respectively constructing a universal index according to a time sequence data table of each time point, specifically comprising the following steps:
the universal index of the time sequence data table of each time point is independently constructed, and the specific construction steps are as follows:
firstly, applying for caching according to the length of a key value field configured by a time sequence data table, wherein the size calculation formula of the caching is as follows:
in the above-mentioned method, the step of,Tin order to be of the size of the cache,nfor the total number of key-value fields,H i is the firstiThe size of the individual key-value fields,Zfor the byte size of the sequence value, Z is 4 bytes in the embodiment of the invention.
And then copying the key value data of the time sequence data table to the starting address of the cache, sorting the key values of each row of the time sequence data table, recording the sorted sequence values at the back of the key value data of the application cache, wherein 4 bytes represented by Z in the size calculation formula of the cache are used for storing the sequence values of the row after sorting.
Each row of the time sequence data table generates a key value data and a sequence value of 4 bytes, and the data of all rows are combined together to form a general index of the time sequence data table at the time point, and fig. 2 is a schematic diagram of memory data organization of the general index in the embodiment of the invention.
Step 200, performing block compression on the universal index to obtain a compressed universal index, storing the compressed universal index into a disk, and then storing time sequence data in a time sequence data table into the disk, wherein the method specifically comprises the following steps:
the universal index is divided according to the first set value to obtain data blocks with the same size, and as each divided data block is compressed, the use space of a disk can be saved, the same disk space can be used for a longer time conveniently, and therefore the data blocks are compressed to form compressed data blocks and then written into the disk. The first set value is set according to the size of a compressed data block, the size of the compressed data block is set according to parameters of a magnetic disk, the efficiency of writing the magnetic disk in adaptive hardware is highest by 1-2M, and in order to ensure that the size of the compressed data block is 1-2M, the first set value in the embodiment of the invention is set to 4M, namely all universal indexes are divided according to the size of 4M to form data blocks with the same size, and the size of most data blocks after compression is 1-2M.
The compressed data blocks are recombined according to the sequence before the data blocks are compressed to form a compressed universal index, the compressed universal index is stored in a magnetic disk, and then the time sequence data in a time sequence data table are stored in the magnetic disk, so that the principle that the magnetic disk has the highest sequential reading and writing performance can be utilized, and the reading and writing data scheme reaches the optimal state. Because the time sequence data must be read first and then read after the universal index is read, the magnetic disk does not need to deflect the magnetic head after the universal index is read, and the sequential time sequence data on the magnetic disk is continuously read; and simultaneously, when the time sequence data of the time sequence data table at the current time point is written, the universal index is written first, and then the time sequence data of the time sequence data table is written.
Step S300, reading a universal index from a disk, filtering and retrieving time sequence data according to the universal index, wherein the method specifically comprises the following steps:
reading a universal index in a time range to be searched from a disk, filtering according to the universal index, if the universal index has a sequence value, constructing an array with the number of the universal index lines, scanning the sequence value in the universal index, searching a position corresponding to the sequence value in the array according to the sequence value, filling the sequence value into the universal index line number, for example, if the sequence value is 5, then searching the 5 th element of the array, filling the line number of the universal index, sequentially obtaining the ordered array of the universal index, and searching the filtering value in the ordered array by adopting a binary search method. If the universal index has no sequence value, traversing the universal index, and matching the filter value while traversing until the universal index is traversed completely.
If the time sequence data does not exist in the filtering, the searching is finished, and the returned result shows that the time sequence data does not exist.
If the time sequence data exist in the filtering, the time sequence data are read according to the position in the universal index, the read time sequence data are the time sequence data meeting the filtering condition, and the read time sequence data are returned to serve as a search result.
The embodiment of the invention also provides a general retrieval system of time sequence data, and fig. 3 is a structural diagram of the general retrieval system of time sequence data, as shown in fig. 3, the system comprises:
an index construction module 310, configured to construct a universal index according to a time sequence data table at the same time;
the data storage module 320 is configured to compress the universal index in blocks to obtain a compressed universal index, store the compressed universal index in a disk, and store the time-series data in the time-series data table in the disk;
and the data retrieval module 330 is configured to read the universal index from the disk, and filter and retrieve time-series data according to the universal index.
Based on the same conception, the embodiment of the invention also provides a schematic structural diagram of the electronic device, as shown in fig. 4, the electronic device may include: processor 410, communication interface (Communications Interface) 420, memory 430 and communication bus 440, wherein processor 410, communication interface 420 and memory 430 communicate with each other via communication bus 440. The processor 410 may invoke logic instructions in the memory 430 to perform the steps of the general retrieval method of time series data as described in the various embodiments above. Examples include:
step S100, respectively constructing a general index according to a time sequence data table of each time point;
step 200, performing block compression on the universal index to obtain a compressed universal index, firstly storing the compressed universal index into a magnetic disk, and then storing time sequence data in a time sequence data table into the magnetic disk;
step S300, reading the universal index from the disk, and filtering and retrieving time sequence data according to the universal index.
The processor 410 may be a central processing unit (Central Processing Unit, CPU). The processor may also be any other general purpose processor, digital signal processor (Digital Signal Processor, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or any combination thereof.
Further, the logic instructions in the memory 430 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Memory 430 may include a storage program area that may store an operating system, at least one application program required for functionality, and a storage data area; the storage data area may store data created by the processor, etc. In addition, the memory may include high-speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some implementations, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the processor through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Based on the same conception, the embodiments of the present invention also provide a computer readable storage medium storing a computer program, the computer program containing at least one piece of code executable by a master control device to control the master control device to implement the steps of the general retrieval method of time series data according to the above embodiments. Examples include:
step S100, respectively constructing a general index according to a time sequence data table of each time point;
step 200, performing block compression on the universal index to obtain a compressed universal index, firstly storing the compressed universal index into a magnetic disk, and then storing time sequence data in a time sequence data table into the magnetic disk;
step S300, reading the universal index from the disk, and filtering and retrieving time sequence data according to the universal index.
Based on the same technical concept, the embodiment of the present invention also provides a computer program, which is used to implement the above-mentioned method embodiment when the computer program is executed by the master control device.
The program may be stored in whole or in part on a storage medium that is packaged with the processor, or in part or in whole on a memory that is not packaged with the processor.
Based on the same technical concept, the embodiment of the invention also provides a processor, which is used for realizing the embodiment of the method. The processor may be a chip.
In summary, the general retrieval method, system, device and storage medium for time series data provided by the invention construct a general index to store before time series data storage, the general index is aggregated according to time points and comprises key value data and record row position information, the position recorded by the corresponding time series data table on a magnetic disk can be accurately positioned, and meanwhile, key value information which does not exist at the time point can be filtered in advance. The universal index has simple structure, is suitable for all time sequence data tables, writes the compressed universal index into the disk and stores the compressed universal index, quickly and accurately searches time sequence data according to the position information in the universal index, avoids the waste of read-write disk IO, consumes less CPU and memory resources, and further improves the effective utilization rate of the time sequence data.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the present application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims (10)

1. A method for universal retrieval of time series data, the method comprising the steps of:
respectively constructing a universal index according to the time sequence data table of each time point;
the universal index is compressed in blocks to obtain a compressed universal index, the compressed universal index is stored in a magnetic disk, and then time sequence data in the time sequence data table are stored in the magnetic disk;
and reading the universal index from the disk, and filtering and retrieving the time sequence data according to the universal index.
2. The method for universal retrieval of time series data according to claim 1, wherein the constructing universal indexes according to the time series data table of each time point respectively specifically comprises:
applying for caching according to the length of the key value field configured by the time sequence data table;
copying key value data of the time sequence data table to the initial address of the cache;
sorting the key values of each row of the time sequence data table, and recording the sorted sequence values at the back of the key value data of the application cache;
the key value data and the sequence value generated by each row of the time sequence data table are combined to form the universal index.
3. The general search method of time series data according to claim 2, wherein the size of the buffer is calculated using the following formula:
in the above-mentioned method, the step of,Tfor the size of the cache memory to be the same,nfor the total number of key-value fields,H i is the firstiThe size of the individual key-value fields,Zis the byte size of the sequence value.
4. The method for universal retrieval of time series data according to claim 1, wherein the block-compressing the universal index to obtain the compressed universal index specifically comprises:
dividing the universal index according to a first set value to obtain data blocks with the same size;
and compressing the data blocks to form compressed data blocks, and recombining the compressed data blocks according to the sequence of the data blocks before compression to obtain a compressed universal index for storing in a disk.
5. The general retrieval method for time series data according to claim 4, wherein the first set value is set according to a size of the compressed data block, the size of the compressed data block being set according to a parameter of the disk.
6. The method for universal retrieval of time series data according to claim 1, wherein reading the universal index from the disk, filtering and retrieving time series data according to the universal index specifically comprises:
reading a universal index in a time range to be searched from a disk, and filtering according to the universal index;
if the time sequence data does not exist in the filtering, ending the searching, and returning the result to display the time sequence data;
and if filtering to obtain time sequence data, reading the time sequence data according to the position in the universal index, and returning the read time sequence data as a retrieval result.
7. The method for universal retrieval of time series data according to claim 1, wherein filtering according to the universal index specifically comprises:
if the universal index has a sequence value, constructing an array with the size of the number of the rows of the universal index, scanning the sequence value in the universal index, searching the position corresponding to the sequence value in the array according to the sequence value, filling the position into the row number of the universal index, obtaining the ordered array of the universal index, and searching a filtering value in the ordered array by adopting a binary search method;
if the universal index has no sequence value, traversing the universal index and matching the filtering value until all the universal indexes are traversed.
8. A universal retrieval system for time series data, the system comprising:
the index construction module is used for respectively constructing universal indexes according to the time sequence data table of each time point;
the data storage module is used for carrying out block compression on the universal index to obtain a compressed universal index, storing the compressed universal index into a magnetic disk, and then storing time sequence data in the time sequence data table into the magnetic disk;
and the data retrieval module is used for reading the universal index from the disk, filtering data according to the universal index and retrieving the time sequence data.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor executes the steps of the computer program implementing a general retrieval method of time series data according to any one of claims 1 to 7.
10. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the steps of the general retrieval method of time series data according to any one of claims 1 to 7.
CN202410057573.1A 2024-01-16 2024-01-16 Universal retrieval method, system, equipment and storage medium for time sequence data Active CN117573703B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410057573.1A CN117573703B (en) 2024-01-16 2024-01-16 Universal retrieval method, system, equipment and storage medium for time sequence data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410057573.1A CN117573703B (en) 2024-01-16 2024-01-16 Universal retrieval method, system, equipment and storage medium for time sequence data

Publications (2)

Publication Number Publication Date
CN117573703A true CN117573703A (en) 2024-02-20
CN117573703B CN117573703B (en) 2024-04-09

Family

ID=89864720

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410057573.1A Active CN117573703B (en) 2024-01-16 2024-01-16 Universal retrieval method, system, equipment and storage medium for time sequence data

Country Status (1)

Country Link
CN (1) CN117573703B (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092844A (en) * 2011-10-28 2013-05-08 腾讯科技(深圳)有限公司 Index creating method, index creating system, searching method and searching system
CN111125120A (en) * 2019-12-30 2020-05-08 广州数锐智能科技有限公司 Stream data-oriented fast indexing method, device, equipment and storage medium
CN112650756A (en) * 2020-12-29 2021-04-13 成都科来网络技术有限公司 Time projection indexing method and system based on time sequence data
CN113268636A (en) * 2021-06-22 2021-08-17 成都科来网络技术有限公司 Rapid retrieval method and device based on time sequence data
CN114647658A (en) * 2022-03-30 2022-06-21 新华三信息技术有限公司 Data retrieval method, device, equipment and machine-readable storage medium

Family Cites Families (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5918225A (en) * 1993-04-16 1999-06-29 Sybase, Inc. SQL-based database system with improved indexing methodology
US5613110A (en) * 1995-01-05 1997-03-18 International Business Machines Corporation Indexing method and apparatus facilitating a binary search of digital data
CN102446184B (en) * 2010-10-12 2013-06-19 上海可鲁系统软件有限公司 Industrial data storage and index method based on time series
CN103488709B (en) * 2013-09-09 2017-06-16 东软集团股份有限公司 A kind of index establishing method and system, search method and system
CN104866502B (en) * 2014-02-25 2020-10-13 深圳市中兴微电子技术有限公司 Data matching method and device
CN106991102B (en) * 2016-01-21 2021-06-08 腾讯科技(深圳)有限公司 Processing method and processing system for key value pairs in inverted index
CN106844664B (en) * 2017-01-20 2020-04-17 北京理工大学 Time series data index construction method based on abstract
CN107871022B (en) * 2017-12-20 2018-12-11 清华大学 A kind of storage of time series data column, querying method and system
CN109164980B (en) * 2018-08-03 2024-02-02 北京涛思数据科技有限公司 Aggregation optimization processing method for time sequence data
CN109325032B (en) * 2018-09-18 2020-10-27 厦门市美亚柏科信息股份有限公司 Index data storage and retrieval method, device and storage medium
CN110580253B (en) * 2019-09-10 2022-05-31 网易(杭州)网络有限公司 Time sequence data set loading method and device, storage medium and electronic equipment
CN110765138B (en) * 2019-10-31 2023-01-20 北京达佳互联信息技术有限公司 Data query method, device, server and storage medium
CN110888886B (en) * 2019-11-29 2022-11-11 华中科技大学 Index structure, construction method, key value storage system and request processing method
CN111782659B (en) * 2020-07-10 2023-10-17 东北大学 Database index creation method, device, computer equipment and storage medium
CN112286867B (en) * 2020-10-27 2022-03-01 山东鼎滏软件科技有限公司 Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium
CN113312313B (en) * 2021-01-29 2023-09-29 淘宝(中国)软件有限公司 Data query method, nonvolatile storage medium and electronic device
CN113656397A (en) * 2021-07-02 2021-11-16 阿里巴巴新加坡控股有限公司 Index construction and query method and device for time series data
CN113360551B (en) * 2021-08-11 2021-11-16 南京赛宁信息技术有限公司 Method and system for storing and rapidly counting time sequence data in shooting range
CN114398373A (en) * 2022-01-16 2022-04-26 瞰客信息科技(上海)有限公司 File data storage and reading method and device applied to database storage
CN115510339A (en) * 2022-09-29 2022-12-23 杭州海康威视数字技术股份有限公司 Method, device and equipment for space-time query and storage medium
CN116126864A (en) * 2023-01-18 2023-05-16 阿里云计算有限公司 Index construction method, data query method and related equipment
CN116501760A (en) * 2023-04-04 2023-07-28 杭州电子科技大学 Efficient distributed metadata management method combining memory and prefix tree
CN117235069A (en) * 2023-09-12 2023-12-15 京东科技信息技术有限公司 Index creation method, data query method, device, equipment and storage medium
CN117312308A (en) * 2023-09-18 2023-12-29 上海沄熹科技有限公司 Method and device for creating index in time sequence database and time sequence database system

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103092844A (en) * 2011-10-28 2013-05-08 腾讯科技(深圳)有限公司 Index creating method, index creating system, searching method and searching system
CN111125120A (en) * 2019-12-30 2020-05-08 广州数锐智能科技有限公司 Stream data-oriented fast indexing method, device, equipment and storage medium
CN112650756A (en) * 2020-12-29 2021-04-13 成都科来网络技术有限公司 Time projection indexing method and system based on time sequence data
CN113268636A (en) * 2021-06-22 2021-08-17 成都科来网络技术有限公司 Rapid retrieval method and device based on time sequence data
CN114647658A (en) * 2022-03-30 2022-06-21 新华三信息技术有限公司 Data retrieval method, device, equipment and machine-readable storage medium

Also Published As

Publication number Publication date
CN117573703B (en) 2024-04-09

Similar Documents

Publication Publication Date Title
CN110149803B (en) Data storage method, system and terminal equipment
KR102564170B1 (en) Method and device for storing data object, and computer readable storage medium having a computer program using the same
EP2633413B1 (en) Low ram space, high-throughput persistent key-value store using secondary memory
US9507821B2 (en) Mail indexing and searching using hierarchical caches
US20200409925A1 (en) Data processing method and apparatus, storage medium and electronic device
US20100228914A1 (en) Data caching system and method for implementing large capacity cache
CN111309720A (en) Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium
CN103019887A (en) Data backup method and device
WO2013075306A1 (en) Data access method and device
WO2013166125A1 (en) Systems and methods of accessing distributed data
CN111831691B (en) Data reading and writing method and device, electronic equipment and storage medium
CN117573703B (en) Universal retrieval method, system, equipment and storage medium for time sequence data
CN112181302A (en) Data multilevel storage and access method and system
CN116909939A (en) LSM tree-based key value separation storage engine garbage recycling method, system and equipment
CN111913913A (en) Access request processing method and device
CN113779286B (en) Method and device for managing graph data
CN110413724A (en) A kind of data retrieval method and device
CN109213972B (en) Method, device, equipment and computer storage medium for determining document similarity
CN112307272B (en) Method, device, computing equipment and storage medium for determining relation information between objects
WO2020238750A1 (en) Data processing method and apparatus, electronic device, and computer storage medium
CN114579617A (en) Data query method and device, computer equipment and storage medium
CN114416741A (en) KV data writing and reading method and device based on multi-level index and storage medium
CN112527804A (en) File storage method, file reading method and data storage system
CN111723266A (en) Mass data processing method and device
CN109408462A (en) A kind of document storage management method and electronic equipment based on educational system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant