CN112286867B - Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium - Google Patents

Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium Download PDF

Info

Publication number
CN112286867B
CN112286867B CN202011160744.1A CN202011160744A CN112286867B CN 112286867 B CN112286867 B CN 112286867B CN 202011160744 A CN202011160744 A CN 202011160744A CN 112286867 B CN112286867 B CN 112286867B
Authority
CN
China
Prior art keywords
data
index
file
time
block
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011160744.1A
Other languages
Chinese (zh)
Other versions
CN112286867A (en
Inventor
刘骏
王德生
张斌
于景洋
赵仁翔
李长笑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Dingfu Software Technology Co ltd
Original Assignee
Shandong Dingfu Software Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Dingfu Software Technology Co ltd filed Critical Shandong Dingfu Software Technology Co ltd
Priority to CN202011160744.1A priority Critical patent/CN112286867B/en
Publication of CN112286867A publication Critical patent/CN112286867A/en
Application granted granted Critical
Publication of CN112286867B publication Critical patent/CN112286867B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots
    • G06F16/113Details of archiving
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/14Details of searching files based on file metadata
    • G06F16/148File search processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • G06F16/1744Redundancy elimination performed by the file system using compression, e.g. sparse files

Abstract

The invention belongs to the technical field of oilfield production data processing, and discloses an oil and gas field time sequence data storage method, an inquiry method, a device thereof and a storage medium. The invention can save the expenditure of server storage resources, avoid unnecessary storage of redundant data, solve the problem of data writing pressure of million-level equipment of a single machine system and improve the data query efficiency.

Description

Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium
Technical Field
The invention relates to the technical field of oilfield production data processing, in particular to an oil and gas field time sequence data storage method, an oil and gas field time sequence data query device and a storage medium.
Background
With the continued advancement of the internet in the field of oil and gas field production, oil well production monitoring platforms have generated more and more time series-based data, referred to as time series data.
The fact that a large number of sensors are needed for detecting one detection object means that a large number of data need to be stored, along with accumulation of index objects, a large number of historical data need to be stored in data storage equipment, and if data are not processed properly, a large amount of storage equipment resources are wasted.
The storage and processing of the time series data are often processed in a relational database mode, but the time series data cannot be efficiently stored due to the data storage format of the relational database. The format problem of the time sequence data needs to use a special storage mode, the time sequence database can efficiently store and rapidly process massive time sequence big data, and the time sequence database is usually subjected to real-time data writing of million or even ten million orders of magnitude detection equipment.
The relational database has the following disadvantages when storing time series data:
1. the storage consumption is large: the relational database stores redundant data and occupies a large amount of storage equipment resources;
2. poor write performance: the single-machine system can hardly reach the writing pressure of millions or even tens of millions of devices;
3. the query efficiency is low: the method is suitable for transaction service scenes, and the analysis performance of the time sequence data points is poor.
Disclosure of Invention
The invention aims to at least solve one of the technical problems in the prior art, and provides a time sequence data storage method, an inquiry method, a device thereof and a storage medium, which can save the expenditure of server storage resources, avoid unnecessary storage of redundant data, solve the data writing pressure of million-level equipment of a single machine system and improve the data inquiry efficiency.
According to a first aspect of the invention, there is provided a method of storing time series data of a field type, the method comprising:
acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
dividing the time series data into a plurality of data files according to a preset time period for storage, wherein each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises all the time series data of the combination of the metric and the index in the time period, and the data files are numbered according to the time stamps;
establishing an index file, numbering the index file according to the timestamp, and establishing an index guide director corresponding to each data block in each index file, wherein the index guide director is a fixed-length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N guiders form a guide set Muster, the N guiders point to a set of N different data areas combined by the metric and the index, each Muster corresponds to N data files, and N is a positive integer;
constructing a guide set positioning table Locates based on each measurement and index of the time sequence data acquired each time, wherein the guide set positioning table comprises a key and a value, the key is different combinations of numerical identifiers of each measurement and each index, and the value refers to a storage address of the index set Muster corresponding to each combination in the index file;
and calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning to a Guider according to Locates and Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
According to a second aspect of the present invention, there is provided a storage device for oil and gas field time series data, the storage device comprising:
the acquisition unit is used for acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
the dividing unit is used for dividing the time series data into a plurality of data files according to a preset time period for storage, each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises the time series data of all the combinations of the metric and the index in the time period, and the data files are numbered according to the time stamps;
an index establishing unit, configured to establish an index file, number the index file according to the timestamp, and establish an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors point to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit, configured to establish a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
and the storage unit is used for calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning the time sequence data to a Guider according to the Locates and the Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
According to a third aspect of the invention, there is provided a method for querying time series data of an oil and gas field, the method comprising the steps of:
receiving a query request of time sequence data, wherein query parameters in the query request comprise a time period, attribute measurement of the time sequence data, indexes and timestamp attributes;
and searching corresponding data blocks from the storage method of the first aspect based on the query parameters, and further extracting corresponding data values.
According to a fourth aspect of the present invention, there is provided an oil and gas field time series data query device, the device comprising:
the receiving unit is used for receiving a query request of time sequence data, and query parameters in the query request comprise a time period, attribute measurement of the time sequence data, indexes and timestamp attributes;
and the query unit is used for searching the corresponding data block from the storage method in the first aspect based on the query parameter and further extracting the corresponding data value.
According to a fifth aspect of the present invention, there is provided a computer-readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions; the at least one instruction, the at least one program, the set of codes or the set of instructions is loaded and executed by a processor to implement a method of storing field time series data as described in the above first aspect or a method of querying field time series data as described in the above third aspect.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise: the time sequence data of the oil and gas field is divided into data blocks for storage, so that the storage consumption is reduced, and meanwhile, a guide set positioning table is constructed according to the measurement and index attributes of the time sequence data, so that a large amount of data can be stored by using key values, flexible condition query is supported, and the query efficiency of time sequence data is improved; according to the embodiment of the application, the sequential storage is carried out after the time sequence data are divided into the regions, and the data values can be obtained by calculating the offset in the data blocks, so that the data in the data blocks only need to be indexed once, and the storage capacity and the query efficiency are greatly improved.
Drawings
The invention is further described below with reference to the accompanying drawings and examples;
FIG. 1 is a flow chart of a method for storing time series data of an oil and gas field according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a storage form of index files according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a combination of metrics and indicators as a key according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a compression algorithm for time series data of the oil and gas field according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a task list file established according to an embodiment of the present invention;
FIG. 6 is a schematic view of an oil and gas field time series data storage device according to an embodiment of the invention;
FIG. 7 is a schematic diagram of a compression unit according to an embodiment of the present invention;
FIG. 8 is a schematic flow chart of a method for querying time series data of an oil and gas field according to an embodiment of the present invention;
fig. 9 is a schematic flow chart of an apparatus for querying time series data of an oil and gas field according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. The specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention.
Fig. 1 is a flow chart of a method for storing time series data of an oil and gas field, which is provided by an embodiment of the invention and is used in the field of oil and gas field production, and the method can be executed by a time series data storage device, wherein the device can be implemented by software and/or hardware and can be generally integrated in an electronic device. As shown in fig. 1, the method includes:
and 110, acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, time stamps and data value attributes.
The time-series data is a time-series recorded in chronological order, and changes depending on time, and the degree of change may be reflected in a numerical value. The acquired time sequence data source can be time sequence data acquired at fixed frequency in the petroleum production field and can also be time sequence data in other industrial fields. The time period refers to the period of measurement acquisition. The time series data has at least four attributes, a Metric, an indicator Target, a timestamp Stamp, and the data value itself.
It is understood that Metric in the embodiment of the present application represents attribution of data. Target is a measurement value acquired at a specified frequency, such as: voltage of oil well No. 1, average temperature in Beijing; the No. 1 oil well and Beijing are measurement attributes, and the voltage and the average temperature are index attributes. The timestamp Stamp is the total number of milliseconds (positive integer) from greenwich time 1970, 00 h 00 s 000 ms (beijing time: 1970, 01 h 01, 08 h 00 s 000 ms) to the present. The timing data of embodiments of the present application may have a number of different metrics and indicators.
Step 120, dividing the time series data into a plurality of data files according to a preset time period, and storing, wherein each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises the time series data of all the combinations of the metric and the index in the time period, and the data files are numbered according to the time stamps.
Specifically, when the time series data is to be written into the storage, the time series data is divided into a plurality of data areas regions in the time dimension, the data areas are a long time period, for example, 7 days, and one data area Region is stored in a single data file. Each data field contains timing data for all the metric and index combinations during the time period. A plurality of data blocks Block are defined in the data area Region, each data Block comprises time sequence data of a combination of the Metric Metric and the index Target, and the number of the data blocks in the data area is the combined number of the Metric and the index. For example, the time series data in one data file includes metric attributes M1, M2 and metric attributes T1 and T2, and the combination of metrics and metrics includes four combinations of M1T1, M1T2, M2T1 and M2T2, and then the data file includes four data blocks for storing data values of time series data of different combinations within one time period.
It can be understood that, in this embodiment, the time series data is divided into different data files for storage, and the data files are numbered, so that the specified data file can be quickly located. Assuming that the time interval of the data area is 7 days, data of 1970-01-0100:00:00:000 to 1970-01-0700:00:00:000 is divided into one data area, i.e., one data file. A data file number is then assigned based on the time stamp.
For example, the time interval is 7 days, and the data file is numbered in such a way that the integer part of the millisecond time difference of 7 days divided by the millisecond time stamp of the start of the data area is the data file number.
Step 130, establishing an index file, numbering the index file according to a timestamp, establishing an index guide director corresponding to each data block in each index file, wherein the index guide director is a fixed-length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N directors form a guide set Muster, the N directors point to a set of N different data areas where the metric and the index are combined, each Muster corresponds to N data files, and N is a positive integer.
In an embodiment of the present invention, the index file number is calculated by taking a difference between the current timestamp Stamp and greenwich time Stamp0 (00 min 00 s 000 ms 00 h 01/01 h 1970), and the index file number is obtained by rounding down (floor) a result of dividing the timestamp difference by the total number of millisecond timestamps of N × time interval.
For example, when N is 8 and the time interval is 7 days, the index file number is: floor ((Stamp-Stamp0)/(8 × 7 × 86400000)).
Illustratively, as shown in fig. 2, the storage form of the created index file is represented schematically, the Y axis TS-Y represents the index file number, and the X axis Muster-X represents the storage location of the Muster in the index file.
Step 140, constructing a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file.
In a specific embodiment of the present invention, when each Metric is established, an integer value identifier KM (for example, 4 bytes in length) is generated for each Metric; when each index Target is established, an integer value identification KT (for example, 4 bytes in length) is also generated for each index Target. For example, the index Target generates the integer value tag by the method that the value of the first created index integer value tag KT0 is 0, the value of the second created index integer value tag KT1 is 1, and the n-th created index integer value tag KTn is n. The purpose of generating the integer value identification is to facilitate the combination of the metric and the index. The wizard set location table Locates in this embodiment is a list including a key (LocateKey) which is a combination of KM and KT and a value. For example, as shown in fig. 3, one way to combine KM and KT as a key (LocateKey) is: the left logical or operation of KM for Metric with KT for index Target finally results in an 8-byte long integer number.
In this embodiment, the value is the value corresponding to the key, which corresponds to the storage address of the combination of metric and index, in the index file, and the storage address is allocated in a manner, for example, since the index guide is fixed in length, it is assumed that it contains 6 data of 8 bytes, and a guide set Muster is 8 index guide guides, so that a guide set is 384 bytes. Assuming that the Metric Metric1, the indexes Target1 and the index 2 are combined and distributed according to the order of creating the indexes, the spatial address of the guide set Muster after the Metric Metric1 and the index Target1 are combined is 0-384 bytes of an index file, and the spatial address of the guide set Muster after the Metric Metric1 and the index Target2 are combined is 385-768 bytes.
It can be understood that the time sequence data is stored in the above manner, the index guide director and the guide set Muster adopt a fixed-length manner, secondary calculation of indexes is avoided, a large amount of search time is saved, time complexity is guaranteed to be O (1), and the data blocks are stored in an unordered manner, so that occupation storage of invalid data blocks can be avoided.
Step 150, calculating the index file number and the data file number corresponding to each time sequence data to be stored according to the timestamp, positioning to a Guider according to Locates and Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number pointed by the Guider, and storing the time sequence data into the corresponding data block.
Specifically, based on the metric of the time sequence data to be stored and the numerical identifiers KM and KT of the index, the value corresponding to the key, that is, the storage location of the corresponding Muster, is found through the calculated key of the locates in the index file number to be stored; and then, performing modular operation on the current timestamp of the time sequence data divided by the time difference of the data file and the number of guiders contained in each Muster to obtain the guiders, and obtaining a data block to be stored in the time sequence data according to a storage address pointed by the guiders.
Further, storing the time series data into the corresponding data block, further comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the data point position specified by the offset.
For example, if the data area Region is 7 days and the collection time period is 1 hour, then 168 (7X 24) pieces of data exist in a data Block, and the data Block is assumed to contain data within a time period of 2020-01-0100:00: 000 to 2020-01-0700:00:00: 000. When the data point has a storage timestamp of 2020-01-0108:00:00:000, the offset is equal to 8 as the difference in time between 2020-01-0100:00: 000 and 2020-01-0108:00:00:000 divided by the cycle time of 1 hour. Since data is stored in bytes, when the unit length of the data type is 4 bytes, the storage space of the data point is 32 th byte to 36 th byte.
In an embodiment of the invention, when the time series data needs to be stored in batch, the time series data is divided into a plurality of sections according to the time period before being written, if a section occupies a part of the data block, the data point in the section is stored in the data block after the data block pointed by the data block is found by the Guider, if the section spans the whole block, the data of the section is added to different data files, and the storage position pointed by the corresponding guide Guider is recorded.
For example, one storage task is a batch storage mode, data of 20 th 8 th point 29 to 20 th 9 th 20 th 10 th point 20 of 2020 is stored, in this case, time sequence data spans a plurality of data files, a few guiders of an index file guide set Muster are firstly positioned, a time period of the data files is assumed to be 30 minutes, 5 guide guiders can be calculated according to a starting time and an ending time, 9 th 20 th 8 th point 29 to 20 th 2020, 9 th 20 th, the corresponding data blocks are found through the guiders, the 29 th and 30 th points in a first data block are stored in data blocks of 29 th point and 8 th 30 th point, the second, third, fourth and fifth data blocks are added into the data files, and the corresponding guiders are updated.
As time is accumulated, the storage capacity of time series data is increased, and the time series data is often characterized by poor applicability of historical data, so that necessary compression processing needs to be performed on the time series data accumulated in history.
Further, the present invention is based on the above-mentioned embodiment, adding a compression step to the time series data, and sequentially compressing all data files before each t0 into compressed files through the preset time point t 0. As shown in fig. 4, step 160 is added to step 190 after step 150, and the following steps are performed:
step 160, acquiring a data file to be compressed before a preset time point t 0;
step 170, sequentially writing an identification BlockKey of each data block in each data file into a pre-generated task list file, wherein the identification BlockKey of each data block is an identification ID generated when each data block is established;
specifically, as shown in fig. 5, for an established task list file schematic diagram, a task list file includes a header and a body, where the header is a task pointer position and records a current compression task execution position, and the body records an identification Block key of each data Block;
and step 180, sequentially compressing the data blocks by adopting a compression algorithm according to the task list, writing the data blocks into corresponding compressed files, updating the task pointer after the compression of each Block is finished to enable the task pointer to point to the next Block key, and updating the starting and ending positions of the blocks in the compressed files into the corresponding index guide.
In the present embodiment, the compression algorithm used is a data compression algorithm in the prior art, including but not limited to deflate, bzip2, etc.
And step 190, deleting the data files when all blocks in each data file are compressed, and sequentially compressing the data files before the next t 0.
It can be understood that, in this embodiment, if the compression task is interrupted unexpectedly, the Block compression task in the task list pointed to by the task pointer is continuously executed next time the task is started. When all the data blocks Block of the data file are compressed, searching the data area file before the preset time point t0 for compression in sequence. In this embodiment, since the compression task is not a core task of storing data, a single-thread block-wise compression manner is adopted, so that too many resources occupied by a heavy compression task can be avoided.
In this embodiment, since the index guide records the start and end positions of each Block in the compressed file, each time data in the compressed file is queried, only part of the content of the corresponding compressed file needs to be read and decompressed, and the address of the compressed data Block is taken out from the client, thereby greatly increasing the query efficiency and saving the computing resources.
As an example of this embodiment, assuming that 3 data files are included before the preset time point t0, and each data file contains 3 data blocks, a total of 9 blocks need to be compressed. The compression task execution comprises the following steps:
1. acquiring data files to be compressed, namely 3 data files, such as 2501.DAT, 2502.DAT and 2503.DAT, wherein the total number of the data files is 9, and the data files are 2501-A, B2501, 2501-B, B2501, 2501-C … … and the like;
2. and a new task list file 2501.DTL is built for recording the process of compressing the data file 2501. DAT. B2501-A, B2501-B, B2501-C, identification Block Key values of 3 blocks are written in the task list file Body in sequence, and a current pointer value 0 is written in the file Header and points to the first Block, namely B2501-A.
3. And executing a compression algorithm to compress the B2501-A, and writing the compressed data into the corresponding compressed file 2501. ZDT. B2501-A corresponding index guide marker overwrites the storage position of Block in 2501.ZDT, namely recording the start byte and the end byte, for example 0-125. The Header pointer moves down, increases to 1, and points to the next Block (B2501-B).
4. And repeating the compression algorithm process of the step 3 until all blocks in the body of the DTL 2501 are compressed.
5. Deleting 2501.DTL, newly creating 2502.DTL, repeating the processes in the steps 2 and 3, and executing the compression task of 2502.DAT data area.
6. Before the next compression task starts, whether a task list file which is not deleted exists is checked, if yes, the compression task in the task list file is continuously executed, and therefore breakpoint continuous compression is achieved.
Please refer to fig. 6, which is a schematic diagram of an embodiment of a time-series data storage device according to the present invention. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
A storage apparatus for time series data of this embodiment includes: a storage device for time series data, the storage device comprising:
the acquisition unit 210 is configured to acquire time series data to be stored in each time period, where attributes of the time series data include a metric, an index, a timestamp, and a data value attribute;
a dividing unit 220, configured to divide the time-series data into a plurality of data files according to a predetermined time period, where each data file includes a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file includes time-series data of all combinations of the metric and the index in the time period, and the data files are numbered according to the timestamp;
an index creating unit 230, which creates an index file, numbers the index file according to the timestamp, and creates an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide directs to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors direct to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit 240, configured to establish a guide set positioning table Locates based on each metric and indicator of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each indicator, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
the storage unit 250 is configured to calculate, according to the timestamp, the corresponding index file number and the data file number for each piece of time series data to be stored, locate and Muster corresponding to the index file with the number, and locate the data block in the data with the number corresponding to the Guider in combination with the timestamp, so as to find the data block in the data with the number corresponding to the Guider, and store the time series data in the corresponding data block.
As an embodiment of the present invention, as shown in fig. 7, the time-series data storage device further includes a compressing unit, configured to compress all the data files before a preset time point t0 into compressed files, where the compressing unit specifically includes:
an obtaining unit 310, which obtains a data file to be compressed before a preset time point t 0;
the task manifest file establishing unit 320 is configured to sequentially write an identifier BlockKey of each data block in each data file into a pre-generated task manifest file, where the identifier BlockKey of each data block is an identifier ID of each data block generated at the time of establishing the data block; the task list file comprises a file header and a file body, wherein the header is a task pointer position and records a current compression task execution position, and the body records an identification Block Key of each data Block Block;
a compression algorithm execution unit 330, which sequentially compresses the data blocks according to the task list by using a compression algorithm and writes the compressed data blocks into corresponding compressed files, and after each Block compression is finished, updates the task pointer to point to the next Block key, and updates the starting and ending positions of the Block in the compressed files into the corresponding index guide pointers;
and the deleting unit 340 deletes the data file when all blocks in each data file are compressed, and sequentially compresses the data file before the next t 0.
The same contents of the time series data storage method in this embodiment and the embodiment are not repeated, please refer to the corresponding parts in the first embodiment.
Corresponding to the storage method for the time series data, the application also provides an inquiry method for the time series data. Please refer to fig. 8, which is a flowchart illustrating a method for querying time series data according to the present application. Similarly, the same parts of this embodiment as those of the first embodiment are not repeated, please refer to the corresponding parts in the first embodiment. The query method of the time sequence data comprises the following steps:
step 410, receiving a query request of time series data, where query parameters in the query request include attribute metrics, indexes, a start timestamp and an end timestamp of the time series data, and the start timestamp and the end timestamp refer to a start timestamp and an end timestamp of a data time period to be queried.
Step 420, finding out a corresponding data block from the storage method of the time series data according to the above embodiment based on the query parameter, and further extracting the corresponding data value.
When the time series data is inquired in batch, according to the storage method for storing the time series data in batch in the above embodiment of the present invention, the storage position of the data block pointed by the pointer is found, and then the corresponding data value is extracted in batch.
Corresponding to the query method for the time series data, the application also provides a query device for the time series data. Referring to fig. 9, the apparatus includes:
a receiving unit 510, configured to receive a query request for time series data, where query parameters in the query request include an attribute metric, an index, a start timestamp, and an end timestamp of the time series data;
the query unit 520 finds the corresponding data block from the storage method of the time series data according to the above embodiment based on the query parameter, and further extracts the corresponding data value.
As another embodiment of the present invention, a computer-readable storage medium is provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored; the at least one instruction, the at least one program, the set of codes, or the set of instructions are loaded and executed by a processor to implement a method of storing time series data as described above, or a method of querying time series data as described above.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.

Claims (13)

1. A method for storing time series data of an oil and gas field is characterized by comprising the following steps:
acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
dividing the time series data into a plurality of data files according to a preset time period for storage, wherein each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises all the time series data of the combination of the metric and the index in the time period, and the data files are numbered according to the time stamps;
establishing an index file, numbering the index file according to the timestamp, and establishing an index guide director corresponding to each data block in each index file, wherein the index guide director is a fixed-length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N guiders form a guide set Muster, the N guiders point to a set of N different data areas combined by the metric and the index, each Muster corresponds to N data files, and N is a positive integer;
constructing a guide set positioning table Locates based on each measurement and index of the time sequence data acquired each time, wherein the guide set positioning table comprises a key and a value, the key is different combinations of numerical identifiers of each measurement and each index, and the value refers to a storage address of the index set Muster corresponding to each combination in the index file;
calculating the corresponding index file number and the data file number of each piece of time sequence data to be stored according to the timestamp, positioning the time sequence data to a Guider according to the Locates and the Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block;
the method further comprises a preset time point t0, and all the data files before each t0 are compressed into compressed files in sequence, and the method specifically comprises the following steps:
acquiring a data file to be compressed before a preset time point t 0;
sequentially writing an identification BlockKey of each data Block in each data file into a pre-generated task list file, wherein the identification BlockKey of each data Block is an identification ID of each data Block generated at the time of construction, the task list file comprises a file header and a file body, the header is a task pointer position and records a current compression task execution position, and the body records the identification BlockKey of each data Block;
sequentially compressing the data blocks by adopting a compression algorithm according to the task list, writing the data blocks into corresponding compressed files, updating the task pointer after the compression of each Block is finished to enable the task pointer to point to the next Block key, and updating the starting position and the ending position of the Block in the compressed files into the corresponding index guide;
and when all blocks in each data file are compressed, deleting the data file, and sequentially compressing the data file before the next t 0.
2. The method according to claim 1, wherein locating and Muster according to the corresponding Locates and Muster of the index file with the number and locating the data block in the data with the number corresponding to the guide by combining the timestamp to find the data block in the data with the number corresponding to the guide specifically comprises:
finding a storage position of a corresponding Muster in the locations in the numbered index file based on the metrics of the time series data and the numerical identifiers of the indexes, wherein the numerical identifiers of the metrics are integer numerical identifiers generated when each metric is established, and the numerical identifiers of the indexes are integer numerical identifiers generated when each index is established;
and obtaining the Guider according to the timestamp of the time sequence data and the modulo operation of the value of the N, and obtaining the data block to be stored by the time sequence data according to the storage address pointed by the Guider.
3. The method for storing time series data of oil and gas fields according to claim 1, wherein the storing the time series data into the corresponding data block further comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the storage position specified by the offset.
4. The method for storing time series data of oil and gas fields according to claim 1, which further comprises: when the time sequence data needs to be stored in batch, the time sequence data is divided into a plurality of sections according to the time period before being written, if a certain section occupies a part of the data block, the data point in the certain section is stored in the data block after the data block pointed by the guide is found by the guide, if the section spans the whole block, the data of the section is added to different data files, and the storage position pointed by the corresponding guide is recorded.
5. The oil and gas field time series data storage method according to any one of claims 1 to 4, characterized in that when a compression task is interrupted and a next compression task is started, the task list compressed last time is found, the execution position of the compression task is found, and the compression task is continuously executed.
6. An oil and gas field time series data storage device, characterized in that, this storage device includes:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring time sequence data to be stored in oil and gas field production in each time period, and the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
the dividing unit is used for dividing the time series data into a plurality of data files according to a preset time period for storage, each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises the time series data of all the combinations of the metric and the index in the time period, and the data files are numbered according to the time stamps;
an index establishing unit, configured to establish an index file, number the index file according to the timestamp, and establish an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors point to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit, configured to establish a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
the storage unit is used for calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning the time sequence data to a Guider according to Locates and Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block;
the data file compression method further comprises a compression unit, which is used for compressing all the data files before a preset time point t0 into compressed files, and the compression unit specifically comprises:
an acquisition unit which acquires a data file to be compressed before a preset time point t 0;
the task list file establishing unit is used for sequentially writing an identifier BlockKey of each data block in each data file into a pre-generated task list file, wherein the BlockKey is an identifier ID of each data block generated during construction; the task list file comprises a file header and a file body, wherein the header is a task pointer position and records a current compression task execution position, and the body records an identification Block Key of each data Block Block;
the compression algorithm execution unit is used for sequentially compressing the data blocks according to the task list by adopting a compression algorithm and writing the data blocks into corresponding compression files, after the compression of each Block is finished, the task pointer is updated to point to the next Block key, and the starting and ending positions of the Block in the compression files are updated into the corresponding index guide guiders;
and the deleting unit deletes the data file when all blocks in each data file are compressed, and sequentially compresses the data file before the next t 0.
7. The oil and gas field time series data storage device according to claim 6, wherein the locating and the Muster corresponding to the index file with the number are located to the Guider in combination with the timestamp, so as to find the data block in the data with the number corresponding to the Guider, specifically comprises:
finding a storage position of a corresponding Muster in the locations in the numbered index file based on the metrics of the time series data and the numerical identifiers of the indexes, wherein the numerical identifiers of the metrics are integer numerical identifiers generated when each metric is established, and the numerical identifiers of the indexes are integer numerical identifiers generated when each index is established;
and obtaining the Guider according to the timestamp of the time sequence data and the modulo operation of the value of the N, and obtaining the data block to be stored by the time sequence data according to the storage address pointed by the Guider.
8. An oil and gas field time series data storage device as claimed in claim 6, wherein said storing said time series data into corresponding said data blocks comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the storage position specified by the offset.
9. The oil and gas field time series data storage device according to any one of claims 6 to 8, further comprising an interruption execution unit for finding the task list compressed last time and finding the compression task execution position to continue executing the compression task when the compression task is interrupted and started next time.
10. A method for inquiring time series data of an oil and gas field is characterized by comprising the following steps:
receiving a query request of time series data, wherein query parameters in the query request comprise attribute measurement, indexes, a start timestamp and an end timestamp of the time series data;
finding a corresponding data block from the oil and gas field time series data storage method according to any one of claims 1-3 based on the query parameter, and further extracting the corresponding data value.
11. The method according to claim 10, wherein when the time series data is queried in batches, according to the method for storing time series data of oil and gas fields according to claim 4, the storage position of the data block pointed by the marker is found, and the corresponding data value is extracted in batches.
12. An oil and gas field time series data inquiry unit, its characterized in that, the apparatus includes:
the receiving unit is used for receiving a query request of the time sequence data, and query parameters in the query request comprise attribute measurement, indexes, a start timestamp and an end timestamp of the time sequence data;
the query unit is used for searching the corresponding data blocks from the oil and gas field time series data storage method according to any one of claims 1 to 4 based on the query parameters and further extracting the corresponding data values.
13. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions; the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by a processor to implement a method of storing time series data according to any one of claims 1 to 5 or a method of querying field time series data according to claims 10-11.
CN202011160744.1A 2020-10-27 2020-10-27 Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium Active CN112286867B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011160744.1A CN112286867B (en) 2020-10-27 2020-10-27 Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011160744.1A CN112286867B (en) 2020-10-27 2020-10-27 Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium

Publications (2)

Publication Number Publication Date
CN112286867A CN112286867A (en) 2021-01-29
CN112286867B true CN112286867B (en) 2022-03-01

Family

ID=74372693

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011160744.1A Active CN112286867B (en) 2020-10-27 2020-10-27 Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium

Country Status (1)

Country Link
CN (1) CN112286867B (en)

Families Citing this family (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112463803A (en) * 2021-02-01 2021-03-09 山东柏源技术有限公司 Time sequence data storage method, device and equipment for petroleum production
CN112579834B (en) * 2021-02-22 2021-09-03 北京工业大数据创新中心有限公司 Industrial equipment data storage method and system
CN113032453B (en) * 2021-02-25 2024-03-01 广州虎牙科技有限公司 Data storage and decompression method and device, electronic equipment and storage medium
CN113515576A (en) * 2021-07-13 2021-10-19 北京字节跳动网络技术有限公司 Data processing method and device, electronic equipment and computer readable medium
CN113360551B (en) * 2021-08-11 2021-11-16 南京赛宁信息技术有限公司 Method and system for storing and rapidly counting time sequence data in shooting range
CN114630030A (en) * 2022-03-10 2022-06-14 陕西安控科技有限公司 Oil and gas field mobile measure operation equipment monitoring system and method thereof
CN114615306B (en) * 2022-05-10 2022-07-29 中南林业科技大学 Efficient file system of sink node in Internet of things and processing method thereof
CN116304390B (en) * 2023-04-13 2024-02-13 北京基调网络股份有限公司 Time sequence data processing method and device, storage medium and electronic equipment
CN117573703B (en) * 2024-01-16 2024-04-09 科来网络技术股份有限公司 Universal retrieval method, system, equipment and storage medium for time sequence data

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776967A (en) * 2016-12-05 2017-05-31 哈尔滨工业大学(威海) Mass small documents real-time storage method and device based on sequential aggregating algorithm
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109164980A (en) * 2018-08-03 2019-01-08 北京涛思数据科技有限公司 A kind of optimizing polymerization processing method of time series data
CN111309720A (en) * 2018-12-11 2020-06-19 北京京东尚科信息技术有限公司 Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10904192B2 (en) * 2016-07-27 2021-01-26 Sap Se Time series messaging persistence and publication

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106776967A (en) * 2016-12-05 2017-05-31 哈尔滨工业大学(威海) Mass small documents real-time storage method and device based on sequential aggregating algorithm
CN108256088A (en) * 2018-01-23 2018-07-06 清华大学 A kind of storage method and system of the time series data based on key value database
CN109164980A (en) * 2018-08-03 2019-01-08 北京涛思数据科技有限公司 A kind of optimizing polymerization processing method of time series data
CN111309720A (en) * 2018-12-11 2020-06-19 北京京东尚科信息技术有限公司 Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
分散存储油气生产动态大数据的优化管理与快速查询;王洪亮;《石油勘探与开发》;20191031;第46卷(第5期);959-965 *
基于时序数据库的监控数据存储方法研究;林志达等;《电子元器件与信息技术》;20200120(第01期);全文 *

Also Published As

Publication number Publication date
CN112286867A (en) 2021-01-29

Similar Documents

Publication Publication Date Title
CN112286867B (en) Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium
US9645736B2 (en) Processing time series data from multiple sensors
CN108153784B (en) Synchronous data processing method and device
CN101553813B (en) Managing storage of individually accessible data units
KR101708261B1 (en) Managing storage of individually accessible data units
CN108287668B (en) Equipment data processing method and device, computer device and readable storage medium
CN110990402B (en) Format conversion method from row storage to column storage, query method and device
US20060004840A1 (en) Index adding program of relational database, index adding apparatus, and index adding method
CN110309233B (en) Data storage method, device, server and storage medium
WO2018095299A1 (en) Time sequence data management method, device and apparatus
CN111385365A (en) Processing method and device for reported data, computer equipment and storage medium
CN111125018B (en) File exception tracing method, device, equipment and storage medium
CN114911830A (en) Index caching method, device, equipment and storage medium based on time sequence database
CN109739819A (en) Snapshot lossless compression method, device, equipment and the readable storage medium storing program for executing that can be recalled
CN109344163B (en) Data verification method and device and computer readable medium
CN115599793B (en) Method, device and storage medium for updating data
CN115454353B (en) High-speed writing and query method for space application data
CN111090705B (en) Multidimensional data processing method, device and equipment and storage medium
EP2568399A2 (en) Data storage method and system
CN112463803A (en) Time sequence data storage method, device and equipment for petroleum production
CN110543452B (en) Data acquisition method and equipment
CN114064666A (en) Data warehouse synchronization system and method
CN111831622A (en) Data index generation method and device, electronic equipment and readable storage medium
CN112286948B (en) Data storage method, data reading method and data storage device of time sequence database
CN113392088B (en) Data synchronization method, device, terminal and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant