CN112286867A - Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium - Google Patents
Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium Download PDFInfo
- Publication number
- CN112286867A CN112286867A CN202011160744.1A CN202011160744A CN112286867A CN 112286867 A CN112286867 A CN 112286867A CN 202011160744 A CN202011160744 A CN 202011160744A CN 112286867 A CN112286867 A CN 112286867A
- Authority
- CN
- China
- Prior art keywords
- data
- index
- file
- time
- block
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/11—File system administration, e.g. details of archiving or snapshots
- G06F16/113—Details of archiving
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/13—File access structures, e.g. distributed indices
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/14—Details of searching files based on file metadata
- G06F16/148—File search processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/16—File or folder operations, e.g. details of user interfaces specifically adapted to file systems
- G06F16/162—Delete operations
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/17—Details of further file system functions
- G06F16/174—Redundancy elimination performed by the file system
- G06F16/1744—Redundancy elimination performed by the file system using compression, e.g. sparse files
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Human Computer Interaction (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention belongs to the technical field of oilfield production data processing, and discloses an oil and gas field time sequence data storage method, an inquiry method, a device thereof and a storage medium. The invention can save the expenditure of server storage resources, avoid unnecessary storage of redundant data, solve the problem of data writing pressure of million-level equipment of a single machine system and improve the data query efficiency.
Description
Technical Field
The invention relates to the technical field of oilfield production data processing, in particular to an oil and gas field time sequence data storage method, an oil and gas field time sequence data query device and a storage medium.
Background
With the continued advancement of the internet in the field of oil and gas field production, oil well production monitoring platforms have generated more and more time series-based data, referred to as time series data.
The fact that a large number of sensors are needed for detecting one detection object means that a large number of data need to be stored, along with accumulation of index objects, a large number of historical data need to be stored in data storage equipment, and if data are not processed properly, a large amount of storage equipment resources are wasted.
The storage and processing of the time series data are often processed in a relational database mode, but the time series data cannot be efficiently stored due to the data storage format of the relational database. The format problem of the time sequence data needs to use a special storage mode, the time sequence database can efficiently store and rapidly process massive time sequence big data, and the time sequence database is usually subjected to real-time data writing of million or even ten million orders of magnitude detection equipment.
The relational database has the following disadvantages when storing time series data:
1. the storage consumption is large: the relational database stores redundant data and occupies a large amount of storage equipment resources;
2. poor write performance: the single-machine system can hardly reach the writing pressure of millions or even tens of millions of devices;
3. the query efficiency is low: the method is suitable for transaction service scenes, and the analysis performance of the time sequence data points is poor.
Disclosure of Invention
The invention aims to at least solve one of the technical problems in the prior art, and provides a time sequence data storage method, an inquiry method, a device thereof and a storage medium, which can save the expenditure of server storage resources, avoid unnecessary storage of redundant data, solve the data writing pressure of million-level equipment of a single machine system and improve the data inquiry efficiency.
According to a first aspect of the invention, there is provided a method of storing time series data of a field type, the method comprising:
acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
dividing the time series data into a plurality of data files according to a preset time period for storage, wherein each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises all the time series data of the combination of the metric and the index in the time period, and the data files are numbered according to the time stamps;
establishing an index file, numbering the index file according to the timestamp, and establishing an index guide director corresponding to each data block in each index file, wherein the index guide director is a fixed-length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N guiders form a guide set Muster, the N guiders point to a set of N different data areas combined by the metric and the index, each Muster corresponds to N data files, and N is a positive integer;
constructing a guide set positioning table Locates based on each measurement and index of the time sequence data acquired each time, wherein the guide set positioning table comprises a key and a value, the key is different combinations of numerical identifiers of each measurement and each index, and the value refers to a storage address of the index set Muster corresponding to each combination in the index file;
and calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning to a Guider according to Locates and Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
According to a second aspect of the present invention, there is provided a storage device for oil and gas field time series data, the storage device comprising:
the acquisition unit is used for acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
the dividing unit is used for dividing the time series data into a plurality of data files according to a preset time period for storage, each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises the time series data of all the combinations of the metric and the index in the time period, and the data files are numbered according to the time stamps;
an index establishing unit, configured to establish an index file, number the index file according to the timestamp, and establish an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors point to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit, configured to establish a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
and the storage unit is used for calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning the time sequence data to a Guider according to the Locates and the Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
According to a third aspect of the invention, there is provided a method for querying time series data of an oil and gas field, the method comprising the steps of:
receiving a query request of time sequence data, wherein query parameters in the query request comprise a time period, attribute measurement of the time sequence data, indexes and timestamp attributes;
and searching corresponding data blocks from the storage method of the first aspect based on the query parameters, and further extracting corresponding data values.
According to a fourth aspect of the present invention, there is provided an oil and gas field time series data query device, the device comprising:
the receiving unit is used for receiving a query request of time sequence data, and query parameters in the query request comprise a time period, attribute measurement of the time sequence data, indexes and timestamp attributes;
and the query unit is used for searching the corresponding data block from the storage method in the first aspect based on the query parameter and further extracting the corresponding data value.
According to a fifth aspect of the present invention, there is provided a computer-readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions; the at least one instruction, the at least one program, the set of codes or the set of instructions is loaded and executed by a processor to implement a method of storing field time series data as described in the above first aspect or a method of querying field time series data as described in the above third aspect.
The beneficial effects brought by the technical scheme provided by the embodiment of the application at least comprise: the time sequence data of the oil and gas field is divided into data blocks for storage, so that the storage consumption is reduced, and meanwhile, a guide set positioning table is constructed according to the measurement and index attributes of the time sequence data, so that a large amount of data can be stored by using key values, flexible condition query is supported, and the query efficiency of time sequence data is improved; according to the embodiment of the application, the sequential storage is carried out after the time sequence data are divided into the regions, and the data values can be obtained by calculating the offset in the data blocks, so that the data in the data blocks only need to be indexed once, and the storage capacity and the query efficiency are greatly improved.
Drawings
The invention is further described below with reference to the accompanying drawings and examples;
FIG. 1 is a flow chart of a method for storing time series data of an oil and gas field according to an embodiment of the present invention;
FIG. 2 is a schematic diagram illustrating a storage form of index files according to an embodiment of the present invention;
FIG. 3 is a diagram illustrating a combination of metrics and indicators as a key according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a compression algorithm for time series data of the oil and gas field according to an embodiment of the present invention;
FIG. 5 is a diagram illustrating a task list file established according to an embodiment of the present invention;
FIG. 6 is a schematic view of an oil and gas field time series data storage device according to an embodiment of the invention;
FIG. 7 is a schematic diagram of a compression unit according to an embodiment of the present invention;
FIG. 8 is a schematic flow chart of a method for querying time series data of an oil and gas field according to an embodiment of the present invention;
fig. 9 is a schematic flow chart of an apparatus for querying time series data of an oil and gas field according to an embodiment of the present invention.
Detailed Description
Reference will now be made in detail to the present preferred embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to like elements throughout. The specific embodiments described herein are merely illustrative of the invention and are not to be construed as limiting the invention.
Fig. 1 is a flow chart of a method for storing time series data of an oil and gas field, which is provided by an embodiment of the invention and is used in the field of oil and gas field production, and the method can be executed by a time series data storage device, wherein the device can be implemented by software and/or hardware and can be generally integrated in an electronic device. As shown in fig. 1, the method includes:
and 110, acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, time stamps and data value attributes.
The time-series data is a time-series recorded in chronological order, and changes depending on time, and the degree of change may be reflected in a numerical value. The acquired time sequence data source can be time sequence data acquired at fixed frequency in the petroleum production field and can also be time sequence data in other industrial fields. The time period refers to the period of measurement acquisition. The time series data has at least four attributes, a Metric, an indicator Target, a timestamp Stamp, and the data value itself.
It is understood that Metric in the embodiment of the present application represents attribution of data. Target is a measurement value acquired at a specified frequency, such as: voltage of oil well No. 1, average temperature in Beijing; the No. 1 oil well and Beijing are measurement attributes, and the voltage and the average temperature are index attributes. The timestamp Stamp is the total number of milliseconds (positive integer) from greenwich time 1970, 00 h 00 s 000 ms (beijing time: 1970, 01 h 01, 08 h 00 s 000 ms) to the present. The timing data of embodiments of the present application may have a number of different metrics and indicators.
Specifically, when the time series data is to be written into the storage, the time series data is divided into a plurality of data areas regions in the time dimension, the data areas are a long time period, for example, 7 days, and one data area Region is stored in a single data file. Each data field contains timing data for all the metric and index combinations during the time period. A plurality of data blocks Block are defined in the data area Region, each data Block comprises time sequence data of a combination of the Metric Metric and the index Target, and the number of the data blocks in the data area is the combined number of the Metric and the index. For example, the time series data in one data file includes metric attributes M1, M2 and metric attributes T1 and T2, and the combination of metrics and metrics includes four combinations of M1T1, M1T2, M2T1 and M2T2, and then the data file includes four data blocks for storing data values of time series data of different combinations within one time period.
It can be understood that, in this embodiment, the time series data is divided into different data files for storage, and the data files are numbered, so that the specified data file can be quickly located. Assuming that the time interval of the data area is 7 days, data of 1970-01-0100:00:00:000 to 1970-01-0700:00:00:000 is divided into one data area, i.e., one data file. A data file number is then assigned based on the time stamp.
For example, the data file number is obtained by dividing the millisecond time stamp at the beginning of the data area by the integer part of the millisecond time difference of 7 days.
In an embodiment of the present invention, the index file number is calculated by taking a difference between the current timestamp Stamp and greenwich time Stamp0 (00 min 00 s 000 ms 00 h 01/01 h 1970), and the index file number is obtained by rounding down (floor) a result of dividing the timestamp difference by the total number of millisecond timestamps of N × time interval.
For example, when N is 8 and the time interval is 7 days, the index file number is: floor ((Stamp-Stamp0)/(8 × 7 × 86400000)).
Illustratively, as shown in fig. 2, the storage form of the created index file is represented schematically, the Y axis TS-Y represents the index file number, and the X axis Muster-X represents the storage location of the Muster in the index file.
Step 140, constructing a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file.
In a specific embodiment of the present invention, when each Metric is established, an integer value identifier KM (for example, 4 bytes in length) is generated for each Metric; when each index Target is established, an integer value identification KT (for example, 4 bytes in length) is also generated for each index Target. For example, the index Target generates the integer value tag by the method that the value of the first created index integer value tag KT0 is 0, the value of the second created index integer value tag KT1 is 1, and the n-th created index integer value tag KTn is n. The purpose of generating the integer value identification is to facilitate the combination of the metric and the index. The wizard set location table Locates in this embodiment is a list including a key (LocateKey) which is a combination of KM and KT and a value. For example, as shown in fig. 3, one way to combine KM and KT as a key (LocateKey) is: the left logical or operation of KM for Metric with KT for index Target finally results in an 8-byte long integer number.
In this embodiment, the value is the value corresponding to the key, which corresponds to the storage address of the combination of metric and index, in the index file, and the storage address is allocated in a manner, for example, since the index guide is fixed in length, it is assumed that it contains 6 data of 8 bytes, and a guide set Muster is 8 index guide guides, so that a guide set is 384 bytes. Assuming that the Metric Metric1, the indexes Target1 and the index 2 are combined and distributed according to the order of creating the indexes, the spatial address of the guide set Muster after the Metric Metric1 and the index Target1 are combined is 0-384 bytes of an index file, and the spatial address of the guide set Muster after the Metric Metric1 and the index Target2 are combined is 385-768 bytes.
It can be understood that the time sequence data is stored in the above manner, the index guide director and the guide set Muster adopt a fixed-length manner, secondary calculation of indexes is avoided, a large amount of search time is saved, time complexity is guaranteed to be O (1), and the data blocks are stored in an unordered manner, so that occupation storage of invalid data blocks can be avoided.
Specifically, based on the metric of the time sequence data to be stored and the numerical identifiers KM and KT of the index, the value corresponding to the key, that is, the storage location of the corresponding Muster, is found through the calculated key of the locates in the index file number to be stored; and then, performing modular operation on the current timestamp of the time sequence data divided by the time difference of the data file and the number of guiders contained in each Muster to obtain the guiders, and obtaining a data block to be stored in the time sequence data according to a storage address pointed by the guiders.
Further, storing the time series data into the corresponding data block, further comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the data point position specified by the offset.
For example, if the data area Region is 7 days and the collection time period is 1 hour, then 168 (7X 24) pieces of data exist in a data Block, and the data Block is assumed to contain data within a time period of 2020-01-0100:00: 000 to 2020-01-0700:00:00: 000. When the data point has a storage timestamp of 2020-01-0108:00:00:000, the offset is equal to 8 as the difference in time between 2020-01-0100:00: 000 and 2020-01-0108:00:00:000 divided by the cycle time of 1 hour. Since data is stored in bytes, when the unit length of the data type is 4 bytes, the storage space of the data point is 32 th byte to 36 th byte.
In an embodiment of the invention, when the time series data needs to be stored in batch, the time series data is divided into a plurality of sections according to the time period before being written, if a section occupies a part of the data block, the data point in the section is stored in the data block after the data block pointed by the data block is found by the Guider, if the section spans the whole block, the data of the section is added to different data files, and the storage position pointed by the corresponding guide Guider is recorded.
For example, one storage task is a batch storage mode, data of 20 th 8 th point 29 to 20 th 9 th 20 th 10 th point 20 of 2020 is stored, in this case, time sequence data spans a plurality of data files, a few guiders of an index file guide set Muster are firstly positioned, a time period of the data files is assumed to be 30 minutes, 5 guide guiders can be calculated according to a starting time and an ending time, 9 th 20 th 8 th point 29 to 20 th 2020, 9 th 20 th, the corresponding data blocks are found through the guiders, the 29 th and 30 th points in a first data block are stored in data blocks of 29 th point and 8 th 30 th point, the second, third, fourth and fifth data blocks are added into the data files, and the corresponding guiders are updated.
As time is accumulated, the storage capacity of time series data is increased, and the time series data is often characterized by poor applicability of historical data, so that necessary compression processing needs to be performed on the time series data accumulated in history.
Further, the present invention is based on the above-mentioned embodiment, adding a compression step to the time series data, and sequentially compressing all data files before each t0 into compressed files through the preset time point t 0. As shown in fig. 4, step 160 is added to step 190 after step 150, and the following steps are performed:
specifically, as shown in fig. 5, for an established task list file schematic diagram, a task list file includes a header and a body, where the header is a task pointer position and records a current compression task execution position, and the body records an identification Block key of each data Block;
and step 180, sequentially compressing the data blocks by adopting a compression algorithm according to the task list, writing the data blocks into corresponding compressed files, updating the task pointer after the compression of each Block is finished to enable the task pointer to point to the next Block key, and updating the starting and ending positions of the blocks in the compressed files into the corresponding index guide.
In the present embodiment, the compression algorithm used is a data compression algorithm in the prior art, including but not limited to deflate, bzip2, etc.
And step 190, deleting the data files when all blocks in each data file are compressed, and sequentially compressing the data files before the next t 0.
It can be understood that, in this embodiment, if the compression task is interrupted unexpectedly, the Block compression task in the task list pointed to by the task pointer is continuously executed next time the task is started. When all the data blocks Block of the data file are compressed, searching the data area file before the preset time point t0 for compression in sequence. In this embodiment, since the compression task is not a core task of storing data, a single-thread block-wise compression manner is adopted, so that too many resources occupied by a heavy compression task can be avoided.
In this embodiment, since the index guide records the start and end positions of each Block in the compressed file, each time data in the compressed file is queried, only part of the content of the corresponding compressed file needs to be read and decompressed, and the address of the compressed data Block is taken out from the client, thereby greatly increasing the query efficiency and saving the computing resources.
As an example of this embodiment, assuming that 3 data files are included before the preset time point t0, and each data file contains 3 data blocks, a total of 9 blocks need to be compressed. The compression task execution comprises the following steps:
1. acquiring data files to be compressed, namely 3 data files, such as 2501.DAT, 2502.DAT and 2503.DAT, wherein the total number of the data files is 9, and the data files are 2501-A, B2501, 2501-B, B2501, 2501-C … … and the like;
2. and a new task list file 2501.DTL is built for recording the process of compressing the data file 2501. DAT. B2501-A, B2501-B, B2501-C, identification Block Key values of 3 blocks are written in the task list file Body in sequence, and a current pointer value 0 is written in the file Header and points to the first Block, namely B2501-A.
3. And executing a compression algorithm to compress the B2501-A, and writing the compressed data into the corresponding compressed file 2501. ZDT. B2501-A corresponding index guide marker overwrites the storage position of Block in 2501.ZDT, namely recording the start byte and the end byte, for example 0-125. The Header pointer moves down, increases to 1, and points to the next Block (B2501-B).
4. And repeating the compression algorithm process of the step 3 until all blocks in the body of the DTL 2501 are compressed.
5. Deleting 2501.DTL, newly creating 2502.DTL, repeating the processes in the steps 2 and 3, and executing the compression task of 2502.DAT data area.
6. Before the next compression task starts, whether a task list file which is not deleted exists is checked, if yes, the compression task in the task list file is continuously executed, and therefore breakpoint continuous compression is achieved.
Please refer to fig. 6, which is a schematic diagram of an embodiment of a time-series data storage device according to the present invention. Since the apparatus embodiments are substantially similar to the method embodiments, they are described in a relatively simple manner, and reference may be made to some of the descriptions of the method embodiments for relevant points. The device embodiments described below are merely illustrative.
A storage apparatus for time series data of this embodiment includes: a storage device for time series data, the storage device comprising:
the acquisition unit 210 is configured to acquire time series data to be stored in each time period, where attributes of the time series data include a metric, an index, a timestamp, and a data value attribute;
a dividing unit 220, configured to divide the time-series data into a plurality of data files according to a predetermined time period, where each data file includes a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file includes time-series data of all combinations of the metric and the index in the time period, and the data files are numbered according to the timestamp;
an index creating unit 230, which creates an index file, numbers the index file according to the timestamp, and creates an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide directs to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors direct to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit 240, configured to establish a guide set positioning table Locates based on each metric and indicator of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each indicator, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
the storage unit 250 is configured to calculate, according to the timestamp, the corresponding index file number and the data file number for each piece of time series data to be stored, locate and Muster corresponding to the index file with the number, and locate the data block in the data with the number corresponding to the Guider in combination with the timestamp, so as to find the data block in the data with the number corresponding to the Guider, and store the time series data in the corresponding data block.
As an embodiment of the present invention, as shown in fig. 7, the time-series data storage device further includes a compressing unit, configured to compress all the data files before a preset time point t0 into compressed files, where the compressing unit specifically includes:
an obtaining unit 310, which obtains a data file to be compressed before a preset time point t 0;
the task manifest file establishing unit 320 is configured to sequentially write an identifier BlockKey of each data block in each data file into a pre-generated task manifest file, where the identifier BlockKey of each data block is an identifier ID of each data block generated at the time of establishing the data block; the task list file comprises a file header and a file body, wherein the header is a task pointer position and records a current compression task execution position, and the body records an identification Block Key of each data Block Block;
a compression algorithm execution unit 330, which sequentially compresses the data blocks according to the task list by using a compression algorithm and writes the compressed data blocks into corresponding compressed files, and after each Block compression is finished, updates the task pointer to point to the next Block key, and updates the starting and ending positions of the Block in the compressed files into the corresponding index guide pointers;
and the deleting unit 340 deletes the data file when all blocks in each data file are compressed, and sequentially compresses the data file before the next t 0.
The same contents of the time series data storage method in this embodiment and the embodiment are not repeated, please refer to the corresponding parts in the first embodiment.
Corresponding to the storage method for the time series data, the application also provides an inquiry method for the time series data. Please refer to fig. 8, which is a flowchart illustrating a method for querying time series data according to the present application. Similarly, the same parts of this embodiment as those of the first embodiment are not repeated, please refer to the corresponding parts in the first embodiment. The query method of the time sequence data comprises the following steps:
When the time series data is inquired in batch, according to the storage method for storing the time series data in batch in the above embodiment of the present invention, the storage position of the data block pointed by the pointer is found, and then the corresponding data value is extracted in batch.
Corresponding to the query method for the time series data, the application also provides a query device for the time series data. Referring to fig. 9, the apparatus includes:
a receiving unit 510, configured to receive a query request for time series data, where query parameters in the query request include an attribute metric, an index, a start timestamp, and an end timestamp of the time series data;
the query unit 520 finds the corresponding data block from the storage method of the time series data according to the above embodiment based on the query parameter, and further extracts the corresponding data value.
As another embodiment of the present invention, a computer-readable storage medium is provided, in which at least one instruction, at least one program, a set of codes, or a set of instructions is stored; the at least one instruction, the at least one program, the set of codes, or the set of instructions are loaded and executed by a processor to implement a method of storing time series data as described above, or a method of querying time series data as described above.
The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the above embodiments, and various changes can be made within the knowledge of those skilled in the art without departing from the gist of the present invention.
Claims (15)
1. A method for storing time series data of an oil and gas field is characterized by comprising the following steps:
acquiring time sequence data to be stored in oil and gas field production in each time period, wherein the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
dividing the time series data into a plurality of data files according to a preset time period for storage, wherein each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises all the time series data of the combination of the metric and the index in the time period, and the data files are numbered according to the time stamps;
establishing an index file, numbering the index file according to the timestamp, and establishing an index guide director corresponding to each data block in each index file, wherein the index guide director is a fixed-length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N guiders form a guide set Muster, the N guiders point to a set of N different data areas combined by the metric and the index, each Muster corresponds to N data files, and N is a positive integer;
constructing a guide set positioning table Locates based on each measurement and index of the time sequence data acquired each time, wherein the guide set positioning table comprises a key and a value, the key is different combinations of numerical identifiers of each measurement and each index, and the value refers to a storage address of the index set Muster corresponding to each combination in the index file;
and calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning to a Guider according to Locates and Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
2. The method as claimed in claim 1, wherein the locating and the Muster corresponding to the index file with the number are located to the client in combination with the timestamp, so as to find the data block in the data with the number corresponding to the client, specifically comprises:
finding a storage position of a corresponding Muster in the locations in the numbered index file based on the metrics of the time series data and the numerical identifiers of the indexes, wherein the numerical identifiers of the metrics are integer numerical identifiers generated when each metric is established, and the numerical identifiers of the indexes are integer numerical identifiers generated when each index is established;
and obtaining the Guider according to the timestamp of the time sequence data and the modulo operation of the value of the N, and obtaining the data block to be stored by the time sequence data according to the storage address pointed by the Guider.
3. The method of claim 1, wherein said storing said time series data into corresponding said data blocks, further comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the storage position specified by the offset.
4. The method of claim 1 for storing time series data of an oil and gas field, the method further comprising: when the time sequence data needs to be stored in batch, the time sequence data is divided into a plurality of sections according to the time period before being written, if a certain section occupies a part of the data block, the data point in the certain section is stored in the data block after the data block pointed by the guide is found by the guide, if the section spans the whole block, the data of the section is added to different data files, and the storage position pointed by the corresponding guide is recorded.
5. The oil and gas field time series data storage method according to any one of claims 1 to 4, further comprising a preset time point t0, and compressing all the data files before each t0 into compressed files in sequence, and specifically comprising the following steps:
acquiring a data file to be compressed before a preset time point t 0;
sequentially writing an identification BlockKey of each data Block in each data file into a pre-generated task list file, wherein the identification BlockKey of each data Block is an identification ID of each data Block generated at the time of construction, the task list file comprises a file header and a file body, the header is a task pointer position and records a current compression task execution position, and the body records the identification BlockKey of each data Block;
sequentially compressing the data blocks by adopting a compression algorithm according to the task list, writing the data blocks into corresponding compressed files, updating the task pointer after the compression of each Block is finished to enable the task pointer to point to the next Block key, and updating the starting position and the ending position of the Block in the compressed files into the corresponding index guide;
and when all blocks in each data file are compressed, deleting the data file, and sequentially compressing the data file before the next t 0.
6. The method for storing time series data of oil and gas fields according to claim 5, wherein when a compression task is interrupted and a next compression task is started, the task list compressed last time is found, the execution position of the compression task is found, and the compression task is continuously executed.
7. An oil and gas field time series data storage device, characterized in that, this storage device includes:
the system comprises an acquisition unit, a storage unit and a processing unit, wherein the acquisition unit is used for acquiring time sequence data to be stored in oil and gas field production in each time period, and the attributes of the time sequence data comprise measurement, indexes, timestamps and data value attributes;
the dividing unit is used for dividing the time series data into a plurality of data files according to a preset time period for storage, each data file comprises a plurality of data blocks, each data block correspondingly stores the data value of one combination of the metric and the index in one time period, each data file comprises the time series data of all the combinations of the metric and the index in the time period, and the data files are numbered according to the time stamps;
an index establishing unit, configured to establish an index file, number the index file according to the timestamp, and establish an index guide director corresponding to each data block in each index file, where the index guide director is a fixed length byte, the index guide director points to a storage address of the corresponding data block in the data file, every N number of the directors form a guide set Muster, the N number of the directors point to a set in which the metric and the indicator are combined in N different data areas, and each Muster corresponds to N number of the data files, where N is a positive integer;
a guide set positioning table establishing unit, configured to establish a guide set positioning table Locates based on each metric and index of the time series data acquired each time, where the guide set positioning table includes a key and a value, the key is a different combination of a numerical identifier of each metric and a numerical identifier of each index, and the value is a storage address of the index set Muster corresponding to each combination in the index file;
and the storage unit is used for calculating the corresponding index file number and the data file number according to the timestamp of each piece of time sequence data to be stored, positioning the time sequence data to a Guider according to the Locates and the Muster corresponding to the index file with the number and the timestamp, so as to find the data block in the data with the corresponding number and pointed by the Guider, and storing the time sequence data into the corresponding data block.
8. The oil and gas field time series data storage device according to claim 7, wherein the locating and the Muster corresponding to the index file with the number are located to the Guider in combination with the timestamp, so as to find the data block in the data with the number corresponding to the Guider, specifically comprises:
finding a storage position of a corresponding Muster in the locations in the numbered index file based on the metrics of the time series data and the numerical identifiers of the indexes, wherein the numerical identifiers of the metrics are integer numerical identifiers generated when each metric is established, and the numerical identifiers of the indexes are integer numerical identifiers generated when each index is established;
and obtaining the Guider according to the timestamp of the time sequence data and the modulo operation of the value of the N, and obtaining the data block to be stored by the time sequence data according to the storage address pointed by the Guider.
9. An oil and gas field time series data storage device as claimed in claim 7, wherein said storing said time series data into corresponding said data blocks comprises: and calculating the storage offset of the time sequence data in the data block according to the time stamp of the time sequence data and the time period, and writing the time sequence data into the storage position specified by the offset.
10. The field time series data storage device according to any one of claims 7 to 9, further comprising a compression unit for compressing all the data files before a preset time point t0 into compressed files, the compression unit comprising:
an acquisition unit which acquires a data file to be compressed before a preset time point t 0;
the task list file establishing unit is used for sequentially writing an identifier BlockKey of each data block in each data file into a pre-generated task list file, wherein the BlockKey is an identifier ID of each data block generated during construction; the task list file comprises a file header and a file body, wherein the header is a task pointer position and records a current compression task execution position, and the body records an identification Block Key of each data Block Block;
the compression algorithm execution unit is used for sequentially compressing the data blocks according to the task list by adopting a compression algorithm and writing the data blocks into corresponding compression files, after the compression of each Block is finished, the task pointer is updated to point to the next Block key, and the starting and ending positions of the Block in the compression files are updated into the corresponding index guide guiders;
and the deleting unit deletes the data file when all blocks in each data file are compressed, and sequentially compresses the data file before the next t 0.
11. The field sequential data storage device according to claim 10, further comprising an interruption execution unit for finding the task list compressed last time and finding the compression task execution position to continue execution of the compression task when the compression task is interrupted and started next time.
12. A method for inquiring time series data of an oil and gas field is characterized by comprising the following steps:
receiving a query request of time series data, wherein query parameters in the query request comprise attribute measurement, indexes, a start timestamp and an end timestamp of the time series data;
finding a corresponding data block from the oil and gas field time series data storage method according to any one of claims 1-3 based on the query parameter, and further extracting the corresponding data value.
13. The method according to claim 12, wherein when the time series data is queried in batches, according to the method for storing time series data of oil and gas fields according to claim 4, the storage position of the data block pointed by the marker is found, and the corresponding data value is extracted in batches.
14. An oil and gas field time series data inquiry unit, its characterized in that, the apparatus includes:
the receiving unit is used for receiving a query request of the time sequence data, and query parameters in the query request comprise attribute measurement, indexes, a start timestamp and an end timestamp of the time sequence data;
the query unit is used for searching the corresponding data blocks from the oil and gas field time series data storage method according to any one of claims 1 to 4 based on the query parameters and further extracting the corresponding data values.
15. A computer readable storage medium having stored therein at least one instruction, at least one program, set of codes, or set of instructions; the at least one instruction, the at least one program, the set of codes or the set of instructions being loaded and executed by a processor to implement a method of storing time series data according to any one of claims 1 to 6 or a method of querying field time series data according to claims 12-13.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011160744.1A CN112286867B (en) | 2020-10-27 | 2020-10-27 | Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011160744.1A CN112286867B (en) | 2020-10-27 | 2020-10-27 | Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112286867A true CN112286867A (en) | 2021-01-29 |
CN112286867B CN112286867B (en) | 2022-03-01 |
Family
ID=74372693
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011160744.1A Active CN112286867B (en) | 2020-10-27 | 2020-10-27 | Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112286867B (en) |
Cited By (12)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112463803A (en) * | 2021-02-01 | 2021-03-09 | 山东柏源技术有限公司 | Time sequence data storage method, device and equipment for petroleum production |
CN112579834A (en) * | 2021-02-22 | 2021-03-30 | 北京工业大数据创新中心有限公司 | Industrial equipment data storage method and system |
CN113032453A (en) * | 2021-02-25 | 2021-06-25 | 广州虎牙科技有限公司 | Data storage and decompression method and device, electronic equipment and storage medium |
CN113297135A (en) * | 2021-02-10 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN113297151A (en) * | 2021-02-10 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN113360551A (en) * | 2021-08-11 | 2021-09-07 | 南京赛宁信息技术有限公司 | Method and system for storing and rapidly counting time sequence data in shooting range |
CN113515576A (en) * | 2021-07-13 | 2021-10-19 | 北京字节跳动网络技术有限公司 | Data processing method and device, electronic equipment and computer readable medium |
CN114615306A (en) * | 2022-05-10 | 2022-06-10 | 中南林业科技大学 | Efficient file system of sink node in Internet of things and processing method thereof |
CN114630030A (en) * | 2022-03-10 | 2022-06-14 | 陕西安控科技有限公司 | Oil and gas field mobile measure operation equipment monitoring system and method thereof |
CN114817679A (en) * | 2022-04-14 | 2022-07-29 | 中南林业科技大学 | Method for storing out-of-order time sequence data of sink nodes of Internet of things |
CN116304390A (en) * | 2023-04-13 | 2023-06-23 | 北京基调网络股份有限公司 | Time sequence data processing method and device, storage medium and electronic equipment |
CN117573703B (en) * | 2024-01-16 | 2024-04-09 | 科来网络技术股份有限公司 | Universal retrieval method, system, equipment and storage medium for time sequence data |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106776967A (en) * | 2016-12-05 | 2017-05-31 | 哈尔滨工业大学(威海) | Mass small documents real-time storage method and device based on sequential aggregating algorithm |
US20180034760A1 (en) * | 2016-07-27 | 2018-02-01 | Sap Se | Time series messaging persistence and publication |
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
CN109164980A (en) * | 2018-08-03 | 2019-01-08 | 北京涛思数据科技有限公司 | A kind of optimizing polymerization processing method of time series data |
CN111309720A (en) * | 2018-12-11 | 2020-06-19 | 北京京东尚科信息技术有限公司 | Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium |
-
2020
- 2020-10-27 CN CN202011160744.1A patent/CN112286867B/en active Active
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180034760A1 (en) * | 2016-07-27 | 2018-02-01 | Sap Se | Time series messaging persistence and publication |
CN106776967A (en) * | 2016-12-05 | 2017-05-31 | 哈尔滨工业大学(威海) | Mass small documents real-time storage method and device based on sequential aggregating algorithm |
CN108256088A (en) * | 2018-01-23 | 2018-07-06 | 清华大学 | A kind of storage method and system of the time series data based on key value database |
CN109164980A (en) * | 2018-08-03 | 2019-01-08 | 北京涛思数据科技有限公司 | A kind of optimizing polymerization processing method of time series data |
CN111309720A (en) * | 2018-12-11 | 2020-06-19 | 北京京东尚科信息技术有限公司 | Time sequence data storage method, time sequence data reading method, time sequence data storage device, time sequence data reading device, electronic equipment and storage medium |
Non-Patent Citations (2)
Title |
---|
林志达等: "基于时序数据库的监控数据存储方法研究", 《电子元器件与信息技术》 * |
王洪亮: "分散存储油气生产动态大数据的优化管理与快速查询", 《石油勘探与开发》 * |
Cited By (15)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112463803A (en) * | 2021-02-01 | 2021-03-09 | 山东柏源技术有限公司 | Time sequence data storage method, device and equipment for petroleum production |
CN113297135A (en) * | 2021-02-10 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN113297151A (en) * | 2021-02-10 | 2021-08-24 | 阿里巴巴集团控股有限公司 | Data processing method and device |
CN112579834A (en) * | 2021-02-22 | 2021-03-30 | 北京工业大数据创新中心有限公司 | Industrial equipment data storage method and system |
CN113032453A (en) * | 2021-02-25 | 2021-06-25 | 广州虎牙科技有限公司 | Data storage and decompression method and device, electronic equipment and storage medium |
CN113032453B (en) * | 2021-02-25 | 2024-03-01 | 广州虎牙科技有限公司 | Data storage and decompression method and device, electronic equipment and storage medium |
CN113515576A (en) * | 2021-07-13 | 2021-10-19 | 北京字节跳动网络技术有限公司 | Data processing method and device, electronic equipment and computer readable medium |
CN113360551A (en) * | 2021-08-11 | 2021-09-07 | 南京赛宁信息技术有限公司 | Method and system for storing and rapidly counting time sequence data in shooting range |
CN114630030A (en) * | 2022-03-10 | 2022-06-14 | 陕西安控科技有限公司 | Oil and gas field mobile measure operation equipment monitoring system and method thereof |
CN114817679A (en) * | 2022-04-14 | 2022-07-29 | 中南林业科技大学 | Method for storing out-of-order time sequence data of sink nodes of Internet of things |
CN114615306A (en) * | 2022-05-10 | 2022-06-10 | 中南林业科技大学 | Efficient file system of sink node in Internet of things and processing method thereof |
CN114615306B (en) * | 2022-05-10 | 2022-07-29 | 中南林业科技大学 | Efficient file system of sink node in Internet of things and processing method thereof |
CN116304390A (en) * | 2023-04-13 | 2023-06-23 | 北京基调网络股份有限公司 | Time sequence data processing method and device, storage medium and electronic equipment |
CN116304390B (en) * | 2023-04-13 | 2024-02-13 | 北京基调网络股份有限公司 | Time sequence data processing method and device, storage medium and electronic equipment |
CN117573703B (en) * | 2024-01-16 | 2024-04-09 | 科来网络技术股份有限公司 | Universal retrieval method, system, equipment and storage medium for time sequence data |
Also Published As
Publication number | Publication date |
---|---|
CN112286867B (en) | 2022-03-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112286867B (en) | Oil-gas field time sequence data storage method, oil-gas field time sequence data query device and storage medium | |
US10176208B2 (en) | Processing time series data from multiple sensors | |
CN108153784B (en) | Synchronous data processing method and device | |
CN101553813B (en) | Managing storage of individually accessible data units | |
KR101708261B1 (en) | Managing storage of individually accessible data units | |
CN108287668B (en) | Equipment data processing method and device, computer device and readable storage medium | |
US20060004840A1 (en) | Index adding program of relational database, index adding apparatus, and index adding method | |
CN110990402B (en) | Format conversion method from row storage to column storage, query method and device | |
WO2018095299A1 (en) | Time sequence data management method, device and apparatus | |
CN114911830A (en) | Index caching method, device, equipment and storage medium based on time sequence database | |
CN115827660B (en) | Data updating method and device, electronic equipment and nonvolatile storage medium | |
CN111125018B (en) | File exception tracing method, device, equipment and storage medium | |
CN109344163B (en) | Data verification method and device and computer readable medium | |
CN109739819A (en) | Snapshot lossless compression method, device, equipment and the readable storage medium storing program for executing that can be recalled | |
EP2568399A2 (en) | Data storage method and system | |
CN115729893A (en) | Data access method, data access device, nonvolatile storage medium and electronic device | |
CN111858767A (en) | Synchronous data processing method, device, equipment and storage medium | |
CN117278046A (en) | Time sequence data compression storage method and device, electronic equipment and storage medium | |
CN115599793B (en) | Method, device and storage medium for updating data | |
CN115454353A (en) | High-speed writing and query method for space application data | |
CN110688395A (en) | Information query method, device, information statistical method and related equipment | |
CN112463803A (en) | Time sequence data storage method, device and equipment for petroleum production | |
CN110543452B (en) | Data acquisition method and equipment | |
CN114064666A (en) | Data warehouse synchronization system and method | |
CN111831622A (en) | Data index generation method and device, electronic equipment and readable storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |