CN115408390A - Data processing method and device and electronic equipment - Google Patents

Data processing method and device and electronic equipment Download PDF

Info

Publication number
CN115408390A
CN115408390A CN202211021925.5A CN202211021925A CN115408390A CN 115408390 A CN115408390 A CN 115408390A CN 202211021925 A CN202211021925 A CN 202211021925A CN 115408390 A CN115408390 A CN 115408390A
Authority
CN
China
Prior art keywords
target
data
file
hash
period
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202211021925.5A
Other languages
Chinese (zh)
Inventor
周京晖
程栋
马瀚征
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Electric Wind Power Group Co Ltd
Original Assignee
Shanghai Electric Wind Power Group Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Electric Wind Power Group Co Ltd filed Critical Shanghai Electric Wind Power Group Co Ltd
Priority to CN202211021925.5A priority Critical patent/CN115408390A/en
Publication of CN115408390A publication Critical patent/CN115408390A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • G06F16/2255Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2458Special types of queries, e.g. statistical queries, fuzzy queries or distributed queries
    • G06F16/2474Sequence data queries, e.g. querying versioned data
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/28Databases characterised by their database models, e.g. relational or object models
    • G06F16/284Relational databases
    • G06F16/285Clustering or classification

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a data processing method and device and electronic equipment. The method and the device adopt a Hash digest algorithm to determine the corresponding Hash identification and storage path based on the variable identification, the target period and the target time interval of the target variable corresponding to the target data, thereby creating a corresponding target file under the corresponding path and writing the target data in, and realizing data storage. According to the data storage method and device, data under different variables, different time periods and different periods can be independently stored in different files respectively according to a Hash algorithm, so that the characteristics that high-frequency data collection and query are carried out at equal time intervals and specific time periods are fully utilized, the utilization rate of a storage space can be effectively improved, follow-up data query with a higher effective data ratio can be conveniently carried out according to information such as variables, periods and time periods, and the device data storage requirement under a large-data-volume scene can be effectively met compared with a traditional data storage mode.

Description

Data processing method and device and electronic equipment
Technical Field
The present disclosure relates to data processing technologies, and in particular, to a data processing method and apparatus, and an electronic device.
Background
For various industrial equipment with real-time monitoring requirements, data generated and recorded in the operation process has very important values for state monitoring, online diagnosis and offline analysis.
Currently, commonly used data storage methods include relational databases (e.g., mySQL), non-relational databases (e.g., mongoDB), time-series databases (e.g., TSDB), text files (e.g., CSV files), and the like.
However, with the development of industrial control technology, the number of monitored variables in operation of industrial equipment is increasing, and the data sampling frequency is increasing, which poses a serious challenge for the storage of massive data, for example, for high-frequency data with a sampling period of millisecond level or less, a single variable can generate tens of thousands to millions of data per day, and the data generated per year can reach hundreds of millions of data.
The conventional data storage mode cannot meet the production practice requirements in a large data volume scene in the aspects of storage efficiency, access speed and the like.
Disclosure of Invention
The embodiment of the application provides a data processing method and device and electronic equipment, and aims to solve the problem that the existing data storage scheme is difficult to store high-frequency variables.
In a first aspect, an embodiment of the present application provides a data processing method, including:
acquiring target data of a target variable in a target time period and a target period;
determining a hash identifier and a storage path of the target file through a target hash algorithm at least according to the target time interval, the target period and the variable identifier of the target variable;
and creating the target file according to the hash mark under the storage path, and writing the target data into the target file.
In a possible implementation manner, the target data of the target variable in the target period and the target period includes:
target sampling data of the target variable in a target time period and a target sampling period, and/or target statistical data of the target variable in the target time period and the target statistical period;
the target statistical period is greater than the target sampling period, and the target statistical data is data obtained by performing statistics on sampling data or other statistical data based on the target statistical period.
In a possible implementation manner, the determining, by using a target hash algorithm, a hash identifier and a storage path of a target file according to at least the target time period, the target period, and a variable identifier of the target variable includes:
and determining the hash identification and the storage path of the target file through a target hash algorithm according to the data type of the target data, the target time interval, the target period and the variable identification of the target variable.
In a possible implementation manner, the determining, by a target hash algorithm, a hash identifier and a storage path of a target file according to at least the target time period, the target period, and the variable identifier of the target variable includes:
determining a hash value matched with the target data through a target hash algorithm at least according to the target time interval, the target period and the variable identification of the target variable;
selecting characters with preset digits from the hash value as a hash mark of the target file;
dividing the characters of the hash marks, inquiring a storage position matched with a division result in a preset multistage file directory, and determining a storage path of the target file.
In a possible implementation manner, the writing the target data into the target file includes:
sequentially writing data points in the target data into the target file in a binary format according to a time sequence, wherein the binary storage bit number of the data points in the target file is a fixed bit number;
and if the fault data points with missing data exist, occupying the fault data points as the fixed number based on an interpolation algorithm or a preset special value.
In a possible implementation manner, the method further includes:
when a query instruction for performing data query on the target variable is received, determining hash identifications and storage paths of one or more files to be queried through the target hash algorithm according to a target query time interval, a target query period and variable identifications carried in the query instruction;
and inquiring the one or more files to be inquired based on the hash identification and the storage path to obtain the data to be inquired of the target variable in the target inquiry time period and the target inquiry period.
In a possible implementation manner, the determining, according to the target query time period, the target query period, and the variable identifier carried in the query instruction, the hash identifier and the storage path of one or more files to be queried by using a target hash algorithm includes:
determining one or more actual query time periods according to the target query time period carried in the query instruction based on a time period division rule matched with the target query cycle;
determining hash marks and storage paths of one or more files to be inquired through a target hash algorithm based on the one or more actual inquiry time intervals, the target inquiry period and the variable marks carried in the inquiry instruction; wherein each actual query time interval is matched with a target time interval of a file to be queried.
In a possible implementation manner, the target file and the file to be queried further include file header information for performing data self-description, where the file header information includes at least one of a variable identifier, a target time period, a target period, an encryption algorithm type, and a compression algorithm type; the external equipment stores verification information used for verifying the data of the target file;
after the querying the one or more files to be queried based on the hash identifier and the storage path, the method further includes:
acquiring the check information in the external equipment, and judging whether the check information is matched with the file header information in the file to be inquired;
if so, determining that the file to be queried has no data abnormality, and acquiring the data to be queried of the target variable in the target query time interval and the target query period;
if not, determining that the file to be queried has data exception, and returning an error prompt.
In a second aspect, an embodiment of the present application provides a data processing apparatus, including:
the data acquisition unit is used for acquiring target data of the target variable in a target time period and a target period;
a hash determining unit, configured to determine, according to at least the target time interval, the target period, and the variable identifier of the target variable, a hash identifier and a storage path of a target file through a target hash algorithm;
and the file creating unit is used for creating the target file according to the hash identifier under the storage path and writing the target data into the target file.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine-executable instructions to perform the method steps disclosed above.
According to the technical scheme, the hash abstract algorithm is adopted to determine the corresponding hash identification and storage path based on the variable identification, the target period and the target time interval of the target variable corresponding to the target data, so that the corresponding target file is created under the corresponding path and the target data is written in, and data storage is realized. In the embodiment, data under different variables, different time periods and different periods can be respectively and independently stored in different files according to the Hash algorithm, so that the characteristics that high-frequency data acquisition and query are carried out at equal time intervals and specific time periods are fully utilized, the utilization rate of a storage space can be effectively improved, the follow-up data query with higher effective data ratio can be conveniently carried out according to the information of the variables, the periods, the time periods and the like, and the requirement on data storage of equipment under a large-data-volume scene can be effectively met compared with the traditional data storage mode.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart of a method provided by an embodiment of the present application;
FIG. 2 is a schematic diagram of a directory storage structure provided in an embodiment of the present application;
FIG. 3 is a schematic diagram of a data file structure provided in an embodiment of the present application;
FIG. 4 is a schematic diagram of a data storage performance comparison;
FIG. 5 is a schematic diagram of a comparison of data query performance;
FIG. 6 is a diagram illustrating an exemplary embodiment of an apparatus;
fig. 7 is a block diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In order to make the technical solutions provided in the embodiments of the present application better understood and make the above objects, features, and advantages of the embodiments of the present application more obvious and understandable by those skilled in the art, the technical solutions in the embodiments of the present application are further described in detail below with reference to the accompanying drawings.
In the field of industrial control, it is necessary to collect data of various variables of industrial equipment during operation of the industrial equipment, so as to perform various applications such as real-time state monitoring, online diagnosis and offline analysis. The industrial equipment comprises any equipment needing monitoring or operation data acquisition, such as a wind driven generator, a chemical reaction kettle, a transformer and the like, and the variables needing to be acquired can be the variables needed in any industrial production control or equipment detection process, such as wind speed, rotating speed, oil temperature, hydraulic pressure, power and the like.
Data having a sampling period of milliseconds or less is generally referred to as high-frequency data, and variables requiring high-frequency data acquisition are referred to as high-frequency variables. Taking a wind power generator as an example: the number of the high-frequency variable points with 20ms resolution required to be collected by a single fan can reach more than 1000, the data amount required to be stored per day is 50/s 86400 s/d 1000 points/station =43.2 hundred million/d.station, if the storage period of the high-frequency data is 180 days, the data amount required to be stored by the single fan in one storage period can reach 43.2 hundred million/d 180 d =7776 hundred million, and the total storage amount can reach 7776 hundred million 4=3.11 terabytes (3.11 TB) calculated by taking 4 bytes (single-precision floating point number) of each data as an example.
However, the currently used data processing method has several defects in the large data storage scenario, such as: the storage structure of the relational database is difficult to support effective storage and access of trillion-level data, and extra fields such as ID (identity) and time scale need to be stored aiming at each record, so that the total storage capacity after the index is added is far beyond the capacity of the data; the non-relational database needs to use a memory as a main cache medium, the performance is sharply reduced after the data volume (such as TB level) exceeds the available memory (such as GB level), and the problem of low effective data occupation ratio caused by introducing extra fields also exists; the time sequence database can greatly improve the performance by aiming at the general optimization of the time sequence data, but cannot be customized by utilizing the self characteristics and the service requirements of the high-frequency data, the storage efficiency and the access performance still cannot meet the service requirements, and the requirements on hardware equipment are higher; although text files have the advantage of being readable by users, the storage efficiency and access performance are far lower than those of the previous 3-class database, and the text files can be only used for data exchange under the condition of small data volume, and the like.
In view of the above, embodiments of the present application provide a data processing method to solve the above problems. Referring to fig. 1, fig. 1 is a flowchart of a method provided in an embodiment of the present application. The sampling period of the target variable involved in the flow may be in the order of milliseconds, or may be at any higher or lower sampling level, which is not limited in this embodiment.
In this embodiment, the data related to the data processing method flow may be from any industrial production equipment requiring data acquisition and storage, including a wind turbine, a chemical reaction kettle, or other equipment having data storage requirements; the obtained data may be stored in any electronic device such as a personal computer, a server, an embedded industrial control device, a mobile phone, and the like, which is not limited in this embodiment.
As shown in fig. 1, the process may include the following steps:
step 101, acquiring target data of a target variable in a target time period and a target period.
In this embodiment, in order to perform independent query on high-frequency data based on variables, time periods and intervals (i.e., sampling periods), and perform data storage on the data query and acquisition generally based on the characteristic of equal time intervals, it is necessary to determine the variable, time period and period information of the acquired target data first, that is, to acquire the target data of the target variable in the target time period and the target period for generating a subsequent target file for data storage.
Wherein, the target time interval may be a time range corresponding to the target data, i.e. from a time corresponding to a first data point in the target data to a time corresponding to a last data point, such as 10 to 00, 12; the target period may be a corresponding time interval between two adjacent data points in the target data, and the time interval may be a fixed value within the same set of target data, such as 20 milliseconds, 1 minute, 1 hour, and the like.
As a preferred embodiment, the target data of the target variable in the target period and the target period may specifically be: target sampling data of the target variable in a target time period and a target sampling period, and/or target statistical data of the target variable in the target time period and the target statistical period.
The sampling data can be original data obtained by directly sampling data according to a certain sampling period; the statistical data may be data obtained by performing statistics on the sampled data or other statistical data based on a statistical period, where the target statistical period is greater than the target sampling period.
For example, data sampling is performed in a wind turbine at a target sampling period of 20ms, and then target sampling data with the target sampling period of 20ms can be obtained; on this basis, statistics may be continued to obtain statistical data, for example, 3000 consecutive data points may be selected from the target sample data, and the maximum value of every adjacent 50 data points is determined to obtain 60 data points, that is, the target statistical data with the target statistical period of 20ms × 50= 1s; further, the maximum value statistics can be continued on the 60 data points in the same manner, so as to further obtain the target statistical data with the target statistical period of 1s × 60=1min, and so on; the maximum value may be replaced by any statistical value, such as a minimum value, an average value, a peak-to-valley difference, a standard deviation, a final value, and the like, which is not limited in this embodiment.
Because the query of the high-frequency data is usually performed for a specified time interval, for example, millisecond-level original sample data does not need to be accessed when the statistical data with the coarse granularity of minute level is queried, minute-level statistical data does not need to be accessed when the statistical data with the coarse granularity of hour level is queried, and the like, the statistical data under different periods are synchronously stored when the data is stored, which is beneficial to improving the subsequent data query efficiency, reducing the hardware requirement, and optimizing the user experience.
And 102, determining a hash identifier and a storage path of the target file through a target hash algorithm at least according to the target time interval, the target period and the variable identifier of the target variable.
In this embodiment, after the target data is obtained, a preset hash algorithm is at least required to be used to generate a hash identifier corresponding to the target data based on a time period, a period, and a variable identifier of the target data, so as to determine a storage path, a name of a storage file, and the like according to the hash identifier in the following process.
Specifically, a hash value matched with the target data may be determined through a target hash algorithm, and then characters with preset digits are selected from the hash value as hash identifiers of the target file for subsequently dividing the characters of the hash identifiers, a storage location matched with a division result is queried in a preset multi-level file directory, and a storage path of the target file is determined; the related content of the multi-level file directory will be specifically given in conjunction with the related content of fig. 2, which will not be described herein again.
For example, when the target period of certain target data is 2020-08-14 17 00 to 2020-08-14 18, the target period is 50ms, and the variable identifier is 1008, when the adopted preset hash algorithm is the MD5 algorithm, the manner of calculating the hash identifier may be MD5 ("1008 #2020081417# 50ms") = CBC0F1300ECA444B8C30764463E368D9.
The "1008", "2020081417" and "50ms" respectively correspond to the variable identifier, the target time period, and the target period of the target variable, and the specifically adopted format, separator, arrangement sequence, and other contents can be adjusted at will, which is not limited in this embodiment; and, the hash calculation result may be directly determined as the hash identifier, or a specified part of the bits (for example, the first ten bits: CBC0F 1300E) may be selected as the hash identifier, which is not limited in this embodiment. The preset hash algorithm can be any hash digest algorithm such as MD5, SHA-1, SHA2, RIPEMD-160 and the like, and the hash algorithm used when the hash identifier is generated and the hash algorithm used when the stored data file is inquired in the follow-up process are the same target hash algorithm.
In this embodiment, after the hash identifier is determined, a storage path of the target file to be generated may be determined based on the hash identifier; as an optional embodiment, a file directory storage structure of several levels may be generated in advance based on a combination of hash characters (0, 1, 2 … … F) with a certain number of bits, and then a directory position matching the hash identifier is determined in the multi-level file directory as a storage path.
As an alternative embodiment, the directory storage structure of the multi-level file directory may be implemented with reference to the content shown in fig. 2. The root directory shown in the figure may be a designated root directory for uniformly storing all data files, such as C: \ data, etc. Each level of subdirectory under the root directory may be composed of several bit hash characters, for example, when 2 bit hash characters are used, 16 × 16=256 levels of subdirectories under the root directory may be included, which are 00, 01, 02 … … FF in sequence. On this basis, it may be possible to set up a number of second-level subdirectories instead of directly storing files under the first-level subdirectories, for example, when the second-level subdirectories are also formed by two-bit hash characters, 256 second-level subdirectories are similarly included under each first-level subdirectory, 256 × 256=65536 second-level subdirectories are included under the root directory, and subsequently generated data files are all stored under the second-level subdirectories. In this embodiment, the total number of subdirectories and the number of subdirectories of each level are not limited, two levels of subdirectories are usually set, and each level of subdirectories is composed of two hash characters, so that the data capacity requirement under a general condition can be met, and if the requirement cannot be met, the subdirectories of the third level or above can be expanded continuously.
As a specific example, based on the directory storage structure in fig. 2, assuming that for target data with a variable identifier of 0001, a target period of 2021-09-10, and a target period of 20 ms: HASH ("0001 #, 2021091010#, 20ms") = "006E8D78A940," then it can be determined that its primary subdirectory is "00" (1-2 bits of HASH mark), and the secondary subdirectory under said primary subdirectory is "6E" (3-4 bits of HASH mark), so that the storage path of target file to be generated corresponding to said target data can be C: \ data 00\6E; assume that for the target statistical data with variable identification of 0002, target period of 2021 year round, and target period of 1 hour, the hash identification corresponding thereto is: HASH ("0002 #, 2021#, 1h") = "009D1129E3BF," then its storage path can be C: \ data \00\9D.
As a preferred embodiment, when the target data is target statistical data determined based on statistics, the basis for determining the hash identifier further includes a data type of the target data. The data type corresponds to a specific statistic type of the target statistic data, for example, when the data type of the target statistic data corresponds to a maximum value, the data type is determined to be 01, and when the data type of the target statistic data corresponds to a minimum value, the data type is determined to be 02, and the like, and the data type is added to the calculation basis of the hash identifier, so that the data type is used for subsequently storing the statistic data of different statistic types in the same variable, the same time period, and the same statistic period into different data files respectively.
As another preferred embodiment, when the target data is target statistical data determined based on statistics, statistical data of different statistical value types under the same variable, the same time period and the same statistical period may also be stored in the same data file; for example, the maximum value, the minimum value and the average value data can be stored in the same data file, wherein the 1 st, 4 th, 7 … … 3*n-2 data points in the file correspond to the maximum value, the 2 nd, 5 th, 8 … … 3*n-1 data points in the file correspond to the minimum value, and the 3 rd, 6 th, 9 … … 3*n data points in the file correspond to the average value. Different byte lengths can be adopted for storing data of different statistic value types, but the same byte length must be ensured among the data of the same statistic value type, so that the data at the appointed moment can be positioned based on the byte length when the data is inquired subsequently.
Based on the directory storage structure and the storage path determining mode, a large number of data files can be distributed to each subdirectory statistically and evenly, and compared with the traditional mode of establishing storage directories based on time periods such as the year, the month and the day, the defect that the file access efficiency is reduced due to the fact that a certain directory stores too many data files can be effectively avoided.
It should be noted that, only data under the same variable from the same device may be stored in the same root directory, or data under different variables from different devices may be stored in the same root directory, which is not limited in this embodiment; if only a specific variable is stored in the same root directory, and data of different variables are divided into different root directories, that is, the data of different variables are isolated and stored independently, so that the consumption of computing resources during subsequent data query can be effectively reduced.
By setting the same variable at different target time intervals, the total amount of data to be retrieved during subsequent data query can be greatly reduced so as to improve the query efficiency; similarly, the data of the same variable in different target periods are independently stored, that is, preprocessing is performed for different time granularities such as second, minute, hour, day and the like, although the total amount of occupied storage space is slightly increased compared with a scheme of only storing original data, the geometric size of the data to be searched in the query stage can be greatly reduced, and the performance improvement brought by the locality of the acquired data is realized.
Step 103, creating the target file according to the hash identifier under the storage path, and writing the target data into the target file.
In this embodiment, based on the foregoing steps 101 and 102, the target data, the hash identifier and the storage path have been determined, so that a target file can be created at the storage path at least based on the hash identifier for storing the target data. The file name of the target file can be equal to the hash mark, and can also be a character corresponding to a preset digit in the hash mark; the file name may further include an additional field added based on a preset rule, and the like, in addition to the hash identifier, which is not limited in this embodiment.
As an alternative embodiment, writing the target data into the target file may specifically be writing data points in the target data into the target file sequentially in a binary format according to a time sequence.
The target file may be a binary file, such as a file in a bin (binary: binary) format; the binary storage digit of the data point in the target file is a fixed digit, if a data missing fault data point exists, the fault data point is occupied as the fixed digit based on an interpolation algorithm or a preset special value, so that data point positioning can be performed based on the byte length during subsequent data query, for example, if the data missing fault data point exists, a substitute value for occupying the fault point is calculated by adopting the interpolation algorithm according to the adjacent data point which is not missing, or 0 padding is performed, and the like.
As an alternative embodiment, the structure of the target file may be implemented by referring to a schematic diagram of the data file structure shown in fig. 3. As shown in fig. 3, each data file contains N data points arranged in time sequence, each data point is stored in binary, and the storage length of the data point can be configured from 1/8 byte (1 bit) to 8 bytes (double-precision floating point number) as required; the N data points in the same data file may be written once or written multiple times, which is not limited in this embodiment.
As a preferred embodiment, when receiving a query instruction for performing data query on the target variable, according to a target query time period, a target query period, and a variable identifier carried in the query instruction, determining a hash identifier and a storage path of one or more files to be queried by using the target hash algorithm, querying the one or more files to be queried based on the hash identifier and the storage path, and acquiring data to be queried of the target variable in the target query time period and the target query period.
In this embodiment, the data stored in the target variable may be queried accordingly based on the received query instruction. The target query time interval is a time range in which the data corresponding to the target variable to be queried and acquired is located, and the target query cycle is a sampling or statistical cycle of the data corresponding to the target variable to be queried and acquired.
Further, as a preferred embodiment, in the data query process, specifically, one or more actual query time periods are determined according to the target query time period carried in the query instruction based on a time period division rule matched with the target query period, and further, based on the one or more actual query time periods, the target query period carried in the query instruction, and the variable identifier, the hash identifier and the storage path of one or more files to be queried are determined through a target hash algorithm. Wherein each actual query time interval is matched with a target time interval of a file to be queried.
In this embodiment, since there may be a case where the target query period carried in the query instruction is not exactly the same as the aforementioned target period, the target query period needs to be converted into one or more actual query periods based on a preset period division rule.
For example, assuming that, based on a preset time interval division rule, when the target period of the target variable is 20ms, 50 × 60=180000 data points exist in the corresponding data file, that is, the data in each data file corresponds to a time length of 1 hour, so that the target period of the target variable in the target period is 10 to 11, when the target query period is 10 to 11. At this time, based on a preset time interval division rule (if the time interval corresponding to the target period of 20ms is divided in units of whole hours), it can be determined that the actual query time interval matching the target query time interval 10 to 11 is from 10 to 11, and the ratio of 11 to 00; the method comprises the following steps of sequentially determining the hash identifier and the storage path of a file to be queried, which are matched with the two actual query time periods, based on the two actual query time periods, namely, two data storage files of which the target period of the target variable is 20ms and the target time periods are respectively 10 to 00 to 11 and 00 to 12; optionally, the actual query time period may also be 10 to 11, 00 and 11 to 00.
Similarly, the time interval division rule may be that the time interval corresponding to the target period of 1 minute is divided in units of a whole day or a whole month, the time interval corresponding to the target period of 1 hour is divided in units of a whole month or a whole year, and the like; and, the target query time interval may be the same as the target time interval, or the target query time interval may be completely contained in a certain target time interval, and the file to be queried may be a certain file instead of multiple files.
As an optional embodiment, the file may further include file header information for performing data self-description or verification, and the file header information may include one or more of variable identifier, target time period, target period, byte length, data length, encryption algorithm type, and compression algorithm type. The file header information can be stored in at least one external device independent of the device as check information, and in the subsequent data query process, whether the file header information used for self-description in the queried data file is matched with the check information stored in the external device can be judged, so that whether the queried data file has data abnormality or not is determined, and the effect of redundancy verification is achieved.
Specifically, after the one or more files to be queried are queried based on the hash identifier and the storage path, the check information in the external device may also be obtained, whether the check information matches the file header information in the file to be queried is determined, if so, it is determined that the file to be queried has no data exception, the data to be queried of the target variable in the target query time period and the query cycle is obtained, and if not, it is determined that the file to be queried has data exception, and an error prompt is returned.
The encryption algorithm type and the compression algorithm type are mainly used for encrypting and/or compressing the data file after data is written in, so that data security is further improved, and data privacy is protected. If encryption and/or compression operation is adopted in the data storage process, the file header information should contain the corresponding encryption algorithm type and/or compression algorithm type, so that in the subsequent data query process, after the file to be queried is determined, the file to be queried is decrypted and/or decompressed by adopting the same encryption algorithm type and/or compression algorithm type as in the storage process, and corresponding query data is read from the file.
Thus, the flow shown in fig. 1 is completed.
As can be seen from the flow shown in fig. 1, in this embodiment, based on the variable identifier, the target period, and the target time period of the target variable corresponding to the target data, the hash digest algorithm is used to determine the corresponding hash identifier and the storage path, so as to create the corresponding target file in the corresponding path and write the target data in the corresponding path, thereby implementing data storage. In the embodiment, data under different variables, different time periods and different periods can be respectively and independently stored in different files according to a Hash algorithm, so that the characteristic that high-frequency data acquisition and query are carried out at equal time intervals aiming at specific time periods is fully utilized, the utilization rate of a storage space can be effectively improved, the follow-up data query with higher effective data ratio can be conveniently carried out according to information such as variables, periods and time periods, and the requirement for storing equipment data under the large-data-volume scene can be effectively met compared with the traditional data storage mode.
For the data files stored and obtained based on the method, the corresponding file name and the storage address to be queried can be directly determined by adopting the same Hash digest algorithm according to the query condition of variable identification + time period + period, the constant access speed of the O (1) level can be basically achieved, and compared with the traditional full-table traversal or field indexing method, the retrieval speed under the large-data-volume scene can be effectively improved.
In addition, because the file name is determined based on the hash identifier, information such as a variable name, a sampling period, a sampling time interval and the like cannot be directly displayed, even if data leakage occurs, a data acquirer cannot reversely determine specific data meanings in the file based on contents such as the file name and the like, only a string of meaningless binary codes can be obtained, and data security can be effectively improved; and because the corresponding time interval between adjacent data points in the file is fixed and the data length is fixed, the contents such as time marks, spacers and the like can be avoided from being stored, the effective data percentage of 100 percent is basically achieved, and the storage space utilization rate and the query efficiency are further improved.
In order to make those skilled in the art better understand the technical solutions provided in the embodiments of the present application, the data processing method described above will be described below with reference to specific embodiments. As an optional embodiment, assuming that 1020 high-frequency variables of 20ms are needed to be acquired and stored in a certain inner Mongolia wind power plant prototype A, a secondary subdirectory structure is created under a specified root directory/home/agent/store by referring to the above storage manner, and each data point is stored in a fixed length of 4 bytes (single-precision floating point number), wherein the target data to be stored includes target sampling data with a target sampling period of 20ms, the target statistical periods are target statistical data of 1 second, 1 minute, 1 hour and 1 day respectively, the target statistical data includes five statistical values of a maximum value, a minimum value, an average value, a peak-valley difference and a standard difference, and the target hash algorithm adopts an MD5 algorithm.
Illustratively, a system for executing data storage can operate under the configuration of an intel J1900 low-power-consumption CPU, a 2G memory and a 1TB mechanical hard disk, and practical tests show that smooth and unobvious display can be performed when historical curve query is performed on stored data through a front-end interface under the condition, and all query links including data file decompression can be completed within 3 seconds, the average memory occupies about 800MB, the average CPU utilization rate is within 10%, the uncompressed data volume per day is about 18GB, and the compressed data volume per day is about 3GB, so that a data storage mode with higher storage space utilization rate, higher access efficiency and lower hardware requirement is realized, which is superior to the traditional data storage mode.
As a specific example, assuming that raw data of an average wind speed needs to be stored, a target variable is 1012, a target time period is 2021-09-10: MD5 ("1012 #2021091010# 20ms") = A3D35596E3A9B99D 1EF89D0493AC2, the complete calculation result thereof is taken as a hash identifier, and a file of "A3D35596e3a b99d99d1ef89d0493ac2.Bin" is created under the storage path/home/ag/store/A3/D3/as a target file.
500 original sampling data of 20ms are written into the file in batches every 10 seconds, the byte length of the original sampling data is fixed to 4 bytes of single-precision floating point numbers, 2000 bytes are written into the file each time, and special values such as-1.0 or 0 are filled as data missing marks if the data are missing.
The method has the advantages that the real-time data are cached for a period of time and then written in batch instead of continuously written in, so that the problem that the number of files opened at the same time is out of limit due to the fact that the number of written data files is in direct proportion to the number of measured points can be effectively solved. For example, if data is written into all variables in real time, at least 1020 files need to be processed simultaneously, which causes a huge hardware load; and the data corresponding to each variable can be processed in sequence in a cache period by setting to be written in batch after caching for a period of time, so that the number of files needing to be processed simultaneously is reduced, and the computational load is reduced. Meanwhile, reasonable balance can be carried out between the writing efficiency and the fault loss by setting the cache duration before writing, and as the statistical data can be quickly recovered from the high-frequency data, only the allowable loss time of the high-frequency data needs to be considered, the fault loss caused by batch writing can be eliminated by storing the original data packet file in practice, and the condition that the lost data cannot be recovered after the fault occurs is avoided.
In this embodiment, the target file may be set to be not compressed in the data writing process, and the target file may be compressed integrally after the data writing process is completed, so that a repeated process of compression, decompression and compression is avoided in the data writing process, and the data writing efficiency is improved. If the file is in an uncompressed state when the target file is accessed, the file can be directly read; and if the compressed file is compressed, decompressing the compressed file to a preset temporary directory and then reading the compressed file, such as a file system based on a memory, and deleting a corresponding decompressed file from the temporary directory after a preset timeout so as to release the storage space.
The total amount of data points in a single data file can be controlled by reasonably controlling the time span of each type of data file, the contradiction between improving the storage efficiency by balancing batch storage and improving the access efficiency by fast decompression is balanced, and the condition that the time consumed by query and decompression does not influence the user experience obviously after the data files are compressed is ensured; in practical use, the time is generally required to be controlled to be on the order of milliseconds, that is, the number of data points in a single data file can be controlled to be on the order of 10 ten thousand.
As another specific example, assuming that the minute-level statistics of the average wind speed need to be stored, the target variable is 1012, the target period is 2021-09-10 (i.e. 2021-09-10 to 2021-09-11), and the target period is 1min, the hash calculation process is: MD5 ("1012 #202109# 1m") =7C9BA434540CC928 A9F0F7D1DD4603, takes the complete calculation result thereof as a hash identifier, and creates a "7czochralski ba434540cc928aj9 aj9 f7d1dd4603.Bin" file as a target file under a storage path/home/agent/store/7C/9B/for.
300 statistics values with 1 minute statistical intervals are written into the file in batches every 1 hour, wherein the statistics values are written in a fixed sequence of a maximum value, a minimum value, an average value, a peak-valley difference and a standard deviation, each type of statistics values contain 60 data points, and 1200 bytes are written into the file each time corresponding to the statistics data in the hour under the assumption that all the byte lengths of the statistics values are fixed to 4-byte single-precision floating point numbers.
Based on the above data storage process, as a specific embodiment, assuming that a query needs to be performed on the raw data of the stored average wind speed, the variable of the variable to be queried is identified as 1012, and the target query period is a high-frequency raw sampling data of 10 minutes between 2021-09-10 and 55-11, as follows, based on a preset period division rule, the corresponding actual query period should be 2021-09-10 (i.e. 10 00-11 00) and 2021-09-11 00 (i.e. 11-00: MD5 ("1012 #2021091010# 20ms") = A3D35596E3A9B99D 1EF89D0493AC2, MD5 ("1012 #2021091011# 20ms") = DD6a61096E6E118F2B73BE7491E4D89B.
Based on the hash identifier, a file "A3D35596 A3 a9b99d 1ef89d0493a c2.Bin" is obtained from the place/agent/store/A3/D3, "the file is decompressed and opened," the data offset is calculated to be 55 × 60 × 50 × 4=660000 bytes, 5 × 60 × 50 × 4=60000 bytes are continuously read from the 660001 bytes, and 15000 floating point numbers are returned; go to/home/agent/store/DD/6A to obtain the file "dd6a61096e6e118f2b73be7491e4d89b.bin," decompress and open the file, without data offset, continuously read 5 × 60 × 50 × 4=60000 bytes 15000 floating point numbers from the 1 st byte and return, combine the above 30000 floating point numbers, that is, 10 minutes of high frequency raw sample data between 2021-09-10.
The data offset is determined based on the byte length and the corresponding relation between the target query time interval and the actual query time interval, and is used for indicating the initial position of the to-be-queried data needing to be returned in the to-be-queried file; when the file contains header information, the data offset needs to be correspondingly increased by the number of bytes with the same length as the header information.
As another specific example, assuming that a query needs to be performed on the stored minute-level sampling data of the average wind speed, and specifically the average data of the five statistical values, the variable to be queried is identified as 1012, and the target query period is 2021-09-10: 00 to 12:00, two hours in total, the target query period is 1min, and based on the preset time interval division rule, the corresponding actual query time interval should be 2021-09 (i.e. 9 months-10 months), then the hash calculation process is: MD5 ("1012 #202109# 1m") =7C9BA434540CC928 A9F0F7D1DD4603.
Based on the hash identifier, go to/home/agent/store/7C/9B to obtain a file "7c9ba434540cc928a9a9f9f7d1dd4603. Bin," decompress and open the file, "calculate the data offset as (1440 x 9+60 x 10) = 4=54240 bytes, read the 3 rd floating point number from the 54241 byte and every 5 floating point numbers, continuously read 120 floating point numbers in total, the 120 is the average statistical data with a target query period of 1min between 2021-09-10.
Compared with the conventional database, the data processing method provided by the embodiment has the advantages that the data storage performance comparison schematic diagram shown in fig. 4 can be referred to as the data storage performance advantage, and the data query performance comparison schematic diagram shown in fig. 5 can be referred to as the data query performance advantage; practical tests show that in a high-frequency data scene, the storage space occupation of the method is about 1/6-1/8 of that of a traditional database, the query speed is about 100-1000 times of that of the traditional database, the high-frequency data storage space requirement of industrial equipment is effectively reduced, the query efficiency is greatly improved, the requirements on hardware storage space and computing speed are reduced, and the optimal economic benefit under the condition of meeting the service requirement is finally achieved.
The introduction of the above method flow is thus completed with reference to the specific embodiments.
The method provided by the embodiment of the present application is described above, and the apparatus provided by the embodiment of the present application is described below:
referring to fig. 6, fig. 6 is a structural diagram of an apparatus provided in the embodiment of the present application. As shown in fig. 6, the apparatus may include:
a data acquisition unit 601 configured to acquire target data of a target variable in a target period and a target cycle;
a hash determining unit 602, configured to determine, according to at least the target time interval, the target period, and the variable identifier of the target variable, a hash identifier and a storage path of a target file through a target hash algorithm;
a file creating unit 603, configured to create the target file according to the hash identifier under the storage path, and write the target data into the target file.
In a possible implementation manner, in the data acquiring unit 601, when acquiring target data of the target variable in the target period and the target period, the data acquiring unit is specifically configured to:
acquiring target sampling data of the target variable in a target time period and a target sampling period, and/or acquiring target statistical data of the target variable in the target time period and the target statistical period;
the target statistical period is greater than the target sampling period, and the target statistical data is data obtained by performing statistics on sampling data or other statistical data based on the target statistical period.
In a possible implementation manner, in the hash determining unit 602, when determining the hash identifier and the storage path of the target file by using a target hash algorithm according to at least the target time period, the target period, and the variable identifier of the target variable, specifically:
and determining the hash identification and the storage path of the target file through a target hash algorithm according to the data type of the target data, the target time interval, the target period and the variable identification of the target variable.
In a possible implementation manner, in the hash determining unit 602, when determining the hash identifier and the storage path of the target file by using a target hash algorithm according to at least the target time period, the target period, and the variable identifier of the target variable, specifically:
determining a hash value matched with the target data through a target hash algorithm at least according to the target time interval, the target period and the variable identification of the target variable;
selecting characters with preset digits from the hash value as a hash mark of the target file;
dividing the characters of the hash marks, inquiring a storage position matched with a division result in a preset multistage file directory, and determining a storage path of the target file.
In a possible implementation manner, in the file creating unit 603, when writing the target data into the target file, specifically:
sequentially writing the data points in the target data into the target file in a binary format according to a time sequence, wherein the binary storage bit number of the data points in the target file is a fixed bit number;
and if the fault data points with missing data exist, occupying the fault data points as the fixed number based on an interpolation algorithm or a preset special value.
In a possible implementation manner, the file creating unit 603 is further configured to:
when a query instruction for performing data query on the target variable is received, determining hash identifications and storage paths of one or more files to be queried through the target hash algorithm according to a target query time interval, a target query period and variable identifications carried in the query instruction;
and inquiring the one or more files to be inquired based on the hash identification and the storage path to obtain the data to be inquired of the target variable in the target inquiry time period and the target inquiry period.
In a possible implementation manner, in the file creating unit 603, when determining, according to the target query time period, the target query period, and the variable identifier carried in the query instruction, the hash identifier and the storage path of one or more files to be queried by using a target hash algorithm, specifically:
determining one or more actual query time periods according to the target query time period carried in the query instruction based on a time period division rule matched with the target query period;
determining hash marks and storage paths of one or more files to be inquired through a target hash algorithm based on the one or more actual inquiry time intervals, the target inquiry period and the variable marks carried in the inquiry instruction; wherein each actual query time interval is matched with a target time interval of a file to be queried.
In a possible implementation manner, the target file and the file to be queried further include file header information for performing data self-description, where the file header information includes at least one of a variable identifier, a target time period, a target period, a byte length, an encryption algorithm type, and a compression algorithm type; the external equipment stores verification information used for performing data verification on the target file;
in the file creating unit 603, after querying the one or more files to be queried based on the storage path and the hash identifier, the file creating unit is further specifically configured to:
acquiring the check information in the external equipment, and judging whether the check information is matched with the file header information in the file to be inquired;
if so, determining that the file to be queried has no data abnormality, and acquiring the data to be queried of the target variable in the target query time interval and the query cycle;
if not, determining that the file to be queried has data exception, and returning an error prompt.
Thus, the description of the structure of the device shown in fig. 6 is completed.
The embodiment of the application also provides a hardware structure of the device shown in fig. 6. Referring to fig. 7, fig. 7 is a structural diagram of an electronic device according to an embodiment of the present disclosure. As shown in fig. 7, the hardware structure may include: a processor and a machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine executable instructions to implement the methods disclosed in the examples above.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored on the machine-readable storage medium, and when the computer instructions are executed by a processor, the method disclosed in the above example of the present application can be implemented.
The machine-readable storage medium may be, for example, any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, apparatuses, modules or units described in the above embodiments may be specifically implemented by a computer chip or an entity, or implemented by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A method of data processing, the method comprising:
acquiring target data of a target variable in a target time period and a target period;
determining a hash identifier and a storage path of a target file through a target hash algorithm at least according to the target time interval, the target period and the variable identifier of the target variable;
and creating the target file according to the hash identification under the storage path, and writing the target data into the target file.
2. The method of claim 1, wherein the target variable comprises target data at a target time period and a target period, comprising:
target sampling data of the target variable in a target time period and a target sampling period, and/or target statistical data of the target variable in the target time period and the target statistical period;
the target statistical period is greater than the target sampling period, and the target statistical data is data obtained by counting sampling data or other statistical data based on the target statistical period.
3. The method of claim 2, wherein determining the hash identifier and the storage path of the target file by a target hash algorithm according to at least the target time interval, the target period, and the variable identifier of the target variable comprises:
and determining the hash identification and the storage path of the target file through a target hash algorithm according to the data type of the target data, the target time interval, the target period and the variable identification of the target variable.
4. The method of claim 1, wherein determining the hash identifier and the storage path of the target file by a target hash algorithm according to at least the target time interval, the target period, and the variable identifier of the target variable comprises:
determining a hash value matched with the target data through a target hash algorithm at least according to the target time interval, the target period and the variable identification of the target variable;
selecting characters with preset digits from the hash value as hash marks of the target file;
dividing the characters of the hash marks, inquiring a storage position matched with a division result in a preset multistage file directory, and determining a storage path of the target file.
5. The method of claim 1, wherein writing the target data to the target file comprises:
sequentially writing data points in the target data into the target file in a binary format according to a time sequence, wherein the binary storage bit number of the data points in the target file is a fixed bit number;
and if the fault data points with missing data exist, occupying the fault data points as the fixed number based on an interpolation algorithm or a preset special value.
6. The method according to any one of claims 1-5, further comprising:
when a query instruction for performing data query on the target variable is received, determining hash identifications and storage paths of one or more files to be queried through a target hash algorithm according to a target query time interval, a target query cycle and a variable identification carried in the query instruction;
and inquiring the one or more files to be inquired based on the hash identification and the storage path, and acquiring the data to be inquired of the target variable in the target inquiry time period and the target inquiry period.
7. The method according to claim 6, wherein the determining the hash identifier and the storage path of one or more files to be queried by a target hash algorithm according to the target query time interval, the target query cycle, and the variable identifier carried in the query instruction comprises:
determining one or more actual query time periods according to the target query time period carried in the query instruction based on a time period division rule matched with the target query cycle;
determining hash identifications and storage paths of one or more files to be queried through a target hash algorithm based on the one or more actual query time intervals, the target query cycle and the variable identifications carried in the query instruction; wherein each actual query time interval is matched with a target time interval of a file to be queried.
8. The method according to claim 6, wherein the target file and the file to be queried further include file header information for performing data self-description, and the file header information includes at least one of a variable identifier, a target time period, a target period, a byte length, an encryption algorithm type, and a compression algorithm type; the external equipment stores verification information used for performing data verification on the target file;
after the one or more files to be queried are queried based on the hash identifier and the storage path, the method further includes:
acquiring the verification information in the external equipment, and judging whether the verification information is matched with the file header information in the file to be inquired;
if so, determining that the file to be queried has no data abnormality, and acquiring the data to be queried of the target variable in the target query time interval and the target query cycle;
if not, determining that the file to be queried has data exception, and returning an error prompt.
9. A data processing apparatus, characterized in that the apparatus comprises:
the data acquisition unit is used for acquiring target data of the target variable in a target time period and a target period;
a hash determining unit, configured to determine, according to at least the target time interval, the target period, and the variable identifier of the target variable, a hash identifier and a storage path of the target file through a target hash algorithm;
and the file creating unit is used for creating the target file according to the hash identifier under the storage path and writing the target data into the target file.
10. An electronic device, comprising: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine executable instructions to perform the method steps of any of claims 1-8.
CN202211021925.5A 2022-08-24 2022-08-24 Data processing method and device and electronic equipment Pending CN115408390A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211021925.5A CN115408390A (en) 2022-08-24 2022-08-24 Data processing method and device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211021925.5A CN115408390A (en) 2022-08-24 2022-08-24 Data processing method and device and electronic equipment

Publications (1)

Publication Number Publication Date
CN115408390A true CN115408390A (en) 2022-11-29

Family

ID=84160690

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211021925.5A Pending CN115408390A (en) 2022-08-24 2022-08-24 Data processing method and device and electronic equipment

Country Status (1)

Country Link
CN (1) CN115408390A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932470A (en) * 2023-09-18 2023-10-24 江苏正泰泰杰赛智能科技有限公司 Method, system and storage medium capable of calculating and storing time sequence data of Internet of things

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116932470A (en) * 2023-09-18 2023-10-24 江苏正泰泰杰赛智能科技有限公司 Method, system and storage medium capable of calculating and storing time sequence data of Internet of things
CN116932470B (en) * 2023-09-18 2024-01-05 江苏正泰泰杰赛智能科技有限公司 Method, system and storage medium capable of calculating and storing time sequence data of Internet of things

Similar Documents

Publication Publication Date Title
CN111125089B (en) Time sequence data storage method, device, server and storage medium
CN104751055B (en) A kind of distributed malicious code detecting method, apparatus and system based on texture
CN108205577B (en) Array construction method, array query method, device and electronic equipment
WO2018132414A1 (en) Data deduplication using multi-chunk predictive encoding
CN107766529B (en) Mass data storage method for sewage treatment industry
JP2024009919A (en) Device for storing storage object data
US11675768B2 (en) Compression/decompression using index correlating uncompressed/compressed content
CN109960612B (en) Method, device and server for determining data storage ratio
CN115408390A (en) Data processing method and device and electronic equipment
CN111008183B (en) Storage method and system for business wind control log data
EP2779520A1 (en) A process for obtaining candidate data from a remote storage server for comparison to a data to be identified
WO2021226922A1 (en) Data compression method, apparatus and device, and readable storage medium
CN112632568B (en) Temperature data storage and acquisition method, system, electronic equipment and storage medium
CN115408350A (en) Log compression method, log recovery method, log compression device, log recovery device, computer equipment and storage medium
CN111274245B (en) Method and device for optimizing data storage
CN114268323B (en) Data compression coding method, device and time sequence database supporting line memory
CN105302915A (en) High-performance data processing system based on memory calculation
CN117526965A (en) Intelligent compression storage method for bank data, computer equipment and storage medium
CN104133883B (en) Telephone number ownership place data compression method
CN102693315A (en) Method and device for removing URL (uniform resource locator) duplicate on basis of shared memory mapping
CN108647243B (en) Industrial big data storage method based on time series
CN111787074A (en) File synchronization method and terminal
CN116842012A (en) Method, device, equipment and storage medium for storing Redis cluster in fragments
CN115061637A (en) Disk data indexing method and device, computer equipment and storage medium
JP7404734B2 (en) Data compression device, history information management system, data compression method and data compression program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB03 Change of inventor or designer information

Inventor after: Zhou Jinghui

Inventor after: Cheng Dong

Inventor after: Ma Hanzheng

Inventor before: Zhou Jinghui

Inventor before: Cheng Dong

Inventor before: Ma Hanzheng

CB03 Change of inventor or designer information