CN115630065B - Storage and query method based on multi-compression mode sub-partition table - Google Patents

Storage and query method based on multi-compression mode sub-partition table Download PDF

Info

Publication number
CN115630065B
CN115630065B CN202211272183.3A CN202211272183A CN115630065B CN 115630065 B CN115630065 B CN 115630065B CN 202211272183 A CN202211272183 A CN 202211272183A CN 115630065 B CN115630065 B CN 115630065B
Authority
CN
China
Prior art keywords
data
format
partition
string
compression mode
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211272183.3A
Other languages
Chinese (zh)
Other versions
CN115630065A (en
Inventor
周勇亮
贾宗秀
赵冬伟
李晓鹏
关旭
蒋旭
姬涛涛
刘勇生
张昕尧
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD
Original Assignee
TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD filed Critical TIANJIN SHENZHOU GENERAL DATA TECHNOLOGY CO LTD
Priority to CN202211272183.3A priority Critical patent/CN115630065B/en
Publication of CN115630065A publication Critical patent/CN115630065A/en
Application granted granted Critical
Publication of CN115630065B publication Critical patent/CN115630065B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2282Tablespace storage structures; Management thereof
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/22Indexing; Data structures therefor; Storage structures
    • G06F16/2228Indexing structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/064Management of blocks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0674Disk device
    • G06F3/0676Magnetic disk device
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computational Linguistics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a storage and query method based on a multi-compression mode sub-partition table, which comprises the following steps: step S1, receiving a series of block data which are streamed according to a preset format; s2, analyzing based on a preset format to obtain the data composition in the block data; s3, analyzing different parts of the data composition, and correspondingly compressing the data blocks by adopting different compression modes; step S4, based on the adopted compression mode, matching corresponding partition number segments in a first lookup table, and independently setting index partition type fields in compressed data, wherein the matched index partition number segments are used as additional data; step S5, based on the partition number mark, storing the mark into a corresponding sub-partition table, and recording an index and a compression mode field of corresponding data; step S6, in the data storage process, allocating continuous spaces with different sizes to different sub-partition tables for storage; step S7, the user searches in the corresponding index storage table based on the data compression mode or the data format type.

Description

Storage and query method based on multi-compression mode sub-partition table
Technical Field
The invention relates to the technical field of computer databases, in particular to a storage and query method based on a multi-compression mode sub-partition table.
Background
With the development of large data of the internet, more and more mass data need to be stored, the data come from various places, the data formats are quite different, and for a database, all data are stored according to a given storage process, and although the storage process is quicker, the retrieval process is very slow, and particularly under the condition of extremely large data volume, the retrieval and reading of a magnetic disk are frequent, so that the service life of the magnetic disk is easy to be reduced. In addition, in engineering instrument data debugging and actual measurement, data access to a test instrument is very frequent, a large amount of test data is generated every day, the data are stored in a hard disk in a daily and monthly manner, the data size is very large, and the data cannot be effectively organized and managed due to irregular log-like recorded information, so that inconvenience is brought to future search and inquiry.
Disclosure of Invention
In order to solve the technical problems, the invention provides a storage and query method based on a multi-compression mode sub-partition table, which can adopt a multi-compression mode for different types of data, set different index structures for storage, store the data in different compression modes at different disk partition positions, realize quick search based on the characteristics of the data types during search, and improve the search storage efficiency.
The technical scheme of the invention is as follows: a storage and query method based on a multi-compression mode sub-partition table comprises the following steps:
step S1, receiving a series of block data which are streamed according to a preset format;
s2, analyzing based on a preset format to obtain the data composition in the block data;
s3, analyzing different parts of the data, and correspondingly compressing the data blocks by adopting different compression modes according to preset rules;
step S4, based on the adopted compression mode, matching corresponding partition number segments in a first lookup table, and independently setting index partition type fields in compressed data, taking the matched index partition number segments as additional data, and filling the additional data into the compressed data to obtain compressed data with index partition number marks;
step S5, based on the partition number mark, storing the mark into a corresponding sub-partition table, and recording an index and a compression mode field of corresponding data;
step S6, in the data storage process, allocating continuous spaces with different sizes to different sub-partition tables for storage;
and S7, inputting data to be queried and a data compression mode or a data format type which are judged in advance by a user, and searching in a corresponding index storage table based on the data compression mode or the data format type.
Further, in the step S1, a series of block data streamed according to a predetermined format is received, where the predetermined format refers to:
the simple short control character string format is characterized in that characters are control character strings without data format, and the length of the character strings is smaller than a first threshold value;
the simple complex control character string format is characterized in that characters are control character strings without data format, and the character string length is larger than a first threshold value;
the simple string is connected with the data content format, and comprises a control string format and the data content, wherein the control string format is positioned in front of the data content;
a short data content format comprising only data content and having a length less than a third threshold;
the long data content format includes only data content and has a length greater than a third threshold.
Further, step S2 is to parse based on a predetermined format to obtain a data composition in the block data;
for the simple short control string format, directly extracting the control string;
for a simple and complex control string format, extracting a string, and calculating the length value of the string; extracting part of keywords in the character string;
for a simple string data content format, determining the position and length of the data content based on the control string format, and extracting the data content based on the position and length data;
for short data content format, directly extracting data content;
for long data content formats, the data content is directly extracted, and the data character length is counted.
Further, step S3, analyzing different parts of the data composition, and compressing the data block by using different compression modes according to preset rules; the method specifically comprises the following steps:
for the simple short control string format, after the control string is directly extracted, the control string is directly stored in a first format, namely an original character, and the types of the date and the command format are added in front of the original character;
for the short data content format, directly extracting the data content, and directly storing according to a second format, namely the original numerical value; adding a date and command format type in front of the original character;
storing the simple and complex control string format in a third format, and adding date, command format type, keywords and string length in front of the original character; the keywords are the keywords extracted in the front;
compressing and storing the long data content format in a fourth format, and adding date and command format types in the front;
for the simple word string data content format, the first half part reserves the original data, the second half part is stored in a compressed mode based on the fifth format or according to the original data, and the date and the command format type and the keyword are added in front.
Further, step S4 is to match corresponding partition number segments in the first lookup table based on the adopted compression mode, and to set index partition type fields in the compressed data separately, and to fill the matched index segment number segments as additional data into the compressed data to obtain compressed data with index partition number marks;
wherein, different compression modes correspond to different partition number sections, the first to fifth compression modes correspond to the first to fifth partition number sections, each partition number section is reduced in turn, and reserved gap number sections are reserved among the partition number sections;
each partition number segment is acquired, and the value is added to a preset position of the compressed data of the first format to the fifth format as a partition field.
Further, step S5 is to store the partition number mark in the corresponding sub-partition table, and record the index and compression mode field of the corresponding data;
the disk is partitioned according to the number segments, the width of the number segments is in proportional relation with the space allocated by the disk, the data volume of the current number segments and the space occupation amount of the disk are counted, dynamic adjustment is carried out, and each partition corresponds to one compression mode.
Further, in the step S6, in the data storage process, different continuous spaces with different sizes are allocated to different sub-partition tables for storage.
Further, in step S7, the user inputs the data to be queried and the data compression mode or the data format type determined in advance, and searches in the corresponding index storage table based on the data compression mode or the data format type.
Further, in the query, the user inputs the condition at the input end: the date, the keyword and the command format type are B, the number segment partition corresponding to the format is positioned in the database to be queried, and the date and the entry corresponding to the keyword in the storage table are searched.
Advantageous effects
The invention can analyze and process a large amount of data with different command formats for different devices, adopts different data characteristic extraction and compression modes for data characteristics, distinguishes storage areas of different data with format types for different format data, takes indexes of the data storage areas as fields to be added into the processed compressed data for storage, extracts keyword contents for partial data with command information, facilitates quick query, and particularly adds section range information for each data, thereby facilitating quick query, and can quickly store and query massive data with different formats.
Drawings
Fig. 1: a schematic diagram is saved for data test by connecting a host computer with a plurality of test devices;
fig. 2: a method flow chart of the present invention;
fig. 3: testing a plurality of data format schematics for the device;
fig. 4: the invention correspondingly adopts different compression modes and corresponding storage partition schematics aiming at various data formats.
Detailed Description
The technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, but not all embodiments, and all other embodiments obtained by those skilled in the art without the inventive effort based on the embodiments of the present invention are within the scope of protection of the present invention.
As shown in fig. 1, a host 1 is connected to various types of instruments and equipment through a data cable, the host 1 is a desktop computer or a laptop computer, a pad, a mobile terminal, etc., the data cable is a GPIB industrial control bus, or a LAN network cable, etc., the host 1 is simultaneously connected to various types of instruments and equipment 2, for example, a spectrum analyzer, a voltmeter, a frequency counter, a vector network analyzer, etc., the host 1 controls the instruments and equipment 2 to perform a test, and test data is stored in a memory 3 locally or connected thereto. Under the condition of a large number of test samples, a large amount of various test data can be generated, in addition, various command read-write data, log record data and the like can be generated during instrument joint debugging, and the storage mode of the data has great influence on the reading speed of future calling and search query. Because a large amount of various test data are generated every day, if the data are not efficiently organized and processed, the subsequent search and query efficiency is very low, and the time is very delayed.
According to an embodiment of the present invention, a storage and query method based on a multi-compression mode sub-partition table is provided, as shown in fig. 2, including the following steps:
step S1, receiving a series of block data which are streamed according to a preset format;
s2, analyzing based on a preset format to obtain the data composition in the block data;
s3, analyzing different parts of the data, and correspondingly compressing the data blocks by adopting different compression modes according to preset rules;
step S4, based on the adopted compression mode, matching corresponding partition number segments in a first lookup table, and independently setting index partition type fields in compressed data, taking the matched index partition number segments as additional data, and filling the additional data into the compressed data to obtain compressed data with index partition number marks;
step S5, based on the partition number mark, storing the mark into a corresponding sub-partition table, and recording an index and a compression mode field of corresponding data;
step S6, in the data storage process, allocating continuous spaces with different sizes to different sub-partition tables for storage;
and S7, inputting data to be queried and a data compression mode or a data format type which are judged in advance by a user, and searching in a corresponding index storage table based on the data compression mode or the data format type.
Further, as shown in fig. 3, in step S1, a series of block data streamed according to a predetermined format is received, where the predetermined format refers to:
the simple short control character string format is A1 format, wherein the characters are control character strings without data format, and the character string length is smaller than a first threshold value; for example, CETC3572 of the first row in fig. 3, would belong to such a string format, or for example, IDN? Commands, etc., also belonging to such formats; the string length is typically short, e.g., within 10 characters;
the simple complex control character string format is A2 format, wherein the characters are control character strings or response character strings without data format, and the character string length is larger than a first threshold value; for example, the second row, continue, in FIG. 3; sense: window disc; calicut: form device? Belonging to this class, but the string is slightly longer, typically greater than 10 characters in length;
the simple string is connected with the data content format, namely the B format, and comprises a control string format and the data content, wherein the control string format is positioned in front of the data content; for example, in the third line of fig. 3, "Sense: frequency: start:100000000", which includes a control string format" Sense: frequency: start: "and data content" 100000000";
the short data content format, i.e. the C1 format, comprises only data content and has a length smaller than a third threshold; for example, the fourth line, "300" in FIG. 3, typically the character length of the data is short, e.g., less than 10 characters;
the long data content format, i.e. the C2 format, comprises only data content and has a length greater than the third threshold. For example, the fifth row in fig. 3:
"1.1283433E-8,1.12823E-8,1.2283433E-8,1.34533E-8,1.5289433E-8,1.3383433E-8,1.4283433E-9", the piece of data represents the magnitude of points on a curve, which can be very long, e.g., several thousand bytes, since there are many points on a curve.
Further, step S2 is to parse based on a predetermined format to obtain a data composition in the block data;
for the simple short control string format, directly extracting the control string; for example, with CETC3572, the character string CETC3572 may be directly extracted, and optionally, the character string is also extracted as a keyword and appended to the original character in the subsequent data processing.
For a simple and complex control string format, extracting a string, and calculating the length value of the string; extracting part of keywords in the character string; for example for a context; sense: window disc; calicut: form device? The calculator value length is: 51, the extracted keywords are words in the section marks, for example: continuous, window display, format device, etc., generally extracts the last vocabulary;
for a simple string data content format, the position and length of the data content are determined based on the control string format, and the data content is extracted based on the position and length data. For example, for Sense: frequency: start:100000000, extract string part is Sense: frequency: start, the data part is 100000000, and further extracts the keyword start;
for short data content format, directly extracting data content; for example, for the third line of data, extract 300 directly;
for long data content formats, the data content is directly extracted, the data character length is counted, and grouping is performed according to an array or not.
Further, step S3, analyzing different parts of the data composition, and correspondingly compressing the data block by adopting different compression modes according to a preset rule; the method specifically comprises the following steps:
for the simple short control string format A1, after the control string is directly extracted, the control string is directly stored in a first format, namely an original character, and a date and a command format type are added in front of the original character, wherein the command format type refers to the front A1, A2, B, C1 and C2;
for the short data content format, directly extracting the data content, and directly storing according to a second format, namely the original numerical value; adding a date and command format type in front of the original character;
storing the simple and complex control string format in a third format, and adding date, command format type, keywords and string length in front of the original character; the keywords are the keywords extracted in the front;
the long data content format is compressed and stored in a fourth format, and the date and command format types are added in the front, and the compression mode can be a mode of predictive coding, transform coding and the like.
For the simple word string data content format, the first half part reserves the original data, the second half part is stored in a compressed mode based on the fifth format or according to the original data, and the date and the command format type and the keyword are added in front.
Further, step S4 is to match corresponding partition number segments in the first lookup table based on the adopted compression mode (i.e. equivalent to the previous data format), set index partition type fields in the compressed data separately, fill the matched index partition number segments as additional data into the compressed data, and obtain compressed data with index partition number segment marks; for example:
partition segment bits corresponding to B format: 0xHH 01108 011000 … … 0xHH014010
Date 20220910; type B; scale 11000to14010; key is frequency, set; length201, data … … XXXXX XXXX; wherein, scale is added 11000to14010;
wherein, different compression modes correspond to different partition number sections, the first to fifth compression modes correspond to the first to fifth partition number sections, the size of each partition number section can be adjusted according to the size of the data volume, and reserved gap number sections are reserved among the partition number sections; for example, a predetermined gap is reserved between the A1 section and the A2 section to prevent range overflow caused by too fast data growth:
each partition number segment is acquired, and the value is added to a preset position of the compressed data of the first format to the fifth format as a partition field.
Further, step S5 is to store the partition number mark in the corresponding sub-partition table and record the index and compression mode field of the corresponding data
The disk is partitioned according to the number segments, the width of the number segments is in proportional relation with the space allocated by the disk, the data volume of the current number segments and the space occupation amount of the disk are counted, dynamic adjustment is carried out, and each partition corresponds to one compression mode.
Further, in the step S6, in the data storage process, different continuous spaces with different sizes are allocated to different sub-partition tables for storage; the sub-partition table in the invention is a traditional disk partition table, is a sub-partition table arranged on a conventional partition table, and correspondingly divides different sub-partitions in order to realize storage based on the compression mode.
Further, in step S7, the user inputs the data to be queried and the data compression mode or the data format type determined in advance, and searches in the corresponding index storage table based on the data compression mode or the data format type.
For example, if the user needs to query 2021, 9, 5, a certain command related to the query frequency range, the user inputs the condition at the input: 2021.09.05, and keyword frequency, and command format type is B, the number segment partition corresponding to the B format can be rapidly located in the database to perform query, and the date in the storage table is searched, and in all the entries with the date of 2021.09.05, the keyword frequency is queried, so that rapid query is realized, data meeting the condition does not need to be traversed from all the data, and at least 80% of data access quantity can be reduced.
While the foregoing has been described in relation to illustrative embodiments thereof, so as to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, but is to be construed as limited to the spirit and scope of the invention as defined and defined by the appended claims, as long as various changes are apparent to those skilled in the art, all within the scope of which the invention is defined by the appended claims.

Claims (4)

1. The storage and query method based on the multi-compression mode sub-partition table is characterized by comprising the following steps:
step S1, receiving a series of block data which are streamed according to a preset format; the step S1 is to receive a series of block data streamed according to a predetermined format, where the predetermined format refers to:
the simple short control character string format is characterized in that characters are control character strings without data format, and the length of the character strings is smaller than a first threshold value;
the simple complex control character string format is characterized in that characters are control character strings without data format, and the character string length is larger than a first threshold value;
the simple string is connected with the data content format, and comprises a control string format and the data content, wherein the control string format is positioned in front of the data content;
a short data content format comprising only data content and having a length less than a third threshold;
a long data content format including only data content and having a length greater than a third threshold;
s2, analyzing based on a preset format to obtain the data composition in the block data; step S2, analyzing based on a preset format to obtain the data composition in the block data;
for the simple short control string format, directly extracting the control string;
for a simple and complex control string format, extracting a string, and calculating the length value of the string; extracting part of keywords in the character string;
for a simple string data content format, determining the position and length of the data content based on the control string format, and extracting the data content based on the position and length data;
for short data content format, directly extracting data content;
for the long data content format, directly extracting the data content, and counting the data character length;
s3, analyzing different parts of the data, and correspondingly compressing the data blocks by adopting different compression modes according to preset rules; s3, analyzing different parts of the data, and correspondingly compressing the data blocks by adopting different compression modes according to preset rules; the method specifically comprises the following steps:
for the simple short control string format, after the control string is directly extracted, the control string is directly stored in a first format, namely an original character, and the types of the date and the command format are added in front of the original character;
for the short data content format, directly extracting the data content, and directly storing according to a second format, namely the original numerical value; adding a date and command format type in front of the original character;
storing the simple and complex control string format in a third format, and adding date, command format type, keywords and string length in front of the original character; the keywords are the keywords extracted in the front;
compressing and storing the long data content format in a fourth format, and adding date and command format types in the front;
for the simple word string data content format, the first half part reserves the original data, the second half part is stored in a compressed mode based on the fifth format or according to the original data, and the date and command format types and keywords are added in front;
step S4, based on the adopted compression mode, matching corresponding partition number segments in a first lookup table, and independently setting index partition type fields in compressed data, taking the matched index partition number segments as additional data, and filling the additional data into the compressed data to obtain compressed data with index partition number marks;
step S5, based on the partition number mark, storing the partition number mark into a corresponding partition table, and recording an index and a compression mode field of corresponding data;
step S6, in the data storage process, allocating continuous spaces with different sizes to different partition tables for storage;
and S7, inputting data to be queried and a data compression mode or a data format type which are judged in advance by a user, and searching in a corresponding index storage table based on the data compression mode or the data format type.
2. The method for storing and inquiring the sub-partition table based on the multiple compression modes according to claim 1, wherein the step S4 is characterized in that, based on the adopted compression mode, the corresponding partition number segment is matched in the first lookup table, the index partition type field is independently set in the compressed data, the matched index partition number segment is used as additional data, and the additional data is filled into the compressed data to obtain the compressed data with the index partition number mark;
wherein, different compression modes correspond to different partition number sections, the first to fifth compression modes correspond to the first to fifth partition number sections, each partition number section is reduced in turn, and reserved gap number sections are reserved among the partition number sections;
and acquiring each partition number segment, and attaching the partition number segment to a preset position of the compressed data of the first format to the fifth format of the result to be used as a partition field.
3. The method for storing and querying sub-partition tables based on multiple compression modes according to claim 1, wherein step S5 is to store the sub-partition table in the corresponding partition table based on the partition number mark and record the index and compression mode field of the corresponding data;
the disk is partitioned according to the number segments, the width of the number segments is in proportional relation with the space allocated by the disk, the data volume of the current number segments and the space occupation amount of the disk are counted, dynamic adjustment is carried out, and each partition corresponds to one compression mode.
4. A storage and query method based on a multi-compression mode sub-partition table according to claim 3,
in the query, the user inputs the conditions at the input: the date, the keyword and the command format type are B, the number segment partition corresponding to the format is positioned in the database to be queried, and the date and the entry corresponding to the keyword in the storage table are searched.
CN202211272183.3A 2022-10-18 2022-10-18 Storage and query method based on multi-compression mode sub-partition table Active CN115630065B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211272183.3A CN115630065B (en) 2022-10-18 2022-10-18 Storage and query method based on multi-compression mode sub-partition table

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211272183.3A CN115630065B (en) 2022-10-18 2022-10-18 Storage and query method based on multi-compression mode sub-partition table

Publications (2)

Publication Number Publication Date
CN115630065A CN115630065A (en) 2023-01-20
CN115630065B true CN115630065B (en) 2023-08-22

Family

ID=84906757

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211272183.3A Active CN115630065B (en) 2022-10-18 2022-10-18 Storage and query method based on multi-compression mode sub-partition table

Country Status (1)

Country Link
CN (1) CN115630065B (en)

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5229768A (en) * 1992-01-29 1993-07-20 Traveling Software, Inc. Adaptive data compression system
US5854597A (en) * 1996-03-19 1998-12-29 Fujitsu Limited Document managing apparatus, data compressing method, and data decompressing method
CN101800556A (en) * 2001-02-13 2010-08-11 莫塞德技术股份有限公司 Be fit to the method and the setting of data compression
CN109101504A (en) * 2017-06-20 2018-12-28 恒为科技(上海)股份有限公司 A kind of efficient log compression and indexing means
CN112118010A (en) * 2020-08-25 2020-12-22 中电信用服务有限公司 Compression processing method and device for character strings and storage medium
CN112632129A (en) * 2020-12-31 2021-04-09 联想未来通信科技(重庆)有限公司 Code stream data management method, device and storage medium
CN114374392A (en) * 2021-12-17 2022-04-19 深圳市优必选科技股份有限公司 Data compression storage method and device, terminal equipment and readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11163758B2 (en) * 2016-09-26 2021-11-02 Splunk Inc. External dataset capability compensation
US12118041B2 (en) * 2019-10-13 2024-10-15 Thoughtspot, Inc. Query execution on compressed in-memory data

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5229768A (en) * 1992-01-29 1993-07-20 Traveling Software, Inc. Adaptive data compression system
US5854597A (en) * 1996-03-19 1998-12-29 Fujitsu Limited Document managing apparatus, data compressing method, and data decompressing method
CN101800556A (en) * 2001-02-13 2010-08-11 莫塞德技术股份有限公司 Be fit to the method and the setting of data compression
CN109101504A (en) * 2017-06-20 2018-12-28 恒为科技(上海)股份有限公司 A kind of efficient log compression and indexing means
CN112118010A (en) * 2020-08-25 2020-12-22 中电信用服务有限公司 Compression processing method and device for character strings and storage medium
CN112632129A (en) * 2020-12-31 2021-04-09 联想未来通信科技(重庆)有限公司 Code stream data management method, device and storage medium
CN114374392A (en) * 2021-12-17 2022-04-19 深圳市优必选科技股份有限公司 Data compression storage method and device, terminal equipment and readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于嵌入式实时操作系统Vxworks下的数据压缩技术;王江泉等;《数字技术与应用》(第03期);第70-71页 *

Also Published As

Publication number Publication date
CN115630065A (en) 2023-01-20

Similar Documents

Publication Publication Date Title
US11995086B2 (en) Methods for enhancing rapid data analysis
US5561421A (en) Access method data compression with system-built generic dictionaries
US6546394B1 (en) Database system having logical row identifiers
US7853598B2 (en) Compressed storage of documents using inverted indexes
CN107577436B (en) Data storage method and device
Wu et al. Breaking the curse of cardinality on bitmap indexes
CN109994131B (en) Index-based power frequency wave recording file compression storage method and system
CN111324750A (en) Large-scale text similarity calculation and text duplicate checking method
WO2015116221A1 (en) Managing database with counting bloom filters
CN112597345B (en) Automatic acquisition and matching method for laboratory data
US20060190468A1 (en) Techniques for improving memory access patterns in tree-based data index structures
CN115630065B (en) Storage and query method based on multi-compression mode sub-partition table
JP2006323575A (en) Document retrieval system, document retrieval method, document retrieval program and recording medium
CN112434085A (en) Roaring Bitmap-based user data statistical method
CN112765960B (en) Text matching method and device and computer equipment
CN115794861A (en) Offline data query multiplexing method based on feature abstract and application thereof
CN107238764B (en) The electric energy quality monitoring method and device of electric system
CN114995880A (en) Binary code similarity comparison method based on SimHash
CN114398399A (en) Retrieval method and device of management information base and electronic equipment
CN112100670A (en) Big data based privacy data grading protection method
CN100386996C (en) Threshold expression analytic method
CN118312524B (en) Table recall method, apparatus, electronic device and medium
US20050256823A1 (en) Memory, method, and program product for organizing data using a compressed trie table
Navarro et al. An optimal index for pat arrays
CN118349548A (en) Big data access processing method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant