CN108234552B - Data storage method and device - Google Patents

Data storage method and device Download PDF

Info

Publication number
CN108234552B
CN108234552B CN201611159962.7A CN201611159962A CN108234552B CN 108234552 B CN108234552 B CN 108234552B CN 201611159962 A CN201611159962 A CN 201611159962A CN 108234552 B CN108234552 B CN 108234552B
Authority
CN
China
Prior art keywords
data
data block
block
compressed
object memory
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611159962.7A
Other languages
Chinese (zh)
Other versions
CN108234552A (en
Inventor
潘晓东
胡林红
莫衍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN201611159962.7A priority Critical patent/CN108234552B/en
Publication of CN108234552A publication Critical patent/CN108234552A/en
Application granted granted Critical
Publication of CN108234552B publication Critical patent/CN108234552B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

The embodiment of the invention discloses a data storage method and a data storage device, wherein the method comprises the following steps: the method comprises the steps of carrying out data blocking on data needing to be stored currently to obtain at least one data block, determining an object memory associated with each data block, compressing the data blocks associated with the object memory according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory, and storing the compressed data into the object memory associated with the compressed data. By adopting the embodiment of the invention, the storage performance can be improved.

Description

Data storage method and device
Technical Field
The invention relates to the technical field of internet, in particular to a data storage method and device.
Background
In a cloud computing environment, computing, storage, and networking are three major functions of cloud computing. Cloud storage is an emerging storage scheme, and serves as a storage back end of a virtual machine, so that stability and high availability of data of the virtual machine can be guaranteed. A conventional data storage method may be to perform cloud storage on data that needs to be currently stored through a Central Processing Unit (CPU), for example, store the data that needs to be currently stored in a Ceph (extensible storage space) cluster. However, with the development of technology, the processing speed of the CPU is faster and faster, but the processing performance of the input/output port (I/O) is not significantly improved, and if the data size of the data that needs to be stored is larger, the load of the I/O is higher, and the storage performance is reduced.
Disclosure of Invention
The technical problem to be solved by the embodiments of the present invention is to provide a data storage method and apparatus, which can improve storage performance.
In order to solve the above technical problem, an embodiment of the present invention provides a data storage method, where the method includes:
performing data blocking on data needing to be stored currently to obtain at least one data block;
determining an object store associated with each of the data blocks;
compressing the data block associated with the object memory according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory;
and storing the compressed data into an object memory associated with the compressed data.
Correspondingly, an embodiment of the present invention further provides a data storage device, where the data storage device includes:
the data blocking unit is used for carrying out data blocking on the data which needs to be stored currently to obtain at least one data block;
a determining unit, configured to determine an object store associated with each of the data blocks;
the compression unit is used for compressing the data blocks associated with the object memory according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory;
and the storage unit is used for storing the compressed data into an object memory associated with the compressed data.
By implementing the embodiment of the invention, the data which needs to be stored at present is blocked to obtain at least one data block, the object memory associated with each data block is determined, the data block associated with the object memory is compressed according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, and the compressed data is stored in the object memory associated with the compressed data, so that the network performance can be improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts;
fig. 1 is a schematic flow chart of a data storage method provided in an embodiment of the present invention;
FIG. 2 is a schematic diagram of an architecture of a data storage system provided in an embodiment of the present invention;
fig. 3 is a schematic diagram of index information provided in an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a data storage device provided in an embodiment of the present invention;
fig. 5 is a schematic structural diagram of a client provided in an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a data storage method, a client can perform data blocking on data needing to be stored currently through a CPU module to obtain at least one data block, determine an object memory associated with each data block, compress the data block associated with the object memory according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory, and further store the compressed data into the object memory associated with the compressed data through an I/O module. Compared with the traditional data storage method for directly storing data, the embodiment of the invention carries out data blocking on the data to obtain at least one data block, compresses each data block to obtain compressed data, and stores each compressed data, so that the I/O load can be reduced, and the storage performance is improved.
The data storage method can be operated in a client, and the client can be installed in a cloud storage system. For example, the client may be a virtual machine, and the cloud storage system may include a Ceph system or a GlusterFS (open source distributed file system) system, and the like.
The embodiment of the invention discloses a flow diagram of a data storage method shown in figure 1. As shown in fig. 1, the data storage method may include the steps of:
s101, performing data blocking on data needing to be stored currently to obtain at least one data block.
The client can perform data blocking on the data which needs to be stored currently to obtain at least one data block. The data that needs to be stored currently may include text, images, audio, video, or animation, and the like, and is not limited by the embodiment of the present invention.
Optionally, the client may perform data partitioning on the data according to the preset data amount to obtain at least one data block, where the data amount of the last data block in the at least one data block is smaller than or equal to the preset data amount, and the data amounts of other data blocks in the at least one data block except the last data block are equal to the preset data amount. For example, if the data amount of the data that needs to be stored currently is 16MB and the preset data amount is 4MB, the client may perform data blocking on the data to obtain 4 data blocks, where the data amount of each data block is 4MB, and the data amount of each data block is the same as the preset data amount. For another example, if the data amount of the data that needs to be stored currently is 18MB, and the preset data amount is 4MB, the client may perform data blocking on the data to obtain 5 data blocks, that is, a first data block, a second data block, a third data block, a fourth data block, and a fifth data block, where the data amounts of the first data block, the second data block, the third data block, and the fourth data block are all 4MB, the data amount of the fifth data block is 2MB, the data amount of the last data block (that is, the fourth data block) is smaller than the preset data amount, and the data amounts of the other data blocks are equal to the preset data amount.
It should be noted that the preset data amount may be a preset data block length, and a developer may perform corresponding modification in combination with different application scenarios, which is not specifically limited by the embodiment of the present invention.
S102, determining the object memory associated with each data block.
After the client performs data partitioning on the data to obtain at least one data block, an object memory associated with each data block may be determined. Taking the schematic architecture diagram of the data Storage system shown in fig. 2 as an example, after the client acquires data (e.g., File), data partitioning may be performed on the data to obtain at least one Object (e.g., Objects), and then the client may determine an Object-based Storage Device (OSD) associated with each data block according to a preset rule. The cloud storage system may include a plurality of object memories, the cloud storage system may include a main OSD and a plurality of backup OSDs, the client may predetermine compression algorithms matched with the object memories, and compression manners or compression ratios in the compression algorithms matched with different object memories may be different, for example, a compression manner in the compression algorithm matched with the first object memory is a fast compression, and a compression ratio is 5: 4; the compression mode in the compression algorithm matched with the second object memory is standard compression, and the compression ratio is 2: 1.
Optionally, the client may obtain the data identifier of each data block, process the data identifier of each data block according to a preset message digest algorithm to obtain a placement group to which each data block belongs, and process the group identifier of the placement group according to the preset message digest algorithm to obtain an object storage associated with the data block included in the placement group.
In a specific implementation, a Placement Group (PG) is a logical set of objects, the PG is a basic unit for distributing data to the OSDs by the cloud storage system, and objects included in the same PG are to be distributed to the same OSDs. The client can process the data identification and the correction parameter of each object through a preset message digest algorithm to obtain the PG to which the object belongs. Illustratively, the data identification of the object may be an identification number (ID) of the object. The preset Message Digest Algorithm may include a Secure Hash Algorithm (SHA) or a fifth version of the Message Digest Algorithm (Message Digest Algorithm MD5), etc. Further, after the client determines the PG to which the object belongs, the current running state of the cloud storage system and the group identifier of the PG may be processed through a preset message digest algorithm, so as to obtain an object storage associated with the data block included in the PG. Wherein, the group identification of the PG may be an ID of the PG. The OSD contained in the OSD cluster in the cloud storage system is divided according to fault-tolerant areas (such as a rack or a machine room) of physical nodes. Further, after the PG is mapped to the OSD, the client may compress the data block associated with the object memory according to a compression algorithm matched with the object memory, so as to obtain compressed data associated with the object memory.
S103, compressing the data blocks associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory.
The client may compress the data block associated with the object store according to a compression algorithm matched with the object store to obtain compressed data associated with the object store. For example, the object memory associated with the first data block is a third object memory, and the compression algorithm matched with the third object memory is a third compression algorithm, the client may compress the first data block according to the third compression algorithm to obtain compressed data associated with the third object memory, where the compressed data is compressed data corresponding to the first data block. For another example, the object memory associated with the second data block is the first object memory, and the compression algorithm matched with the first object memory is the first compression algorithm, the client may compress the second data block according to the first compression algorithm to obtain the compressed data associated with the first object memory, where the compressed data is the compressed data corresponding to the second data block.
Optionally, the client compresses the data blocks associated with the object memory according to a compression algorithm matched with the object memory, before obtaining the compressed data associated with the object memory, the sum of the data amounts of all the data blocks included in the placement group may be determined, and when the sum of the data amounts is within the preset data amount range, the client may use the compression algorithm corresponding to the preset data amount range as the compression algorithm matched with the object memory associated with the data blocks included in the placement group.
In a specific implementation, the client may pre-establish a corresponding relationship between the data volume range and the compression algorithm, for example, when the data volume range is (0MB, 5 MB), the corresponding compression algorithm is the first compression algorithm, when the data volume range is (5MB, 10 MB), the corresponding compression algorithm is the second compression algorithm, when the data volume range is (5MB, 10 MB), the corresponding compression algorithm is the third compression algorithm, the client determines that the first placement group includes the first data block and the second data block, the data volume of the first data block is 3MB, the data volume of the second data block is 5MB, the client may determine that the sum of the data volumes of all the data blocks included in the first placement group is 8MB, the sum of the data volumes is located in the data volume range (5MB, 10 MB), the compression algorithm corresponding to the data volume range is the second compression algorithm, and the client may use the second compression algorithm as the compression algorithm for matching the object storage associated with the first data block, the client may also use the second compression algorithm as a compression algorithm that matches the object store associated with the second data block, where the object store associated with the first data block may be the same as the object store associated with the second data block.
Optionally, the client compresses the data block associated with the object memory according to a compression algorithm matched with the object memory, and after obtaining compressed data associated with the object memory, may generate index information of the data. The index information may include an offset of each data block and a compression algorithm used by compressed data corresponding to each data block.
In a specific implementation, since the compression algorithms used by the data blocks are not completely the same, the data volumes of the compressed data obtained by compression are not completely the same, and the client may generate index information of the data, where the index information is used to record the data volume and offset of each data block obtained by data partitioning of the data. Taking the schematic diagram of the index information shown in fig. 3 as an example, the client may design a data space of 1KB at the head of the data for storing the index information. The index information may include an index header and an index for the data block, where the index header is used to store the data amount of the data or the number of the data blocks, and the index area for the data block is used to store a compression algorithm for compressing the data or the data amount of the compressed data.
Illustratively, a client initiates a data write request, when the data amount of data is greater than 4MB, the client may perform data blocking on the data to obtain at least one data block, and set a number from small to large for the at least one data block, for example, the client performs data blocking on the data to obtain i +2 data blocks, the client may configure the number of the first data block as block 0, the number of the ith data block as block i-1, the number of the (i + 1) th data block as block i, and the number of the (i + 2) th data block as block i + 1. Further, the client may write the sum of the data size of the data into the foremost origi _ size position of the index header, store the sum by using 64bits, be compatible with large files, and write the number of the data blocks into the block _ num position of the index header. Further, the client may compress each data block to obtain compressed data, where the data amount of the compressed data is block _ len, and the client may write the compressed data corresponding to the (i + 1) th data block into the block _ len of block i. Further, the client may obtain the offset of each data block, and write the offset of the i +1 th data block into the offset position of block i. If the data size of the index header is 1KB, the offset of the first data block is 1 KB. If the data size of the index of the first data block is 2KB, the offset of the second data block is 1+ 2KB to 3KB, and similarly, the offset of the ith data block is the data size of the offset of the ith-1 data block and the index of the ith-1 data block. Further, the client may write the compressed data at the offset of the data according to the offset of each data block.
Illustratively, the index header in the index information may be aligned by 64 bits. The data structure of the index information may be as shown in table 1:
TABLE 1
block_size block_num offset block_len offset block_len
64bits 64bits 64bits 64bits 64bits 64bits
Block _ size may be used to indicate a data amount of data, block _ num may be used to indicate a number of at least one data block obtained by data blocking, offset may be used to indicate an offset of the data block, and block _ len may be used to indicate a data amount of compressed data corresponding to the data block.
It should be noted that, in order to support the alignment of the huge file and the data, each parameter in the index information adopts 64bits for length storage.
And S104, storing the compressed data into an object memory associated with the compressed data.
The client may store the compressed data in an object store associated with the compressed data. For example, after the client compresses the first data block according to the third compression algorithm to obtain compressed data associated with the third object memory, the compressed data may be stored in the third object memory. For another example, after the client compresses the second data block according to the first compression algorithm to obtain the compressed data associated with the first object memory, the compressed data may be stored in the first object memory.
Optionally, the client may store each compressed data in a streaming compression and streaming storage manner, for example, the client may compress the data by using a plurality of threads to obtain the compressed data, and store the compressed data by using a plurality of threads. For example, the client may compress the second data block and store compressed data corresponding to the compressed first data block, which may improve data storage efficiency.
Optionally, the client may further receive a data acquisition request, where the data acquisition request carries a data identifier, acquire index information of data corresponding to the data identifier, determine a decompression algorithm required by each compressed data according to a compression algorithm adopted by compressed data corresponding to each data block included in the index information, decompress the compressed data according to the decompression algorithm required by each compressed data, obtain a data block corresponding to each compressed data, and assemble each data block according to the offset of each data block to obtain data. For example, the client determines that data to be read is subjected to data partitioning through the index information to obtain two data blocks, the compression algorithm of the compressed data corresponding to the first data block is a third compression algorithm, and the client can decompress the compressed data corresponding to the first data block through a third decompression algorithm corresponding to the third compression algorithm to obtain the first data block; the compression algorithm of the compressed data corresponding to the second data block is a first compression algorithm, the client can decompress the compressed data corresponding to the second data block through a first decompression algorithm corresponding to the first compression algorithm to obtain the second data block, and then assemble the first data block and the second data block according to the offset of each data block to obtain the data to be read.
In the embodiment of the invention, the client performs data partitioning on the data which needs to be stored currently to obtain at least one data block, determines the object memory associated with each data block, compresses the data block associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, and stores the compressed data into the object memory associated with the compressed data, so that the network performance can be improved.
Referring to fig. 4, fig. 4 is a schematic structural diagram of a data storage device provided in an embodiment of the present invention, where the data storage device may be used to implement part or all of the steps in the method embodiment shown in fig. 1, and as shown in the figure, the data storage device in the embodiment may at least include a data partitioning unit 401, a determining unit 402, a compressing unit 403, and a storing unit 404, where:
the data blocking unit 401 is configured to perform data blocking on data that needs to be currently stored to obtain at least one data block.
A determining unit 402, configured to determine an object store associated with each of the data blocks.
A compressing unit 403, configured to compress the data block associated with the object memory according to a compression algorithm matched with the object memory, to obtain compressed data associated with the object memory.
A storage unit 404, configured to store the compressed data in an object memory associated with the compressed data.
Optionally, the determining unit 402 is specifically configured to:
and acquiring the data identification of each data block.
And processing the data identification of each data block according to a preset message digest algorithm to obtain a placement group to which each data block belongs.
And processing the group identification of the placement group according to the preset message digest algorithm to obtain an object memory associated with the data block contained in the placement group.
Optionally, the determining unit 402 is further configured to, before the compressing unit 403 compresses the data blocks associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, determine a sum of data amounts of all the data blocks included in the placement group.
And the determining unit is further configured to use a compression algorithm corresponding to a preset data volume range as a compression algorithm matched with the object memory associated with the data block included in the placement group when the data volume sum is within the preset data volume range.
Optionally, the data blocking unit 401 is specifically configured to:
and performing data partitioning on the data according to a preset data volume to obtain at least one data block, wherein the data volume of the last data block in the at least one data block is less than or equal to the preset data volume, and the data volumes of other data blocks except the last data block in the at least one data block are equal to the preset data volume.
Optionally, the data storage device in the embodiment of the present invention may further include:
an index information generating unit 405, configured to, after the compressing unit 403 compresses the data blocks associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, generate index information of the data, where the index information includes an offset of each data block and a compression algorithm used by the compressed data corresponding to each data block.
Optionally, the data storage device in the embodiment of the present invention may further include:
an obtaining request receiving unit 406, configured to receive a data obtaining request, where the data obtaining request carries a data identifier.
An index information obtaining unit 407, configured to obtain index information of data corresponding to the data identifier.
The determining unit 402 is further configured to determine, according to a compression algorithm adopted by compressed data corresponding to each data block included in the index information, a decompression algorithm required by each compressed data.
Further, the data storage device in the embodiment of the present invention may further include:
a decompressing unit 408, configured to decompress the compressed data according to a decompressing algorithm required by each piece of compressed data, so as to obtain a data block corresponding to each piece of compressed data.
And an assembling unit 409, configured to assemble each data block according to the offset of each data block to obtain the data.
In the embodiment of the present invention, the data blocking unit 401 performs data blocking on data that needs to be currently stored to obtain at least one data block, the determining unit 402 determines an object memory associated with each data block, the compressing unit 403 compresses the data block associated with the object memory according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory, and the storing unit 404 stores the compressed data in the object memory associated with the compressed data, which can improve the storage performance.
Referring to fig. 5, fig. 5 is a schematic structural diagram of a client according to another embodiment of the present invention, where the client according to the embodiment of the present invention can be used to implement the method according to the embodiment of the present invention shown in fig. 1, for convenience of description, only the part related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to the embodiment of the present invention shown in fig. 1.
As shown in fig. 5, the client includes: at least one processor 501, such as a CPU, at least one input device 503, at least one output device 504, memory 505, at least one communication bus 502. Wherein a communication bus 502 is used to enable connective communication between these components. The input device 503 may be a network interface, and is configured to receive a data obtaining request. The output device 504 may specifically be a network interface or the like, and is configured to store the compressed data in an object storage associated with the compressed data. The memory 505 may include a high-speed RAM memory, and may further include a non-volatile memory, such as at least one disk memory, and is specifically used for storing a compression algorithm for object memory matching, and the like. Memory 505 may optionally include at least one memory device located remotely from the aforementioned processor 501. A set of program codes is stored in the memory 505, and the processor 501, the input device 503 and the output device 504 call the program codes stored in the memory 505 for performing the following operations:
the processor 501 performs data blocking on data that needs to be stored currently to obtain at least one data block.
The processor 501 determines the object store associated with each of the data blocks.
The processor 501 compresses the data blocks associated with the object memory according to the compression algorithm matched with the object memory, so as to obtain the compressed data associated with the object memory.
The output device 504 stores the compressed data in an object store associated with the compressed data.
Optionally, the determining, by the processor 501, an object memory associated with each data block includes:
and acquiring the data identification of each data block.
And processing the data identification of each data block according to a preset message digest algorithm to obtain a placement group to which each data block belongs.
And processing the group identification of the placement group according to the preset message digest algorithm to obtain an object memory associated with the data block contained in the placement group.
Optionally, before the processor 501 compresses the data block associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, the method further includes:
a sum of data amounts for all data blocks included in the placement group is determined.
And when the data volume sum is within a preset data volume range, taking a compression algorithm corresponding to the preset data volume range as a compression algorithm matched with the object memory associated with the data blocks contained in the placement group.
Optionally, the processor 501 performs data blocking on data that needs to be currently stored to obtain at least one data block, where the data block includes:
and performing data partitioning on the data according to a preset data volume to obtain at least one data block, wherein the data volume of the last data block in the at least one data block is less than or equal to the preset data volume, and the data volumes of other data blocks except the last data block in the at least one data block are equal to the preset data volume.
Optionally, after the processor 501 compresses the data block associated with the object memory according to the compression algorithm matched with the object memory to obtain the compressed data associated with the object memory, the method further includes:
and generating index information of the data, wherein the index information comprises the offset of each data block and a compression algorithm adopted by compressed data corresponding to each data block.
Optionally, the method further includes:
receiving a data acquisition request, wherein the data acquisition request carries a data identifier.
And acquiring index information of the data corresponding to the data identification.
And determining a decompression algorithm required by each compressed data according to a compression algorithm adopted by the compressed data corresponding to each data block contained in the index information.
And decompressing the compressed data according to a decompression algorithm required by each piece of compressed data to obtain a data block corresponding to each piece of compressed data.
And assembling each data block according to the offset of each data block to obtain the data.
Specifically, the client described in the embodiment of the present invention may be used to implement part or all of the flow in the embodiment of the method described in conjunction with fig. 1 of the present invention.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims (10)

1. A method of data storage, the method comprising:
performing data blocking on data needing to be stored currently to obtain at least one data block;
acquiring a data identifier of each data block;
processing the data identification of each data block according to a preset message digest algorithm to obtain a placement group to which each data block belongs;
processing the group identification of the placement group according to the preset message digest algorithm to obtain an object memory associated with the data block contained in the placement group;
determining the sum of the data amount of all the data blocks contained in the placement group;
when the data volume sum is within a preset data volume range, taking a compression algorithm corresponding to the preset data volume range as a compression algorithm matched with an object memory associated with the data blocks contained in the placement group, wherein the corresponding relation between the preset data volume range and the compression algorithm is pre-established;
compressing the data blocks associated with the object memory according to a compression algorithm matched with the object memory through a plurality of first threads to obtain compressed data associated with the object memory;
storing, by a number of second threads, the compressed data into an object store to which the compressed data is associated.
2. The method of claim 1, wherein the data blocking the data currently required to be stored to obtain at least one data block comprises:
and performing data partitioning on the data according to a preset data volume to obtain at least one data block, wherein the data volume of the last data block in the at least one data block is less than or equal to the preset data volume, and the data volumes of other data blocks except the last data block in the at least one data block are equal to the preset data volume.
3. The method of claim 1, wherein after compressing, by the first plurality of threads, the data blocks associated with the object store according to the compression algorithm that matches the object store to obtain the compressed data associated with the object store, further comprising:
and generating index information of the data, wherein the index information comprises the offset of each data block and a compression algorithm adopted by compressed data corresponding to each data block.
4. The method of claim 3, wherein the method further comprises:
receiving a data acquisition request, wherein the data acquisition request carries a data identifier;
acquiring index information of data corresponding to the data identification;
determining a decompression algorithm required by each compressed data according to a compression algorithm adopted by the compressed data corresponding to each data block contained in the index information;
decompressing the compressed data according to a decompression algorithm required by each compressed data to obtain a data block corresponding to each compressed data;
and assembling each data block according to the offset of each data block to obtain the data.
5. A data storage device, characterized in that the device comprises:
the data blocking unit is used for carrying out data blocking on the data which needs to be stored currently to obtain at least one data block;
a determining unit, configured to obtain a data identifier of each data block; processing the data identification of each data block according to a preset message digest algorithm to obtain a placement group to which each data block belongs; processing the group identification of the placement group according to the preset message digest algorithm to obtain an object memory associated with the data block contained in the placement group; determining the sum of the data amount of all the data blocks contained in the placement group; when the data volume sum is within a preset data volume range, taking a compression algorithm corresponding to the preset data volume range as a compression algorithm matched with an object memory associated with the data blocks contained in the placement group, wherein the corresponding relation between the preset data volume range and the compression algorithm is pre-established;
the compression unit is used for compressing the data blocks associated with the object memory according to a compression algorithm matched with the object memory through a plurality of first threads to obtain compressed data associated with the object memory;
and the storage unit is used for storing the compressed data into an object memory associated with the compressed data through a plurality of second threads.
6. The apparatus of claim 5, wherein the data blocking unit is specifically configured to:
and performing data partitioning on the data according to a preset data volume to obtain at least one data block, wherein the data volume of the last data block in the at least one data block is less than or equal to the preset data volume, and the data volumes of other data blocks except the last data block in the at least one data block are equal to the preset data volume.
7. The apparatus of claim 5, wherein the apparatus further comprises:
and the index information generating unit is used for generating index information of the data after the compressing unit compresses the data blocks associated with the object memory through a plurality of first threads according to a compression algorithm matched with the object memory to obtain compressed data associated with the object memory, wherein the index information comprises the offset of each data block and the compression algorithm adopted by the compressed data corresponding to each data block.
8. The apparatus of claim 7, wherein the apparatus further comprises:
an acquisition request receiving unit, configured to receive a data acquisition request, where the data acquisition request carries a data identifier;
the index information acquisition unit is used for acquiring the index information of the data corresponding to the data identification;
the determining unit is further configured to determine, according to a compression algorithm adopted by compressed data corresponding to each data block included in the index information, a decompression algorithm required by each compressed data;
the device further comprises:
the decompression unit is used for decompressing the compressed data according to a decompression algorithm required by each piece of compressed data to obtain a data block corresponding to each piece of compressed data;
and the assembling unit is used for assembling each data block according to the offset of each data block to obtain the data.
9. A client, the client comprising:
a memory for storing program code;
a processor for calling the program code stored in the memory to execute the data storage method of any one of claims 1 to 4.
10. A computer-readable storage medium, characterized in that it stores a computer program which, when executed by a client, causes the client to execute a data storage method according to any one of claims 1 to 4.
CN201611159962.7A 2016-12-15 2016-12-15 Data storage method and device Active CN108234552B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611159962.7A CN108234552B (en) 2016-12-15 2016-12-15 Data storage method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611159962.7A CN108234552B (en) 2016-12-15 2016-12-15 Data storage method and device

Publications (2)

Publication Number Publication Date
CN108234552A CN108234552A (en) 2018-06-29
CN108234552B true CN108234552B (en) 2021-11-05

Family

ID=62650473

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611159962.7A Active CN108234552B (en) 2016-12-15 2016-12-15 Data storage method and device

Country Status (1)

Country Link
CN (1) CN108234552B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109756230B (en) * 2019-01-03 2024-02-27 Oppo广东移动通信有限公司 Data compression storage method, data compression method, device, equipment and medium
WO2020154975A1 (en) * 2019-01-30 2020-08-06 深圳市大疆创新科技有限公司 Storage method and playback method for point cloud data, and computer-readable medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103067488A (en) * 2012-12-25 2013-04-24 中国科学院深圳先进技术研究院 Implement method of unified storage
CN106027615A (en) * 2016-05-10 2016-10-12 乐视控股(北京)有限公司 Object storage method and system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104754055A (en) * 2015-04-03 2015-07-01 易云捷讯科技(北京)有限公司 Safety cloud storage method for use in multi-cloud environment
CN105302495B (en) * 2015-11-20 2019-05-28 华为技术有限公司 Date storage method and device

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103067488A (en) * 2012-12-25 2013-04-24 中国科学院深圳先进技术研究院 Implement method of unified storage
CN106027615A (en) * 2016-05-10 2016-10-12 乐视控股(北京)有限公司 Object storage method and system

Also Published As

Publication number Publication date
CN108234552A (en) 2018-06-29

Similar Documents

Publication Publication Date Title
US9477682B1 (en) Parallel compression of data chunks of a shared data object using a log-structured file system
JP6316974B2 (en) Flash memory compression
US9172771B1 (en) System and methods for compressing data based on data link characteristics
EP3376393B1 (en) Data storage method and apparatus
US10509582B2 (en) System and method for data storage, transfer, synchronization, and security
US10649905B2 (en) Method and apparatus for storing data
US9977598B2 (en) Electronic device and a method for managing memory space thereof
US10318165B2 (en) Data operating method, device, and system
US10474385B2 (en) Managing memory fragmentation in hardware-assisted data compression
CN111966631A (en) Mirror image file generation method, system, equipment and medium capable of being rapidly distributed
CN112785408A (en) Account checking method and device based on Hash
CN108234552B (en) Data storage method and device
CN115225094A (en) Data compression method, electronic device and computer program product
CN112506950A (en) Data aggregation processing method, computing node, computing cluster and storage medium
CN110222046B (en) List data processing method, device, server and storage medium
CN104123102B (en) A kind of IP hard disks and its data processing method
TW201435585A (en) Electronic apparatus for data access and data access method therefor
US20120324560A1 (en) Token data operations
CN109302449B (en) Data writing method, data reading device and server
US11194498B1 (en) Inline compression with small-write compression avoidance
CN110851433A (en) Key optimization method for key value storage system, storage medium, electronic device and system
CN117099109A (en) Compression technique for deep neural network weights
CN115033544A (en) Data compression method, device, equipment and medium based on relation numerical value
CN114003573A (en) Compression method, device, equipment, storage medium and program product of file system
CN111639055B (en) Differential packet calculation method, differential packet calculation device, differential packet calculation equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant