CN115220644A - Data processing method, data processing device, solid state drive and storage medium - Google Patents

Data processing method, data processing device, solid state drive and storage medium Download PDF

Info

Publication number
CN115220644A
CN115220644A CN202110426638.1A CN202110426638A CN115220644A CN 115220644 A CN115220644 A CN 115220644A CN 202110426638 A CN202110426638 A CN 202110426638A CN 115220644 A CN115220644 A CN 115220644A
Authority
CN
China
Prior art keywords
data
compression algorithm
migrated
compression
ssd
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110426638.1A
Other languages
Chinese (zh)
Inventor
王卫新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110426638.1A priority Critical patent/CN115220644A/en
Publication of CN115220644A publication Critical patent/CN115220644A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance
    • G06F3/0611Improving I/O performance in relation to response time
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/0671In-line storage system
    • G06F3/0673Single storage device
    • G06F3/0679Non-volatile semiconductor memory device, e.g. flash memory, one time programmable memory [OTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application discloses a data processing method, a data processing device, a solid-state drive and a storage medium, which are applicable to the fields of data storage, data reading and the like of computer technology. The method comprises the following steps: determining data to be migrated in compressed data stored by a Solid State Drive (SSD); determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data; compressing the first decompressed data based on a second compression algorithm to obtain target compressed data, and storing the target compressed data to the SSD; the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm. By adopting the embodiment of the application, the storage space of the SSD can be saved, and the applicability is improved.

Description

Data processing method, data processing device, solid-state drive and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a data processing method and apparatus, a solid state drive, and a storage medium.
Background
Computing storage has started to rise in recent years, and the storage capacity of various storage devices has also become higher and higher. For Solid State Disk (SSD), there is some extra hardware computing capability in SSD, such as implementing some hardware computing capability through Application Specific Integrated Circuit (ASIC), field-programmable gate array (FPGA), etc., where compression and decompression are one of the important computing capabilities of SSD.
For example, the SSD may compress data to obtain compressed data when writing data, so that the compressed data may be stored to some extent in a storage space with a limited storage capacity. However, the compression rate and the compression rate of the current compression algorithm have certain mutual exclusivity, and under the condition that the SSD has certain requirements on the timeliness of the written data, the compression rate of the compression algorithm adopted when compressing the written data is large, so that the reduction degree of the data volume of the compressed data compared with the data volume of the data before compression is limited, and the saving effect of the storage space of the SSD is not obvious.
Therefore, how to further save the storage space of the SSD becomes an urgent problem to be solved.
Disclosure of Invention
Embodiments of the present application provide a data processing method, an apparatus, a solid state drive, and a storage medium, which can save a storage space of an SSD and have high applicability.
In one aspect, an embodiment of the present application provides a data processing method, where the method includes:
determining data to be migrated in compressed data stored by a Solid State Drive (SSD);
determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data;
compressing the first decompressed data based on a second compression algorithm to obtain target compressed data, and storing the target compressed data to the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
In another aspect, an embodiment of the present application provides a data processing apparatus, including:
the data determining module is used for determining data to be migrated in the compressed data stored in the SSD;
the data compression module is used for determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data;
a data decompression module, configured to compress the first decompressed data based on a second compression algorithm to obtain target compressed data, and store the target compressed data in the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
In another aspect, embodiments of the present application provide a solid state drive including a processor and a memory, the processor and the memory being connected to each other;
the memory is used for storing computer programs;
the processor is configured to execute the data processing method provided by the embodiment of the application when the computer program is called.
In another aspect, an embodiment of the present application provides a computer-readable storage medium, where a computer program is stored, and the computer program is executed by a processor to implement the data processing method provided by the embodiment of the present application.
In another aspect, embodiments of the present application provide a computer program product or a computer program, which includes computer instructions stored in a computer-readable storage medium. The processor of the solid-state memory reads the computer instructions from the computer-readable storage medium, and executes the computer instructions, so that the computer device executes the data processing method provided by the embodiment of the application.
In the embodiment of the application, the data to be migrated in the solid state drive can be decompressed to obtain the first decompressed data, and the first decompressed data is compressed based on the compression algorithm with the lower compression ratio to obtain the target compressed data, so that the data size of the target compressed data is smaller than that of the data to be migrated, and further, under the condition that the data to be migrated is not changed, the storage space of the solid state drive is saved, and the applicability is high.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the embodiments will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
Fig. 1 is a schematic view of a scene of a data processing method according to an embodiment of the present application;
FIG. 2 is a flow chart of a data processing method according to an embodiment of the present disclosure;
FIG. 3 is a flow chart illustrating a process of writing data by the solid state drive according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of a scenario of a tag compression algorithm provided in an embodiment of the present application;
FIG. 5 is a schematic diagram of a scenario for storing target compressed data according to an embodiment of the present application;
fig. 6 is a schematic flowchart of an application scenario of the data processing method according to the embodiment of the present application;
FIG. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application;
fig. 8 is a schematic structural diagram of a solid-state drive according to an embodiment of the present application.
Detailed Description
The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
Referring to fig. 1, fig. 1 is a schematic view of a scene of a data processing method according to an embodiment of the present application. As shown in fig. 1, the solid state drive, upon receiving host data, compresses and stores the host data. Based on this, after determining the data to be migrated in the stored compressed data, the solid state drive may determine a first compression algorithm corresponding to the data to be migrated, and then decompress the data to be migrated based on the first compression algorithm to obtain first decompressed data.
Further, the solid state drive compresses the first decompressed data again based on a second compression algorithm with a compression rate smaller than that of the first compression algorithm to obtain target compressed data, and then stores the target compressed data.
In the embodiments of the present application, the solid state drive is also called a solid state hard disk, which is a hard disk made of an array of solid state electronic memory chips.
Referring to fig. 2, fig. 2 is a schematic flow chart of the data processing method provided in the embodiment of the present application, and as shown in fig. 2, the data processing method provided in the embodiment of the present application is applicable to a solid state drive, and specifically includes the following steps:
and S21, determining data to be migrated in the compressed data stored in the SSD by the solid state drive.
In some possible embodiments, the solid state drive receives data to be written by the host device after receiving the data write command. When the solid state drive writes data, the data to be written is compressed by adopting a certain compression algorithm, and the compressed data is stored to the solid state drive in the form of data blocks.
The data received and initially written by the solid state drive may be system data of the opposite host device, or may also be related data acquired by the host device from a data block or a block chain, or may also be data acquired based on big data, a network, cloud computing, and the like, and may specifically be determined based on requirements of an actual application scenario, which is not limited herein.
In order to improve data writing efficiency and reduce writing delay, and achieve the purposes of high data throughput and data real-time property, a solid-state drive usually adopts a compression algorithm with a faster compression rate when compressing data to be written.
The compression algorithm adopted by the solid state drive when compressing the data to be written may be specifically determined based on the requirements of the actual application scenario, which is not limited herein.
As an example, when host (host) data is written to the solid state drive, the solid state drive internal computing unit compresses the host data using the LZ4 algorithm and stores the compressed data to the solid state drive.
The solid-state drive stores data by adopting nonvolatile flash memory particles (NAND), and can still store the data after power failure.
As an example, referring to fig. 3, fig. 3 is a schematic flow chart of writing data by a solid state drive according to an embodiment of the present application. As shown in fig. 3, after receiving host (host) data, the solid state drive compresses the host data through the internal computing unit, that is, compresses the host data by using the first compression algorithm, and then stores the compressed data. And the solid-state drive stores the compressed data into the NAND Flash.
In some possible embodiments, the data to be migrated in the compressed data stored in the solid state drive is compressed data that needs to be stored from the current data block to another data block, and may be determined based on the actual application scenario requirements, which is not limited herein.
Optionally, the data to be migrated includes at least one of the following:
the method comprises the steps that corresponding compressed data are obtained when a solid-state drive carries out Garbage Collection (GC) processing;
the method comprises the steps that corresponding compressed data are obtained when a solid-state drive carries out Wear Leveling (WL) processing;
at least one data block of the solid state drive stores compressed data of unstable state.
Garbage collection is an action of the solid state drive to sort the internally stored compressed data to generate a blank data block. And the solid-state drive stores the residual effective compressed data in at least one data block into a new data block, and erases the compressed data in the original data block to obtain a blank data block. If the current available storage space of the solid state drive is not enough to store the compressed data, the solid state drive carries part of the compressed data through garbage collection processing to obtain a blank data block, so that the compressed data is stored in the blank data block, and the carried compressed data in the process is data to be migrated in the compressed data stored by the solid state drive.
Each data block in the solid-state drive has a certain lifetime (Program/Erase Count, P/E value), the number of times of erasing data corresponding to each data block is limited, and the lower the P/E value is, the higher the lifetime is, the higher the number of times of erasing data which can be used for information is. The lifetime of a solid state drive depends on the minimum lifetime among all data blocks. If a data block is erased frequently, the service life of the solid state drive is reduced. Based on the data, the solid state drive preferentially selects the data blocks with low P/E values for data erasing, or preferentially stores the compressed data in the data blocks with low P/E values and low data access rates into the data blocks with high P/E values. In the above process, the compressed data that needs to be stored from the data block with a low P/E value to the data block with a high P/E value is the data to be migrated in the compressed data stored in the solid state drive.
In which the solid state drive is unstable based on the compressed Data stored by the NAND, and a Data save (Data Retention) process is required to migrate to a new storage space. In the above process, the storage state of at least one data block of the SSD is unstable and the compressed data that needs to be migrated to the new storage space is the to-be-migrated data in the compressed data stored in the SSD.
In some possible implementations, the solid state drive may determine the data to be migrated in the stored compressed data when idle. And if the compressed data corresponding to the garbage collection processing can be determined when the solid state drive is idle, determining the compressed data corresponding to the garbage collection processing as the data to be migrated. If the compressed data corresponding to the wear leveling processing can be determined when the SSD is idle, the compressed data corresponding to the wear leveling processing can be determined as the data to be migrated. And if the storage state of the compressed data in each data block is unstable when the solid state drive is idle, determining the compressed data as the data to be migrated.
In some possible embodiments, the data to be migrated in the compressed data stored by the solid state drive may be determined based on a preset period. And if the garbage collection processing is carried out based on the preset period, determining the corresponding compressed data during the garbage collection processing as the data to be migrated. And if the data is subjected to the wear leveling processing based on the preset period, determining the corresponding compressed data when the wear leveling processing is performed as the data to be migrated. And if the storage state of the compressed data in the data block for storing the compressed data is unstable, determining the compressed data as the data to be migrated.
And S22, determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data.
In some possible embodiments, since the compression algorithm adopted by the solid state drive when compressing the written data is an algorithm with a fast compression rate, the compression rate of the compression algorithm is positively correlated with the compression rate. The compression ratio is used to indicate the ratio of the size of the compressed data to the size of the compressed money, for example, 90M is obtained by compressing 100M of data, the compression ratio is 90/100 × 100% =90%, and the smaller the compression ratio is, the better the compression ratio is, but the smaller the compression ratio is, the longer the compression time is, and the slower the compression ratio is. Therefore, the solid state drive still occupies a considerable storage space when compressing the written data. In order to improve the utilization rate of the storage space of the solid-state drive, after the data to be migrated is determined, the data to be migrated may be decompressed and the decompressed data may be further processed.
For convenience of description, a compression algorithm corresponding to compressed data stored by the solid state drive is referred to as a first compression algorithm hereinafter.
Specifically, if the solid state drive uniformly compresses the written data by using a preset compression algorithm when compressing the written data, after determining the data to be migrated, if the data to be migrated has not been subjected to data migration before, that is, the data to be migrated is stored in the same data block all the time after being written in the solid state drive, the first compression algorithm corresponding to the data to be migrated may be directly determined to be the preset compression algorithm.
If the solid-state drive uniformly adopts an LZ4 algorithm to compress data when compressing written data, after data to be migrated in the compressed data stored in the solid-state drive is determined, if the data to be migrated does not replace a stored data block, a first compression algorithm corresponding to the data to be migrated can be directly determined to be the LZ4 algorithm.
Optionally, when the solid state drive compresses the written data to obtain compressed data, the solid state drive marks the first compression algorithm used by the compression algorithm identifier. When the solid state drive compresses the written data, different compression algorithms may be used for different data, that is, the first compression algorithms corresponding to different compressed data may be the same or different, and may be specifically determined based on the requirements of the actual application scenario, which is not limited herein.
For convenience of description, the compression algorithm identifier corresponding to the first compression algorithm is referred to as a first compression algorithm identifier hereinafter. For example, after the solid state drive compresses the written data to obtain compressed data, the first compression algorithm identifier corresponding to the first compression algorithm may be added to the meta area corresponding to the compressed data. The data in the meta area is used to describe data, that is, the data in the meta area in the embodiment of the present application may be identified as a compression algorithm corresponding to the compression algorithm. Based on this, after the data to be migrated is determined, the first compression algorithm corresponding to the data to be migrated may be determined based on the first compression algorithm identification of the meta area corresponding to the data to be migrated.
The first compression algorithm identifier corresponding to the first compression algorithm may be represented by one or a combination of multiple of specific characters, numbers, and symbols, and the specific representation mode may be specifically determined based on the requirements of the actual application scenario, which is not limited herein.
Referring to fig. 4, fig. 4 is a schematic view of a scenario of a tag compression algorithm provided in the embodiment of the application. As shown in fig. 4, for the determined data to be migrated, since the data to be migrated is substantially compressed data, if the corresponding first compression algorithm is LZ4 algorithm, the first compression algorithm corresponding to the data to be migrated may be marked by a character "LZ4" in the meta region corresponding to the data to be migrated, that is, the first compression algorithm corresponding to the first compression algorithm is identified as "LZ4".
Specifically, when the data to be migrated is decompressed based on the first compression algorithm, the data to be migrated may be decompressed based on a decompression algorithm corresponding to the first compression algorithm, so as to obtain decompressed data corresponding to the data to be migrated. For convenience of description, decompressed data obtained by decompressing the data to be migrated based on the first compression algorithm is referred to as first decompressed data hereinafter.
And S23, compressing the first decompressed data based on a second compression algorithm to obtain target compressed data, and storing the target compressed data in the solid-state drive.
In some feasible embodiments, after the data to be migrated is decompressed to obtain the corresponding first decompressed data, because the solid state drive is to improve the data writing efficiency, when the data to be written is compressed, a compression algorithm with a higher compression rate is often used, so after the first decompressed data corresponding to the data to be migrated is obtained, the first decompressed data can be compressed by using a compression algorithm with a lower compression rate, and then the compressed data obtained by compressing the first decompressed data by using the compression algorithm with the lower compression rate is stored in the solid state drive.
For convenience of description, the compression algorithm with a lower compression rate used when compressing the first decompressed data will be referred to as a second compression algorithm hereinafter. I.e. the compression ratio of the second decompression algorithm is smaller than the compression ratio of the first compression algorithm.
For convenience of description, compressed data obtained by compressing the first decompressed data by the second compression algorithm is hereinafter referred to as target compressed data.
Referring to fig. 5, fig. 5 is a schematic view of a scenario for storing target compressed data according to an embodiment of the present application. As shown in fig. 5, the solid state drive determines the data to be migrated from the NAND for storing compressed data, and then decompresses the data to be migrated by using the first compression algorithm. Further, the solid state drive compresses the decompressed first decompressed data again by using a second compression algorithm, and stores the compressed data (i.e., the target compressed data in the embodiment of the present application). The compression ratio of the second compression algorithm is lower and is more stressed than the high compression ratio, so that after the target compressed data is stored in a new data block in the NAND, the storage space can be further saved. The compression and decompression actions are carried out in the background of the solid state drive, and the real-time performance of the host for reading and writing the data stored in the solid state drive is not directly influenced.
In some possible embodiments, after decompressing the data to be migrated based on the first compression algorithm to obtain the first decompressed data, the second compression algorithm for compressing the first decompressed data may be determined first.
Specifically, the first compression algorithm corresponding to the data to be migrated can be determined based on the first compression algorithm identifier corresponding to the data to be migrated, at this time, whether the first compression algorithm is the compression algorithm with the lowest current compression ratio can be determined, and if the first compression algorithm is not the compression algorithm with the lowest current compression ratio, any compression algorithm with a compression ratio smaller than that of the first compression algorithm is determined as the second compression algorithm. If the first compression algorithm corresponding to the data to be migrated is determined to be the LZ4 algorithm based on the first compression algorithm identification corresponding to the data to be migrated, the gzip algorithm can be determined to be the second compression algorithm because the compression rate of the gzip algorithm is smaller than that of the LZ4 algorithm.
Optionally, the solid state drive stores a first compression algorithm list and a second compression algorithm list, where a compression ratio of any compression algorithm in the first compression algorithm list is greater than a compression ratio of any compression algorithm in the second compression algorithm list, and the second compression algorithm list includes a compression algorithm with a lowest current compression ratio. After determining the first compression algorithm corresponding to the data to be migrated, it may be determined whether the first compression algorithm belongs to the first compression algorithm list. If the first compression algorithm belongs to the first compression algorithm list, any compression algorithm in the second compression algorithm list may be determined as the second compression algorithm. If the first compression algorithm belongs to the second compression algorithm list, any compression algorithm in the second compression algorithm list, which has a compression ratio lower than that of the first compression algorithm, may be determined as the second compression algorithm.
Optionally, the second compression algorithm may be a preset compression algorithm with the lowest compression ratio, that is, no matter what the determined first compression algorithm of the data to be migrated is, the compression ratio of the second compression algorithm may always be not greater than that of the first compression algorithm.
The compression rate of each compression algorithm is calculated based on computer equipment, cloud computing (cloud computing) and other modes, and the compression algorithm list is determined based on the calculation result.
The cloud computing is a computing mode, and distributes computing tasks on a resource pool formed by a large number of computers, so that various application systems can acquire computing power, storage space and information services according to needs. The compression rate of each compression algorithm can be rapidly determined based on the computing power of cloud computing.
Optionally, if the solid state drive uniformly compresses the written data by using a preset first compression algorithm when compressing the data, the solid state drive may compress the first decompressed data by using a preset second compression algorithm under the condition that the first decompressed data is obtained by decompressing the data to be migrated based on the preset first compression algorithm, so as to obtain the target compressed data. Wherein the preset compression rate of the first compression algorithm is greater than the preset compression rate of the second compression algorithm.
If the data to be migrated is not decompressed based on the preset first compression algorithm to obtain the first decompressed data, it indicates that the data to be migrated has been migrated according to the previous data, that is, the data to be migrated is decompressed again and compressed for the second time after being written into the solid state drive. In this case, it may be determined whether a compression rate of a first compression algorithm corresponding to the data to be migrated that is to be decompressed is greater than a preset compression rate of a second compression algorithm, and if the compression rate is greater than the preset compression rate of the second compression algorithm, the preset second compression algorithm is determined as the second compression algorithm for compressing the first decompressed data. Otherwise, any compression algorithm having a compression rate smaller than the first compression algorithm may be newly determined as the second compression algorithm for compressing the first decompressed data.
Optionally, since the compression algorithm implements data compression by re-encoding data, the encoding efficiency may also be different when a uniform encoding manner is used to encode data of different data types. Therefore, for a compression algorithm, when data of the same data size and different data types are compressed, there may be a difference in compression rate corresponding to each data type. Therefore, for the first decompressed data corresponding to the data to be migrated, in order to further make the storage space occupied by the target compressed data after compressing the first decompressed data smaller, when determining the second compression algorithm for compressing the first decompressed data, the second compression algorithm may be determined based on the data type of the first decompressed data.
When the second compression algorithm is determined based on the data type of the first decompressed data, the compression rate of each compression algorithm corresponding to the data type can be determined, so that the compression algorithm with the compression rate corresponding to the data type smaller than that of the first compression algorithm corresponding to the data type can be determined, and any one of the compression algorithms can be determined as the second compression algorithm. After the second compression algorithm determined based on the mode compresses the first decompressed data to obtain the target compressed data, the data size of the target compressed data can be smaller than the data to be migrated corresponding to the first decompressed data, and therefore the storage space of the solid state drive is saved.
In some possible implementations, after determining the second compression algorithm, the first decompressed data may be compressed based on the second compression algorithm and the resulting target compression number may be stored to the solid state drive. The target compressed data may be stored to a new empty block in the solid state drive.
The data processing method provided in the embodiment of the present application is further described below with reference to fig. 6. Referring to fig. 6, fig. 6 is a schematic flowchart of an application scenario of the data processing method according to the embodiment of the present application. In fig. 6, after receiving the host data, the SSD internal computing unit compresses the host data using the LZ4 algorithm, writes the compressed data into the NAND Flash, and marks the LZ4 algorithm at the meta corresponding to the compressed data. The LZ4 algorithm is the first compression algorithm in this embodiment.
Further, when the SSD determines valid data during garbage collection, that is, reads the valid data from the NAND Flash to perform recovery storage on the valid data, it may be determined that the compression algorithm corresponding to the valid data is the LZ4 algorithm based on the meta region of the valid data, and decompress the valid data based on the LZ4 algorithm. Meanwhile, under the condition that the compression rate of the LZ4 algorithm is determined to be high, the decompressed data is compressed again based on the gzip algorithm with the lower compression rate, and then the compressed data is written into the NAND Flash.
The LZ4 algorithm is a first compression algorithm in the embodiment of the present application, the gzip algorithm is a second compression algorithm in the embodiment of the present application, valid data read by the SSD during garbage collection is to-be-migrated data in the embodiment of the present application, data obtained by decompressing the valid data based on the LZ4 algorithm is first decompressed data in the embodiment of the present application, and compressed data obtained based on the gzip algorithm is target compressed data in the embodiment of the present application.
In some possible embodiments, when storing target compressed data obtained by compressing the first decompressed data based on the second compression algorithm to the solid state drive, a compression algorithm identification of the second compression algorithm may be determined. For convenience of description, the compression algorithm identification of the second compression algorithm is hereinafter referred to as a second compression algorithm identification.
Further, the target compressed data is marked based on the second compression algorithm identification, and then the marked target compressed data is stored in the solid-state drive.
Wherein, the second compression algorithm identification can be identified based on one or more combinations of characters, letters and numbers.
The marking modes of the second compression algorithm identifiers corresponding to different target compressed data may be the same or different, and may be specifically determined based on the requirements of the actual application scenario, which is not limited herein.
And adding a second compression algorithm identifier in a meta area corresponding to the target compression data to mark a second campus algorithm corresponding to the target compression data.
In some possible embodiments, the decompressed data corresponding to the data read instruction may also be returned in response to the data read instruction. The data reading instruction may be an instruction issued by a device corresponding to the solid state drive for reading data, or may also be a data reading instruction issued when a relevant process in the device corresponding to the solid state drive needs to call relevant data, and a specific triggering manner of the data reading instruction may be specifically determined based on a requirement of an actual application scenario, which is not limited herein.
Specifically, in response to a data reading instruction, compressed data corresponding to the data reading instruction is determined from compressed data stored in the solid state drive. For convenience of description, the compressed data corresponding to the data reading instruction is referred to as data to be read hereinafter.
The compressed data corresponding to the data reading instruction may be compressed data obtained based on a first compression algorithm stored in the solid state drive, or may also be compressed data obtained based on a second compression algorithm stored in the solid state drive, and may be specifically determined based on requirements of an actual application scenario, which is not limited herein.
Further, a compression algorithm corresponding to the data to be read is determined, and then the data to be read is decompressed based on the compression algorithm corresponding to the data to be read, so that decompressed data corresponding to the data to be read is obtained. For convenience of description, the decompressed data corresponding to the data to be read is referred to as a second decompressed description hereinafter.
When the compression algorithm corresponding to the data to be read is determined, the compression algorithm identification corresponding to the data to be read can be determined, and then the compression algorithm corresponding to the data to be read is determined based on the compression algorithm identification corresponding to the data to be read.
When the data to be read is decompressed based on the compression algorithm corresponding to the data to be read, the data to be read can be decompressed based on the decompression algorithm corresponding to the compression algorithm.
Further, after the data to be read is decompressed to obtain second decompressed data, the second decompressed data can be returned. Such as returning the second decompressed data to the corresponding device of the solid state drive.
It should be particularly noted that, in the embodiment of the present application, the first compression algorithm and the second compression algorithm include, but are not limited to, an LZ77 algorithm, an LZR algorithm, an LZSS algorithm, a DEFLATE algorithm, an LZMA, an LZ4 algorithm, a gzip algorithm, and the like, and in practical applications, the compression ratio of the second compression algorithm is smaller than that of the first compression algorithm, which is not limited herein.
In the embodiment of the application, the data to be migrated in the solid state drive can be decompressed to obtain the first decompressed data, and the first decompressed data is compressed based on the compression algorithm with the lower compression ratio to obtain the target compressed data, so that the data size of the target compressed data is smaller than that of the data to be migrated, and the storage space of the solid state drive is saved under the condition that the data to be migrated is not changed.
Meanwhile, the data to be migrated in the processes of garbage recovery, wear balance and the like of the solid state drive can be processed, the real-time performance of reading and writing the data of the solid state drive by the host equipment cannot be directly influenced, and the applicability is high.
Referring to fig. 7, fig. 7 is a schematic structural diagram of a data processing apparatus according to an embodiment of the present application. The data processing apparatus 1 provided in the embodiment of the present application includes:
the data determining module 11 is configured to determine data to be migrated in compressed data stored in the solid state drive SSD;
the data compression module 12 is configured to determine a first compression algorithm corresponding to the data to be migrated, and decompress the data to be migrated based on the first compression algorithm to obtain first decompressed data;
a data decompression module 13, configured to compress the first decompressed data based on a second compression algorithm to obtain target compressed data, and store the target compressed data in the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
In some possible embodiments, the data to be migrated includes at least one of:
compressing data corresponding to the SSD during garbage recovery processing;
compressing data corresponding to the SSD during wear leveling;
at least one data block of the SSD stores compressed data with unstable state.
In some possible embodiments, the data compression module 12 is configured to:
determining a first compression algorithm identifier corresponding to the data to be migrated;
and determining a first compression algorithm corresponding to the data to be migrated based on the compression algorithm identification.
In some possible embodiments, the data decompression module 13 is further configured to:
determining a data type of the first decompressed data;
a second compression algorithm is determined based on the data type.
In some possible embodiments, the data decompression module 13 is further configured to:
determining a second compression algorithm identifier of the second compression algorithm;
and marking the target compressed data based on the second compression algorithm identification, and storing the marked target compressed data to the SSD.
In some possible embodiments, the data decompression module 13 is further configured to:
responding to a data reading instruction, and determining data to be read corresponding to the data reading instruction from compressed data stored in the SSD;
and determining a compression algorithm corresponding to the data to be read, decompressing the data to be read based on the compression algorithm corresponding to the data to be read to obtain second decompressed data corresponding to the data to be read, and returning the second decompressed data.
In some possible embodiments, the data determining module 11 is further configured to:
and determining data to be migrated in the compressed data stored in the SSD based on a preset period.
The data processing apparatus may be a computer program (including program codes) running in the solid state drive, for example, the data processing apparatus is an application software that can be used to execute the implementation manners provided by the steps in fig. 2, which may specifically refer to the implementation manners provided by the steps, and is not described herein again.
In some possible implementations, the data processing apparatus provided in this embodiment may be implemented by a combination of hardware and software, and as an example, the data processing apparatus provided in this embodiment may be a processor in the form of a hardware decoding processor, which is programmed to perform the data processing method provided in this embodiment, for example, the processor in the form of a hardware decoding processor may be implemented by one or more Application Specific Integrated Circuits (ASICs), DSPs, programmable Logic Devices (PLDs), complex Programmable Logic Devices (CPLDs), field Programmable Gate Arrays (FPGAs), or other electronic components.
In some possible implementations, the data processing apparatus provided in the embodiments of the present application may be implemented in software, which may be software in the form of programs and plug-ins, and includes a series of modules, including a data determination module 11, a data compression module, and a data decompression module 13. The data determining module 11, the data compressing module, and the data decompressing module 13 are used to implement the data processing method provided by the embodiment of the present application.
In the embodiment of the application, the data to be migrated in the SSD can be decompressed to obtain the first decompressed data, and the first decompressed data is compressed based on the compression algorithm with the lower compression ratio to obtain the target compressed data, so that the data size of the target compressed data is smaller than that of the data to be migrated, and further, the storage space of the SSD is saved without changing the data to be migrated, and the application is high.
Referring to fig. 8, fig. 8 is a schematic structural diagram of a solid state drive provided in an embodiment of the present application. As shown in fig. 8, the solid state drive 1000 in the present embodiment may include: the processor 1001, the network interface 1004, and the memory 1005, and the solid state drive 1000 may further include: a user interface 1003, and at least one communication bus 1002. The communication bus 1002 is used to implement connection communication among these components. The user interface 1003 may also include a standard wired interface or a wireless interface. The network interface 1004 may optionally include a standard wired interface, a wireless interface (e.g., WI-FI interface). The memory 1005 may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one disk memory. The memory 1005 may optionally be at least one memory device located remotely from the processor 1001. As shown in fig. 8, a memory 1005, which is a kind of computer-readable storage medium, may include therein an operating system, a network communication module, a user interface module, and a device control application program.
In the solid state drive 1000 shown in fig. 8, the network interface 1004 may provide network communication functions; the user interface 1003 is an interface for providing a user with input; and the processor 1001 may be used to invoke a device control application stored in the memory 1005 to implement:
determining data to be migrated in compressed data stored in a Solid State Drive (SSD);
determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data;
compressing the first decompressed data based on a second compression algorithm to obtain target compressed data, and storing the target compressed data to the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
In some possible embodiments, the data to be migrated includes at least one of:
compressing data corresponding to the SSD during garbage recovery processing;
compressing data corresponding to the SSD during wear leveling;
at least one data block of the SSD stores compressed data in an unstable state.
In some possible embodiments, the processor 1001 is configured to:
determining a first compression algorithm identifier corresponding to the data to be migrated;
and determining a first compression algorithm corresponding to the data to be migrated based on the compression algorithm identification.
In some possible embodiments, the processor 1001 is further configured to:
determining a data type of the first decompressed data;
a second compression algorithm is determined based on the data type.
In some possible embodiments, the processor 1001 is configured to:
determining a second compression algorithm identifier of the second compression algorithm;
and marking the target compressed data based on the second compression algorithm identification, and storing the marked target compressed data to the SSD.
In some possible embodiments, the processor 1001 is further configured to:
responding to a data reading instruction, and determining data to be read corresponding to the data reading instruction from compressed data stored in the SSD;
and determining a compression algorithm corresponding to the data to be read, decompressing the data to be read based on the compression algorithm corresponding to the data to be read, obtaining second decompressed data corresponding to the data to be read, and returning the second decompressed data.
In some possible embodiments, the processor 1001 is configured to:
and determining data to be migrated in the compressed data stored in the SSD based on a preset period.
It should be understood that in some possible embodiments, the processor 1001 may be a Central Processing Unit (CPU), and the processor may be other general purpose processors, digital Signal Processors (DSPs), application Specific Integrated Circuits (ASICs), field-programmable gate arrays (FPGAs) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, and the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The memory may include both read-only memory and random access memory, and provides instructions and data to the processor. The portion of memory may also include non-volatile random access memory. For example, the memory may also be composed of non-volatile memory particles (NAND).
In a specific implementation, the solid state drive 1000 may execute the implementation manners provided in the steps in fig. 2 through the built-in functional modules, which may specifically refer to the implementation manners provided in the steps, and are not described herein again.
In the embodiment of the application, the data to be migrated in the SSD can be decompressed to obtain the first decompressed data, and the first decompressed data is compressed based on the compression algorithm with the lower compression ratio to obtain the target compressed data, so that the data size of the target compressed data is smaller than that of the data to be migrated, and further, under the condition that the data to be migrated is not changed, the storage space of the SSD is saved, and the applicability is high.
An embodiment of the present application further provides a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and is executed by a processor to implement the method provided in each step in fig. 2, which may specifically refer to the implementation manner provided in each step, and is not described herein again.
The computer readable storage medium may be any of the data processing devices or internal storage elements of a solid-state memory, such as non-volatile flash memory pellets of a solid-state memory. The computer readable storage medium may also be an external storage device of an electronic device corresponding to the solid state drive, for example, a plug-in hard disk, a Smart Memory Card (SMC), a Secure Digital (SD) card, a flash memory card (flash card), or the like provided on the electronic device. The computer readable storage medium may further include a magnetic disk, an optical disk, a read-only memory (ROM), a Random Access Memory (RAM), and the like. Further, the computer readable storage medium may also include both the internal storage unit of the solid state drive and a corresponding external storage device. The computer readable storage medium is used for storing the computer program and other programs and data required by the electronic device. The computer readable storage medium may also be used to temporarily store data that has been output or is to be output.
Embodiments of the present application provide a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The processor of the solid state drive reads the computer instructions from the computer readable storage medium, and the processor executes the computer instructions, causing the computer device to perform the method provided by the steps of fig. 2.
The terms "first", "second", and the like in the claims and in the description and drawings of the present application are used for distinguishing between different objects and not for describing a particular order. Furthermore, the terms "include" and "have," as well as any variations thereof, are intended to cover non-exclusive inclusions. For example, a process, method, system, article, or electronic device that comprises a list of steps or elements is not limited to only those steps or elements recited, but may alternatively include other steps or elements not expressly listed or inherent to such process, method, article, or electronic device. Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment can be included in at least one embodiment of the application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. It is explicitly and implicitly understood by one skilled in the art that the embodiments described herein can be combined with other embodiments. The term "and/or" as used in this specification and the appended claims refers to and includes any and all possible combinations of one or more of the associated listed items.
Those of ordinary skill in the art will appreciate that the various illustrative components and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the components and steps of the various examples have been described above generally in terms of their functionality in order to clearly illustrate this interchangeability of hardware and software. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The above disclosure is only for the purpose of illustrating the preferred embodiments of the present application and is not intended to limit the scope of the present application, which is defined by the appended claims.

Claims (10)

1. A method of data processing, the method comprising:
determining data to be migrated in compressed data stored by a Solid State Drive (SSD);
determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data;
compressing the first decompressed data based on a second compression algorithm to obtain target compressed data, and storing the target compressed data to the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
2. The method of claim 1, wherein the data to be migrated comprises at least one of:
the SSD carries out corresponding compressed data during garbage recovery processing;
compressing data corresponding to the SSD during wear leveling;
at least one data block of the SSD stores compressed data with unstable state.
3. The method according to claim 1, wherein the determining the first compression algorithm corresponding to the data to be migrated includes:
determining a first compression algorithm identifier corresponding to the data to be migrated;
and determining a first compression algorithm corresponding to the data to be migrated based on the compression algorithm identification.
4. The method of claim 1, further comprising:
determining a data type of the first decompressed data;
a second compression algorithm is determined based on the data type.
5. The method of claim 1, wherein storing the target compressed data to the SSD comprises:
determining a second compression algorithm identification for the second compression algorithm;
and marking the target compressed data based on the second compression algorithm identification, and storing the marked target compressed data to the SSD.
6. The method of claim 1, further comprising:
responding to a data reading instruction, and determining data to be read corresponding to the data reading instruction from compressed data stored in the SSD;
and determining a compression algorithm corresponding to the data to be read, decompressing the data to be read based on the compression algorithm corresponding to the data to be read to obtain second decompressed data corresponding to the data to be read, and returning the second decompressed data.
7. The method according to claim 1, wherein the determining the data to be migrated in the compressed data stored in the SSD comprises:
and determining data to be migrated in the compressed data stored in the SSD based on a preset period.
8. A data processing apparatus, characterized in that the apparatus comprises:
the data determination module is used for determining data to be migrated in compressed data stored in the SSD;
the data compression module is used for determining a first compression algorithm corresponding to the data to be migrated, and decompressing the data to be migrated based on the first compression algorithm to obtain first decompressed data;
the data decompression module is used for compressing the first decompressed data based on a second compression algorithm to obtain target compressed data and storing the target compressed data to the SSD;
the compression ratio of the second compression algorithm is less than the compression ratio of the first compression algorithm.
9. A solid state drive comprising a processor and a memory, the processor and memory being interconnected;
the memory is used for storing a computer program;
the processor is configured to perform the method of any of claims 1 to 7 when the computer program is invoked.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method of any one of claims 1 to 7.
CN202110426638.1A 2021-04-20 2021-04-20 Data processing method, data processing device, solid state drive and storage medium Pending CN115220644A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110426638.1A CN115220644A (en) 2021-04-20 2021-04-20 Data processing method, data processing device, solid state drive and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110426638.1A CN115220644A (en) 2021-04-20 2021-04-20 Data processing method, data processing device, solid state drive and storage medium

Publications (1)

Publication Number Publication Date
CN115220644A true CN115220644A (en) 2022-10-21

Family

ID=83605100

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110426638.1A Pending CN115220644A (en) 2021-04-20 2021-04-20 Data processing method, data processing device, solid state drive and storage medium

Country Status (1)

Country Link
CN (1) CN115220644A (en)

Similar Documents

Publication Publication Date Title
US20080062775A1 (en) Fusion memory device and method
CN105204781A (en) Compression method, device and equipment
US9946464B2 (en) Systems and methods for predicting compressibility of data
CN110764906B (en) Memory recovery processing method and device, electronic equipment and storage medium
WO2021057665A1 (en) Data storage method and apparatus, mobile terminal, and storage medium
CN109067405B (en) Data compression method, device, terminal and computer readable storage medium
CN111309267B (en) Storage space allocation method and device, storage equipment and storage medium
CN104125458A (en) Lossless stored data compression method and device
CN111857574A (en) Write request data compression method, system, terminal and storage medium
CN105096367A (en) Method and device of optimizing Canvas rendering performance
CN112070652A (en) Data compression method, data decompression method, readable storage medium and electronic device
CN108053034B (en) Model parameter processing method and device, electronic equipment and storage medium
JP5895565B2 (en) IC card and program
CN108880559B (en) Data compression method, data decompression method, compression equipment and decompression equipment
US20140258247A1 (en) Electronic apparatus for data access and data access method therefor
CN109508782B (en) Neural network deep learning-based acceleration circuit and method
CN111290848A (en) Memory optimization method, terminal and computer readable storage medium
CN112235422A (en) Data processing method and device, computer readable storage medium and electronic device
CN106293542B (en) Method and device for decompressing file
CN115220644A (en) Data processing method, data processing device, solid state drive and storage medium
CN113590021B (en) Storage system
CN107783990B (en) Data compression method and terminal
KR20080023846A (en) Compressed file managementing device
EP3026551A1 (en) Methods and devices for compressing byte code for smart cards
CN113467699A (en) Method and device for improving available storage capacity

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination