WO2023231571A1 - 数据压缩方法及装置 - Google Patents

数据压缩方法及装置 Download PDF

Info

Publication number
WO2023231571A1
WO2023231571A1 PCT/CN2023/087178 CN2023087178W WO2023231571A1 WO 2023231571 A1 WO2023231571 A1 WO 2023231571A1 CN 2023087178 W CN2023087178 W CN 2023087178W WO 2023231571 A1 WO2023231571 A1 WO 2023231571A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
compression
hardware
source
compression algorithm
Prior art date
Application number
PCT/CN2023/087178
Other languages
English (en)
French (fr)
Inventor
张剑
张希舟
曹文龙
全绍晖
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from CN202211033099.6A external-priority patent/CN117220685A/zh
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2023231571A1 publication Critical patent/WO2023231571A1/zh

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/174Redundancy elimination performed by the file system
    • HELECTRICITY
    • H03ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same, similar or subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction

Definitions

  • the present application relates to the field of data storage, and in particular, to a data compression method and device.
  • the processor can use a data compression algorithm to compress the original data, that is, reconstruct the original data according to a specific method, thereby reducing data redundancy and reducing the storage capacity occupied by the original data.
  • a data compression algorithm For different types of data, different compression algorithms have different compression efficiencies.
  • the compression system usually sets up a hardware accelerator card. A compression algorithm is deployed in the hardware accelerator card to compress the data types that the compression algorithm is suitable for.
  • the compression efficiency of the hardware accelerator card is not good for other data types that are not suitable. Lower, the data compression efficiency of the compression system is affected. Therefore, how to provide a more effective data compression method has become an urgent problem that needs to be solved.
  • This application provides a data compression method and device, which solves the problem that the hardware compression algorithm has low compression efficiency for other data types that are not suitable, and the data compression efficiency of the compression system is affected.
  • a data compression method is provided, the data compression method being performed by a compression device.
  • the compression device may include one or more processors, wherein a set hardware compression algorithm is deployed in one processor.
  • the data compression method includes: a compression device preprocesses source data to obtain first data, the type of the first data matches a hardware compression algorithm; further, the compression device compresses the first data according to the hardware compression algorithm.
  • the compression device preprocesses the source data to obtain the first data, and then compresses the first data whose data type matches the set hardware compression algorithm, so that
  • the set hardware compression algorithm can be adapted to more types of data, and the preprocessing operation of the compression device on the source data improves the applicability of the hardware compression algorithm.
  • the compression algorithm or model does not exist in the processor in the form of a software unit, but is unloaded in the compression device in the form of a hardware compression algorithm.
  • the first data is compressed by the compression device based on the hardware compression algorithm, which improves the efficiency of the compression device. How efficiently the source data is compressed.
  • the hardware compression algorithm includes: Lempel-Ziv (Lz) 77 compression algorithm.
  • the compression device preprocesses the source data to obtain the first data, including: the compression device determines the preprocessed data from the source data according to the data sliding window of the hardware compression algorithm; the data sliding window is used to indicate the data During the compression process, the data range of redundant data is queried, and the preprocessed data contains redundant data within the data sliding window. Furthermore, the compression device selects the aforementioned first data from the preprocessed data.
  • the data range indicated by the data sliding window may also be called: query data segment, query data width, etc.
  • the hardware compression algorithm Since the hardware compression algorithm has a fixed data sliding window after it is deployed in the compression device, the hardware compression algorithm has a low data compression efficiency for the second data that does not have redundant data within the data range indicated by the data sliding window. , therefore, the compression device filters out the preprocessed data with redundant data in the data sliding window from the source data, and selects the first data in the preprocessed data, so that the compression device compresses the first data based on the hardware compression algorithm. , which avoids compressing the second data from occupying the computing power of the hardware compression algorithm, and improves the compression efficiency of the hardware compression algorithm for source data.
  • the compression device preprocesses the source data to obtain the first data, which further includes: first, the compression device determines the second data from the source data according to the data sliding window of the hardware compression algorithm; the second data is The data is not stored in the sliding window in redundant data. Secondly, the compression device selects the first data compression model according to the data type of the second data. Finally, the compression device compresses the second data according to the selected first data compression model.
  • the compression device performs software compression on the second data.
  • the compression algorithm used for software compression may be one or more of the multiple set data compression models.
  • the first data compression model is used as an example for explanation.
  • a part of the data (first data) can be compressed based on the hardware compression algorithm, and the compression efficiency of this part of the data is relatively high; and a part of the data (the second data) can be compressed based on the second data.
  • the first data compression model selected by the data type is compressed, which avoids the problem of low compression efficiency caused by the mismatch between the second data and the hardware compression algorithm (there is no redundant data in the data sliding window), further improving the compression equipment Compression efficiency of source data.
  • the source data includes multiple data blocks.
  • the compression device preprocesses the source data to obtain the first data, including: the compression device determines a third data block that meets a set condition from the plurality of data blocks.
  • the set condition is: using a hardware compression algorithm to compress the predicted data of the data block.
  • the amount is greater than or equal to the data amount of the data block; and, the compression device determines the first data from other data blocks in the plurality of data blocks except the third data block. It can be understood that the data compression process in which the predicted data amount of the data block is greater than or equal to the data amount of the data block before compression can be called an invalid compression process.
  • the third data block is Compression performed by the hardware compression algorithm will occupy the processing bandwidth of the hardware compression algorithm (the amount of data compressed per unit time) and will reduce the compression efficiency of the hardware compression algorithm. Therefore, the compression device extracts data from the source data other than the third data. Determining the first data and compressing the first data based on the hardware compression algorithm avoids the compression process of the third data block from occupying the processing bandwidth of the hardware compression algorithm and improves the compression efficiency of the compression device on the source data.
  • the data compression method provided in this embodiment further includes: the compression device outputs the third data block.
  • the compression device since the predicted data amount of the third data block is greater than or equal to the data amount of the third data block before compression, the compression device does not compress the third data block and directly outputs the third data block. , which avoids the compression process of the third data block occupying the processing bandwidth of the hardware compression algorithm or the processing bandwidth of the data compression model, improves the speed at which the compression device compresses the source data, and reduces the data compression delay.
  • the compression device preprocesses the source data to obtain the first data, including: the compression device performs a spatial transformation (position transform) on the source data to obtain the first data and transformation information; the transformation information is used to indicate the source data.
  • Data mapping relationship between data and first data In some cases, the data type of the source data does not necessarily match the hardware compression algorithm. However, after the compression device performs data mapping operations such as position transformation or spatial transformation on the source data, the data type of the mapped data matches the hardware compression algorithm.
  • the hardware compression algorithm is more efficient in compressing the mapped data (such as the first data), which improves the applicability of the hardware compression algorithm and solves the problem of reduced compression efficiency caused by the mismatch between the data type of the source data and the hardware compression algorithm. The problem.
  • the data compression method provided in this embodiment further includes: the compression device obtains the first compressed data corresponding to the first data, and outputs the first compressed data and the aforementioned transformation information. It is worth noting that the compression device outputs the compressed first compressed data and the aforementioned transformation information, and the decompression device (or decompression device) decompresses the first compressed data according to the transformation information, thereby obtaining the aforementioned source data. , avoids the problem of reduced compression efficiency caused by the mismatch between the data type of the source data and the hardware compression algorithm, and improves the compression efficiency of the source data by the hardware compression algorithm.
  • the compression device preprocesses the source data to obtain the first data, including: the compression device identifies the fourth data in the source data that conforms to the set data pattern, and obtains the first data from the source data except the fourth data. Determine the first data among other data.
  • the set data pattern includes: the data is a string of all 0s, the data is a string of all 1s, or the change pattern between the strings conforms to at least one of the set patterns. For example, since the efficiency of compressing the fourth data that satisfies the set data pattern by the hardware compression algorithm is low, the compression device extracts the data that satisfies the certain data pattern from the source data, and then extracts the data from the source data.
  • the software compression algorithm compresses the fourth data, which reduces the problem of compression efficiency being affected by the mismatch between the fourth data and the hardware compression algorithm, and improves the compression efficiency of the compression device on the source data.
  • the data compression method provided by this embodiment further includes: the compression device selects a second data compression model according to the data mode of the fourth data, and compresses the data according to the second data compression model corresponding to the fourth data.
  • the fourth data is compressed. For example, if the fourth data are all 0 or both are 1, the compression device may select a dictionary compression algorithm to compress the fourth data, thereby quickly completing the compression of the fourth data by the compression device and preventing the fourth data from being compressed by hardware.
  • the compression performed by the algorithm occupies the processing bandwidth of the hardware compression algorithm, thereby improving the compression efficiency of the first data by the hardware compression algorithm.
  • a data compression device which can be applied to compression equipment.
  • the data compression device includes: a preprocessing unit and a hardware compression unit.
  • the preprocessing unit is used to preprocess the source data to obtain first data, the type of the first data is adapted to the hardware compression algorithm; the hardware compression unit is used to compress the first data according to the hardware compression algorithm.
  • the preprocessing unit is specifically used to: determine the preprocessed data from the source data according to the data sliding window of the hardware compression algorithm; the data sliding window is used to indicate the query of redundant data during the data compression process. Data range, preprocessed data has redundant data within the data sliding window.
  • the preprocessing unit is also specifically used to: select the first data from the preprocessed data.
  • the preprocessing unit is also specifically configured to: determine the second data from the source data according to the data sliding window of the hardware compression algorithm; the second data does not have redundant data within the data sliding window.
  • the preprocessing unit is also specifically configured to: select a first data compression model according to the data type of the second data; and compress the second data according to the first data compression model.
  • the source data includes multiple data blocks; the preprocessing unit is specifically used to: determine the third data block that meets the set conditions from the multiple data blocks; the set conditions are: using hardware compression
  • the predicted data size of the algorithm compressed data block is greater than or equal to the data size of the data block.
  • the preprocessing unit is also specifically configured to determine the first data from other data blocks except the third data block among the plurality of data blocks.
  • the data compression device provided in this embodiment further includes: a communication unit.
  • the communication unit is used to: output the third data block.
  • the preprocessing unit is specifically configured to perform spatial transformation on the source data to obtain the first data and transformation information; the transformation information is used to indicate the data mapping relationship between the source data and the first data.
  • the communication unit is further configured to: obtain the first compressed data corresponding to the first data; and output the transformation information and the first compressed data.
  • the preprocessing unit is specifically used to: identify the fourth data in the source data that conforms to the set data pattern; the set data pattern includes: the data is a string of all 0s, and the data is all 1s. The string, or the change pattern between strings, conforms to at least one of the set patterns.
  • the preprocessing unit is also specifically configured to determine the first data from other data included in the source data except the fourth data.
  • the preprocessing unit is further specifically configured to: select a second data compression model according to the data pattern of the fourth data; and compress the fourth data according to the second data compression model.
  • the hardware compression algorithm includes: Lz77 compression algorithm.
  • a chip including: a processor and a power supply circuit.
  • the power supply circuit is used to supply power to the processor; the processor is used to execute the operation steps of the method described in any implementation manner of the first aspect.
  • a fourth aspect provides an interface card, including: the chip and an interface provided in the third aspect; the interface is used to receive signals from other devices other than the interface card and send them to the chip; or to transmit signals from the chip Sent to other devices other than the interface card.
  • the interface card refers to a smart network card, etc.
  • a fifth aspect provides a compression device, including: an interface card as provided in the fourth aspect.
  • the compression device includes at least one processor and a memory, and the memory is used to store a set of computer instructions; when the processor serves as the execution device in the first aspect or any possible implementation of the first aspect to execute the When assembling computer instructions, perform the operation steps of the data migration method in the first aspect or any possible implementation of the first aspect.
  • a sixth aspect provides a compression system, including: a first processor and a second processor. Wherein, a hardware compression algorithm is deployed in the second processor.
  • the first processor is used to obtain source data to be compressed; and preprocess the source data to obtain first data whose data type matches the hardware compression algorithm.
  • the second processor is used to compress the first data according to the hardware compression algorithm.
  • compression system can be used to perform the operational steps of the method described in any implementation manner of the first aspect.
  • a computer-readable storage medium is provided.
  • Computer programs or instructions are stored in the storage medium.
  • the method described in any implementation mode of the first aspect is executed. Steps.
  • a computer program product When the computer program product is run on a computer, it causes the computer to execute the operation steps of the method described in any implementation manner of the first aspect.
  • the computer may refer to a compression device, a compression accelerator card, a chip, etc.
  • Figure 1 is a schematic architectural diagram of a compression system provided by this application.
  • Figure 2 is a schematic structural diagram of a chip provided by this application.
  • FIG. 3 is a schematic flow chart of the data compression method provided by this application.
  • Figure 4 is a schematic flow chart 2 of the data compression method provided by this application.
  • Figure 5 is a flow chart 3 of the data compression method provided by this application.
  • Figure 6 is a schematic flow chart 4 of the data compression algorithm provided by this application.
  • Figure 7 is a schematic structural diagram of a data compression device provided by this application.
  • This application provides a data compression method, which includes: first, a compression device obtains source data to be compressed. Secondly, the compression device preprocesses the source data to obtain the first data whose data type matches the set hardware compression algorithm. Finally, the compression device compresses the first data according to a hardware compression algorithm. When the data type of the source data does not match the hardware compression algorithm, the compression device preprocesses the source data to obtain the first data, and then compresses the first data whose data type matches the set hardware compression algorithm, so that The set hardware compression algorithm can be adapted to more types of data, and the preprocessing operation of the compression device on the source data improves the applicability of the hardware compression algorithm.
  • the compression algorithm or model does not exist in the processor in the form of a software unit, but is unloaded in the compression device in the form of a hardware compression algorithm.
  • the first data is compressed by the compression device based on the hardware compression algorithm, which improves the efficiency of the compression device. How efficiently the source data is compressed.
  • Figure 1 is a schematic architectural diagram of a compression system provided by this application.
  • the compression system includes a compression device 110, an acceleration device 115, and a client device 120.
  • Compression device 110 is a common computer device.
  • the user can input source data to the compression device 110 through the client device 120, and the compression device 110 compresses the source data.
  • the compression device 110 also outputs the target data obtained by compressing the source data to the client device 120.
  • the client device 120 is a terminal device, including but not limited to a personal computer, a server, a mobile phone, a tablet computer or a smart car.
  • the compression device 110 includes an input/output (I/O) interface 114, a processor 111, and a memory 112.
  • I/O interface 114 is used to communicate with devices external to compression device 110 .
  • the client device 120 inputs data and sends compression tasks to the compression device 110 through the I/O interface 114.
  • the compression device 110 processes the input data (such as compressing or decompressing), it then sends data to the client device through the I/O interface 114. 120 sends the output result after processing the data.
  • the processor 111 is the computing core and control core of the compression device 110. It may include: a central processing unit (CPU), a specific integrated circuit, other general-purpose processors, and a digital signal processor (DSP). , application specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc. In practical applications, the compression device 110 may also include multiple processors.
  • the processor 111 may include one or more processor cores. An operating system and other software programs are installed in the processor 111, so that the processor 111 can access the memory 112 and various peripheral component interconnect express (PCIe) devices.
  • PCIe peripheral component interconnect express
  • the processor 111 is connected to the memory 112 through a double data rate (DDR) bus or other types of buses.
  • Memory 112 is the main memory of compression device 110.
  • the memory 112 is generally used to store various running software in the operating system, input data received from the client device 120, and output results sent to the client device 120 in the future. In order to improve the access speed of the processor 111, the memory 112 needs to have the advantage of fast access speed.
  • dynamic random access memory (DRAM) is usually used as the memory 112.
  • the memory 112 can also be other random access memories, such as static random access memory (static random access memory, SRAM), etc.
  • the memory 112 may also be a read only memory (ROM).
  • read-only memory for example, it can be programmable read-only memory (programmable read only memory, PROM), erasable programmable read-only memory (erasable programmable read only memory, EPROM), etc.
  • PROM programmable read only memory
  • EPROM erasable programmable read only memory
  • This embodiment does not limit the number and type of memories 112 .
  • the compression system is also provided with a data storage system 113.
  • the data storage system 113 can be located outside the compression device 110 (as shown in Figure 1), and exchange data with the compression device 110 through the network. .
  • the data storage system 113 may also be located inside the host.
  • the data storage system 113 exchanges data with the processor 111 through the PCIe bus 116 .
  • the data storage system 113 appears as a hard disk.
  • the acceleration device 115 is used to perform compression tasks or decompression tasks.
  • the processor 111 sends the received AI task and input data to the acceleration device 115.
  • the acceleration device 115 completes the AI task according to the input data and then sends the processing result to the processor 111.
  • the acceleration device 115 can be directly inserted into the card slot on the motherboard of the compression device 110 , and exchanges data with the processor 111 through the PCIe bus 116 .
  • the PCIe bus 116 in Figure 1 can also be replaced by a bus accelerator 115 of the Compute Express Link (CXL), Universal Serial Bus (USB) protocol or other protocols for data transmission. .
  • CXL Compute Express Link
  • USB Universal Serial Bus
  • the above-mentioned acceleration device 115 may not be directly inserted into the card slot on the mainboard of the compression device 110, but may be located in the acceleration device.
  • the accelerating device is a device independent of the compression device 110, such as an accelerating card.
  • the compression device 110 may be connected to the acceleration device 115 through a wired network such as a network cable, or may be connected to the acceleration device 115 through a wireless network such as a wireless hotspot or Bluetooth. If the acceleration device 115 is used to process compression tasks, such as compressing source data to be compressed, the acceleration device may be implemented by one or more chips.
  • the chip includes CPU, graphics processing unit (GPU), neural-network processing units (NPU), tensor processing unit (TPU), FPGA, ASIC any kind.
  • GPU also known as display core, visual processor, and display chip, is a microprocessor that specializes in image computing on personal computers, workstations, game consoles, and some mobile devices (such as tablets, smartphones, etc.).
  • NPU simulates human neurons and synapses at the circuit level, and uses deep learning instruction sets to directly process large-scale neurons and synapses.
  • One instruction completes the processing of a group of neurons.
  • ASIC is suitable for a single purpose integrated circuit product.
  • the above-mentioned compression task may refer to compressing the source data obtained by the client device 120
  • the decompression task may refer to decompressing the compressed data sent by the client device 120.
  • the processor 111 in Figure 1 can be implemented by a chip, as shown in Figure 2.
  • Figure 2 is a schematic structural diagram of a chip provided by the present application.
  • the chip 200 includes a core 201, CPU 202, system buffer 203 and DDR 206.
  • the CPU 202 is used to accept AI tasks (such as compression tasks, decompression tasks, neural network computing tasks, etc.) and call the core 201 to perform the task.
  • AI tasks such as compression tasks, decompression tasks, neural network computing tasks, etc.
  • the CPU 202 is also used to undertake scheduling tasks.
  • the CPU 202 can be implemented by an ARM processor, which is small in size, low in power consumption, uses a 32-bit reduced instruction set, and has simple and flexible addressing.
  • the CPU 202 can also be implemented by other processors.
  • Core 201 is used to provide the computing power required for compression and decompression tasks.
  • the core 201 includes a load/store unit (LSU), a cube computing unit, a scalar computing unit, a vector computing unit, and a buffer. ).
  • LSU load/store unit
  • cube computing unit is used to load data to be processed and store processed data. It can also be used to manage the reading and writing of internal data in the core between different buffers, and to complete some format conversion operations.
  • the cube computing unit is used to provide the core computing power of matrix multiplication.
  • the scalar computing unit is a single instruction stream single data (SISD) processor. This type of processor only processes one piece of data (usually an integer or floating point number) at the same time.
  • SISD single instruction stream single data
  • the vector computing unit also known as the array processor, is a processor that can directly operate a set of arrays or vectors for calculations.
  • the number of buffers may be one or more.
  • the buffer mainly refers to the level 1 buffer (level 1 buffer, L1 buffer).
  • the buffer is used to temporarily store some data that needs to be used repeatedly by the core 201 to reduce reading and writing from the bus.
  • the implementation of certain data format conversion functions also requires that the source data be located in the buffer.
  • the distance between the cube computing unit in the core and the storage area where the data is located is shortened, reducing the cube computing unit's access to the DDR 206, thereby reducing the data access delay. , and the core data processing delay.
  • the system buffer 203 mainly refers to the level 2 buffer (level 2 buffer, L1 buffer or L2 cache), which is used to temporarily store input data, intermediate results or final results passing through the chip.
  • level 2 buffer level 2 buffer, L1 buffer or L2 cache
  • DDR 206 is an off-chip memory, which can also be replaced by either high bandwidth memory (HBM) or other off-chip memory. DDR 206 is located between the chip and external memory, overcoming the access speed limit when computing resources share memory reading and writing.
  • HBM high bandwidth memory
  • the input/output (I/O) device 205 included in the chip 200 refers to the hardware for data transmission, and can also be understood as a device connected to the I/O interface.
  • Common I/O devices include network cards, printers, keyboards, mice, etc. All external storage can also be used as I/O devices, such as hard disks, floppy disks, optical disks, etc.
  • the chip 200 may also include an encoder/decoder 204 and an I/O device 205.
  • the encoder/decoder 204 is used to encode or decode data. It should be understood that in some optional situations, the encoder/decoder 204 can also be designed as an encode/decode unit (software module) and then integrated into the core 201 .
  • the core 201, CPU 202, system buffer 203, coder/decoder 204, I/O device 205, and DDR 206 are connected through a bus.
  • the bus may include a path for transmitting information between the above-mentioned components (eg, CPU 202, system buffer 203).
  • the bus may also include a power bus, a control bus, a status signal bus, etc.
  • the bus can be a PCIe bus, or an extended industry standard architecture architecture, EISA) bus, unified bus (unified bus, Ubus or UB), computer express link (compute express link, CXL), cache coherent interconnect for accelerators (CCIX), etc.
  • core 201 can access these I/O devices 205 through the PCIe bus.
  • the core 201 is connected to the system buffer 203 through the DDR bus.
  • different system buffers 203 may use different data buses to communicate with the core 201. Therefore, the DDR bus may also be replaced by other types of data buses. This embodiment of the present application does not limit the bus type.
  • the LSU in the core 201 reads (loads) the data from the DDR 206 and compresses the data to be compressed.
  • the source data is compressed and the target data is obtained.
  • the LSU loads (store) the processing result into the DDR 206, and the network interface card sends the inference result to the client device 120, or sends it to the data storage system 113 for persistent storage.
  • acceleration device 115 shown in Figure 1 can also be implemented by the chip 200 shown in Figure 2, and this application is not limited to this.
  • Figure 3 is a schematic flow chart 1 of the data compression method provided by this application.
  • the data compression method can be applied to the compression system shown in Figure 1.
  • the data compression method can be executed by the compression device 300.
  • the compression device 300 may be the compression device 110 or the acceleration device 115 shown in Figure 1, or the chip shown in Figure 2, etc.
  • the compression device 300 includes multiple processors, one of which is equipped with a hardware compression algorithm (core 201 in Figure 2), and another processor is used to preprocess the source data to be compressed (CPU 202 in Figure 2) ).
  • the processor deployed with the hardware compression algorithm may also be referred to as the hardware compression layer (referred to as: hardware layer) of the compression device 300, and the processor used for preprocessing and compressing data using the software compression algorithm may also be referred to as compression.
  • the software compression layer of device 300 (referred to as: software layer).
  • the core 201 is deployed with a hardware compression algorithm, such as any one or a combination of the following: Lz77 compression algorithm, Huffman (Huffman) coding algorithm, cyclic redundancy check (cyclic redundancy check, CRC) algorithm, finite state entropy coding (finite state entropy, FSE) algorithm, and compressed state maintenance.
  • the CRC algorithm may refer to the CRC32 algorithm.
  • the CRC32 algorithm refers to generating a 4-byte (32-bit) check value and converting it into an 8-digit hexadecimal number, such as FA 12 CD 45 Wait for data verification, etc.
  • the advantages of the CRC algorithm are simplicity and speed. That is to say, the compression device may include one or more fine-grained compression algorithms that support lossless compression, so that one processor can combine multiple fine-grained compression algorithms to achieve multiple lossless compression methods, and Compress the data.
  • the data compression method provided in this embodiment includes the following steps S310 to S340.
  • the compression device obtains the source data to be compressed.
  • the source data may be data to be stored obtained from the client device 120 by the compression device 110 .
  • the source data is data to be stored collected by the client device 120.
  • the compression device 110 compresses the data to be stored, thereby reducing the cost of the data storage system. The amount of storage space occupied in 113.
  • the compression device preprocesses the source data and obtains first data whose data type matches the set hardware compression algorithm.
  • ANNs machine learning or artificial neural networks
  • the data type identification model can also be trained on the sample data by the chip 200 .
  • the data type identification model is used to indicate the mapping relationship between the source data and the data type of the source data.
  • the CPU 202 can identify the data type of the source data using the data type identification model.
  • Data type identification models include but are not limited to: Naive Bayes ( Models such as Bayes (NB), Extreme Gradient Boosting Tree (XGBoost), Multilayer Perceptron (MLP) and combinations of the above models.
  • NB Bayes
  • XGBoost Extreme Gradient Boosting Tree
  • MLP Multilayer Perceptron
  • Sample data can originate from data blocks or file fragments.
  • Sample data includes but is not limited to: text data, picture data, video data, genetic data, executable programs, virtual hard disk data, database data, etc.
  • the data types of sample data include but are not limited to: text type, picture type, video type, gene type, executable program type, virtual hard disk type, database type, etc.
  • the compression device (or other processing device used to train the data type recognition model, such as the acceleration device 115) can obtain sample data from the database.
  • the sample data can be open source data or test data.
  • an identifier can be set for the sample data according to a known data type, and the sample data and the identifier of the sample data are input to an artificial neural network for training to obtain a data type recognition model. For example, if the sample data is text data, you can set a label for the text data based on the file suffix name, input the text data and the text label into the artificial neural network for training, and obtain the mapping relationship between the text data and the text type.
  • the compression device inputs the source data into the data type recognition model, which can identify text types, picture types, video types, gene types, executable program types, virtual hard disk types, database types and other types. Other types may refer to generic types.
  • the data type recognition model can be pre-trained by other devices, and then the data type recognition model is imported into the processor included in the compression device, so that the processor uses the data type recognition model to identify the data type of the source data.
  • the compression device compresses the first data according to the hardware compression algorithm to obtain the first compressed data.
  • the compression device provides preprocessing operations of the software layer (such as The aforementioned S320), after preprocessing the source data to be compressed to obtain the first data, the compression device uses a hardware compression algorithm to compress the first data, thereby obtaining the first compressed data.
  • the first compressed data may include description data, and the description data may refer to a metadata group (tuple) used to describe the compressed first data.
  • the metadata group includes one or more metadata (metadata). The data may be stored in any position included in the first compressed data in the form of data pointers, data identifiers, etc.
  • the compression device when the data type of the source data does not match the hardware compression algorithm, the compression device will preprocess the source data to obtain the first data, and then process the first data whose data type matches the set hardware compression algorithm. Compression enables the set hardware compression algorithm to be adapted to more types of data.
  • the preprocessing operation of the compression device on the source data improves the applicability of the hardware compression algorithm.
  • the compression algorithm or model does not exist in the processor in the form of a software unit, but is unloaded in the compression device in the form of a hardware compression algorithm.
  • the first data is compressed by the compression device based on the hardware compression algorithm, which improves the efficiency of the compression device. How efficiently the source data is compressed.
  • the data compression method provided in this embodiment also includes the following step S340.
  • S340 The compression device outputs target data corresponding to the source data.
  • the target data includes the first compressed data corresponding to S330, and possible other data.
  • the other data may be the remaining data in the source data except the first data, or may be other compressed data after compressing the remaining data.
  • the source data includes multiple data blocks.
  • the compression device can identify from multiple data blocks that meet the set conditions (for example, the set condition is to use the hardware compression algorithm to compress data
  • the predicted data amount of the block is greater than or equal to the data amount of the data block) three data blocks (remaining data), and determine the first data from data other than the third data block among the plurality of data blocks included in the source data.
  • the predicted data amount refers to the amount of storage space occupied by the compressed data corresponding to the third data block after the compression device compresses the third data block.
  • the data compression process in which the predicted data amount of the data block is greater than or equal to the uncompressed data amount of the data block (data amount before compression) can be called an invalid compression process. Therefore, for the third data block, the Compression of the third data block by the hardware compression algorithm will occupy the processing bandwidth of the hardware compression algorithm (the amount of data compressed per unit time) and will reduce the compression efficiency of the hardware compression algorithm. Therefore, the compression device removes the third data from the source data.
  • the first data is determined among other data, and the first data is compressed based on the hardware compression algorithm, which avoids the compression process of the third data block occupying the processing bandwidth of the hardware compression algorithm and improves the compression efficiency of the compression device on the source data.
  • the compression device compares the data volume before and after data compression with a data block (block) granularity, so as not to send the third data block to the hardware compression algorithm for processing, thus avoiding the problem.
  • the third data block occupies the processing bandwidth of the hardware compression algorithm, improving the compression efficiency of the source data.
  • the compression device can also use other data granularities to compare the predicted data volume and the pre-compression data volume of the source data, so that the predicted data volume is greater than or equal to the pre-compression data volume.
  • the compression device can convert the third data block into the source data during the output of the first compressed data. Data blocks are output. Since the predicted data amount of the third data block is greater than or equal to the pre-compression data amount, the compression device does not compress the third data block and directly outputs the third data block, thus avoiding the need for the hardware to be occupied by the compression process of the third data block.
  • the processing bandwidth of the compression algorithm or the processing bandwidth occupied by the data compression model improves the speed of the compression device to compress the source data and reduces the delay of data compression.
  • Figure 4 is a schematic flowchart 2 of the data compression method provided by this application.
  • the aforementioned S320 may include the following steps S320A to S320D.
  • the compression device determines the data sliding window of the hardware compression algorithm.
  • the data sliding window is used to indicate the data range for querying redundant data during the data compression process.
  • the data range may also be referred to as the data width, data segment or historical sliding window in which the compression device queries the redundant data corresponding to the current data from the current data forward, etc. This application is not limited to this.
  • the size of the data sliding window is 8 kilobytes (KB), 32KB, etc.
  • S320B The compression device determines whether redundant data exists within the data range indicated by the data sliding window among the multiple data included in the source data.
  • Redundant data refers to data that appears multiple times in the source data. If redundant data exists within the data range indicated by the data sliding window, the compression device executes S320C; if there is no redundant data within the data range indicated by the data sliding window, the compression device executes the following S331.
  • the compression device identifies the preprocessed data included in the source data based on the data sliding window of the hardware compression algorithm.
  • This preprocessed data has redundant data within the data range indicated by the data sliding window. For example, if the data interval between two consecutive occurrences of data 1 included in the source data is 4KB and the data sliding window is 32KB, then data 1 is used as preprocessed data.
  • the compression device obtains the first data whose data type matches the set hardware compression algorithm from the preprocessed data.
  • the compression device identifies the second data included in the source data according to the data sliding window.
  • the compression device selects the first data compression model according to the data type of the second data.
  • the correspondence between data types and data compression models can be predefined.
  • the memory 112 is used to pre-store the correspondence between the data type and the data compression model.
  • the processor 111 determines the data type, it can first retrieve the corresponding relationship between the data type and the data compression model from the memory 112, and then obtain the corresponding relationship between the data type and the data compression model according to the data type of the data to be compressed. Get one or more data compression models.
  • the correspondence between the data type and the data compression model includes the correspondence between the data type of the data to be compressed and one or more data compression models.
  • This embodiment at least provides one possible implementation mode: filter out commonly used data compression models from all data compression models in traditional technology, or use other data compression models generated by superimposing commonly used data compression models as the data compression models.
  • Model The so-called other data compression models generated by the superposition of commonly used data compression models mainly refer to the generation of high-order data compression models through the superposition of low-order data compression models.
  • nested model is a model that predicts subsequent bytes based on the information of nested symbols (such as []) appearing in the bytes to be predicted.
  • the context model is a model that predicts subsequent bytes based on the context of consecutive bytes that appear before the byte to be predicted.
  • the indirect model is a model that uses the historical information and context of the bits 1-2 bytes before the byte to be predicted to predict subsequent bytes.
  • Text models are models that use information such as words, sentences, and paragraphs to predict subsequent bytes. Usually, it is used to predict text-based data.
  • the sparse model is a model that predicts subsequent bytes by finding discontinuous bytes before the byte to be predicted as context. For example, the 1 byte and 3 bytes before the byte to be predicted are used to predict the subsequent bytes.
  • the extensible markup language model is a model that predicts subsequent bytes through feature information such as labels contained in the bytes to be predicted.
  • the matching model is a model that searches for matching information in the context before the byte to be predicted, and predicts subsequent bytes based on the matching information.
  • the distance model is a model that uses the current byte to be predicted and some special bytes to predict subsequent bytes. For example, the special byte is the distance between space characters.
  • the executable program model is a model that predicts subsequent bytes using a specific set of instructions and opcodes.
  • the word model is a context model that predicts subsequent bytes based on the word information that appears.
  • the record model predicts the context of subsequent bytes by looking up row and column information in the file. A row in a table is called a record, and is mostly used in databases and tables.
  • the image model is a context model that uses the characteristics of the picture to predict subsequent bytes. For example, a context model that uses the grayscale or pixels of an image to predict subsequent bytes.
  • the partial match prediction model searches for matches in the bytes to be predicted based on the multiple bytes that appear consecutively before the byte to be predicted. If no match is found, the partial match in the multiple bytes is reduced. Divide the bytes into bytes, and then search for matches based on the reduced ones until a new byte is found or recorded as a new byte to predict the model.
  • Dynamic Markov compression models are models that use a variable-length bit-level context history table to predict the next bit.
  • the byte model is a model that predicts subsequent bits based on the historical information of the bits.
  • a linear prediction model is a context model that predicts subsequent bytes based on linear regression analysis. Adaptive models are models that adjust calculated probabilities based on probabilities calculated by other models and known context.
  • the sound source model is a context model that predicts subsequent bytes through feature information in the audio file.
  • a general model is a model that makes probabilistic predictions for new data types or unidentified data types. This general model can be generated by superimposing multiple other models. Since the data compression model set is a filtered data compression model, its number is much smaller than the number of originally used data compression models, so it takes up less storage space.
  • the data compression model set can be stored on the processor 111, and the processor 111 can 111 Complete the operation of compressing the second data.
  • the correspondence between data types and data compression models may be presented in the form of a table.
  • illustrating the correspondence between data types and data compression models in the form of a table is only one of the storage forms in the storage device, and does not represent the correspondence between data types and data compression models in the storage device.
  • the storage form of the corresponding relationship between the data type and the data compression model in the storage device can also be stored in other forms, and the embodiment of the present application does not limit this.
  • the data compression model set can be stored on the processor 111 through software.
  • the data compression models included in the data compression model set are stored in a memory built into or coupled to the processor 111 .
  • the data compression model set can also be stored on the processor 111 through hardware.
  • the data compression model set is burned on the processor 111 in the form of designing the circuit structure of the processor 111 .
  • New data types include genetic data data types and big data data types. It is also possible to eliminate the less frequently used models among the existing data compression models, and to modify one or more existing data compression models. Revise.
  • a combination of existing data compression models can be used to generate a new data compression model, and the set of data compression models stored on the processor 111 can be updated through software upgrade.
  • a high-order data compression model can be implemented by a low-order data compression model. Therefore, there is no need to re-change the hardware circuit, which greatly reduces the upgrade cost of the processor 111.
  • the data type of the second data to be compressed is a text type.
  • the processor 111 determines according to the text type that the data compression model corresponding to the text type includes a text model (TextModel), a word model (WordModel), and a nested model (NestModel).
  • TextModel text model
  • WordModel word model
  • NestModel nested model
  • the data compression method provided by this embodiment also includes the following step S333.
  • the compression device compresses the second data according to the selected first data compression model to obtain the second compressed data.
  • the compression device when the compression device outputs the target data corresponding to the source data, the compression device combines the first compressed data corresponding to the first data and the second compressed data corresponding to the second data and outputs them together.
  • the software layer in the compression device uses Lz77_out_win to preprocess the source data and send the second data outside the data sliding window (predetermined window) (for example, the second data is the matched result tuples_0) Go to the FSE module for entropy encoding; input the first data within the data sliding window (for example, the first data is the unmatched string literals_0) into the hardware compression algorithm (such as Lz77_in_win) included in the hardware layer, and compress the hardware layer of the device Lz77 performs a repeated string matching search on the input data within the data range indicated by the data sliding window, and obtains the matching result tuples_1 and the non-matching string literals_1.
  • Lz77_out_win to preprocess the source data and send the second data outside the data sliding window (predetermined window) (for example, the second data is the matched result tuples_0) Go to the FSE module for entropy encoding; input the first data within the data sliding window (for example, the first
  • the matching result tuples_1 and the non-matching string literals_1 enter the HUF module for Huffman encoding.
  • the first compressed data is obtained, and the description data included in the first compressed data is determined based on the matching result tuples_1.
  • the compression device entropy encodes tuples_0 obtained by encoding the second data, summarizes the first compressed data to obtain target data, and outputs the target data.
  • the compression device performs software compression on the second data
  • the compression algorithm used in the software compression may be one or more of a plurality of set data compression models. Therefore, for the source data to be compressed, a part of the data (first data) can be compressed based on a hardware compression algorithm, and the compression efficiency of this part of the data is relatively high.
  • Figure 5 is a schematic flowchart 3 of the data compression method provided by this application.
  • the aforementioned preprocessing operation may include spatial transformation. (position transform)
  • the aforementioned S320 may include the following step S320E.
  • S320E The compression device performs spatial transformation on the source data and obtains the first data and transformation information.
  • the transformation information is used to indicate the data mapping relationship between the source data and the first data.
  • the compression device aggregates the first compressed data and conversion information corresponding to the first data and outputs them.
  • the data type of the source data does not necessarily match the hardware compression algorithm.
  • the compression device performs data mapping operations such as position transformation or spatial transformation on the source data
  • the data type of the mapped data matches the hardware compression algorithm. match, and the hardware compression algorithm is more efficient in compressing the mapped data (such as the first data), which improves the applicability of the hardware compression algorithm and solves the problem caused by the mismatch between the data type of the source data and the hardware compression algorithm.
  • the problem of reduced compression efficiency is more efficient in compressing the mapped data (such as the first data), which improves the applicability of the hardware compression algorithm and solves the problem caused by the mismatch between the data type of the source data and the hardware compression algorithm.
  • the software layer in the compression device will support the spatial transformation of the source data that implements the Hash matching algorithm of the Snappy lossless compression method to obtain the first data that supports the Lz77 compression algorithm.
  • the compression device performs a spatial transformation operation on the source data based on the preprocessing operation of the software layer, and brings similar data blocks in the source data closer together in space, so that the Lz77 compression algorithm of the hardware layer can be used within a limited time. More data redundancy is found within the data sliding window.
  • the spatial transformation information (trans_info) is packaged into the final compression result (target data).
  • the Lz77 compression algorithm performs redundant string processing on the first data that has been transformed within the data sliding window.
  • the matching result tuples_1 and the non-matching string literals_1 enter the HUF module for Huffman encoding, and the third 1. Compressed data.
  • the compression device summarizes the first compressed data and the aforementioned transformation information and outputs the target data.
  • the compression device outputs the compressed first compressed data and the aforementioned transformation information, and the decompression device (or decompression device) decompresses the first compressed data according to the transformation information, thereby obtaining the aforementioned source data. , avoids the problem of reduced compression efficiency caused by the mismatch between the data type of the source data and the hardware compression algorithm, and improves the compression efficiency of the source data by the hardware compression algorithm.
  • Figure 6 is a schematic flowchart 4 of the data compression algorithm provided by this application.
  • the aforementioned preprocessing operation may include Process data or strings that match the set data pattern (including but not limited to elimination, data deduplication, etc.).
  • the aforementioned S320 can include S320F and S320G.
  • the compression device identifies the fourth data in the source data that matches the set data pattern.
  • the set data patterns include: the data is a string of all 0s, the data is a string of all 1s, or the change pattern between strings conforms to at least one of the set patterns.
  • the change pattern between word strings can be equally spaced distribution in ascending order or descending order, such as 1, 2, 3, 4, 5, etc., or 5, 4, 3, 2, 1, etc.
  • S320G The compression device determines the first data from other data except the fourth data included in the source data.
  • the compression device Since the fourth data that satisfies the set data pattern is inefficiently compressed by the hardware compression algorithm, the compression device extracts the data that satisfies a certain data pattern from the source data, and then removes the fourth data from the source data. Determine the first data to be compressed among other data, which avoids the hardware compression algorithm to compress the fourth data, reduces the problem that the compression efficiency is affected by the mismatch between the fourth data and the hardware compression algorithm, and improves the compression equipment's ability to compress the source. Data compression efficiency.
  • the data compression method provided in this embodiment also includes the following step S334.
  • the compression device Based on the data pattern of the fourth data, uses the second data compression model corresponding to the data pattern of the fourth data to compress the fourth data to obtain third compressed data.
  • the compression device can select a dictionary compression algorithm (such as the second data compression model of S334) to compress the fourth data, thereby quickly completing the compression of the fourth data by the compression device.
  • a dictionary compression algorithm such as the second data compression model of S334.
  • the compression device can also perform a pattern_removal operation on these data, thereby reducing the data size of the source data, so that the Lz77 compression algorithm at the hardware layer can be used under limited conditions. More data redundancy is found in the data sliding window. Pattern_removal includes but is not limited to removing all 0 strings, all 1 strings, or strings that match the preset pattern.
  • the Lz77 compression algorithm at the hardware layer in the compression device performs redundant string processing on the transformed data within the data range indicated by the data sliding window. The matching results tuples_1 and the non-matching string literals_1 enter the HUF module for Huffman encoding.
  • the compression device summarizes the first compressed data corresponding to the first data and the third compressed data corresponding to the fourth data and outputs the target data.
  • the compression device implements the preprocessing operation of the hardware compression algorithm (such as Lz77 compression algorithm) in the hardware layer at the software layer, and the software layer
  • the preprocessing in supports flexible algorithm configuration.
  • the compression device will preprocess the source data to obtain the first data, and then combine the data type and the set hardware compression algorithm.
  • the matching first data is compressed, so that the set hardware compression algorithm can be adapted to more types of data.
  • the preprocessing operation of the compression device on the source data improves the applicability of the hardware compression algorithm.
  • the compression algorithm or model does not exist in the processor in the form of a software unit, but is offloaded in the compression device in the form of a hardware compression algorithm.
  • the first data is compressed by the compression device based on the hardware compression algorithm.
  • the hardware layer Unloading the Lz77 compression algorithm can significantly improve compression performance and improve the efficiency of the compression device in compressing source data.
  • the compression device includes corresponding hardware structures and/or software modules that perform each function.
  • the units and method steps of each example described in conjunction with the embodiments disclosed in this application can be implemented in the form of hardware or a combination of hardware and computer software. Whether a certain function is executed by hardware or computer software driving the hardware depends on the specific application scenarios and design constraints of the technical solution.
  • the data compression method provided according to this embodiment is described in detail above with reference to FIGS. 1 to 6 .
  • the data compression device provided according to this embodiment will be described below with reference to FIG. 7 .
  • Figure 7 is a schematic structural diagram of a data compression device provided by this application.
  • the data compression device 700 can be used to implement the above.
  • the function of the compression device in the above method embodiment can also be achieved, so the beneficial effects of the above method embodiment can also be achieved.
  • the data compression may be the compression device 110 shown in Figure 1, the chip 200 shown in Figure 2, or a module (such as a processor) applied to the compression device or chip.
  • the data compression device 700 includes: a communication unit 710 , a preprocessing unit 720 and a hardware compression unit 730 .
  • the data compression device 700 is used to implement the functions of the compression device in the above method embodiments shown in FIGS. 3 to 6 .
  • Communication unit 710 used to obtain source data to be compressed.
  • the preprocessing unit 720 is used to preprocess the source data and obtain the first data whose data type matches the set hardware compression algorithm.
  • the hardware compression unit 730 is used to compress the first data according to a hardware compression algorithm.
  • the preprocessing unit 720 is specifically used to: determine the preprocessed data from the source data according to the data sliding window of the hardware compression algorithm; the data sliding window is used to instruct redundant data to be queried during the data compression process. In the data range, the preprocessed data has redundant data in the data sliding window.
  • the preprocessing unit 720 is also specifically configured to select the first data from the preprocessed data.
  • the preprocessing unit 720 is also specifically configured to: determine the second data from the source data according to the data sliding window of the hardware compression algorithm; the second data does not have redundant data within the data sliding window. .
  • the preprocessing unit 720 is also specifically configured to: select a first data compression model according to the data type of the second data; and compress the second data according to the first data compression model.
  • the source data includes multiple data blocks; the preprocessing unit 720 is specifically configured to: determine a third data block that meets the set conditions from the multiple data blocks; the set conditions are: using hardware
  • the compression algorithm compresses a data block with a predicted data size that is greater than or equal to the data size of the data block.
  • the preprocessing unit 720 is also specifically configured to determine the first data from data blocks other than the third data block among the plurality of data blocks.
  • the communication unit 710 is also used to: output the third data block.
  • the preprocessing unit 720 is specifically configured to: perform spatial transformation on the source data, and obtain the first data and transformation information.
  • the transformation information is used to indicate the data mapping relationship between the source data and the first data.
  • the communication unit 710 is also used to obtain the first compressed data corresponding to the first data. And, output the transformation information and the first compressed data.
  • the preprocessing unit 720 is specifically configured to: identify the fourth data in the source data that conforms to the set data pattern; the set data pattern includes: the data is an all-0 string, and the data is A string of all 1s, or the change pattern between strings conforms to at least one of the set patterns.
  • the preprocessing unit 720 is also specifically configured to determine the first data from other data except the fourth data included in the source data.
  • the preprocessing unit 720 is further specifically configured to select a second data compression model according to the data mode of the fourth data. And, compress the fourth data according to the second data compression model corresponding to the fourth data.
  • the hardware compression algorithm includes: Lz77 compression algorithm.
  • the data compression device 700 in the embodiment of the present application can be implemented by a chip
  • the data compression device 700 according to the embodiment of the present application can correspond to executing the method described in the embodiment of the present application
  • the data compression device 700 in the The above and other operations and/or functions of each unit are respectively to implement the corresponding processes of each method in Figures 3 to 6. For the sake of simplicity, they will not be described again here.
  • the data compression device When the data compression device implements the data compression method shown in any of the preceding figures through software, the data compression device and its respective units may also be software modules.
  • the above-mentioned data compression method is implemented by calling the software module through the processor.
  • the processor may be a CPU, ASIC implementation, or a programmable logic device (PLD).
  • PLD may be a complex programmable logical device (CPLD), FPGA, or general-purpose device.
  • GAL Use generic array logic
  • the hardware may be implemented by a processor or a chip.
  • the chip includes interface circuit and control circuit.
  • the interface circuit is used to receive data from other devices other than the processor and transmit it to the control circuit, or to send data from the control circuit to other devices other than the processor.
  • control circuit is used to implement any of the possible implementation methods in the above embodiments through logic circuits or executing code instructions.
  • the beneficial effects can be found in the description of any aspect in the above embodiments and will not be described again here.
  • the chip includes a processor and a power supply circuit
  • the power supply circuit is used to power the processor
  • the processor can be used to implement the data compression method in the foregoing embodiments.
  • the power supply circuit may be located in the same chip as the processor, or in another chip other than the chip where the processor is located.
  • the power supply circuit may include but is not limited to at least one of the following: a power supply subsystem, a power management chip, a power consumption management processor, or a power consumption management control circuit.
  • processor in the embodiment of the present application may be a CPU, NPU or GPU, or other general-purpose processor, DSP, ASIC, FPGA or other programmable logic device, transistor logic device, hardware component or other random combination.
  • a general-purpose processor can be a microprocessor or any conventional processor.
  • the hardware can also be an interface card.
  • the interface card can include a chip and an interface.
  • the interface is used to receive signals from other devices other than the interface card and send them to the chip; or to send signals from other devices to the chip;
  • the chip's signals are sent to other devices outside of the interface card.
  • the chip can perform the operation steps of the data compression method in the aforementioned embodiment according to the signals sent and received by the interface card.
  • the interface card may refer to a smart network card or the like.
  • the method steps in this embodiment can be implemented by hardware or by a processor executing software instructions.
  • Software instructions can be composed of corresponding software modules, and software modules can be stored in random access memory (RAM), flash memory, ROM, PROM, EPROM, EEPROM, registers, hard disks, mobile hard disks, CD-ROM or laptops. any other form of storage media known in the art.
  • An exemplary storage medium is coupled to the processor such that the processor can read information from the storage medium and write information to the storage medium.
  • the storage medium can also be an integral part of the processor.
  • the processor and storage media may be located in an ASIC. Additionally, the ASIC can be located in a computing device.
  • the processor and the storage medium can also exist as discrete components in computing devices and storage devices.
  • This application also provides a chip system, which includes a processor and is used to implement the functions of the compression device in the above method.
  • the chip system further includes a memory for storing program instructions and/or data.
  • the chip system may be composed of chips, or may include chips and other discrete devices.
  • the hardware when the data compression method provided by this application is integrated on computing chip hardware (such as an accelerator card), the hardware can be installed into a storage system, such as a distributed storage system or a centralized storage system.
  • a storage system such as a distributed storage system or a centralized storage system.
  • the distributed storage system may include: a distributed storage system that integrates storage and computing, or a distributed storage system that separates storage and computing;
  • a centralized storage system may include: a storage system that integrates disks and frames, or a disk-frame-integrated storage system. Box-separated storage systems, etc.
  • the computer program product includes one or more computer programs or instructions.
  • the computer may be a general purpose computer, a special purpose computer, a computer network, a network device, a user equipment, or other programmable device.
  • the computer program or instructions may be stored in In a computer-readable storage medium, or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer program or instructions can be transmitted from a website, computer, server or data center via wired or wireless means Transmission to another website, computer, server or data center.
  • the computer-readable storage medium may be any available medium that can be accessed by a computer or a data storage device such as a server or data center that integrates one or more available media.
  • the available media may be magnetic media, such as floppy disks, hard disks, and magnetic tapes; they may also be optical media, such as digital video discs (DVDs); they may also be semiconductor media, such as solid state drives (solid state drives). , SSD).

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Compression, Expansion, Code Conversion, And Decoders (AREA)

Abstract

公开了一种数据压缩方法及装置,涉及数据存储领域,解决了硬件压缩算法对不适配的其他数据类型的压缩效率较低,压缩系统的数据压缩效率受到影响的问题。在源数据的数据类型与硬件压缩算法不匹配时,压缩设备在对源数据进行预处理获取第一数据后,才将数据类型和设定的硬件压缩算法相匹配的第一数据进行压缩,使得设定的硬件压缩算法可适配更多类型的数据,压缩设备对源数据的预处理操作提高了硬件压缩算法的适用性。而且,压缩算法或模型不是以软件单元的形式存在于处理器中,而以硬件压缩算法的形式卸载在压缩设备中,该第一数据由压缩设备基于硬件压缩算法来进行压缩,提高了压缩设备对源数据进行压缩的效率。

Description

数据压缩方法及装置 技术领域
本申请涉及数据存储领域,尤其涉及一种数据压缩方法及装置。
背景技术
随着大数据(big data)、人工智能(artificial intelligence,AI)、云计算技术的发展,大量的数据不断产生。为了降低数据存储成本,处理器可以采用数据压缩算法压缩原始数据,即将原始数据按照特定方法重构,从而降低数据冗余度,缩减原始数据占用的存储容量。针对于不同类型的数据,不同的压缩算法的压缩效率不同。压缩系统通常会设置一个硬件加速卡,该硬件加速卡中部署有一种压缩算法,以对该压缩算法适配的数据类型进行压缩,但该硬件加速卡对不适配的其他数据类型的压缩效率较低,压缩系统的数据压缩效率受到影响。因此,如何提供一种更有效的数据压缩方法成为目前亟需解决的问题。
发明内容
本申请提供了数据压缩方法及装置,解决了硬件压缩算法对不适配的其他数据类型的压缩效率较低,压缩系统的数据压缩效率受到影响的问题。
第一方面,提供了一种数据压缩方法,该数据压缩方法由压缩设备执行。示例性的,该压缩设备可包括一个或多个处理器,其中一个处理器中部署有设定的硬件压缩算法。该数据压缩方法包括:压缩设备预处理源数据得到第一数据,该第一数据的类型与硬件压缩算法匹配;进而,压缩设备根据硬件压缩算法对第一数据进行压缩。
在源数据的数据类型与硬件压缩算法不匹配时,压缩设备在对源数据进行预处理获取第一数据后,才将数据类型和设定的硬件压缩算法相匹配的第一数据进行压缩,使得设定的硬件压缩算法可适配更多类型的数据,压缩设备对源数据的预处理操作提高了硬件压缩算法的适用性。而且,压缩算法或模型不是以软件单元的形式存在于处理器中,而以硬件压缩算法的形式卸载在压缩设备中,该第一数据由压缩设备基于硬件压缩算法来进行压缩,提高了压缩设备对源数据进行压缩的效率。
在一些示例中,硬件压缩算法包括:伦佩尔-齐夫(Lempel-Ziv,Lz)77压缩算法。
一种可选的实现方式中,压缩设备预处理源数据得到第一数据,包括:压缩设备根据硬件压缩算法的数据滑窗,从源数据中确定预处理数据;该数据滑窗用于指示数据压缩过程中查询冗余数据的数据范围,且预处理数据在数据滑窗内存在冗余数据。进而,压缩设备从预处理数据中选择前述的第一数据。
在一些示例中,数据滑窗指示的数据范围也可称为:查询数据段、查询数据宽度等。
由于硬件压缩算法在部署到压缩设备中后,该硬件压缩算法具有固定的数据滑窗,硬件压缩算法对数据滑窗指示的数据范围内未存在冗余数据的第二数据的数据压缩效率较低,因此,在压缩设备从源数据中筛选出数据滑窗内存在冗余数据的预处理数据,并在预处理数据中选择第一数据,从而压缩设备基于硬件压缩算法对该第一数据进行压缩,避免了压缩第二数据占用硬件压缩算法的算力,提升了硬件压缩算法的对源数据的压缩效率。
一种可选的实现方式中,压缩设备预处理源数据得到第一数据,还包括:首先,压缩设备根据硬件压缩算法的数据滑窗,从源数据中确定第二数据;该第二数据在数据滑窗内不存 在冗余数据。其次,压缩设备根据第二数据的数据类型,选择第一数据压缩模型。最后,压缩设备根据选择出的第一数据压缩模型对第二数据进行压缩。
在本实施例中,压缩设备将第二数据进行软件压缩,软件压缩所用的压缩算法可以是设定的多种数据压缩模型包含的一个或多个,这里以第一数据压缩模型为例进行说明,针对于待压缩的源数据而言,一部分数据(第一数据)可基于硬件压缩算法进行压缩,该部分数据的压缩效率较高;还有一部分数据(第二数据)可基于第二数据的数据类型选择出的第一数据压缩模型进行压缩,避免了第二数据与硬件压缩算法不适配(数据滑窗内不存在冗余数据)导致的压缩效率较低的问题,进一步提升了压缩设备对源数据的压缩效率。
一种可选的实现方式中,源数据包括多个数据块。压缩设备预处理源数据得到第一数据,包括:压缩设备从所述多个数据块中确定符合设定条件的第三数据块,该设定条件为:利用硬件压缩算法压缩数据块的预测数据量大于或等于数据块的数据量;以及,压缩设备从多个数据块中除第三数据块外的其他数据块中确定第一数据。可以理解的,数据块的预测数据量大于或等于数据块在压缩前的数据量的数据压缩过程可以称为一个无效的压缩过程,因此,对于第三数据块而言,该第三数据块由硬件压缩算法进行压缩会占用硬件压缩算法的处理带宽(单位时间内压缩的数据量),会降低硬件压缩算法的压缩效率,因此,压缩设备从源数据中除该第三数据外的其他数据中确定第一数据,并基于硬件压缩算法对第一数据进行压缩,避免了第三数据块的压缩过程占用硬件压缩算法的处理带宽,提高了压缩设备对源数据的压缩效率。
一种可选的实现方式中,本实施例提供的数据压缩方法还包括:压缩设备输出第三数据块。在本实施例中,由于该第三数据块的预测数据量大于或等于第三数据块在压缩前的数据量,因此,压缩设备不对该第三数据块进行压缩,直接输出该第三数据块,避免了第三数据块的压缩过程占用硬件压缩算法的处理带宽或者占用数据压缩模型的处理带宽,提高了压缩设备对源数据进行压缩的速度,减少了数据压缩的时延。
一种可选的实现方式中,压缩设备预处理源数据得到第一数据,包括:压缩设备对源数据进行空间变换(position transform),获取第一数据和变换信息;该变换信息用于指示源数据和第一数据之间的数据映射关系。在一些情形中,源数据的数据类型与硬件压缩算法不一定匹配,但是压缩设备对该源数据进行位置变换或空间变换等数据映射操作后,映射后数据的数据类型与硬件压缩算法相匹配,且由硬件压缩算法对映射后数据(如第一数据)进行压缩的效率较高,提升了硬件压缩算法的适用性,解决了源数据的数据类型与硬件压缩算法不适配导致的压缩效率降低的问题。
一种可选的实现方式中,本实施例提供的数据压缩方法还包括:压缩设备获取第一数据对应的第一压缩数据,并输出该第一压缩数据和前述的变换信息。值得注意的是,压缩设备将压缩后的第一压缩数据和前述的变换信息进行输出,解压缩设备(或称:解压设备)依据变换信息对第一压缩数据进行解压,从而获取前述的源数据,避免了源数据的数据类型与硬件压缩算法不适配导致的压缩效率降低的问题,提高了硬件压缩算法对源数据的压缩效率。
一种可选的实现方式中,压缩设备预处理源数据得到第一数据,包括:压缩设备识别源数据中符合设定的数据模式的第四数据,并从源数据包括的除第四数据外的其他数据中确定第一数据。其中,该设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种。示例性的,由于满足设定的数据模式的第四数据由硬件压缩算法进行压缩的效率较低,因此,压缩设备对源数据中满足一定的数据模式的数据进行提取后,再从源数据中除第四数据外的其他数据中确定待压缩的第一数据,避免了硬 件压缩算法对第四数据进行压缩,降低了第四数据与硬件压缩算法不匹配导致压缩效率受到影响的问题,提高了压缩设备对源数据的压缩效率。
一种可选的实现方式中,本实施例提供的数据压缩方法还包括:压缩设备根据第四数据的数据模式,选择第二数据压缩模型,并根据第四数据对应的第二数据压缩模型对第四数据进行压缩。示例性的,若第四数据均为0或者均为1,则压缩设备可选择字典压缩算法对第四数据进行压缩,从而快速完成压缩设备对第四数据的压缩,避免第四数据由硬件压缩算法进行压缩占用硬件压缩算法的处理带宽,提高了硬件压缩算法对第一数据的压缩效率。
第二方面,提供了一种数据压缩装置,该数据压缩装置可应用于压缩设备,数据压缩装置包括:预处理单元和硬件压缩单元。预处理单元,用于预处理源数据得到第一数据,第一数据的类型适配硬件压缩算法;硬件压缩单元,用于根据硬件压缩算法对第一数据进行压缩。
一种可选的实现方式中,预处理单元,具体用于:根据硬件压缩算法的数据滑窗,从源数据中确定预处理数据;数据滑窗用于指示数据压缩过程中查询冗余数据的数据范围,预处理数据在数据滑窗内存在冗余数据。预处理单元,还具体用于:从预处理数据中选择第一数据。
一种可选的实现方式中,预处理单元,还具体用于:根据硬件压缩算法的数据滑窗,从源数据中确定第二数据;第二数据在数据滑窗内不存在冗余数据。预处理单元,还具体用于:根据第二数据的数据类型,选择第一数据压缩模型;以及,根据第一数据压缩模型对第二数据进行压缩。
一种可选的实现方式中,源数据包括多个数据块;预处理单元,具体用于:从多个数据块中确定符合设定条件的第三数据块;设定条件为:利用硬件压缩算法压缩数据块的预测数据量大于或等于数据块的数据量。预处理单元,还具体用于:从多个数据块中除第三数据块外的其他数据块中确定第一数据。
一种可选的实现方式中,本实施例提供的数据压缩装置还包括:通信单元。例如,该通信单元用于:输出第三数据块。
一种可选的实现方式中,预处理单元,具体用于:对源数据进行空间变换,获取第一数据和变换信息;变换信息用于指示源数据和第一数据之间的数据映射关系。
一种可选的实现方式中,通信单元还用于:获取第一数据对应的第一压缩数据;以及,输出变换信息和第一压缩数据。
一种可选的实现方式中,预处理单元,具体用于:识别源数据中符合设定的数据模式的第四数据;设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种。预处理单元,还具体用于:从源数据包括的除第四数据外的其他数据中确定第一数据。
一种可选的实现方式中,预处理单元,还具体用于:根据第四数据的数据模式,选择第二数据压缩模型;以及,根据第二数据压缩模型对第四数据进行压缩。
一种可选的实现方式中,硬件压缩算法包括:Lz77压缩算法。
第三方面,提供了一种芯片,包括:处理器和供电电路。供电电路用于为处理器供电;处理器用于执行第一方面中任一种实现方式所述的方法的操作步骤。
第四方面,提供了一种接口卡,包括:第三方面所提供的芯片和接口;接口用于接收来自接口卡之外的其他装置的信号并发送至芯片;或用于将来自芯片的信号发送给接口卡之外的其他装置。例如,该接口卡是指智能网卡等。
第五方面,提供了一种压缩设备,包括:如第四方面提供的接口卡。
在一些情形中,该压缩设备包括至少一个处理器和存储器,存储器用于存储一组计算机指令;当处理器作为第一方面或第一方面任一种可能实现方式中的执行设备执行所述一组计算机指令时,执行第一方面或第一方面任一种可能实现方式中的数据迁移方法的操作步骤。
第六方面,提供了一种压缩系统,包括:第一处理器和第二处理器。其中,第二处理器中部署有硬件压缩算法。第一处理器,用于获取待压缩的源数据;以及预处理源数据,获取数据类型与硬件压缩算法相匹配的第一数据。第二处理器,用于根据硬件压缩算法对第一数据进行压缩。
应理解,该压缩系统可用于执行第一方面中任一种实现方式所述的方法的操作步骤。
第七方面,提供了一种计算机可读存储介质,存储介质中存储有计算机程序或指令,当计算机程序或指令被压缩设备执行时,执行第一方面中任一种实现方式所述的方法的操作步骤。
第八方面,提供了一种计算机程序产品,计算机程序产品在计算机上运行时,使得计算机执行第一方面中任一种实现方式所述的方法的操作步骤。示例性的,该计算机可以是指压缩设备、压缩加速卡、芯片等。
本申请在上述各方面提供的实现方式的基础上,还可以进行进一步组合以提供更多实现方式。
附图说明
图1为本申请提供的一种压缩系统的架构示意图;
图2为本申请提供的一种芯片的结构示意图;
图3为本申请提供的数据压缩方法的流程示意图一;
图4为本申请提供的数据压缩方法的流程示意图二;
图5为本申请提供的数据压缩方法的流程示意图三;
图6为本申请提供的数据压缩算法的流程示意图四;
图7为本申请提供的数据压缩装置的结构示意图。
具体实施方式
本申请提供了一种数据压缩方法,包括:首先,压缩设备获取待压缩的源数据。其次,压缩设备预处理源数据,获取数据类型与设定的硬件压缩算法相匹配的第一数据。最后,压缩设备根据硬件压缩算法对第一数据进行压缩。在源数据的数据类型与硬件压缩算法不匹配时,压缩设备在对源数据进行预处理获取第一数据后,才将数据类型和设定的硬件压缩算法相匹配的第一数据进行压缩,使得设定的硬件压缩算法可适配更多类型的数据,压缩设备对源数据的预处理操作提高了硬件压缩算法的适用性。而且,压缩算法或模型不是以软件单元的形式存在于处理器中,而以硬件压缩算法的形式卸载在压缩设备中,该第一数据由压缩设备基于硬件压缩算法来进行压缩,提高了压缩设备对源数据进行压缩的效率。
为了下述各实施例的描述清楚简洁,首先给出相关技术的简要介绍。
图1为本申请提供的一种压缩系统的架构示意图。压缩系统包括压缩设备110、加速装置115和客户设备120。压缩设备110是一个常见的计算机设备。用户可通过客户设备120向压缩设备110输入源数据,由压缩设备110对该源数据进行压缩,压缩设备110还将压缩源数据获取的目标数据输出至客户设备120。客户设备120是一种终端设备,包括但不限于个人电脑、服务器、手机、平板电脑或者智能车等。
压缩设备110包括输入输出(input/output,I/O)接口114、处理器111、存储器112。I/O接口114用于与位于压缩设备110外部的设备通信。例如,客户设备120通过I/O接口114向压缩设备110输入数据以及发送压缩任务,压缩设备110对输入的数据进行处理(如压缩或者解压缩)之后,再通过I/O接口114向客户设备120发送对该数据处理后的输出结果。
处理器111是压缩设备110的运算核心和控制核心,它可以包括:中央处理器(central processing unit,CPU)、特定的集成电路,其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。实际应用中,压缩设备110也可以包括多个处理器。处理器111中可以包括一个或多个处理器核(core)。在处理器111中安装有操作系统和其他软件程序,从而处理器111能够实现对存储器112及各种外围器件互联(Peripheral Component Interconnect express,PCIe)设备的访问。
处理器111通过双倍速率(double data rate,DDR)总线或者其他类型的总线和存储器112相连。存储器112是压缩设备110的主存(main memory)。存储器112通常用来存放操作系统中各种正在运行的软件、从客户设备120接收的输入数据以及将来发送给客户设备120的输出结果等。为了提高处理器111的访问速度,存储器112需要具备访问速度快的优点。在传统的计算机设备中,通常采用动态随机存取存储器(dynamic random access memory,DRAM)作为存储器112。除了DRAM之外,存储器112还可以是其他随机存取存储器,例如静态随机存取存储器(static random access memory,SRAM)等。另外,存储器112也可以是只读存储器(read only memory,ROM)。而对于只读存储器,举例来说,可以是可编程只读存储器(programmable read only memory,PROM)、可抹除可编程只读存储器(erasable programmable read only memory,EPROM)等。本实施例不对存储器112的数量和类型进行限定。
可选的,为了对数据进行持久化存储,压缩系统中还设置有数据存储系统113,数据存储系统113可位于压缩设备110的外部(如图1所示),通过网络与压缩设备110交换数据。可选的,数据存储系统113也可以位于主机的内部,如数据存储系统113通过PCIe总线116与处理器111交换数据。此时,数据存储系统113表现为硬盘。
加速装置115用于执行压缩任务或者解压缩任务。处理器111将接收的AI任务以及输入数据发送给加速装置115,加速装置115根据输入数据完成所述AI任务之后将处理结果发送给处理器111。如图1所示,加速装置115可以直接插在压缩设备110的主板上的卡槽中,通过PCIe总线116与处理器111交换数据。需注意的是,图1中的PCIe总线116也可以替换成计算快速互联(compute express link,CXL)、通用串行总线(universal serial bus,USB)协议或其他协议的总线加速装置115进行数据传输。
另外,上述的加速装置115也可以不是直接插在压缩设备110的主板上的卡槽中,而是位于加速设备中的。如该加速设备是一个独立于压缩设备110的设备,如加速卡。此时,压缩设备110可以通过网线等有线网络与加速装置115进行连接,也可以通过无线热点或者蓝牙(bluetooth)等无线网络与加速装置115进行连接。如加速装置115用于处理压缩任务,例如对待压缩的源数据进行压缩等,加速装置可以由一个或多个芯片来实现。如该芯片包括CPU、图形处理器(graphics processing unit,GPU)、神经网络处理器(neural-network processing units,NPU)、张量处理单元(tensor processing unit,TPU)、FPGA、ASIC中 的任意一种。其中,GPU又称显示核心、视觉处理器、显示芯片,是一种专门在个人电脑、工作站、游戏机和一些移动设备(如平板电脑、智能手机等)上图像运算工作的微处理器。NPU在电路层模拟人类神经元和突触,并且用深度学习指令集直接处理大规模的神经元和突触,一条指令完成一组神经元的处理。ASIC适合于某一单一用途的集成电路产品。
上述的压缩任务可以是指对客户设备120获取的源数据进行压缩,解压缩任务可以是指将客户设备120发送的压缩数据进行解压。
示例性的,图1中的处理器111可通过芯片来实现,如图2所示,图2为本申请提供的一种芯片的结构示意图,示例的,该芯片200包括核心(core)201、CPU 202、系统缓冲区203和DDR 206。
其中,CPU 202用于接受AI任务(如压缩任务、解压缩任务、神经网络运算任务等),并调用核心201执行该任务。在芯片200有多个核心201的情况下,CPU 202还用于承担调度的任务。例如,CPU 202可由ARM处理器实现,体积小、低功耗,采用32位精简指令集,寻址简单灵活。当然,在一些实施方式中,CPU 202也可以由其他处理器实现。
核心201用于提供压缩和解压缩任务中所需的运算能力。在一种可选的情形中,核心201包括加载/存储单元(load/store unit,LSU)、立方体(cube)计算单元、标量(scalar)计算单元、向量(vector)计算单元以及缓冲区(buffer)。其中,LSU用于加载待处理的数据以及存储处理后的数据,还可以用于核心中内部数据在不同缓冲区之间的读写管理,以及完成一些格式转换的操作。立方体计算单元用于提供矩阵乘的核心算力。标量计算单元是一种单指令流单数据流(single instruction single data,SISD)的处理器,该类型处理器在同一时间内只处理一条数据(通常为整数或浮点数)。向量计算单元又称数组处理器,是可以实现直接操作一组数组或向量进行计算的处理器。缓冲区的数量可能是一个或多个,如该缓冲区主要指一级缓存(level 1 buffer,L1 buffer),缓冲区用来暂存核心201需要反复使用的一些数据从而减少从总线读写,另外,某些数据格式转换功能的实现,也要求源数据位于缓冲区中。在本实施例中,由于缓冲区位于核心,拉近了核心中的立方体计算单元和数据所在的存储区域之间的距离,减少立方体计算单元对DDR 206的访问,从而降低了数据的访问时延,以及核心的数据处理时延。
系统缓冲区203,主要指二级缓存(level 2 buffer,L1 buffer或L2 cache),它用于临时存储经过所述芯片的输入数据、中间结果或者最终结果。
DDR 206是一个片外存储器,它也可以替换为或者高带宽存储器(high bandwidth memory,HBM)或者其他片外存储器。DDR 206位于芯片与外部存储器之间,克服了计算资源共享存储器读写时的访问速度限制。
芯片200所包含的输入/输出(Input/Output,I/O)设备205是指进行数据传输的硬件,也可以理解为与I/O接口对接的设备。常见的I/O设备有网卡、打印机、键盘、鼠标等。所有的外存也可以作为I/O设备,如硬盘、软盘、光盘等。
在一些应用场景中需要对数据进行编码或解码处理,因此芯片200还可能包括编/解码器204和I/O设备205,编/解码器204用于对于数据进行编码或者解码。应理解,在一些可选的情形中,编/解码器204也可以被设计为编/解码单元(软件模组)后集成在核心201中。
核心201、CPU 202、系统缓冲区203、编/解码器204、I/O设备205、DDR 206通过总线相连。总线可以包括一通路,用于在上述组件(如CPU 202、系统缓冲区203)之间传送信息。总线除包括数据总线之外,还可以包括电源总线、控制总线和状态信号总线等。但是为了清楚说明起见,总线可以是PCIe总线,或扩展工业标准结构(extended industry standard  architecture,EISA)总线、统一总线(unified bus,Ubus或UB)、计算机快速链接(compute express link,CXL)、缓存一致互联协议(cache coherent interconnect for accelerators,CCIX)等。例如,核心201可以通过PCIe总线访问这些I/O设备205。核心201通过DDR总线和系统缓冲区203相连。这里,不同的系统缓冲区203可能采用不同的数据总线与核心201通信,因此,DDR总线也可以替换为其他类型的数据总线,本申请实施例不对总线类型进行限定。
举例来说,当CPU 202将AI任务所要处理的数据(如待压缩的源数据)加载到DDR 206中后,核心201中的LSU从DDR 206读取(load)该数据,并对该待压缩的源数据进行压缩后获取目标数据。当得出处理结果后,LSU再将该处理结果加载(store)到DDR 206中,由网络接口卡将推理结果发送给客户设备120,或者发送到数据存储系统113进行持久化存储。
值得注意的是,图1所示出的加速装置115也可以通过图2所示出的芯片200实现,本申请对此不予限定。
下面将结合附图对本实施例提供的数据压缩方法的具体实现方式进行详细描述。
如图3所示,图3为本申请提供的数据压缩方法的流程示意图一,该数据压缩方法可以应用于图1所示出的压缩系统,该数据压缩方法可由压缩设备300执行,该压缩设备300可以是图1所示出的压缩设备110或者加速装置115,或者图2所示出的芯片等。压缩设备300包括多个处理器,其中一个处理器中部署有硬件压缩算法(如图2中的核心201),另一个处理器用于对待压缩的源数据进行预处理(如图2中的CPU 202)。应理解,部署有硬件压缩算法的处理器也可以称为压缩设备300的硬件压缩层(简称:硬件层),用于预处理以及利用软件压缩算法对数据进行压缩的处理器也可以称为压缩设备300的软件压缩层(简称:软件层)。
示例性的,核心201中部署有硬件压缩算法,如以下任意一种或多种的组合:Lz77压缩算法,霍夫曼(Huffman)编码算法,循环冗余码校验(cyclic redundancy check,CRC)算法,有限状态熵编码(finite state entropy,FSE)算法,以及压缩状态维护。在一种可能的具体示例中,CRC算法可以是指CRC32算法,CRC32算法是指生成一个4字节(32位)的校验值,并以8位十六进制数,如FA 12 CD 45等来进行数据校验等。CRC算法的优点在于简便、速度快。也就是说,压缩设备可以包括支持实现无损压缩方式的一种或多种细粒度的压缩算法,使得一个处理器可以将多种细粒度的压缩算法进行组合,从而实现多种无损压缩方式,并对数据进行压缩。
请参照图3,本实施例提供的数据压缩方法包括以下步骤S310至S340。
S310,压缩设备获取待压缩的源数据。
如图1所示,该源数据可以是由压缩设备110从客户设备120中获取的待存储数据。
又如,该源数据是客户设备120采集的待存储数据,在该待存储数据将被存储至数据存储系统113的过程中,由压缩设备110对该待存储数据进行压缩,从而减少数据存储系统113中存储空间的占用量。
S320,压缩设备预处理该源数据,获取数据类型与设定的硬件压缩算法相匹配的第一数据。
在识别源数据的数据类型前,可以利用机器学习或人工神经网络(Artificial Neural Networks,ANNs)对大量的样本数据进行训练,得到数据类型识别模型。或者,数据类型识别模型也可以由芯片200对样本数据进行训练。该数据类型识别模型用于指示源数据与源数据的数据类型的映射关系。进而,CPU 202可以利用数据类型识别模型来识别源数据的数据类型。
数据类型识别模型包括但不限于:朴素贝叶斯(Bayes,NB)、极限梯度提升决策树(Extreme Gradient Boosting Tree,XGBoost)、多层感知机(Multilayer Perceptron,MLP)和上述这些模型的组合等模型。
样本数据可以来源于数据块或文件片段。样本数据包括但不限于:文本数据、图片数据、视频数据、基因数据、可执行程序、虚拟硬盘数据和数据库数据等。对应的,样本数据的数据类型包括但不限于:文本类型、图片类型、视频类型、基因类型、可执行程序类型、虚拟硬盘类型和数据库类型等。
可选的,压缩设备(或其他用于训练数据类型识别模型的处理设备,如加速装置115)可以从数据库中获取样本数据。或者,样本数据可以是开源数据或测试数据。
在一些实施例中,可以根据已知的数据类型为样本数据设置标识,将样本数据和样本数据的标识输入至人工神经网络进行训练,得到数据类型识别模型。例如,样本数据为文本数据,可以依据文件后缀名来为文本数据设置标识,将文本数据和文本的标识输入至人工神经网络进行训练,得到文本数据与文本类型的映射关系。
示例的,压缩设备将源数据输入至数据类型识别模型,可以识别出文本类型、图片类型、视频类型、基因类型、可执行程序类型、虚拟硬盘类型、数据库类型和其他类型。其他类型可以是指通用类型。
在本申请实施例中,数据类型识别模型可以是由其他设备预先训练得到,再将数据类型识别模型导入到压缩设备包括的处理器,以便处理器利用数据类型识别模型识别源数据的数据类型。
S330,压缩设备根据硬件压缩算法对第一数据进行压缩,获取第一压缩数据。
示例性的,若硬件压缩算法是Lz77压缩算法,在该Lz77压缩算法被卸载到压缩设备包括的硬件(如图2所示出的核心201)后,压缩设备提供软件层的预处理操作(如前述的S320),对待压缩的源数据进行预处理获取到第一数据后,压缩设备利用硬件压缩算法对第一数据进行压缩,从而获得第一压缩数据。其中,该第一压缩数据可包括描述数据,该描述数据可以是指用于描述压缩第一数据后的元数据组(tuple),该元数据组包括一个或多个元数据(metadata),元数据可以是以数据指针、数据标识等形式存储在第一压缩数据包括的任意位置。
应理解,在源数据的数据类型与硬件压缩算法不匹配时,压缩设备在对源数据进行预处理获取第一数据后,才将数据类型和设定的硬件压缩算法相匹配的第一数据进行压缩,使得设定的硬件压缩算法可适配更多类型的数据,压缩设备对源数据的预处理操作提高了硬件压缩算法的适用性。而且,压缩算法或模型不是以软件单元的形式存在于处理器中,而以硬件压缩算法的形式卸载在压缩设备中,该第一数据由压缩设备基于硬件压缩算法来进行压缩,提高了压缩设备对源数据进行压缩的效率。
请继续参照图3,本实施例提供的数据压缩方法还包括以下步骤S340。
S340,压缩设备输出源数据对应的目标数据。
该目标数据包括S330对应的第一压缩数据,以及可能的其他数据。可选的,该其他数据可以是源数据中除第一数据外的剩余数据,也可以是将该剩余数据进行压缩后的其他压缩数据。
示例性的,假设源数据包括多个数据块(block)。在压缩设备获取数据类型与设定的硬件压缩算法相匹配的第一数据的过程中,压缩设备可从多个数据块中识别符合设定条件(如该设定条件为利用硬件压缩算法压缩数据块的预测数据量大于或等于数据块的数据量)的第 三数据块(剩余数据),并从源数据包括的多个数据块中除第三数据块外的其他数据中确定第一数据。其中,预测数据量是指压缩设备对第三数据块进行压缩后,该第三数据块所对应的压缩数据所占用的存储空间大小。
可以理解的,数据块的预测数据量大于或等于数据块未压缩的数据量(压缩前数据量)的数据压缩过程可以称为一个无效的压缩过程,因此,对于第三数据块而言,该第三数据块由硬件压缩算法进行压缩会占用硬件压缩算法的处理带宽(单位时间内压缩的数据量),会降低硬件压缩算法的压缩效率,因此,压缩设备从源数据中除该第三数据外的其他数据中确定第一数据,并基于硬件压缩算法对第一数据进行压缩,避免了第三数据块的压缩过程占用硬件压缩算法的处理带宽,提高了压缩设备对源数据的压缩效率。
应理解,在本实施例中,压缩设备以数据块(block)为考察粒度对数据压缩前后的数据量进行比对,从而不将该第三数据块发送给硬件压缩算法进行处理,避免了该第三数据块占用硬件压缩算法的处理带宽,提高了源数据的压缩效率。但在一些可选的实现方式中,压缩设备也可以采用其他的数据粒度对源数据进行预测数据量和压缩前数据量的比对,从而将预测数据量大于或等于压缩前数据量的数据进行筛选,降低硬件压缩算法被无效压缩过程所占用的带宽,提高源数据的压缩效率,如数据段(segment)、数据页面(page)、一段持久化内存(persistent memory,PMem或PM)所存储的数据、或者一段支持追加写的持久化存储空间(persistent layer LOG,Plog)等,本申请对此不予限定。
当压缩设备在源数据中确定预测数据量大于或等于压缩前数据量的第三数据块后,结合S340所示出的内容,压缩设备可在第一压缩数据的输出过程中,将该第三数据块进行输出。由于该第三数据块的预测数据量大于或等于压缩前数据量,因此,压缩设备不对该第三数据块进行压缩,直接输出该第三数据块,避免了第三数据块的压缩过程占用硬件压缩算法的处理带宽或者占用数据压缩模型的处理带宽,提高了压缩设备对源数据进行压缩的速度,减少了数据压缩的时延。
为了对待压缩的源数据进行预处理,使得预处理后的第一数据适配压缩设备中卸载的硬件压缩算法,本申请提供以下几种可能的实现方式,下面在图3所示出的数据压缩方法的基础上,结合图4至图6对压缩设备预处理待压缩的源数据的过程进行说明。
在第一种可能的实现方式中,请参见图4,图4为本申请提供的数据压缩方法的流程示意图二,在本实施例提供的数据压缩方法中,前述的S320可包括以下步骤S320A至S320D。
S320A,压缩设备确定硬件压缩算法的数据滑窗。
该数据滑窗用于指示数据压缩过程中查询冗余数据的数据范围。可选的,该数据范围也可称为压缩设备从当前数据向前查询当前数据所对应的冗余数据的数据宽度、数据段或者历史滑窗等等,本申请对此不予限定。
例如,该数据滑窗的大小为8千字节(kilo byte,KB)、32KB等。
S320B,压缩设备判断源数据包括的多个数据中在数据滑窗指示的数据范围内是否存在冗余数据。
冗余数据是指源数据中多次重复出现的数据。若数据滑窗指示的数据范围内存在冗余数据,则压缩设备执行S320C;若数据滑窗指示的数据范围内不存在冗余数据,则压缩设备执行下述的S331。
S320C,压缩设备根据硬件压缩算法的数据滑窗,识别源数据包括的预处理数据。
该预处理数据在数据滑窗指示的数据范围内存在冗余数据。例如,若源数据包括的数据1相邻两次出现的数据间隔为4KB,数据滑窗为32KB,则将该数据1作为预处理数据。
S320D,压缩设备从预处理数据中获取数据类型与设定的硬件压缩算法相匹配的第一数据。
结合图3所示出的数据压缩方法对图4所包括的S331至S333进行说明,如图4所示,在压缩设备识别到预处理数据后,针对于源数据中除预处理数据之外的其他数据,压缩设备还可执行以下步骤S331至S333。
S331,压缩设备根据数据滑窗识别源数据包括的第二数据。
该第二数据在数据滑窗指示的数据范围内不存在冗余数据。
S332,压缩设备根据第二数据的数据类型,选择第一数据压缩模型。
通常,针对不同的数据类型可以设计不同的数据压缩模型,因此,可以针对不同的数据类型预先对数据压缩模型分类。可以预先定义数据类型与数据压缩模型之间的对应关系。在一些实施例中,如图1所示,利用存储器112预先存储数据类型与数据压缩模型之间的对应关系。在处理器111确定数据类型后,可以先从存储器112中调取数据类型与数据压缩模型之间的对应关系,然后根据待压缩数据的数据类型从数据类型与数据压缩模型之间的对应关系中获取一个或多个数据压缩模型。数据类型与数据压缩模型之间的对应关系包含待压缩数据的数据类型与一个或多个数据压缩模型之间的对应关系。
关于所述数据压缩模型的来源。本实施例至少提供了一种可能的实施方式:从传统技术中所有的数据压缩模型中筛选出常用的数据压缩模型,或者,常用的数据压缩模型叠加生成的其他数据压缩模型作为所述数据压缩模型。所谓由常用的数据压缩模型叠加生成的其他数据压缩模型主要是指通过低阶数据压缩模型叠加生成高阶数据压缩模型。
这些常用的数据压缩模型包括但不限于:嵌套模型(nest model)、上下文模型(context model)、间接模型(indirect model)、文本模型(text model)、稀疏模型(sparse model)、可扩展标记语言模型(xml model)、匹配模型(match model)、距离模型(distance model)、可执行程序模型(exe model)、词模型(word model)、记录模型(record model)、图像模型(pic model)、部分匹配预测模型(prediction by partial matching model,PPMD model)、动态马尔可夫压缩模型(dynamic markov compression model,DMCM)、字节模型(byte model)、线性预测模型(linear predicition model)、自适应预测模型(adaptive predicition model)、声源模型(wav model)和通用模型(common model)。其中,嵌套模型是根据待预测字节中出现的嵌套符号(比如[])的信息预测后续字节的模型。上下文模型是根据待预测字节之前出现的连续字节上下文来预测后续字节的模型。间接模型是通过待预测字节前1-2个字节的比特的历史信息以及上下文来预测后续字节的模型。文本模型是通过词语、句子和段落等信息来预测后续字节的模型。通常,用于预测文本类数据。稀疏模型是通过查找待预测字节之前不连续的字节作为上下文预测后续字节的模型。比如,待预测字节之前的1个字节和3个字节预测后续字节的模型。可扩展标记语言模型是通过待预测字节包含的标签等特征信息预测后续字节的模型。匹配模型是通过查找待预测字节之前上下文中是否有匹配的信息,并根据匹配信息来预测后续字节的模型。距离模型是利用当前待预测字节和某些特殊字节预测后续字节的模型。比如,特殊字节为空格字符的距离。可执行程序模型是利用特定的指令集和操作码预测后续字节的模型。词模型就是根据出现的词语信息预测后续字节的上下文模型。记录模型是通过查找文件中的行列信息预测后续字节的上下文。在表格中一行称为一条记录,多用于数据库和表格中。图像模型是利用图片的特征预测后续字节的上下文模型。比如,利用图片的灰度或像素点预测后续字节的上下文模型。部分匹配预测模型是根据待预测字节之前连续出现的多个字节,在待预测字节中进行查找匹配,如果没有找到则减少多个字节中的部 分字节,再根据减少后的自己查找匹配,直到找到或者记录为新的字节,以此来预测的模型。动态马尔可夫压缩模型是使用可变长度的比特级上下文历史表预测下一个比特的模型。字节模型是根据比特的历史信息预测后续比特的模型。线性预测模型是根据线性回归分析来预测后续字节的上下文模型。自适应模型是根据其他模型计算出的概率和已知的上下文,来调整计算出的概率的模型。声源模型是通过音频文件中的特征信息预测后续字节的上下文模型。通用模型是对新的数据类型的数据或者未识别的数据类型的数据进行概率预测的模型。该通用模型可以由多个其他模型叠加生成。由于数据压缩模型集合是经过筛选的数据压缩模型,其数量远小于原来使用的数据压缩模型的数量,因此占用的存储空间较低,可以将数据压缩模型集合存储在处理器111上,由处理器111完成压缩第二数据的操作。
在一些实施例中,数据类型与数据压缩模型之间的对应关系可以以表格的形式呈现。但需要说明的是,以表格的形式示意数据类型与数据压缩模型之间的对应关系只是在存储设备中的存储形式之一,并不是对数据类型与数据压缩模型之间的对应关系在存储设备中的存储形式的限定,当然,数据类型与数据压缩模型之间的对应关系在存储设备中的存储形式还可以以其他的形式存储,本申请实施例对此不做限定。
可选的,可以通过软件方式将数据压缩模型集合存储在处理器111上。例如,将所述数据压缩模型集合所包含的数据压缩模型存储在与处理器111内置或者相耦合的存储器中。可选的,也可以通过硬件方式在处理器111上实现存储所述数据压缩模型集合。例如,以设计处理器111的电路结构的形式将所述数据压缩模型集合烧制在处理器111上。
随着科学技术的发展,如果产生了新的数据类型,进而对上述数据压缩模型集合进行升级。例如,将新的数据类型对应的数据压缩模型存储在处理器111上。新的数据类型包括基因数据的数据类型和大数据的数据类型,也可以淘汰已有的数据压缩模型中使用频率较低的模型,还可以对已有的某一种或多种数据压缩模型进行修改。
在一些实施例中,可以利用现有的数据压缩模型组合生成新的数据压缩模型,通过软件升级方式对存储在处理器111上的数据压缩模型集合进行更新。例如,高阶数据压缩模型可以通过低阶数据压缩模型实现。从而,无需重新改变硬件电路,极大地降低了处理器111的升级成本。
示例的,假设待压缩的第二数据的数据类型为文本类型。处理器111根据文本类型确定文本类型对应的数据压缩模型包括文本模型(TextModel)、词模型(WordModel)和嵌套模型(NestModel)。
请继续参照图4,本实施例提供的数据压缩方法还包括以下步骤S333。
S333,压缩设备根据选择出的第一数据压缩模型对第二数据进行压缩,获取第二压缩数据。
进而,在压缩设备输出源数据对应的目标数据时,压缩设备将第一数据对应的第一压缩数据、第二数据对应的第二压缩数据合并到一起进行输出。
作为一种可能的具体示例,压缩设备中软件层使用Lz77_out_win对源数据进行预处理,并将数据滑窗(既定窗口)之外的第二数据(如该第二数据为已匹配结果tuples_0)发送到FSE模块进行熵编码;将数据滑窗之内的第一数据(如该第一数据为未匹配字串literals_0)输入到硬件层包括的硬件压缩算法(如Lz77_in_win),压缩设备中硬件层的Lz77对输入数据在数据滑窗指示的数据范围内进行重复字串匹配查找,获得匹配结果tuples_1和非匹配字串literals_1,该匹配结果tuples_1和非匹配字串literals_1进入HUF模块进行霍夫曼编码,获得第一压缩数据,第一压缩数据所包括的描述数据是根据匹配结果tuples_1确定的。 最后,压缩设备将熵编码第二数据获得的tuples_0、将第一压缩数据进行汇总获得目标数据,并输出该目标数据。
在通常技术中,比如在数据库(data base,DB)场景,数据冗余大多处于8KB的页面范围内,使用8KB数据滑窗的Lz77压缩算法,可以达到预期压缩效果。但是在采用了相似分组的虚拟服务器接口/虚拟桌面基础架构(Virtual service Interface/Virtual Desktop Infrastructure,VSI/VDI)数据场景中,冗余数据大都处于64KB数据分组中,同样的Lz77压缩算法需要至少32KB的数据滑窗才能达到较好压缩效果。
相比之下,本实施例提供的数据压缩方法,压缩设备将第二数据进行软件压缩,软件压缩所用的压缩算法可以是设定的多种数据压缩模型包含的一个或多个。因此,针对于待压缩的源数据而言,一部分数据(第一数据)可基于硬件压缩算法进行压缩,该部分数据的压缩效率较高。还有一部分数据(第二数据)可基于第二数据的数据类型选择的第一数据压缩模型进行压缩,避免了第二数据与硬件压缩算法不适配(数据滑窗内不存在冗余数据)导致的压缩效率较低的问题,进一步提升了压缩设备对源数据的压缩效率。
在第二种可能的实现方式中,请参见图5,图5为本申请提供的数据压缩方法的流程示意图三,在本实施例提供的数据压缩方法中,前述的预处理操作可包括空间变换(position transform)前述的S320可包括以下步骤S320E。
S320E,压缩设备对源数据进行空间变换,获取第一数据和变换信息。
其中,该变换信息用于指示源数据和第一数据之间的数据映射关系。
进而,在S340中,压缩设备将该第一数据对应的第一压缩数据、变换信息进行汇总后输出。
应理解,在一些情形中,源数据的数据类型与硬件压缩算法不一定匹配,但是压缩设备对该源数据进行位置变换或空间变换等数据映射操作后,映射后数据的数据类型与硬件压缩算法相匹配,且由硬件压缩算法对映射后数据(如第一数据)进行压缩的效率较高,提升了硬件压缩算法的适用性,解决了源数据的数据类型与硬件压缩算法不适配导致的压缩效率降低的问题。
例如,压缩设备中软件层将支持实现Snappy无损压缩方式的哈希(Hash)匹配算法的源数据进行空间变换,获得支持Lz77压缩算法的第一数据。
作为一种可能的具体实施例,压缩设备基于软件层的预处理操作,对源数据执行空间变换操作,将源数据中相似的数据块在空间上靠近,让硬件层的Lz77压缩算法可以在有限的数据滑窗内发现更多的数据冗余。其中空间变换的信息(trans_info)打包到最终压缩结果(目标数据)中。在压缩设备的硬件层,Lz77压缩算法对完成变换的第一数据在数据滑窗内进行冗余字串处理,其中匹配结果tuples_1和非匹配字串literals_1进入HUF模块进行霍夫曼编码,获得第一压缩数据。最后,压缩设备将第一压缩数据和前述的变换信息进行汇总后输出目标数据。
值得注意的是,压缩设备将压缩后的第一压缩数据和前述的变换信息进行输出,解压缩设备(或称:解压设备)依据变换信息对第一压缩数据进行解压,从而获取前述的源数据,避免了源数据的数据类型与硬件压缩算法不适配导致的压缩效率降低的问题,提高了硬件压缩算法对源数据的压缩效率。
在第三种可能的实现方式中,如图6所示,图6为本申请提供的数据压缩算法的流程示意图四,在本实施例提供的数据压缩方法中,前述的预处理操作可包括对符合设定的数据模式的数据或字串进行处理(包括但不限于剔除、数据重删等),如前述的S320可包括S320F 和S320G。
S320F,压缩设备识别源数据中符合设定的数据模式的第四数据。
设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种。示例性的,字串之间的变化规律可以是按照升序、降序进行等间距分布,如1、2、3、4、5等,又如5、4、3、2、1等。
S320G,压缩设备从源数据包括的除第四数据外的其他数据中确定第一数据。
由于满足设定的数据模式的第四数据由硬件压缩算法进行压缩的效率较低,因此,压缩设备对源数据中满足一定的数据模式的数据进行提取后,再从源数据中除第四数据外的其他数据中确定待压缩的第一数据,避免了硬件压缩算法对第四数据进行压缩,降低了第四数据与硬件压缩算法不匹配导致压缩效率受到影响的问题,提高了压缩设备对源数据的压缩效率。
请继续参照图6,本实施例提供的数据压缩方法还包括以下步骤S334。
S334,压缩设备基于第四数据的数据模式,利用该第四数据的数据模式对应的第二数据压缩模型,对第四数据进行压缩,获取第三压缩数据。
示例性的,若第四数据均为0或者均为1,则压缩设备可选择字典压缩算法(如S334的第二数据压缩模型)对第四数据进行压缩,从而快速完成压缩设备对第四数据的压缩,避免第四数据由硬件压缩算法进行压缩占用硬件压缩算法的处理带宽,提高了硬件压缩算法对第一数据的压缩效率。
另外,若第四数据为全0字串或者全1字串,则压缩设备还可以将这些数据执行剔除(pattern_removal)操作,从而缩小源数据的数据规模,使得硬件层的Lz77压缩算法可以在有限的数据滑窗中发现更多的数据冗余。其中pattern_removal包括但不限于剔除全0字串、全1字串、或者符合预置pattern的字串等。进而,压缩设备中硬件层的Lz77压缩算法对完成变换的数据在数据滑窗指示的数据范围内进行冗余字串处理,匹配结果tuples_1和非匹配字串literals_1进入HUF模块进行霍夫曼编码。最后,压缩设备将第一数据对应的第一压缩数据和第四数据对应的第三压缩数据进行汇总后输出目标数据。
结合前述实施例对本申请提供的数据压缩方法的内容,本申请提供的数据压缩方法,压缩设备在在软件层实现对硬件层中硬件压缩算法(如Lz77压缩算法)的预处理操作,且软件层中的预处理支持灵活的算法配置,在源数据的数据类型与硬件压缩算法不匹配时,压缩设备在对源数据进行预处理获取第一数据后,才将数据类型和设定的硬件压缩算法相匹配的第一数据进行压缩,使得设定的硬件压缩算法可适配更多类型的数据,压缩设备对源数据的预处理操作提高了硬件压缩算法的适用性。而且,压缩算法或模型不是以软件单元的形式存在于处理器中,而以硬件压缩算法的形式卸载在压缩设备中,该第一数据由压缩设备基于硬件压缩算法来进行压缩,硬件层中的Lz77压缩算法卸载能显著提升压缩性能,提高了压缩设备对源数据进行压缩的效率。
可以理解的是,为了实现上述实施例中的功能,压缩设备包括了执行各个功能相应的硬件结构和/或软件模块。本领域技术人员应该很容易意识到,结合本申请中所公开的实施例描述的各示例的单元及方法步骤,本申请能够以硬件或硬件和计算机软件相结合的形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用场景和设计约束条件。
上文中结合图1至图6,详细描述了根据本实施例所提供的数据压缩方法,下面将结合图7,描述根据本实施例所提供的数据压缩装置。
图7为本申请提供的数据压缩装置的结构示意图,该数据压缩装置700可以用于实现上 述方法实施例中压缩设备的功能,因此也能实现上述方法实施例所具备的有益效果。在本实施例中,该数据压缩可以是如图1所示的压缩设备110,还可以是图2所示出的芯片200,或者应用于压缩设备或芯片的模块(如处理器)等。
如图7所示,数据压缩装置700包括:通信单元710、预处理单元720和硬件压缩单元730。数据压缩装置700用于实现上述图3至图6中所示的方法实施例中压缩设备的功能。
通信单元710,用于获取待压缩的源数据。
预处理单元720,用于预处理源数据,获取数据类型与设定的硬件压缩算法相匹配的第一数据。
硬件压缩单元730,用于根据硬件压缩算法对第一数据进行压缩。
一种可选的实现方式中,预处理单元720,具体用于:根据硬件压缩算法的数据滑窗,从源数据中确定预处理数据;数据滑窗用于指示数据压缩过程中查询冗余数据的数据范围,预处理数据在数据滑窗内存在冗余数据。预处理单元720,还具体用于:从预处理数据中选择第一数据。
一种可选的实现方式中,预处理单元720,还具体用于:根据硬件压缩算法的数据滑窗,从源数据中确定第二数据;第二数据在数据滑窗内不存在冗余数据。预处理单元720,还具体用于:根据第二数据的数据类型,选择第一数据压缩模型;以及,根据第一数据压缩模型对第二数据进行压缩。
一种可选的实现方式中,源数据包括多个数据块;预处理单元720,具体用于:从多个数据块中确定符合设定条件的第三数据块;设定条件为:利用硬件压缩算法压缩数据块的预测数据量大于或等于该数据块的数据量。预处理单元720,还具体用于:从多个数据块中除第三数据块外的其他数据块中确定第一数据。
一种可选的实现方式中,通信单元710,还用于:输出第三数据块。
一种可选的实现方式中,预处理单元720,具体用于:对源数据进行空间变换,获取第一数据和变换信息。该变换信息用于指示源数据和第一数据之间的数据映射关系。
一种可选的实现方式中,通信单元710,还用于:获取第一数据对应的第一压缩数据。以及,输出变换信息和第一压缩数据。
一种可选的实现方式中,预处理单元720,具体用于:识别源数据中符合设定的数据模式的第四数据;该设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种。预处理单元720,还具体用于:从源数据包括的除第四数据外的其他数据中确定第一数据。
一种可选的实现方式中,预处理单元720,还具体用于:根据第四数据的数据模式,选择第二数据压缩模型。以及,根据第四数据对应的第二数据压缩模型对第四数据进行压缩。
一种可选的实现方式中,硬件压缩算法包括:Lz77压缩算法。
应理解的是,本申请实施例的数据压缩装置700可以通过芯片来实现,根据本申请实施例的数据压缩装置700可对应于执行本申请实施例中描述的方法,并且数据压缩装置700中的各个单元的上述和其它操作和/或功能分别为了实现图3至图6中的各个方法的相应流程,为了简洁,在此不再赘述。
数据压缩装置通过软件实现前述附图中任一所示的数据压缩方法时,数据压缩装置及其各个单元也可以为软件模块。通过处理器调用该软件模块实现上述的数据压缩方法。该处理器可以是CPU,ASIC实现,或可编程逻辑器件(programmable logic device,PLD),上述PLD可以是复杂程序逻辑器件(complex programmable logical device,CPLD)、FPGA、通 用阵列逻辑(generic array logic,GAL)或其任意组合等。
有关上述数据压缩装置更详细的描述可以参考前述附图所示的实施例中相关描述,这里不加赘述。可以理解的,前述附图所示出的数据压缩装置仅为本实施例提供的示例,根据数据访问过程或者业务的不同数据压缩装置可包括更多或更少的单元,本申请对此不予限定。
当数据压缩装置通过硬件实现时,该硬件可以通过处理器或芯片实现。芯片包括接口电路和控制电路。接口电路用于接收来自处理器之外的其它设备的数据并传输至控制电路,或将来自控制电路的数据发送给处理器之外的其它设备。
控制电路通过逻辑电路或执行代码指令用于实现上述实施例中任一种可能实现方式的方法。有益效果可以参见上述实施例中任一方面的描述,此处不再赘述。
或者,芯片包括处理器和供电电路,供电电路用于为处理器供电,处理器可用于实现前述实施例中的数据压缩方法。示例性的,该供电电路可以与处理器位于同一个芯片内,或位于处理器所在的芯片之外的另一个芯片内。该供电电路可以包括但不限于如下至少一个:供电子系统、电管管理芯片、功耗管理处理器或功耗管理控制电路等。
可以理解的是,本申请的实施例中的处理器可以是CPU、NPU或GPU,还可以是其它通用处理器、DSP、ASIC、FPGA或者其它可编程逻辑器件、晶体管逻辑器件,硬件部件或者其任意组合。通用处理器可以是微处理器,也可以是任何常规的处理器。
当数据压缩装置通过硬件实现时,该硬件还可以是接口卡,该接口卡可包括芯片和接口,接口用于接收来自接口卡之外的其他装置的信号并发送至芯片;或用于将来自芯片的信号发送给接口卡之外的其他装置。芯片可根据接口卡收发的信号执行前述实施例中数据压缩方法的操作步骤。例如,该接口卡可以是指智能网卡等。
本实施例中的方法步骤可以通过硬件的方式来实现,也可以由处理器执行软件指令的方式来实现。软件指令可以由相应的软件模块组成,软件模块可以被存放于随机存取存储器(random access memory,RAM)、闪存、ROM、PROM、EPROM、EEPROM、寄存器、硬盘、移动硬盘、CD-ROM或者本领域熟知的任何其它形式的存储介质中。一种示例性的存储介质耦合至处理器,从而使处理器能够从该存储介质读取信息,且可向该存储介质写入信息。当然,存储介质也可以是处理器的组成部分。处理器和存储介质可以位于ASIC中。另外,该ASIC可以位于计算设备中。当然,处理器和存储介质也可以作为分立组件存在于计算设备、存储设备中。
本申请还提供一种芯片系统,该芯片系统包括处理器,用于实现上述方法中压缩设备的功能。在一种可能的设计中,所述芯片系统还包括存储器,用于保存程序指令和/或数据。该芯片系统,可以由芯片构成,也可以包括芯片和其他分立器件。
示例性的,当本申请提供的数据压缩方法被集成在算力芯片硬件(如加速卡)上时,该硬件可以安装至存储系统,如该存储系统是分布式存储系统、集中式存储系统,以赋予存储系统以自适应对源数据进行数据压缩的能力。示例性的,分布式存储系统可包括:存储、计算一体的分布式存储系统,或者,存储、计算分离的分布式存储系统;集中式存储系统可包括:盘框一体的存储系统,或者,盘框分离的存储系统等。
在上述实施例中,可以全部或部分地通过软件、硬件、固件或者其任意组合来实现。当使用软件实现时,可以全部或部分地以计算机程序产品的形式实现。所述计算机程序产品包括一个或多个计算机程序或指令。在计算机上加载和执行所述计算机程序或指令时,全部或部分地执行本申请实施例所述的流程或功能。所述计算机可以是通用计算机、专用计算机、计算机网络、网络设备、用户设备或者其它可编程装置。所述计算机程序或指令可以存储在 计算机可读存储介质中,或者从一个计算机可读存储介质向另一个计算机可读存储介质传输,例如,所述计算机程序或指令可以从一个网站站点、计算机、服务器或数据中心通过有线或无线方式向另一个网站站点、计算机、服务器或数据中心进行传输。所述计算机可读存储介质可以是计算机能够存取的任何可用介质或者是集成一个或多个可用介质的服务器、数据中心等数据存储设备。所述可用介质可以是磁性介质,例如,软盘、硬盘、磁带;也可以是光介质,例如,数字视频光盘(digital video disc,DVD);还可以是半导体介质,例如,固态硬盘(solid state drive,SSD)。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (26)

  1. 一种数据压缩方法,其特征在于,所述方法包括:
    预处理源数据得到第一数据,所述第一数据的类型适配硬件压缩算法;
    根据所述硬件压缩算法对所述第一数据进行压缩。
  2. 根据权利要求1所述的方法,其特征在于,所述预处理源数据得到第一数据,包括:
    根据所述硬件压缩算法的数据滑窗,从所述源数据中确定预处理数据;所述数据滑窗用于指示数据压缩过程中查询冗余数据的数据范围,所述预处理数据在所述数据滑窗内存在冗余数据;
    从所述预处理数据中选择所述第一数据。
  3. 根据权利要求2所述的方法,其特征在于,所述预处理源数据得到第一数据,还包括:
    根据所述硬件压缩算法的数据滑窗,从所述源数据中确定第二数据;所述第二数据在所述数据滑窗内不存在冗余数据;
    根据所述第二数据的数据类型,选择第一数据压缩模型;
    根据所述第一数据压缩模型对所述第二数据进行压缩。
  4. 根据权利要求1所述的方法,其特征在于,所述源数据包括多个数据块;
    所述预处理源数据得到第一数据,包括:
    从所述多个数据块中确定符合设定条件的第三数据块,所述设定条件为:利用所述硬件压缩算法压缩数据块的预测数据量大于或等于所述数据块的数据量;
    从所述多个数据块中除所述第三数据块外的其他数据块中确定所述第一数据。
  5. 根据权利要求4所述的方法,其特征在于,所述方法还包括:
    输出所述第三数据块。
  6. 根据权利要求1所述的方法,其特征在于,所述预处理源数据得到第一数据,包括:
    对所述源数据进行空间变换,获取所述第一数据和变换信息;所述变换信息用于指示所述源数据和所述第一数据之间的数据映射关系。
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:
    获取所述第一数据对应的第一压缩数据;
    输出所述变换信息和所述第一压缩数据。
  8. 根据权利要求1所述的方法,其特征在于,所述预处理源数据得到第一数据,包括:
    识别所述源数据中符合设定的数据模式的第四数据;所述设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种;
    从所述源数据包括的除所述第四数据外的其他数据中确定所述第一数据。
  9. 根据权利要求8所述的方法,其特征在于,所述方法还包括:
    根据所述第四数据的数据模式,选择第二数据压缩模型;
    根据所述第二数据压缩模型对所述第四数据进行压缩。
  10. 根据权利要求1至9中任一项所述的方法,其特征在于,所述硬件压缩算法包括伦佩尔-齐夫Lz77压缩算法。
  11. 一种数据压缩装置,其特征在于,所述装置应用于压缩设备,所述装置包括:
    预处理单元,用于预处理源数据得到第一数据,所述第一数据的类型适配硬件压缩算法;
    硬件压缩单元,用于根据所述硬件压缩算法对所述第一数据进行压缩。
  12. 根据权利要求11所述的装置,其特征在于,所述预处理单元,具体用于:根据所述硬件压缩算法的数据滑窗,从所述源数据中确定预处理数据;所述数据滑窗用于指示数据压 缩过程中查询冗余数据的数据范围,所述预处理数据在所述数据滑窗内存在冗余数据;
    所述预处理单元,还具体用于:从所述预处理数据中选择所述第一数据。
  13. 根据权利要求12所述的装置,其特征在于,所述预处理单元,还具体用于:根据所述硬件压缩算法的数据滑窗,从所述源数据中确定第二数据;所述第二数据在所述数据滑窗内不存在冗余数据;
    所述预处理单元,还具体用于:根据所述第二数据的数据类型,选择第一数据压缩模型;以及,根据所述第一数据压缩模型对所述第二数据进行压缩。
  14. 根据权利要求11所述的装置,其特征在于,所述源数据包括多个数据块;
    所述预处理单元,具体用于:从所述多个数据块中确定符合设定条件的第三数据块;所述设定条件为:利用所述硬件压缩算法压缩数据块的预测数据量大于或等于所述数据块的数据量;
    所述预处理单元,还具体用于:从所述多个数据块中除所述第三数据块外的其他数据块中确定所述第一数据。
  15. 根据权利要求14所述的装置,其特征在于,所述装置还包括:通信单元,用于:输出所述第三数据块。
  16. 根据权利要求11所述的装置,其特征在于,所述预处理单元,具体用于:对所述源数据进行空间变换,获取所述第一数据和变换信息;所述变换信息用于指示所述源数据和所述第一数据之间的数据映射关系。
  17. 根据权利要求16所述的装置,其特征在于,所述装置还包括:通信单元,用于:获取所述第一数据对应的第一压缩数据;以及,输出所述变换信息和所述第一压缩数据。
  18. 根据权利要求11所述的装置,其特征在于,所述预处理单元,具体用于:识别所述源数据中符合设定的数据模式的第四数据;所述设定的数据模式包括:数据为全0字串,数据为全1字串,或者字串之间的变化规律符合设置的规律中至少一种;
    所述预处理单元,还具体用于:从所述源数据包括的除所述第四数据外的其他数据中确定所述第一数据。
  19. 根据权利要求18所述的装置,其特征在于,所述预处理单元,还具体用于:根据所述第四数据的数据模式,选择第二数据压缩模型;以及,根据所述第二数据压缩模型对所述第四数据进行压缩。
  20. 根据权利要求11至19中任一项所述的装置,其特征在于,所述硬件压缩算法包括伦佩尔-齐夫Lz77压缩算法。
  21. 一种芯片,其特征在于,包括:处理器和供电电路;
    所述供电电路用于为所述处理器供电;
    所述处理器用于执行权利要求1至权利要求10中任一所述的方法。
  22. 一种接口卡,其特征在于,包括:权利要求21所述的芯片和接口;
    所述接口用于接收来自所述接口卡之外的其他装置的信号并发送至所述芯片;或用于将来自所述芯片的信号发送给所述接口卡之外的其他装置。
  23. 一种压缩设备,其特征在于,包括:如权利要求22所述的接口卡。
  24. 一种压缩系统,其特征在于,包括:第一处理器和第二处理器;
    所述第二处理器中部署有硬件压缩算法;
    所述第一处理器,用于获取源数据得到第一数据,所述第一数据的类型适配所述硬件压缩算法;
    所述第二处理器,用于根据所述硬件压缩算法对所述第一数据进行压缩。
  25. 一种计算机可读存储介质,所述存储介质中存储有计算机程序或指令,当所述计算机程序或指令被压缩设备执行时,实现权利要求1至10中任一项所述的方法。
  26. 一种计算机程序产品,所述计算机程序产品在计算机上运行时,使得所述计算机执行权利要求1至10中任一项所述的方法。
PCT/CN2023/087178 2022-06-02 2023-04-08 数据压缩方法及装置 WO2023231571A1 (zh)

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
CN202210624708 2022-06-02
CN202210624708.9 2022-06-02
CN202211033099.6 2022-08-26
CN202211033099.6A CN117220685A (zh) 2022-06-02 2022-08-26 数据压缩方法及装置

Publications (1)

Publication Number Publication Date
WO2023231571A1 true WO2023231571A1 (zh) 2023-12-07

Family

ID=89026857

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2023/087178 WO2023231571A1 (zh) 2022-06-02 2023-04-08 数据压缩方法及装置

Country Status (1)

Country Link
WO (1) WO2023231571A1 (zh)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5467087A (en) * 1992-12-18 1995-11-14 Apple Computer, Inc. High speed lossless data compression system
US6466999B1 (en) * 1999-03-31 2002-10-15 Microsoft Corporation Preprocessing a reference data stream for patch generation and compression
CN103746701A (zh) * 2014-01-16 2014-04-23 重庆大学 一种用于Rice无损数据压缩的快速编码选项选择方法
CN106936439A (zh) * 2016-09-20 2017-07-07 南开大学 一种通用的基于分块排序思想的压缩预处理方法及应用
CN109787638A (zh) * 2019-01-10 2019-05-21 杭州幻方科技有限公司 一种数据压缩存储处理装置及方法
CN110943744A (zh) * 2019-12-03 2020-03-31 杭州嘉楠耘智信息科技有限公司 数据压缩、解压缩以及基于数据压缩和解压缩的处理方法及装置
CN112380196A (zh) * 2020-10-28 2021-02-19 安擎(天津)计算机有限公司 一种用于数据压缩传输的服务器
CN113746665A (zh) * 2021-07-29 2021-12-03 深圳市明源云科技有限公司 日志数据处理方法、装置、计算机程序产品及存储介质

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5467087A (en) * 1992-12-18 1995-11-14 Apple Computer, Inc. High speed lossless data compression system
US6466999B1 (en) * 1999-03-31 2002-10-15 Microsoft Corporation Preprocessing a reference data stream for patch generation and compression
CN103746701A (zh) * 2014-01-16 2014-04-23 重庆大学 一种用于Rice无损数据压缩的快速编码选项选择方法
CN106936439A (zh) * 2016-09-20 2017-07-07 南开大学 一种通用的基于分块排序思想的压缩预处理方法及应用
CN109787638A (zh) * 2019-01-10 2019-05-21 杭州幻方科技有限公司 一种数据压缩存储处理装置及方法
CN110943744A (zh) * 2019-12-03 2020-03-31 杭州嘉楠耘智信息科技有限公司 数据压缩、解压缩以及基于数据压缩和解压缩的处理方法及装置
CN112380196A (zh) * 2020-10-28 2021-02-19 安擎(天津)计算机有限公司 一种用于数据压缩传输的服务器
CN113746665A (zh) * 2021-07-29 2021-12-03 深圳市明源云科技有限公司 日志数据处理方法、装置、计算机程序产品及存储介质

Similar Documents

Publication Publication Date Title
EP3612989B1 (en) Flexible hardware for high throughput vector dequantization with dynamic vector length and codebook size
US11599770B2 (en) Methods and devices for programming a state machine engine
WO2021129445A1 (zh) 数据压缩方法及计算设备
US20190190538A1 (en) Accelerator hardware for compression and decompression
WO2020025006A1 (zh) 数据压缩、解压方法及相关装置、电子设备、系统
CN107565971B (zh) 一种数据压缩方法及装置
US9479194B2 (en) Data compression apparatus and data decompression apparatus
CN111083933B (zh) 数据存储及获取方法和装置
US11947979B2 (en) Systems and devices for accessing a state machine
US9137336B1 (en) Data compression techniques
WO2022142106A1 (zh) 文本分析方法、装置、电子设备及可读存储介质
CN115843367A (zh) 具有深度学习加速器和随机存取存储器的可移除式存储装置中的视频压缩
WO2023231571A1 (zh) 数据压缩方法及装置
KR20220049522A (ko) 다중 데이터 스트림을 포함하는 압축된 입력 데이터를 압축 해제하기 위한 압축 해제 엔진
CN114342264A (zh) 多符号解码器
US20230325101A1 (en) Systems and methods for hybrid storage
CN113196306A (zh) 机器学习网络模型压缩
WO2023040367A1 (zh) 一种数据压缩方法及装置
US12001237B2 (en) Pattern-based cache block compression
CN117220685A (zh) 数据压缩方法及装置
US20180300087A1 (en) System and method for an improved real-time adaptive data compression
CN111049836A (zh) 数据处理方法、电子设备及计算机可读存储介质
US20240028577A1 (en) Hardware accelerated string filter
WO2023070424A1 (zh) 一种数据库数据的压缩方法及存储设备
CN111832257A (zh) 编码数据的条件转码

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 23814767

Country of ref document: EP

Kind code of ref document: A1