CN115118716A - Object data online compression method and device, electronic equipment and storage medium - Google Patents

Object data online compression method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN115118716A
CN115118716A CN202210743760.6A CN202210743760A CN115118716A CN 115118716 A CN115118716 A CN 115118716A CN 202210743760 A CN202210743760 A CN 202210743760A CN 115118716 A CN115118716 A CN 115118716A
Authority
CN
China
Prior art keywords
object data
data
storage
compressible
compressed
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210743760.6A
Other languages
Chinese (zh)
Inventor
陈仲涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Original Assignee
Beijing Topsec Technology Co Ltd
Beijing Topsec Network Security Technology Co Ltd
Beijing Topsec Software Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Topsec Technology Co Ltd, Beijing Topsec Network Security Technology Co Ltd, Beijing Topsec Software Co Ltd filed Critical Beijing Topsec Technology Co Ltd
Priority to CN202210743760.6A priority Critical patent/CN115118716A/en
Publication of CN115118716A publication Critical patent/CN115118716A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/06Protocols specially adapted for file transfer, e.g. file transfer protocol [FTP]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0608Saving storage space on storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0625Power saving in storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0643Management of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/04Protocols for data compression, e.g. ROHC
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L69/00Network arrangements, protocols or services independent of the application payload and not provided for in the other groups of this subclass
    • H04L69/22Parsing or analysis of headers

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides an online compression method and device of object data, electronic equipment and a storage medium, wherein the method is executed by a storage gateway, the storage gateway is connected with a plurality of storage devices, and the method comprises the following steps: the storage gateway receives an object data storage request; judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request; if the object type of the object data is obtained, determining whether the object data is compressible according to the object type of the object data; and determining whether the object data is compressed before being transmitted to the storage device according to whether the object data is compressible. The scheme reduces the resource consumption of network transmission and reduces the resource consumption of a CPU.

Description

Object data online compression method and device, electronic equipment and storage medium
Technical Field
The present application relates to the field of data storage technologies, and in particular, to an online compression method and apparatus for object data, an electronic device, and a computer-readable storage medium.
Background
With the advent of the information-oriented age, the global data volume is on an explosive growth trend, and the investment of each enterprise on data storage is also increasing. In order to store more data on a limited storage device and reduce the storage cost of a client, storage compression technology is used by various distributed storage manufacturers. After data compression is used, the occupied amount of the storage space of most objects or files can be reduced by more than half, and the storage cost can be effectively saved. However, data compression requires a large amount of CPU (Central Processing Unit) resources, and if all data is compressed, the CPU resources are insufficient. And some data can not reduce a lot of storage space after being compressed, and for the data, if the data is also compressed, only the CPU overhead is increased, and the use amount of the storage space cannot be reduced. How to determine whether to start compression according to data uploaded by users and how to reduce the number of invalid compression times of the whole cluster become the key research point of the data compression direction in the current distributed storage field.
Disclosure of Invention
The embodiment of the application provides an online compression method of object data, which is used for avoiding CPU resource waste.
The embodiment of the application provides an online compression method of object data, which is executed by a storage gateway, wherein the storage gateway is connected with a plurality of storage devices, and the method comprises the following steps:
the storage gateway receives an object data storage request;
judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request;
if the object type of the object data is obtained, determining whether the object data is compressible according to the object type of the object data;
and determining whether the object data is compressed and then transmitted to the storage equipment according to whether the object data is compressible.
In the method provided by the above embodiment of the present application, data compression is performed at the storage gateway, but not at the storage device, by determining the object type of the object data, and determining whether the object data is compressible according to the object type, and for the compressible object data, the object data is compressed at the storage gateway and then is respectively transmitted to the storage device, so that compressed data is transmitted between the storage gateway and the storage device, and resource consumption of network transmission is reduced.
In an embodiment, the method provided in the embodiment of the present application further includes: and if the object type of the object data cannot be acquired, detecting the compressibility of the object data.
For the condition that whether the object data is compressible or not can not be judged according to the object type, compressibility detection is carried out on the object data instead of directly selecting no compression or compression, and waste of network and CPU resources is avoided as much as possible.
In an embodiment, the detecting compressibility of the object data includes:
if the data volume of the object data is less than or equal to a preset value, directly compressing the object data, and determining whether the object data is compressible;
and if the data volume of the object data is larger than a preset value, selecting a plurality of data segments from the object data to be compressed, comparing the reduction degree of the data volume of the data segments before and after compression, and determining whether the object data is compressible or not based on the reduction degree.
In the embodiment, the data size of the object data is distinguished, if the object data with large data size is directly compressed, whether the object data with large data size is compressible is determined according to the compression ratio, so that the object data with large data size needs to be compressed once, and a CPU is wasted.
In an embodiment, if the data size of the object data is greater than a preset value, selecting a plurality of data segments from the object data to compress, comparing the reduction degree of the data size of the plurality of data segments before and after compression, and determining whether the object data is compressible based on the reduction degree, includes:
if the data volume of the object data is larger than a preset value, randomly selecting a plurality of data segments from the rest data except the data with the first preset volume at the beginning, wherein the length of each data segment is a second preset volume;
compressing the selected data segments;
and if the ratio of the data quantity of the plurality of compressed data segments to the data quantity of the plurality of data segments before compression is smaller than a preset ratio, determining that the object data is compressible.
In the embodiment, in order to improve the accuracy of compressibility detection and eliminate the influence of initial data, several data segments are randomly selected from the residual data to be compressed, and whether the target data is compressible is evaluated according to the compression ratio of the selected data segments, so that the accuracy of compressibility detection is improved.
In an embodiment, the determining, according to object data carried by the object data storage request, whether an object type of the object data can be acquired includes:
and judging the object type of the object data according to the extension or the file header of the object data.
The embodiment judges the object type of the object data according to the extension name and the file header, and is convenient, quick and high in efficiency.
In an embodiment, the determining whether the object data is compressible according to an object type of the object data includes:
if the object type is in a video file format, a picture format or a compression packet format, marking the object data as incompressible;
and if the object type is a text type, marking the object data as compressible.
Because the video files, the pictures, the compressed packets and the like are the compressed data, the data stored in the file formats can not reduce the data volume by compressing the data, only the consumption of CPU resources during data compression and decompression can be increased, the compressed data can not be stored in the object storage, and on the contrary, the data volume of the compressed text type object data can be greatly reduced, so that the compressed text type object data is marked as compressible, and the consumption of CPU resources and the waste of storage space can be avoided.
In an embodiment, before the determining whether to compress the object data before transmitting the object data to the storage device according to whether the object data is compressible, the method further includes:
and recording whether the object data is compressed or not in the metadata of the object data according to whether the object data is compressible or not.
Therefore, when the target data is read, the storage gateway can quickly determine whether to decompress the data according to the record of the metadata, and the data reading efficiency is improved.
In an embodiment, the determining whether to compress the object data before transmitting the object data to the storage device according to whether the object data is compressible includes:
if the object data is compressible, compressing the object data and transmitting the compressed data to the storage equipment;
and if the object data is not compressible, directly transmitting the object data to the storage equipment.
According to the embodiment, the storage gateway determines whether the object data is compressed and then transmitted to the storage device for storage according to whether the object data is compressible, so that the network resource consumption of data transmission is reduced, multiple times of compression are avoided when multiple copies are needed, and the CPU resource consumption is saved.
In an embodiment, the method further comprises:
if an object data reading request sent by a client is received, the storage gateway directly reads the object data from the storage equipment;
determining whether the object data is a compressed storage object according to the record of the metadata of the object data;
and if the object data is a compressed storage object, decompressing the object data and then returning the object data to the client.
According to the embodiment, when the storage gateway receives the object data reading request, the object data is directly acquired from the storage device, and then the compressed and stored object is decompressed and then returned to the client side according to whether the object data recorded by the metadata is the compressed and stored object, so that the data decompression efficiency is improved.
The embodiment of the present application further provides an apparatus for online compressing object data, where the apparatus is applied to a storage gateway, the storage gateway is connected to a plurality of storage devices, and the apparatus includes:
the storage request receiving module is used for receiving an object data storage request;
the object type acquisition module is used for judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request;
the compressible judging module is used for determining whether the object data is compressible according to the object type of the object data if the object type of the object data is obtained;
and the compression storage module is used for determining whether the object data is compressed and then transmitted to the storage equipment according to whether the object data is compressible.
An embodiment of the present application further provides an electronic device, where the electronic device includes:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the above-mentioned online compression method of the object data.
The embodiment of the application also provides a computer readable storage medium, wherein the storage medium stores a computer program, and the computer program can be executed by a processor to complete the online compression method of the object data.
According to the device, the electronic device and the storage medium provided by the embodiment of the application, data compression is performed on the storage gateway instead of the storage device, by judging the object type of the object data, determining whether the object data are compressible according to the object type, and respectively transmitting the compressible object data to the storage device after the object data are compressed on the storage gateway, so that the compressed data are transmitted between the storage gateway and the storage device, the resource consumption of network transmission is reduced, even if the object data have a plurality of copies, only one-time compression is performed on the storage gateway and then the compressed data are transmitted to the plurality of storage devices, compression does not need to be performed on each storage device, and the resource consumption of a CPU is reduced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present application, the drawings required to be used in the embodiments of the present application will be briefly described below.
Fig. 1 is a schematic structural diagram of an electronic device according to an embodiment of the present application;
FIG. 2 is a schematic diagram of an architecture of an object store provided by an embodiment of the present application;
FIG. 3 is a flowchart illustrating an online compression method for object data according to an embodiment of the present disclosure;
FIG. 4 is a schematic diagram of object data in jpg format according to an embodiment of the present application;
FIG. 5 is a detailed flowchart of an online compression method for object data according to an embodiment of the present disclosure;
fig. 6 is a block diagram of an online compression apparatus for object data according to an embodiment of the present disclosure.
Detailed Description
The technical solutions in the embodiments of the present application will be described below with reference to the drawings in the embodiments of the present application.
Like reference numbers and letters refer to like items in the following figures, and thus, once an item is defined in one figure, it need not be further defined and explained in subsequent figures. Meanwhile, in the description of the present application, the terms "first", "second", and the like are used only for distinguishing the description, and are not to be construed as indicating or implying relative importance.
Fig. 1 is a schematic structural diagram of an electronic device provided in an embodiment of the present application. The electronic device 100 may be configured to perform the online compression method for object data provided in the embodiment of the present application. As shown in fig. 1, the electronic device 100 includes: one or more processors 102, and one or more memories 104 storing processor-executable instructions. Wherein the processor 102 is configured to execute an online compression method of object data provided by the following embodiments of the present application.
The processor 102 may be a gateway, or may be an intelligent terminal, or may be a device including a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), or other form of processing unit having data processing capability and/or instruction execution capability, and may process data of other components in the electronic device 100, and may control other components in the electronic device 100 to perform desired functions.
The memory 104 may include one or more computer program products that may include various forms of computer-readable storage media, such as volatile memory and/or non-volatile memory. The volatile memory may include, for example, Random Access Memory (RAM), cache memory (cache), and/or the like. The non-volatile memory may include, for example, Read Only Memory (ROM), hard disk, flash memory, etc. On which one or more computer program instructions may be stored that may be executed by the processor 102 to implement the method of online compression of object data described below. Various applications and various data, such as various data used and/or generated by the applications, may also be stored in the computer-readable storage medium.
In one embodiment, the electronic device 100 shown in FIG. 1 may further include an input device 106, an output device 108, and a data acquisition device 110, which may be interconnected via a bus system 112 and/or other form of connection mechanism (not shown). It should be noted that the components and structure of the electronic device 100 shown in fig. 1 are exemplary only, and not limiting, and the electronic device 100 may have other components and structures as desired.
The input device 106 may be a device used by a user to input instructions and may include one or more of a keyboard, a mouse, a microphone, a touch screen, and the like. The output device 108 may output various information (e.g., images or sounds) to the outside (e.g., a user), and may include one or more of a display, a speaker, and the like. The data acquisition device 110 may acquire an image of a subject and store the acquired image in the memory 104 for use by other components. Illustratively, the data acquisition device 110 may be a camera.
In an embodiment, the devices in the example electronic device 100 for implementing the online compression method of the object data of the embodiment of the present application may be integrally disposed, or may be disposed in a decentralized manner, such as integrally disposing the processor 102, the memory 104, the input device 106 and the output device 108, and disposing the data acquisition device 110 separately.
In an embodiment, the example electronic device 100 for implementing the online compression method of object data of the embodiment of the present application may be implemented as an intelligent terminal such as a server, a desktop computer, a notebook computer, and the like.
The online compression method for object data provided by the embodiment of the present application may be executed by a storage gateway, where the storage gateway may be a service program deployed in an electronic device such as a server, and as shown in fig. 2, the storage gateway is connected to a plurality of storage devices. The storage gateway may provide a highly available s3 (simple storage service) access interface to the outside, and may also provide a swift interface and an http (Hyper Text Transfer Protocol) interface.
The storage device is a location where data is finally stored, is not on the same server as the storage gateway, and one storage device may include one or more disks (disks). One object data is generally multi-copy and is stored in a plurality of storage devices respectively. The general data compression method is to implement data compression on a storage engine on a storage device, but this results in uncompressed data being transmitted between a storage gateway and the storage device, which results in too high consumption of network resources, and the respective compression of multiple copies of data causes too high consumption of a CPU of the entire cluster.
Based on this, the embodiment of the present application provides an online compression method for object data, where the object data is compressed at a storage gateway, the object data transmitted to a storage device is already compressed data, the storage device only needs to store the object data on a disk, the object data only needs to be compressed once no matter how many copies of the object data are, and when the object data is transmitted to multiple storage devices, because the compressed object data is transmitted, the consumption of bandwidth can be reduced by many times. The following provides a detailed description of an online compression method for object data according to an embodiment of the present application.
Fig. 3 is a schematic flowchart of an online compression method for object data according to an embodiment of the present disclosure. As shown in fig. 3, the method includes the following steps S310 to S340.
Step S310: the storage gateway receives an object data storage request.
The storage gateway may receive an object data storage request sent by a client. The object data storage request may carry object data that needs to be stored. Object stores, also called object-based stores, are a general term used to describe methods of resolving and processing discrete units, referred to as objects. Just like a file, an object contains data, but unlike a file, an object no longer has a hierarchy in one hierarchy. Each object is in the same level of a flat address space called a storage pool, and an object does not belong to the next level of another object.
Step S320: and judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request.
In practical application, there are not a few object data that have no compressibility, that is, the size of the compressed data is similar to the size of the original data, so compressing the data can not reduce the data amount, but only increase the consumption of CPU resources when compressing and decompressing the data. Such as video files, pictures, compressed packets, etc., and the data stored in these file formats is compressed data itself, and then the compressed data is not stored in the object storage. If the data of the objects cannot be quickly identified to be incompressible, whether to start compression storage can be determined according to the compression ratio after compression, so that the objects which cannot be compressed need to be compressed once, and the CPU is wasted. In order to improve the detection efficiency of the incompressible object and reduce the overhead of the CPU, the present embodiment obtains the object type before compressing the object data.
The object type can be in various types such as video file format, picture format, compressed packet format, text type and the like. In one embodiment, the object type of the object data may be determined according to an extension or a file header of the object data.
For example, the extension may include jpg (joint Photographic Experts group), png (Portable Network Graphics), xml (eXtensible Markup Language), txt (Text document), html (hypertext Markup Language), and the like, and the object type of the object data with the extensions jpg and png is a picture format. The object type of the object data with the extension names xml, txt and html is a text type.
Because most file formats have identification characters in the file header, the file header of the object data can be read; and determining the object type of the object data according to the identification characters contained in the file header.
The header may be the first 512 bytes of data of the object data, and the identification character is used to characterize the object type of the object data. For example, as shown in fig. 4, the object data in jpg format has a header specifying the character SOI, which is 0XFFD 8. The file header of the object data in the png format is a header deployment name field, and the content is 0X89504E 47. It is only necessary to read the header of the object data, that is, the 512 bytes data in front of the object data to determine the object type of most of the object data.
Step S330: and if the object type of the object data is obtained, determining whether the object data is compressible according to the object type of the object data.
In one embodiment, it is assumed that the object type of the object data, that is, the object type of the object data obtained, is determined according to the extension or the file header of the object data. It may be further determined whether the object data is compressible according to an object type of the object data.
Specifically, if the object type is a video file format, a picture format, or a compression packet format, the object data may be marked as being incompressible; if the object type is a text type, e.g. XML, HTML, WORD, the object data may be marked as compressible.
In another embodiment, assuming that the object type of the object data cannot be determined according to the extension or the file header of the object data, i.e. the object type of the object data cannot be obtained, step S330' is executed to detect compressibility of the object data if the object type of the object data cannot be obtained.
Wherein compressibility is used to characterize whether object data is compressible or incompressible. Specifically, if the data volume of the object data is less than or equal to a preset value, directly compressing the object data, and determining whether the object data is compressible; and if the data volume of the object data is larger than a preset value, selecting a plurality of data segments from the object data to be compressed, comparing the reduction degree of the data volume of the data segments before and after compression, and determining whether the object data is compressible or not based on the reduction degree.
For example, the preset value may be 32KB, and assuming that the data size of one object data is less than or equal to 32KB, the object data may be directly compressed. Assuming that the amount of data after compression is reduced by 20% or more, it means that the object data is compressible, whereas it is considered incompressible.
Assuming that the data size of an object data is greater than 32KB, for example, 3 data segments can be selected from the object data for compression, and assuming that the data size of the 3 data segments is reduced by 20% or more after compression, the object data is compressible, otherwise, the object data is considered to be incompressible.
In one embodiment, if the data size of the target data is larger than the predetermined value, excluding the first predetermined size (e.g. 4KB) of data, randomly selecting a plurality of data segments (e.g. 3 data segments) from the remaining data, wherein the length of each data segment is the second predetermined size (e.g. 4 KB); compressing the selected data segments; and if the ratio of the data quantity of the plurality of compressed data segments to the data quantity of the plurality of data segments before compression is smaller than a preset proportion (for example, 80%), determining that the object data is compressible.
The first predetermined amount and the second predetermined amount may be the same or different, for example, except for the first 4KB of data, 3 data segments are randomly selected from the remaining data for compression, and if the data amount of the 3 data segments after compression is less than 80% of that before compression, it means that the object data is compressible, otherwise, it is not compressible. In the prior art, the object data is compressed integrally, and the compressibility of the object data with large data volume is judged by extracting a plurality of small data sections according to the data volume of the object data, so that the compressibility of the whole object data can be represented, and the CPU overhead caused by compressibility judgment can be reduced.
Step S340: and determining whether the object data is compressed and then transmitted to the storage equipment according to whether the object data is compressible.
Specifically, after determining whether the object data is compressible according to the object type or through compressibility detection, as shown in fig. 5, if the object data is compressible, compressing the object data, and transmitting the compressed data to the storage device; and if the object data is not compressible, directly transmitting the object data to the storage device.
Therefore, compressed data is transmitted between the storage gateway and the storage device, resource consumption of network transmission is reduced, even if the object data has a plurality of copies, the object data only needs to be compressed once at the storage gateway and then transmitted to the plurality of storage devices, compression does not need to be performed at each storage device, and resource consumption of a CPU is reduced.
In an embodiment, before the step S340, the method provided in the embodiment of the present application further includes: and recording whether the object data is compressed and stored in the metadata of the object data according to whether the object data is compressible or not.
The metadata is mainly attribute information describing the object data. The result of whether the object data is compressible can be recorded in the metadata of the object data, so that when the object data is read from the storage device, whether to decompress the object data can be determined according to the record of the metadata.
In an embodiment, if a storage gateway receives an object data reading request sent by a client, the storage gateway directly reads the object data from the storage device; determining whether the object data is a compressed storage object according to the record of the metadata of the object data; and if the object data is a compressed storage object, decompressing the object data and then returning the object data to the client.
The compressed storage object refers to object data which is compressed. Therefore, the gateway equipment can firstly compress the compressed object data and return the decompressed data to the client, so that the client is not required to compress, the compressed data is still transmitted between the storage gateway and the storage equipment, and the occupancy rate of network resources is reduced.
The embodiment mainly provides an online compression method of object data, which can be applied to distributed storage and super-fusion products, and can reduce the usage amount of storage space and the CPU usage rate during object uploading and downloading.
For example, the client uses the object storage to store the picture data, so that when the object data is uploaded, the storage gateway can recognize that the object data is in a common picture format, and the object data is not compressed, because the space saved by lossless compression of the picture data is less than 1%, the compression consumes a large amount of CPU resources, and a large amount of CPU resources are consumed when downloading the object data and decompressing. If the client uses a custom format to deposit data, log data is uploaded to the object storage system in small blocks, for example. At this time, the file format of the object data cannot be acquired, and it cannot be determined whether the object data can be compressed. The compressibility of the object data can be rapidly detected by randomly reading 3 data segments. The log data are text data which are compressible, and the object data are compressed and then sent to the storage device for storage. If the compressed data length is only 10% of the original data length, 90% of storage space and network bandwidth consumption can be saved. And the storage gateway is compressed and then transmitted to the storage equipment. The method can effectively avoid repeated compression of a plurality of copies, reduce the data volume transmitted by the network and avoid CPU waste caused by invalid compression.
The following is an embodiment of the apparatus of the present application, which can be used to implement the embodiment of the online compression method for the object data described above in the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the online compression method of the object data of the present application.
Fig. 6 is a block diagram illustrating an apparatus for online compression of object data according to an embodiment of the present application. As shown in fig. 6, the apparatus is applied to a storage gateway, and the storage gateway is connected to a plurality of storage devices, and the apparatus includes: a storage request receiving module 610, an object type obtaining module 620, a compressible judging module 630 and a compressed storage module 640.
A storage request receiving module 610, configured to receive an object data storage request;
an object type obtaining module 620, configured to determine whether an object type of the object data can be obtained according to the object data carried in the object data storage request;
a compression determining module 630, configured to determine whether the object data is compressible according to the object type of the object data if the object type of the object data is obtained;
and the compressed storage module 640 is configured to determine whether to transmit the object data to the storage device after compressing the object data according to whether the object data is compressible.
In an embodiment, the apparatus further includes:
and the compressibility detection module is used for detecting the compressibility of the object data if the object type of the object data cannot be acquired.
In an embodiment, the compressibility detection module includes:
the direct compression unit is used for directly compressing the object data and determining whether the object data is compressible or not if the data volume of the object data is less than or equal to a preset value;
and the data segment compression unit is used for selecting a plurality of data segments from the object data to be compressed if the data volume of the object data is larger than a preset value, comparing the reduction degree of the data volume of the data segments before and after compression, and determining whether the object data is compressible based on the reduction degree.
In an embodiment, the data segment compressing unit is specifically configured to:
if the data volume of the object data is larger than a preset value, randomly selecting a plurality of data segments from the rest data except the data with the first preset volume at the beginning, wherein the length of each data segment is a second preset volume;
compressing the selected data segments;
and if the ratio of the data quantity of the plurality of compressed data segments to the data quantity of the plurality of data segments before compression is smaller than a preset ratio, determining that the object data is compressible.
In an embodiment, the compression determining module 630 is specifically configured to: and judging the object type of the object data according to the extension or the file header of the object data.
In an embodiment, the compression determining module 630 is specifically configured to: if the object type is in a video file format, a picture format or a compression packet format, marking the object data as incompressible;
and if the object type is a text type, marking the object data as compressible.
In an embodiment, the apparatus further includes:
and the data recording module is used for recording whether the object data is compressed and stored in the metadata of the object data according to whether the object data is compressible.
In an embodiment, the compressed storage module 640 is specifically configured to: if the object data is compressible, compressing the object data and transmitting the compressed data to the storage equipment; and if the object data is not compressible, directly transmitting the object data to the storage device.
In an embodiment, the apparatus further includes:
the data reading module is used for directly reading the object data from the storage equipment by the storage gateway if receiving an object data reading request sent by a client;
the decompression judging module is used for determining whether the object data is a compressed storage object according to the record of the metadata of the object data;
and the data decompression module is used for decompressing the object data and then returning the object data to the client if the object data is a compressed storage object.
The implementation process of the functions and actions of each module in the device is specifically detailed in the implementation process of the corresponding step in the object data online compression method, and is not described herein again.
In the embodiments provided in the present application, the disclosed apparatus and method can also be implemented in other ways. The apparatus embodiments described above are merely illustrative, and for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, functional modules in the embodiments of the present application may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

Claims (12)

1. A method for online compression of object data, the method being performed by a storage gateway, the storage gateway being connected to a plurality of storage devices, the method comprising:
the storage gateway receives an object data storage request;
judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request;
if the object type of the object data is obtained, determining whether the object data is compressible according to the object type of the object data;
and determining whether the object data is compressed and then transmitted to the storage equipment according to whether the object data is compressible.
2. The method of claim 1, further comprising:
and if the object type of the object data cannot be acquired, detecting the compressibility of the object data.
3. The method of claim 2, wherein the detecting compressibility of the object data comprises:
if the data volume of the object data is less than or equal to a preset value, directly compressing the object data, and determining whether the object data is compressible;
and if the data volume of the object data is larger than a preset value, selecting a plurality of data segments from the object data to be compressed, comparing the reduction degree of the data volume of the data segments before and after compression, and determining whether the object data is compressible or not based on the reduction degree.
4. The method of claim 3, wherein if the data size of the object data is larger than a predetermined value, selecting a plurality of data segments from the object data to compress, comparing the reduction degree of the data size of the data segments before and after compression, and determining whether the object data is compressible based on the reduction degree comprises:
if the data volume of the object data is larger than a preset value, randomly selecting a plurality of data segments from the rest data except the data with the first preset volume at the beginning, wherein the length of each data segment is a second preset volume;
compressing the selected data segments;
and if the ratio of the data quantity of the plurality of compressed data segments to the data quantity of the plurality of data segments before compression is smaller than a preset ratio, determining that the object data is compressible.
5. The method according to claim 1, wherein the determining whether the object type of the object data can be obtained according to the object data carried by the object data storage request includes:
and judging the object type of the object data according to the extension or the file header of the object data.
6. The method of claim 1, wherein determining whether the object data is compressible according to an object type of the object data comprises:
if the object type is in a video file format, a picture format or a compression packet format, marking the object data as incompressible;
and if the object type is a text type, marking the object data as compressible.
7. The method of claim 1, wherein before the determining whether to compress the object data before transmitting the object data to the storage device according to whether the object data is compressible, the method further comprises:
and recording whether the object data is compressed or not in the metadata of the object data according to whether the object data is compressible or not.
8. The method of claim 1, wherein determining whether to compress the object data before transmitting the object data to the storage device according to whether the object data is compressible comprises:
if the object data is compressible, compressing the object data and transmitting the compressed data to the storage equipment;
and if the object data is not compressible, directly transmitting the object data to the storage device.
9. The method of claim 1, further comprising:
if an object data reading request sent by a client is received, the storage gateway directly reads the object data from the storage equipment;
determining whether the object data is a compressed storage object according to the record of the metadata of the object data;
and if the object data is a compressed storage object, decompressing the object data and then returning the object data to the client.
10. An apparatus for online compression of object data, the apparatus being applied to a storage gateway, the storage gateway being connected to a plurality of storage devices, the apparatus comprising:
the storage request receiving module is used for receiving an object data storage request;
the object type acquisition module is used for judging whether the object type of the object data can be acquired or not according to the object data carried by the object data storage request;
the compressible judging module is used for determining whether the object data is compressible according to the object type of the object data if the object type of the object data is obtained;
and the compression storage module is used for determining whether the object data is compressed and then transmitted to the storage equipment according to whether the object data is compressible.
11. An electronic device, characterized in that the electronic device comprises:
a processor;
a memory for storing processor-executable instructions;
wherein the processor is configured to perform the method of online compression of object data of any of claims 1-9.
12. A computer-readable storage medium, characterized in that the storage medium stores a computer program executable by a processor to perform the method of online compression of object data according to any one of claims 1 to 9.
CN202210743760.6A 2022-06-27 2022-06-27 Object data online compression method and device, electronic equipment and storage medium Pending CN115118716A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210743760.6A CN115118716A (en) 2022-06-27 2022-06-27 Object data online compression method and device, electronic equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210743760.6A CN115118716A (en) 2022-06-27 2022-06-27 Object data online compression method and device, electronic equipment and storage medium

Publications (1)

Publication Number Publication Date
CN115118716A true CN115118716A (en) 2022-09-27

Family

ID=83329948

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210743760.6A Pending CN115118716A (en) 2022-06-27 2022-06-27 Object data online compression method and device, electronic equipment and storage medium

Country Status (1)

Country Link
CN (1) CN115118716A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014117716A1 (en) * 2013-01-31 2014-08-07 Huawei Technologies Co., Ltd. Block compression in a key/value store
CN110147201A (en) * 2019-04-25 2019-08-20 平安科技(深圳)有限公司 Line compression method, apparatus, computer equipment and storage medium
CN113608692A (en) * 2021-07-25 2021-11-05 济南浪潮数据技术有限公司 Method, system, equipment and medium for verifying data consistency of storage system

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014117716A1 (en) * 2013-01-31 2014-08-07 Huawei Technologies Co., Ltd. Block compression in a key/value store
CN110147201A (en) * 2019-04-25 2019-08-20 平安科技(深圳)有限公司 Line compression method, apparatus, computer equipment and storage medium
CN113608692A (en) * 2021-07-25 2021-11-05 济南浪潮数据技术有限公司 Method, system, equipment and medium for verifying data consistency of storage system

Similar Documents

Publication Publication Date Title
US20080071857A1 (en) Method, computer program, transcoding server and computer system for modifying a digital document
KR20140009175A (en) Partial loading and editing of documents from a server
US9342396B2 (en) Self-stabilizing network nodes in mobile discovery system
JP2006146878A (en) Electronic document managing program and electronic document managing device
TW201621696A (en) Method and system for transmitting data
JP4352940B2 (en) Image search apparatus and program
WO2014206109A1 (en) File downloading method, method for pre-viewing downloaded file, and method for generating preview information
CN109284428B (en) Data processing method, device and storage medium
US10489350B2 (en) Data compression with inline compression metadata
US10101801B2 (en) Method and apparatus for prefetching content in a data stream
CN111177082B (en) PDF file duplicate removal storage method and system
US9158493B2 (en) Page description language package file preview
CN115118716A (en) Object data online compression method and device, electronic equipment and storage medium
CN111414339A (en) File processing method, system, device, equipment and medium
CN110740138A (en) Data transmission method and device
JP5409090B2 (en) Information processing apparatus, information processing method, program, and storage medium
JP2007331312A (en) Printing equipment, printing control method, program, storage medium
US11138149B2 (en) Information processing system, control method therefor, and storage medium for handling an error in converting data in a process for generating business form data
JP2000076155A (en) Html document compression/expansion/display system
JP2010009191A (en) Image processor
CN112306967A (en) File uploading method and device and computer readable storage medium
CN112069771B (en) Method and device for analyzing pictures in PDF (portable document format) file
CN111191418B (en) Online document processing method and device, electronic equipment and computer storage medium
CN115658838B (en) Map set data generation method and device, electronic equipment and storage medium
CN113505153B (en) Memorandum backup method based on iOS system and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination