WO2021238408A1 - 对象存储平台以及对象聚合方法、装置和服务器 - Google Patents

对象存储平台以及对象聚合方法、装置和服务器 Download PDF

Info

Publication number
WO2021238408A1
WO2021238408A1 PCT/CN2021/085236 CN2021085236W WO2021238408A1 WO 2021238408 A1 WO2021238408 A1 WO 2021238408A1 CN 2021085236 W CN2021085236 W CN 2021085236W WO 2021238408 A1 WO2021238408 A1 WO 2021238408A1
Authority
WO
WIPO (PCT)
Prior art keywords
type
objects
aggregated
storage pool
storage
Prior art date
Application number
PCT/CN2021/085236
Other languages
English (en)
French (fr)
Inventor
郭军
李金阳
Original Assignee
百果园技术(新加坡)有限公司
郭军
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 百果园技术(新加坡)有限公司, 郭军 filed Critical 百果园技术(新加坡)有限公司
Publication of WO2021238408A1 publication Critical patent/WO2021238408A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0629Configuration or reconfiguration of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/13File access structures, e.g. distributed indices
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/16File or folder operations, e.g. details of user interfaces specifically adapted to file systems
    • G06F16/162Delete operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/1734Details of monitoring file system events, e.g. by the use of hooks, filter drivers, logs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0638Organizing or formatting or addressing of data
    • G06F3/0644Management of space entities, e.g. partitions, extents, pools

Definitions

  • This application relates to the field of data processing technology, for example, to an object storage platform and an object aggregation method, device, and server.
  • the open source Ceph file system With the gradual increase of multimedia resources (such as pictures, audio and video, etc.), the open source Ceph file system will have corresponding storage and operation requirements for file objects of different sizes, and the Ceph file system will allocate a minimum space unit for data storage , Even if the data volume of a small object is less than the minimum space unit, it will all occupy the minimum space unit, resulting in a great waste of storage space; at the same time, the Ceph file system usually uses the minimum supported operation granularity as the unit when it expands or fails. The file objects stored on it are transferred.
  • the Ceph file system may have the risk of data loss when performing conversion operations, and the transfer of a large number of small objects will greatly increase Ceph The data read and write load of the file system.
  • the Ceph file system will use a third-party storage system to additionally store the custom object name of each small object and the large object when it merges multiple small objects into one large object storage.
  • the Ceph file system uses the functions of the third-party storage system to increase the complexity and maintenance difficulty of the object storage, and make the operation performance of the merged object worse.
  • This application provides an object storage platform, an object aggregation method, device, and server, which adjust the storage structure of the first type of object and reduce the complexity of object storage.
  • An object storage platform includes: an object storage gateway, a first storage pool, and a second storage pool; among them,
  • the first storage pool is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform, and is set to store objects of the first type that have not been aggregated to the second type of objects, and the storage space that has been aggregated to the second type of objects.
  • the second storage pool is a storage space constructed using set disk technology on the internal storage space of the object storage platform, and is set to store objects of the second type aggregated by a plurality of objects of the first type;
  • the data read and write performance supported by the first storage pool is higher than the data read and write performance supported by the second storage pool;
  • the object storage gateway is set to periodically aggregate the first-type objects in the first storage pool that are not aggregated into the second-type objects into the second-type objects in the second storage pool, and store them in the first storage pool.
  • the first-type object of this aggregation is replaced with the aggregation mapping relationship between the first-type object of this aggregation and the second-type object that is aggregated.
  • An object aggregation method is also provided, which is applied to the above-mentioned object storage platform, including:
  • the first storage pool is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform, and is set to store objects of the first type that are not aggregated into objects of the second type, and
  • the aggregation mapping relationship between the first-type object of the second-type object and the aggregated second-type object, the second storage pool is constructed by using a set disk technology on the internal storage space of the object storage platform
  • the storage space of is set to store objects of the second type aggregated by a plurality of objects of the first type, and the data read and write performance supported by the first storage pool is higher than the data read and write performance supported by the second storage pool .
  • An object aggregation device is also provided, which is set in the above-mentioned object storage platform, and includes:
  • the object search module is set to periodically search for objects of the first type that are not aggregated into objects of the second type in the first storage pool;
  • the object aggregation module is configured to aggregate the first-type objects that are not aggregated into the second-type objects into the second-type objects in the second storage pool, and replace the first-type objects aggregated this time in the first storage pool Is the aggregation mapping relationship between the aggregated objects of the first type and the aggregated objects of the second type;
  • the first storage pool is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform, and is set to store objects of the first type that are not aggregated into objects of the second type, and
  • the aggregation mapping relationship between the first-type object of the second-type object and the aggregated second-type object, the second storage pool is constructed by using a set disk technology on the internal storage space of the object storage platform
  • the storage space of is set to store objects of the second type aggregated by a plurality of objects of the first type, and the data read and write performance supported by the first storage pool is higher than the data read and write performance supported by the second storage pool .
  • a server including:
  • One or more processors are One or more processors;
  • Storage device set to store one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the foregoing object aggregation method.
  • a computer-readable storage medium is also provided, on which a computer program is stored, and when the program is executed by a processor, the above-mentioned object aggregation method is realized.
  • FIG. 1 is a schematic architecture diagram of an object storage platform provided in Embodiment 1 of the application;
  • FIG. 2 is a schematic structural diagram of an object storage platform provided in Embodiment 2 of the application.
  • FIG. 3 is a flowchart of an object aggregation method provided in Embodiment 3 of this application;
  • FIG. 4 is a flowchart of an object aggregation method provided in Embodiment 4 of this application.
  • FIG. 5 is a schematic structural diagram of an object aggregation device provided in Embodiment 5 of this application.
  • FIG. 6 is a schematic structural diagram of a server provided in Embodiment 6 of this application.
  • FIG. 1 is a schematic structural diagram of an object storage platform provided in Embodiment 1 of this application. This embodiment is applicable to the case of storing any object under the open source Ceph file system.
  • the object storage platform 10 may include: an object storage gateway 110, a first storage pool 120 and a second storage pool 130.
  • the first storage pool 120 is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform 10, and stores objects of the first type that are not aggregated into objects of the second type, and objects that have been aggregated into the second type of objects.
  • the second storage pool 130 is a storage space constructed using conventional disk technology on the internal storage space of the object storage platform 10, and stores the first type of storage space. The second type of object after the aggregation of class objects.
  • the object storage gateway 110 periodically aggregates the first-type objects in the first storage pool 120 that are not aggregated into the second-type objects into the second-type objects in the second storage pool 130, and stores this time in the first storage pool 120.
  • the aggregated first-type object is replaced with an aggregated mapping relationship between the first-type object and the aggregated second-type object.
  • the Ceph file system since the minimum operation granularity supported by the Ceph file system is an object of a specific size, for objects whose data volume is higher than the minimum operation granularity, the Ceph file system can accurately perform corresponding migration or conversion operations, and for For objects whose data volume is lower than the minimum operation granularity, the Ceph file system may have the risk of data loss when performing operations such as migration or conversion, and cannot support the complete operation of the object.
  • this embodiment proposes to combine multiple Objects with a data volume lower than the minimum operation granularity are aggregated into an object with a data volume higher than the minimum operation granularity for storage, and subsequent operations such as migration or conversion are performed as a whole.
  • the first type of object in this embodiment is not supported by the operation granularity.
  • Objects for complete operation of objects that is, objects whose data volume in the Ceph file system is lower than the minimum supported operation granularity.
  • the second type of object is a container that supports the complete operation of objects at the operation granularity and is used to aggregate objects of the first type, that is, The pre-set data volume in the Ceph file system is higher than the supported minimum operation granularity object container, which can aggregate a large number of first-type objects; taking media resources as an example, the first-type objects in this embodiment can be the Internet There are multiple types of pictures stored under the system, and the second type of object can be a video aggregated from a large number of pictures under the Internet system.
  • a third-party storage system will be additionally configured to store the mapping relationship between the object names of multiple small objects and the merged positions in the large objects, which increases the complexity and maintenance difficulty of the object storage, so this implementation
  • disk technologies with different read and write performances will be used directly in the Ceph file system, and the internal storage space of the Ceph file system will be divided into two different storage spaces, and solid-state disk technology will be used to build the first storage pool 120 on the internal storage space.
  • conventional disk technology is used to construct the second storage pool 130.
  • the data read and write performance supported by the first storage pool 120 is much higher than the data read and write performance supported by the second storage pool 130. On the basis of, the read and write performance of some objects of the first type is guaranteed.
  • the first type of object can be regarded as unaggregated to the first object.
  • the first-type objects of the second-class objects are directly stored in the first storage pool 120, and subsequently the first-type objects in the first storage pool 120 that are not aggregated into the second-type objects are aggregated into the second-type objects through the object storage gateway 110.
  • the second-type objects in the storage pool 130 there is no need to use the function of a third-party storage system, and at the same time, the read and write performance of the first-type objects that are not aggregated into the second-type objects are ensured.
  • the object storage gateway 110 can be a reliable, autonomous, and distributed object store (Reliable, Autonomic Distributed Object Store, RADOS) gateway (RADOS Gateway, RGW) configured under the Ceph file system. ).
  • the second-type objects in the second storage pool 130 may be composed of the mapping tags, attribute tags, content data, and verification tags of each first-type object aggregated.
  • the mapping label is used to indicate the aggregation mapping relationship between the first type of object aggregated by the second type of object and the second type of object, for example, the aggregation offset position of the aggregated first type of object in the second type of object And the data length of the first-type object;
  • the attribute tag is used to indicate metadata information such as the object identification and index classification of the multiple first-type objects aggregated by the second-type object.
  • the object storage platform 10 of this embodiment can also Including the index record pool 140, which is set to record the index classification of the first-type object.
  • the content classification of the first-type object is first analyzed, and the Fill in the object identifier of the first type of object under the corresponding index classification label of the index record pool 140 to indicate that the first type of object written this time belongs to the content under the index classification label; the content data is aggregated by the second type of object The actual content of each first-type object; the check tag is used to determine whether each first-type object aggregated by the second-type object is aggregated incorrectly.
  • the second storage pool 130 is mainly responsible for storing the content data of the second type of objects, and the second type of objects are aggregated by multiple objects of the first type.
  • the content data of the class object is the content data of multiple objects of the first type that are aggregated, so in order to ensure the successful aggregation of the first type of object to the second type of object, and to prevent the excessive aggregation of the first type of object under the second type of object.
  • the aggregation upper limit of the second type of object is stored in the second storage pool 130 in advance, so that in the process of aggregating the first type of object into the second type of object, If the aggregate occupied space of the first type of objects aggregated in the second type of object has reached the upper limit of aggregation, the first type of object will not be aggregated to the second type of object, and the next new type of second object will be switched to Ag
  • the object storage gateway 110 aggregates objects of the first type that are not aggregated into objects of the second type in the first storage pool 120 into objects of the second type in the second storage pool 130, for each aggregation Before an object of the first type, it is necessary to check whether the aggregate capacity of the multiple objects of the first type that have been aggregated under the current second type of object used for this aggregation in the second storage pool 130 is greater than or equal to the aggregation of the second type of object Upper limit; if the aggregate capacity of multiple objects of the first type that have been aggregated is less than the upper limit of the aggregation of objects of the second type, continue to follow the mapping tags, attribute tags, content data, and collation of the first type of objects in the current second type of objects.
  • the data format of the verification label is appended to the first-type objects in this aggregation; if the aggregate capacity of multiple first-type objects that have been aggregated is greater than or equal to the aggregation upper limit of the second-type objects, the next first-type object in the second storage pool 130 is added.
  • the second-class object is used as the new current second-class object, and the remaining first-class objects in this aggregation are continued to be aggregated into the new current second-class object.
  • the object storage gateway 110 will receive the user's operation request for the first type of objects stored on it according to the user's read and write requirements.
  • the object storage gateway 110 will, according to the operation request, download the first storage pool 120 and the second storage pool 130
  • the object information pointed to by the operation request is read, written, modified or deleted accordingly; therefore, if the object storage gateway 110 needs to delete a first-type object according to user needs, and the first-type object has been aggregated to the first Second, in a second-type object in the storage pool 130, the content data of the first-type object needs to be deleted in the aggregated second-type object. There will be corresponding storage vacancies in the second-type object.
  • a recycling process 111 is configured on the object storage gateway 110, and the reclaimable capacity of each second-type object is stored in the second storage pool 130.
  • the content data volume is used to modify the reclaimable capacity of the second-type objects aggregated from the first-type objects deleted on the second storage pool 130 this time, and then the recycling process 111 will detect each second-type object in the second storage pool 130 in real time.
  • the reclaimable capacity of the objects is used to reclaim in time the deleted first-type objects among the first-type objects aggregated by the second-type objects, so as to ensure that the storage space of the deleted first-type objects can be reclaimed in time.
  • the recycling process 111 detects the reclaimable capacity of each second-type object in the second storage pool 130 in real time, and then finds the target second-type object whose reclaimable capacity exceeds the preset recycling upper limit.
  • the target second-type object aggregates large Some objects of the first type have been deleted. Since the aggregation mapping relationship between the deleted objects of the first type and the aggregated objects of the second type has been deleted in the first storage pool 120, it can be determined according to the first storage pool.
  • the target second-type object 120 remaining records of the aggregation mapping relationship related to the target second-type object, find out multiple valid first-type objects that have not been deleted among the multiple first-type objects aggregated by the target second-type object, and then The multiple valid first-type objects aggregated by the target second-type object are rewritten into the first storage pool 120, and the multiple valid first-type objects in the first storage pool 120 and the target second-type object are deleted.
  • the aggregation mapping relationship of, and the target second-type object in the second storage pool 130 that is, the multiple valid first-type objects aggregated by the target second-type object are re-migrated to the first storage pool 110, and Delete all the target second-type objects in the second storage pool 130, so as to ensure that the storage space of the deleted first-type objects can be reclaimed in time, and avoid the storage space on the second-type objects aggregated by the deleted first-type objects. Waste of storage space.
  • the technical solution provided in this embodiment directly uses solid-state disk technology to construct the first storage pool on the internal storage space of the object storage platform, and uses conventional disk technology to construct the second storage pool.
  • the data read performance of the first storage pool is far greater.
  • the first type of objects that are not aggregated to the second type of objects are stored in the first storage pool, and between the first type of objects that have been aggregated to the second type of objects and the aggregated second type of objects
  • the aggregation mapping relationship between the first type of object is stored in the second storage pool, and the second type of object after the aggregation of the first type of object is stored. There is no need to use a third-party storage system to store the aggregation mapping relationship between the first type of object and the second type of object.
  • the storage structure of the first type of object improves the read performance of the first type of object that is not aggregated into the second type of object; at the same time, the object storage gateway can periodically collect the first type of object that is not aggregated to the second type of object in the first storage pool Aggregate into the second type of objects in the second storage pool to realize the dynamic aggregation between the first type of object to the second type of object, and prevent the excessive storage of the first type of object that is not aggregated to the second type of object in the first storage pool, Improve the storage performance of the first storage pool.
  • FIG. 2 is a schematic structural diagram of an object storage platform provided in Embodiment 2 of this application. This embodiment is described on the basis of the technical solutions provided in the above embodiments. 2, the object storage platform 20 may include an object storage gateway 210, a first storage pool 220, a second storage pool 230, an index record pool 240, and a log record pool 250.
  • the first storage pool 220 is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform 20, and stores objects of the first type that are not aggregated into objects of the second type, and objects that have been aggregated into the second type of objects.
  • the second storage pool 230 is a storage space constructed using conventional disk technology on the internal storage space of the object storage platform 20, and stores the first type of storage space. After the second type of objects are aggregated, the second storage pool 230 also stores the upper limit of aggregation of the second type of objects; the object storage gateway 210 is configured with a recycling process 211, and the second storage pool 230 also stores each second type of object.
  • the reclaimable capacity of the object; the index record pool 240 is set to record the index classification of the first type of object;
  • the log record pool 250 is set to record the write log and the modification log of the first type of object, and mark the corresponding aggregate checkpoint log.
  • the object storage gateway 210 periodically aggregates the first-type objects in the first storage pool 220 that are not aggregated into the second-type objects into the second-type objects in the second storage pool 230, and stores this time in the first storage pool 220.
  • the aggregated first-type object is replaced with an aggregated mapping relationship between the first-type object and the aggregated second-type object.
  • the object storage gateway 210 aggregates objects of the first type that are not aggregated into objects of the second type in the first storage pool 220 into objects of the second type in the second storage pool 230, if the current object in the second storage pool 230 is The aggregate capacity of the first type of object in the second type of object is greater than or equal to the upper limit of aggregation, then the next second type of object in the second storage pool will be taken as the new current second type of object, and the remaining first type of objects will continue to be aggregated this time. The objects are aggregated into the new current second type of object.
  • the recycling process 211 configured on the object storage gateway 210 searches the second storage pool 230 in real time for the target second-type object whose reclaimable capacity exceeds the preset recycling upper limit, and aggregates multiple first-class objects of the target second-type object.
  • the multiple valid first-class objects that are not deleted in the class object are rewritten into the first storage pool 220, and the aggregated mapping between the multiple valid first-class objects in the first storage pool 220 and the target second-class object is deleted Relationship, and the target second-type object in the second storage pool 230.
  • the corresponding search process 212 and at least one aggregation process 213 will be additionally configured on the object storage gateway 210.
  • the search process 212 and each aggregation process 213 will subsequently cooperate together.
  • the storage pool 220 searches for objects of the first type that are not aggregated into objects of the second type, and aggregates the found objects of the first type into objects of the second type in the second storage pool 230.
  • the search process 212 configured on the object storage gateway 210 will periodically search for objects of the first type that are not aggregated into objects of the second type in the first storage pool 220, and find the objects of the first type that are not aggregated into the second type of objects.
  • the objects are recorded one by one in the aggregation shards of the corresponding object storage gateway 210; among them, multiple aggregation shards are preset on the object storage gateway 210, multiple aggregation shards and multiple aggregation processes configured on the multiple aggregation shards 213 One-to-one correspondence; after the search process 212 records the multiple first-type objects found in the first storage pool 220 that are not aggregated into the second-type objects into multiple aggregate shards, the multiple aggregate shards correspond to multiple objects.
  • An aggregation process 213 will concurrently read the multiple first-type objects recorded in the multiple aggregated shards that are not aggregated into the second-type objects from the corresponding aggregation shards, and continuously read the first-type objects. Aggregate into the second type of objects in the second storage pool 230, and at the same time replace the aggregated first type of objects in the first storage pool 220 between the first type of objects and the aggregated second type of objects The aggregation mapping relationship makes the first storage pool 220 no longer store the content data of the first type of objects that have been aggregated to the second type of objects, but only store the first type of objects that have been aggregated to the second type of objects and the aggregated to The aggregation mapping relationship between objects of the second type in the first storage pool 220 is avoided, so as to ensure that the objects of the first type that are not aggregated into the second type of objects in the first storage pool 220 are aggregated to the second Accuracy of aggregation in the second type of objects in the storage pool 230.
  • this embodiment will pre-set a log record pool 250, because for newly written or modified first objects Class objects, this embodiment will treat them as objects of the first type that are not aggregated into objects of the second type, first store them in the first storage pool 220, and then periodically aggregate them into the second type of objects in the second storage pool 230.
  • the log record pool 250 records the write log or modification log of the first type of object, so that the object storage gateway 210 regularly replays the multiple logs recorded in the log record pool 250 and finds out that it is not aggregated to the second type.
  • the playback log of each first-type object in the object, and then the first-type object to which the found playback log faces are regarded as the first-type object that is not aggregated into the second-type object, and aggregated into the second storage pool 230 In the second category of objects.
  • this embodiment periodically replays multiple logs recorded in the log recording pool 250 At the time, the last replayed log will be marked accordingly.
  • the object storage gateway 210 will replay the logs in the log record pool 250 in the order of log records, so the log records can be determined
  • the logs in the pool 250 before the marked aggregation checkpoint log have all been replayed, and the first type of objects to which the replayed logs are facing have been aggregated to the second type of objects in the second storage pool 230 during the return visit process
  • the multiple write logs and the modified log facing the first type of objects have been aggregated into the second type of objects in the second storage pool 230
  • the multiple write logs and modification logs after the marked aggregation checkpoint log have not yet aggregated into the second type of objects in the second storage pool 230.
  • the object storage gateway 210 is When looking for objects of the first type that are not aggregated into objects of the second type by periodically replaying the logs in the logging pool 250, only the multiple write logs and modification logs in the logging pool 250 that are located after the aggregated checkpoint log need to be replayed regularly ,
  • the first type of object for each playback log is the first type of object that is not aggregated to the second type of object, and then the first type of object for each playback log is aggregated to the second type of object in the second storage pool 230
  • the corresponding aggregation checkpoint log is remarked in the log record pool 250, and the next time the log is played back, the aggregation checkpoint log will continue to be replayed; for example, if all logs are replayed regularly this time After all playback is complete, the last log played back this time will be used as the new aggregate checkpoint log and marked in the log record pool 250.
  • the new aggregate checkpoint log will be used as the new aggregate checkpoint log and marked in the log record
  • the search process 212 and at least one aggregation process 213 configured on the object storage gateway 210 are used together.
  • the multiple first-type objects that are not aggregated into the second-type object are determined by cooperation, and the search process of the multiple first-type objects that are not aggregated into the second-type object in the first storage pool 220 is described:
  • the search process 212 configured on the object storage gateway 210 will periodically replay multiple write logs and modification logs after the aggregate checkpoint log in the log record pool 250, and write multiple replay logs into multiple aggregate shards.
  • Each aggregation process 213 reads the playback logs concurrently from their corresponding aggregation shards, and aggregates the first type of objects to which each playback log faces as the first type of objects that are not aggregated into the second type of objects to the second type.
  • the aggregation process 213 first detects that the objects of the second type used for this aggregation in the second storage pool 230 have been aggregated.
  • the aggregate capacity of multiple objects of the first type is greater than or equal to the upper limit of aggregation of objects of the second type; if the aggregate capacity of multiple objects of the first type that have been aggregated is less than the aggregate capacity of objects of the second type, continue to the current second type In the class object, the first class object of this aggregation is added according to the data format of the mapping label, attribute label, content data and check label of the first class object; if the aggregation capacity of multiple first class objects that have been aggregated is greater than or equal to For the aggregation of the second type of object, the next second type of object in the second storage pool 230 is taken as the new current second type of object, and the remaining first type objects of this aggregation are aggregated to the new current second type of object. Object.
  • this embodiment will perform the preliminary write for each first-type object.
  • the object storage gateway 210 receives a write request for the first type of object, it first takes the first type of object written this time as the first type of object that is not aggregated into the second type of object, and directly stores it in the first storage pool 220, and correspondingly record the write log of the first type of object written this time in the log record pool 250, so as to periodically replay the multiple write logs and modifications in the log record pool 250 after the aggregate checkpoint log. Log, the write log of the first-type object written this time will be played back, and the first-type object written this time will be aggregated into the second-type object in the second storage pool 230.
  • this embodiment will first read the content data of the first-type object read this time in the first storage pool 220; if the content read in the first storage pool 220 If the data is empty, then continue to find the aggregation mapping relationship between the first-type object read this time and the aggregated second-type object in the first storage pool 220, and store it in the second storage based on the aggregation mapping relationship.
  • the pool 230 finds out the second-type object aggregated into the first-type object read this time, and then continues to read the content data of the first-type object read this time from the second-type object.
  • the first storage pool 220 will store the first type of object that is not aggregated to the second type of object, and the first type of object that has been aggregated to the second type of object.
  • the aggregation mapping relationship between the class object and the aggregated object of the second type therefore, first read the content data of the first type object read this time in the first storage pool 220, and if the first type of object can be read out
  • the content data of the class object indicates that the first type of object read this time has not been aggregated into the second type of object in the second storage pool 230, and the read content data is directly fed back to the user;
  • the content data read in the first storage pool 220 is empty, indicating that the first type of object read this time has been aggregated into the second type of object in the second storage pool 230, so it needs to be in the first storage pool 220 Find the aggregation mapping relationship between the first type of object read this time and the aggregated second type of object, such as the offset position of
  • the second storage pool 230 is searched for the second type of object to which the first type of object read this time is aggregated, and the corresponding value of the second type of object Continue to read the content data of the corresponding data length at the offset position as the content data of the first type of object read this time.
  • this embodiment also reads the check label of the first type of object, and uses the check label to determine whether the content data read this time is wrong. Thereby improving the accuracy of reading objects of the first type.
  • the deleted first-type object is the first-type object that has not been aggregated into the second-type object, or the first-type object that has been aggregated into the second-type object, so that different deletion operations can be performed subsequently to ensure the accuracy of this deletion.
  • the object storage gateway 210 receives the delete request for the first-type object, and the first-type object deleted this time is the first-type object that is not aggregated into the second-type object, it indicates that the first-type object storage read this time In the first storage pool 220, the log record pool 250 records the write log or modification log of the first type of object deleted this time, so that the first type of object can be aggregated into the second type when the log is periodically replayed later. Therefore, this embodiment directly deletes the content data of the first type of object deleted this time in the first storage pool 220, and at the same time, in order to avoid aggregation of the first type of object deleted this time when the log is periodically replayed later.
  • the write log or modification log of the first-type object deleted this time will be deleted in the log record pool 250, so that the write log or modification log of the first-type object that has been deleted will not be replayed later, thereby ensuring the first Accuracy of aggregation of a class of objects.
  • the object storage gateway 210 receives the delete request for the first-type object, and the first-type object deleted this time is the first-type object that has been aggregated into the second-type object, it indicates the first-type object deleted this time Among the second-type objects stored in the second storage pool 230, the write log or modification log of the first-type object deleted this time has been replayed in the log record pool 250, and will not be replayed again, so it only needs to be replayed in the second storage pool. It is sufficient to delete the aggregation mapping relationship between the first-type object deleted this time and the aggregated second-type object in one storage pool 220, and there is no need to delete the first-type object this time in the second storage pool 230.
  • the reclaim process 211 configured on the object storage gateway 210 will detect the reclaimable capacity of each second-type object in the second storage pool 230.
  • subsequent recovery will be performed when the recoverable capacity of the second type of object exceeds the preset recycling upper limit, and the recoverable capacity will exceed the preset
  • the multiple valid first-type objects aggregated by the target second-type object of the recycling upper limit are rewritten into the first storage pool 220, and multiple valid first-type objects and target second-type objects in the first storage pool 220 are deleted
  • the modification operation in this embodiment is mainly to modify the content data of the first type of object that has been currently written, because the first type of object can be stored in the first storage pool 220 It can also be stored in the second-type object in the second storage pool 230. Therefore, it is first necessary to determine whether the first-type object modified this time is the first-type object that is not aggregated into the second-type object, or whether it has been aggregated into the first-type object.
  • the modification operation can be regarded as a combination of delete operation and write operation, that is, through The object storage gateway 210 performs a corresponding delete operation on the original first-type object that has been written, and after the deletion is successful, performs the corresponding write operation again on the new first-type object that is currently required to be written, thereby ensuring the modification operation Accurate execution.
  • the object storage gateway 210 receives the modification request for the first-type object, and the first-type object modified this time is the first-type object that is not aggregated into the second-type object, it indicates that the first-type object storage read this time In the first storage pool 220, the log record pool 250 records the write log of the first-type object deleted this time, so the content data of the first-type object modified this time in the first storage pool 220 can be directly updated.
  • the modification of the first-type object modified this time will also be recorded in the log record pool 250 Log, and delete the write log of the first-type object modified this time, so that only the modified log of the first-type object modified this time will be played back in the future, and the write log of the first-type object modified this time will not be played back. , So as to ensure the accuracy of the aggregated content data of the modified first-type object during subsequent aggregation.
  • the object storage gateway 210 receives the modification request of the first type of object, and the first type of object modified this time is the first type of object aggregated to the second type of object, it indicates that the modified first type of object storage Among the second-type objects in the second storage pool 230, the write logs of the first-type objects that have been modified this time in the log record pool 250 will not be played back again. Therefore, only the first storage pool 220 is required.
  • the first type of object after this modification is directly written into the internal, because when reading the first type of object, the corresponding content data will be first read in the first storage pool 220. If the content data can be read, it will not Pay attention to the aggregation mapping relationship between the first type of object and the second type of object.
  • the first storage pool is not deleted.
  • the aggregation mapping relationship between the content data of the first type object before modification and the aggregated second type object that has been stored in 220 the reading accuracy of the first type object modified this time can also be guaranteed. Therefore, when writing the modified first-type object in the first storage pool 220, there is no need to delete the content data of the first-type object before the modification and the aggregated second-type object stored in the first storage pool 220.
  • the aggregation mapping relationship between the class objects, and the modification log of the first-class object modified this time needs to be recorded in the log record pool 250, so that when the log is periodically replayed later, the modified first object can be modified according to the modification log.
  • Objects of one type are re-aggregated into the objects of the second type in the second storage pool 230. Only when the objects of the first type after this modification are aggregated into the objects of the second type, it is necessary to delete the objects in the first storage pool 220.
  • the aggregation mapping relationship between the first type object before the second modification and the aggregated second type object, and then the aggregation mapping relationship between the first type object after this modification and the aggregated second type object It is stored in the first storage pool 220, so as to ensure the accuracy of reading the first-type object after modification.
  • the first type of objects that have not been aggregated into the second type of objects and the first type of objects that have been aggregated into the second type of objects are accurately performed. Distinguish, subsequent regular playback of multiple write logs and modification logs located after the aggregation checkpoint log in the log record pool, and directly treat the first type of object for each playback log as the first type of object that is not aggregated into the second type of object , To aggregate into the second type of objects in the second storage pool, so as to ensure the accuracy of the first type of Excessive storage of objects of the first type that are not aggregated into objects of the second type in the storage pool improves the storage performance of the first storage pool.
  • FIG. 3 is a flowchart of an object aggregation method provided in the third embodiment of the application.
  • This embodiment can be applied to the storage of any object under the open source Ceph file system, and is applied to the object storage provided in the above embodiment In the platform.
  • An object aggregation method provided in this embodiment may be executed by the object aggregation apparatus provided in the embodiment of the present application, and the apparatus may be implemented in software and/or hardware, and integrated in a server that executes the method.
  • the method may include the following steps:
  • S310 Periodically search for objects of the first type that are not aggregated into objects of the second type in the first storage pool.
  • the first storage pool is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform.
  • the first type of objects that are not aggregated into the second type of objects are stored, and the first type of objects that have been aggregated to the second type of objects are stored.
  • the second storage pool is a storage space constructed using conventional disk technology on the internal storage space of the object storage platform. After the aggregation of the first type of objects is stored The second type of object.
  • the object storage gateway analyzes the storage conditions of multiple objects of the first type in the first storage pool, and regularly searches the first storage pool to find out that the object storage platform has been newly written to the object storage platform during this period of time. Aggregate to objects of the first type of objects of the second type, so as to be accurately aggregated to objects of the second type in the second storage pool.
  • this embodiment regards them as Objects of the first type that are not aggregated into the objects of the second type are first stored in the first storage pool, and then periodically aggregated into the second type of objects in the second storage pool to ensure the read and write performance of some objects of the first type. Therefore, in this embodiment, a log record pool is set in advance, and when a new first type object is written in the first storage pool or an original first type object is modified according to user needs, the first type of object will be recorded in the log record pool.
  • the write log or modification log of the class object for the object storage gateway to periodically replay the multiple logs recorded in the log record pool, and find out the playback of multiple first-class objects that are not aggregated into the second-class objects Log, and then take the first-type object for which the playback log is found as the first-type object that is not aggregated into the second-type object, and sequentially aggregate it into the second-type object in the second storage pool.
  • periodically searching for objects of the first type that are not aggregated into objects of the second type in the first storage pool may include: periodically replaying multiple write logs and modification logs that are located after the aggregated checkpoint log in the log record pool;
  • the first type of object that each playback log faces is regarded as the first type of object that is not aggregated into the second type of object.
  • the replayed log can accurately determine whether the corresponding first-type object has been aggregated into the second-type object in the second storage pool.
  • the replayed log when multiple logs recorded in the log recording pool are regularly played back, the The last replayed log will be marked accordingly.
  • the object storage gateway since the object storage gateway will replay the logs in the log record pool in the order of log records, it can determine that the log record pool is located in the marked The logs before the aggregated checkpoint log have been replayed, and the first type of objects targeted by the replayed logs have been aggregated into the second type of objects in the second storage pool during the return visit process, that is, the log record pool.
  • the multiple write logs before the marked aggregate checkpoint log and the first type of object that the modification log faces have been aggregated into the second type of object in the second storage pool, and are located after the marked aggregate checkpoint log
  • the objects of the first type targeted by multiple write logs and modification logs have not yet been aggregated into the second type of objects in the second storage pool.
  • the Object Storage Gateway regularly plays back the logs in the log record pool to find the unaggregated
  • the first type of the second type of object is the first type of object
  • the first type of object for each replay log is not aggregated to the first type of object.
  • Objects of the first type of the second type of objects and then aggregate the first type of objects for each playback log into the second type of objects in the second storage pool; at the same time, the first type of objects that are not aggregated to the second type of objects
  • it may also include: according to the current aggregation state, remark the aggregate checkpoint log in the multiple write logs and modification logs in the log record pool, so that the logs can be replayed next time When the time, continue to replay from the aggregate checkpoint log; for example, if all the logs replayed regularly this time are fully replayed, the last log replayed this time will be used as the new aggregate checkpoint log and marked in the log record pool , Next time from the new aggregation checkpoint log to continue playback, thereby improving the search efficiency and accuracy of the first type of objects that are not aggregated to the second type of objects.
  • S320 Aggregate objects of the first type that are not aggregated into objects of the second type into objects of the second type in the second storage pool, and replace the objects of the first type aggregated this time with the objects of the first type in the first storage pool The aggregation mapping relationship with the aggregated objects of the second type.
  • each object of the first type found out is continuously aggregated into objects of the second type in the second storage pool.
  • the sequence of multiple objects of the first type aggregated by the object of the type avoids aggregation errors caused by repeatedly writing the object of the first type at the same position of the object of the second type.
  • the second storage pool is mainly responsible for storing the content data of the second type of objects, and the second type of objects are aggregated by multiple objects of the first type.
  • the content data of the object is the content data of multiple objects of the first type that are aggregated.
  • the aggregation upper limit of the second type of objects is stored in the second storage pool in advance, and the first type of objects that are not aggregated into the second type of objects are aggregated into the second storage pool.
  • the second type of objects may include: for each first type of object that is not aggregated into the second type of object in the first storage pool, if the aggregate capacity of the first type of object in the second type of object currently in the second storage pool is less than the aggregation upper limit , The first type of object is directly added to the current second type of object; if the aggregate capacity of the first type of object in the current second type of object in the second storage pool is not less than the aggregation upper limit, the next second type of object in the second storage pool
  • the class object is the new current second class object, and the first class object is added to the new current second class object; so that in the process of aggregating the first class object into the second class object, if the second class object is The aggregate occupied space of the aggregated objects of the first type has reached the upper limit of aggregation, and the objects of the first type are no longer aggregated to the objects of the second type, and the aggregation is switched to the next new object of the second type.
  • the object storage gateway aggregates objects of the first type that are not aggregated into objects of the second type in the first storage pool to objects of the second type in the second storage pool, for each object of the first type that is aggregated Before, it is necessary to check whether the aggregate capacity of multiple first-type objects that have been aggregated under the current second-type objects used for this aggregation in the second storage pool is greater than or equal to the aggregation upper limit of the second-type objects; If the aggregate capacity of an object of the first type is less than the upper limit of aggregation, the second type of object will continue to be added to the current second type of object in accordance with the data format of the mapping tag, attribute tag, content data and verification tag of the first type of object.
  • First-class objects if the aggregate capacity of multiple first-class objects that have been aggregated is not less than the upper limit of aggregation, the next second-class object in the second storage pool is regarded as the new current second-class object, and the remaining objects of this aggregation are continued
  • the objects of the first type are aggregated into the new current object of the second type.
  • the search process configured on the object storage gateway periodically searches for objects of the first type that are not aggregated into objects of the second type in the first storage pool, and finds the objects of the first type that are not aggregated into objects of the second type.
  • Objects are recorded one by one into the corresponding aggregation shards; among them, multiple aggregation shards are preset on the object storage gateway, and the multiple aggregation shards correspond to the multiple aggregation processes configured on the multiple aggregation shards one-to-one; the search process will be After multiple first-type objects found in the first storage pool that are not aggregated into second-type objects are recorded in multiple aggregate shards, multiple aggregation processes corresponding to the multiple aggregate shards will be collected from the corresponding aggregate shards.
  • the accuracy of aggregation of objects of the first type that are not aggregated into objects of the second type in the first storage pool is guaranteed to be aggregated into the objects of the second type in the second storage pool.
  • the search process configured on the object storage gateway and at least one aggregation process cooperate to determine the failure.
  • the search process configured on the object storage gateway and at least one aggregation process cooperate to determine the failure.
  • the search process configured on the object storage gateway will periodically replay multiple write logs and modification logs after the aggregate checkpoint log in the log record pool, and write multiple replay logs to multiple aggregate points.
  • each aggregation process reads the playback logs concurrently from the corresponding aggregation shards, and aggregates the first type of objects to which each playback log faces as the first type of objects that are not aggregated into the second type of objects.
  • the aggregation process will first detect multiple first types of objects that have been aggregated under the current second type of objects in the second storage pool during the aggregation process of each object of the first type.
  • the aggregate capacity of the object of the second type is greater than or equal to the upper limit of the aggregation of the object of the second type; if the aggregate capacity of multiple objects of the first type that have been aggregated is less than the aggregation of the object of the second type, continue to follow the current second type of object
  • the data format of the mapping label, attribute label, content data, and check label of the first type of object is added to the first type of object aggregated this time; if the aggregate capacity of multiple first type objects that have been aggregated is greater than or equal to the second type of object
  • the next second-type object in the second storage pool is taken as the new current second-type object, and the remaining first-type objects in this aggregation are continued to be aggregated into the new current second-type object.
  • the technical solution provided in this embodiment directly uses solid-state disk technology to construct the first storage pool on the internal storage space of the object storage platform, and uses conventional disk technology to construct the second storage pool.
  • the data read performance of the first storage pool is far greater.
  • the first type of objects that are not aggregated to the second type of objects are stored in the first storage pool, and between the first type of objects that have been aggregated to the second type of objects and the aggregated second type of objects
  • the aggregation mapping relationship between the first type of object is stored in the second storage pool, and the second type of object after the aggregation of the first type of object is stored. There is no need to use a third-party storage system to store the aggregation mapping relationship between the first type of object and the second type of object.
  • the storage structure of the first type of object improves the read performance of the first type of object that is not aggregated into the second type of object; at the same time, the object storage gateway can periodically collect the first type of object that is not aggregated to the second type of object in the first storage pool Aggregate into the second type of objects in the second storage pool to realize the dynamic aggregation between the first type of object to the second type of object, and prevent the excessive storage of the first type of object that is not aggregated to the second type of object in the first storage pool, Improve the storage performance of the first storage pool.
  • FIG. 4 is a flowchart of an object aggregation method provided in Embodiment 4 of this application. This embodiment is described on the basis of the foregoing embodiment. As shown in FIG. 4, this embodiment mainly explains the multiple operation processes on objects of the first type and the recovery process of objects of the second type that exist in the process of aggregating objects of the first type into objects of the second type.
  • the method may include the following steps:
  • S410 Periodically search for objects of the first type that are not aggregated into objects of the second type in the first storage pool.
  • S420 Aggregate objects of the first type that are not aggregated into objects of the second type into objects of the second type in the second storage pool, and replace the objects of the first type aggregated this time with the objects of the first type in the first storage pool The aggregation mapping relationship with the aggregated objects of the second type.
  • S430 Detect the reclaimable capacity of each second-type object in the second storage pool in real time, and rewrite multiple valid first-type objects aggregated by the target second-type object whose reclaimable capacity exceeds the preset recycling upper limit to the first In the storage pool, the aggregation mapping relationship between the multiple valid first-type objects in the first storage pool and the target second-type objects and the target second-type objects in the second storage pool are deleted.
  • the object storage gateway will direct the operation request to the first storage pool and the second storage pool according to the operation request Read, write, modify or delete corresponding object information; therefore, if the object storage gateway needs to delete an object of the first type according to user needs, and the object of the first type has been aggregated into a second storage pool In the second type of object, the content data of the first type of object needs to be deleted in the aggregated second type of object. There will be a corresponding storage gap in the second type of object, and because of the different content data of the first type of object It is also different.
  • a recycling process is configured on the second storage pool, and the reclaimable capacity of each second-type object is stored in the second storage pool.
  • the aggregation mapping relationship between the first type of object stored in the pool and the aggregated second type of object, and the first type of object deleted this time on the second storage pool is modified according to the content data volume of the first type of object deleted this time
  • the reclaimable capacity of the second-type objects aggregated by the class objects, and the subsequent real-time detection of the reclaimable capacity of each second-type object in the second storage pool through the recycling process so as to reclaim the first-type objects aggregated by the second-type objects in time Objects of the first type that have been deleted in the database, so as to ensure that the storage space of the deleted objects of the first type can be reclaimed in time.
  • the aggregation mapping relationship between the deleted first type of object and the aggregated second type of object has been deleted in the first storage pool, it can be based on the remaining records of the first storage pool and The aggregation mapping relationship related to the target second-type object, find out the multiple valid first-type objects that have not been deleted among the multiple first-type objects aggregated by the target second-type object, and then the target second-type object
  • the aggregated multiple valid first-type objects are rewritten into the first storage pool, and the aggregated mapping relationship between multiple valid first-type objects in the first storage pool and the target second-type object is deleted, and the second Target second-type objects in the storage pool, that is, re-migrate multiple valid first-type objects aggregated by the target second-type objects to the first storage pool, and delete all the target second-type objects in the second storage pool Class objects, so as to ensure that the storage space of the deleted first-type objects can be reclaimed in time, and avoid waste of storage space on the second-class objects aggregated by the deleted first-type objects.
  • rewriting multiple valid first-type objects aggregated by the target second-type object whose reclaimable capacity exceeds the preset recycling upper limit into the first storage pool may include: Find the aggregation mapping relationship between each first-type object aggregated by the target second-type object and the target second-type object in the pool; aggregate the first-type object whose aggregation mapping relationship is not empty as the target second-type object The valid first-type objects are rewritten to the first storage pool.
  • each first-type object aggregated by the target second-type object By sequentially determining each first-type object aggregated by the target second-type object, and judging whether there is still an aggregation mapping relationship between the first-type object and the target second-type object in the first storage pool, if in the first storage pool, There is no aggregation mapping relationship between the first type of object and the target second type of object in a storage pool, indicating that the first type of object has been deleted without any processing, if the first type of object exists in the first storage pool
  • the aggregation mapping relationship between the object and the target second-type object it can be determined that the first-type object whose aggregation mapping relationship is non-empty is the first-type object that has not been deleted in the target second-type object, and it is regarded as the target second-type object
  • the valid first-type objects aggregated by the objects are rewritten into the first storage pool, and the aggregated mapping relationship between multiple valid first-type objects in the first storage pool and the target second-type object is deleted
  • this embodiment will use the object storage gateway to detect in real time whether the user has an operation requirement for the first type of object. If an operation request for the first type of object is received, it will directly enter the The object information pointed to by the operation request is found under the pool.
  • the object information can be the content data of the first-type object that is not aggregated to the second-type object in the first storage pool, or the first-type object and the first-type object that has been aggregated to the second-type object in the first storage pool.
  • the object information pointed to by the operation request is updated related to this operation to ensure the dynamic update of the object information on the first storage pool and the second storage pool under multiple operations, thereby improving the accuracy of the object operation.
  • this embodiment will have multiple operations such as writing, reading, deleting, and modifying the stored objects of the first type according to user requirements.
  • the following operations are performed on the objects of the first type in this embodiment.
  • the first type of object written this time is regarded as the first type of object that is not aggregated into the second type of object, and it is directly stored in the first storage pool and stored in the The log record pool records the write log of the first type of object written this time.
  • the first type of object written this time is regarded as the first type of object that is not aggregated into the second type of object, and it is directly stored in the first storage pool , And correspondingly record the write log of the first type of object written this time in the log record pool, so as to periodically replay multiple write logs and modification logs after the aggregate checkpoint log in the log record pool.
  • the write log of the first-type object written will be played back, and the first-type object written this time will be aggregated into the second-type object in the second storage pool.
  • the content data of the first type of object read this time will be read in the first storage pool; if the content data read in the first storage pool is empty , The aggregation mapping relationship between the first-type object read this time and the aggregated second-type object is found in the first storage pool, and the aggregation mapping relationship in the second storage pool is read this time according to the aggregation mapping relationship. Continue to read the content data of the first type of object read this time from the second type of object aggregated by the first type of object.
  • the first type of object that is not aggregated to the second type of object will be stored in the first storage pool, and the first type of object that has been aggregated to the second type of object will be stored in the first storage pool
  • the aggregation mapping relationship with the aggregated objects of the second type so first read the content data of the first type of object read this time in the first storage pool, if the content of the first type of object can be read Data, it means that the first type of object read this time has not been aggregated into the second type of object in the second storage pool, and then the read content data is directly fed back to the user; and if it is read in the first storage pool If the content data is empty, it means that the first type of object read this time has been aggregated into the second type of object in the second storage pool, so it is necessary to find the first type of object read this time in the first storage pool
  • the aggregation mapping relationship with the aggregated second-type object such as the offset position of the first-type
  • the delete request for the first-type object is received through the object storage gateway, and the first-type object deleted this time is the first-type object that is not aggregated into the second-type object, it means that the first-type object storage read this time In the first storage pool, the write log or modification log of the first type of object deleted this time is recorded in the log record pool, so that the first type of object can be aggregated into the second type of object when the log is periodically replayed.
  • this embodiment directly deletes the content data of the first-type objects deleted this time in the first storage pool, and at the same time, in order to avoid aggregation errors caused by the aggregation of the first-type objects deleted this time during the subsequent periodic log playback, Delete the write log or modification log of the first-type object deleted this time from the log record pool, so that the write log or modification log of the first-type object that has been deleted will not be replayed later, so as to ensure the accurate aggregation of the first-type object sex.
  • the delete request for the first-type object is received through the object storage gateway, and the first-type object deleted this time is the first-type object that has been aggregated into the second-type object, it means the first-type object deleted this time Among the objects of the second type stored in the second storage pool, the write log or modification log of the first type of object deleted this time has been replayed in the log record pool, and will not be replayed, so it only needs to be in the first storage pool Delete the aggregation mapping relationship between the first type of object deleted this time and the aggregated second type of object, without the need for the second type of object aggregated by the first type of object deleted this time in the second storage pool
  • the recycling process will be configured on the object storage gateway to detect the reclaimable capacity of each second type of object in the second storage pool to uniformly perform the corresponding recycling operation, in order to ensure recycling Accuracy, only need to update the reclaimable capacity of the second type of objects aggregated from
  • the modified first type of this time in the first storage pool is updated The content data of the object, at the same time, record the modification log of the first type object modified this time in the log record pool, and delete the write log of the first type object modified this time; if the modification request of the first type object is received, And the first type of object modified this time is the first type of object aggregated to the second type of object, then the first type of object after this modification is directly written in the first storage pool, and the modified first type of object is directly written in the first storage pool.
  • the aggregation mapping relationship between the objects of the first type before this modification and the aggregated objects of the second type is deleted in the first storage pool.
  • the modification request of the first type of object is received through the object storage gateway, and the first type of object modified this time is the first type of object that is not aggregated into the second type of object, it means that the first type of object stored this time is read In the first storage pool, the write log of the first-type object deleted this time is recorded in the log record pool, so the content data of the first-type object modified this time in the first storage pool can be directly updated, and at the same time, to avoid subsequent When the log is played back regularly, the content data originally written by the first-type object modified this time is aggregated and the aggregation error is caused. The modification log of the first-type object modified this time will also be recorded in the log record pool, and this time will be deleted.
  • the write log of the modified first-type object so that only the modified log of the first-type object modified this time will be replayed later, and the write log of the first-type object modified this time will not be played back before the modification, so as to ensure this modification
  • the modification request of the first type of object is received through the object storage gateway, and the first type of object modified this time is the first type of object aggregated to the second type of object, it means that the modified first type of object storage In the second type of object in the second storage pool, the write log of the first type of object that has been modified this time will not be replayed in the logging pool. Therefore, it only needs to be directly written in the first storage pool.
  • the corresponding content data will be first read in the first storage pool. If the content data can be read, the first-type object will not be paid attention to. Aggregate mapping relationship with objects of the second type. Therefore, even if the modified first type of object is written in the first storage pool in this embodiment, the first type of object that has been stored in the first storage pool is not deleted.
  • the aggregation mapping relationship between the content data before the modification and the aggregated second-type object is also guaranteed, the accuracy of reading the first-type object of this modification can also be guaranteed, so this modification is written in the first storage pool
  • the modification log of the first type of object is recorded in the log record pool, so that when the log is periodically replayed, the modified first type of object can be re-aggregated into the second type of object in the second storage pool according to the modification log.
  • the technical solution provided in this embodiment directly uses solid-state disk technology to construct the first storage pool on the internal storage space of the object storage platform, and uses conventional disk technology to construct the second storage pool.
  • the data read performance of the first storage pool is far greater.
  • the data reading performance is higher than that of the second storage pool.
  • the first type of objects that are not aggregated to the second type of objects are stored in the first storage pool, and the first type of objects that have been aggregated to the second type of objects and the aggregated first type of objects are stored in the first storage pool.
  • the aggregation mapping relationship between the two types of objects, the second type of objects after the aggregation of the first type of objects are stored in the second storage pool, without the need to use a third-party storage system to store the aggregation mapping between the first type of objects and the second type of objects Relationship, adjust the storage structure of the first type of object, improve the read performance of the first type of object that is not aggregated to the second type of object; at the same time, the object storage gateway can periodically aggregate the first storage pool to the second type of object
  • the objects of the first type are aggregated into the objects of the second type in the second storage pool to realize the dynamic aggregation between the objects of the first type to the objects of the second type, and prevent the first type of objects in the first storage pool from not being aggregated to the first type of the second type of objects Excessive storage of objects improves the storage performance of the first storage pool; at the same time, the reclaiming process detects the reclaimable capacity of each second-type object in the second storage pool in real time to ensure
  • FIG. 5 is a schematic structural diagram of an object aggregation device provided in the fifth embodiment of the application, which is set in the object storage platform provided in the foregoing embodiment. As shown in FIG. 5, the device may include:
  • the object search module 510 is set to periodically search for objects of the first type that are not aggregated into objects of the second type in the first storage pool; the object aggregation module 520 is set to aggregate objects of the first type that are not aggregated into objects of the second type to Among the objects of the second type in the second storage pool, in the first storage pool, replace the objects of the first type aggregated this time with the aggregation mapping relationship between the objects of the first type and the objects of the second type aggregated; where ,
  • the first storage pool is a storage space constructed using solid-state disk technology on the internal storage space of the object storage platform. The first type of objects that are not aggregated into the second type of objects are stored, and the first type of objects that have been aggregated to the second type of objects are stored.
  • the aggregation mapping relationship between a type of object and the aggregated object of the second type is a storage space constructed using conventional disk technology on the internal storage space of the object storage platform, and the first type of object aggregation is stored After the second category of objects.
  • the technical solution provided in this embodiment directly uses solid-state disk technology to construct the first storage pool on the internal storage space of the object storage platform, and uses conventional disk technology to construct the second storage pool.
  • the data read performance of the first storage pool is far greater.
  • the first type of objects that are not aggregated to the second type of objects are stored in the first storage pool, and between the first type of objects that have been aggregated to the second type of objects and the aggregated second type of objects
  • the aggregation mapping relationship between the first type of object is stored in the second storage pool, and the second type of object after the aggregation of the first type of object is stored. There is no need to use a third-party storage system to store the aggregation mapping relationship between the first type of object and the second type of object.
  • the storage structure of the first type of object improves the read performance of the first type of object that is not aggregated into the second type of object; at the same time, the object storage gateway can periodically collect the first type of object that is not aggregated to the second type of object in the first storage pool Aggregate into the second type of objects in the second storage pool to realize the dynamic aggregation between the first type of object to the second type of object, and prevent the excessive storage of the first type of object that is not aggregated to the second type of object in the first storage pool, Improve the storage performance of the first storage pool.
  • the object aggregation device provided in this embodiment is applicable to the object aggregation method provided in any of the above embodiments, and has corresponding functions and effects.
  • FIG. 6 is a schematic structural diagram of a server provided in Embodiment 6 of the application.
  • the server includes a processor 60, a storage device 61, and a communication device 62; the number of processors 60 in the server may be one or more.
  • a processor 60 is taken as an example in FIG. 6; the processor 60, the storage device 61, and the communication device 62 in the server may be connected through a bus or in other ways. In FIG. 6, the connection through a bus is taken as an example.
  • the server provided in this embodiment can be configured to execute the object aggregation method provided in any of the foregoing embodiments, and has corresponding functions and effects.
  • the seventh embodiment of the present application also provides a computer-readable storage medium on which a computer program is stored, and when the program is executed by a processor, the object aggregation method in any of the foregoing embodiments can be implemented.
  • the method may include the following steps: periodically searching for objects of the first type that are not aggregated into objects of the second type in the first storage pool; aggregate the objects of the first type that are not aggregated into objects of the second type to the second type of objects in the second storage pool.
  • the internal storage space of the object storage platform uses solid-state disk technology to construct the storage space, and stores the first type of objects that are not aggregated into the second type of objects, and the first type of objects that have been aggregated to the second type of objects and the aggregated The aggregation mapping relationship between objects of the second type.
  • the second storage pool is a storage space constructed using conventional disk technology on the internal storage space of the object storage platform, and stores objects of the second type after aggregation of the first type of objects.
  • An embodiment of the present application provides a storage medium containing computer-executable instructions.
  • the computer-executable instructions are not limited to the method operations described above, and can also perform related operations in the object aggregation method provided by any embodiment of the present application. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

一种对象存储平台以及对象聚合方法、装置和服务器。该对象存储平台(10)包括:对象存储网关(110)、第一存储池(120)和第二存储池(130);第一存储池(120)设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系;第二存储池(130)设置为存储由多个第一类对象聚合后的第二类对象;其中,第一存储池(120)所支持的数据读写性能高于第二存储池所支持的数据读写性能;对象存储网关设置为定期将第一存储池(120)内未聚合到第二类对象的第一类对象聚合到第二存储池(130)内的第二类对象中,并在第一存储池(120)内将本次聚合的第一类对象更换为本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系。

Description

对象存储平台以及对象聚合方法、装置和服务器
本申请要求在2020年05月25日提交中国专利局、申请号为202010450286.9的中国专利申请的优先权,该申请的全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据处理技术领域,例如涉及一种对象存储平台以及对象聚合方法、装置和服务器。
背景技术
随着多媒体资源(如图片、音视频等)的逐渐增加,开源的Ceph文件系统对于不同大小的文件对象,会存在相应的存储和操作需求,而Ceph文件系统会为数据存储分配一个最小空间单元,即使小对象的数据量小于该最小空间单元也会全部占用该最小空间单元,导致存储空间的极大浪费;同时Ceph文件系统在扩容或者硬件故障时,通常以所支持的最小操作粒度为单位对其上存储的文件对象进行转移,对于数据量低于该最小操作粒度的小对象,Ceph文件系统在执行转换操作时可能会存在丢失数据的风险,而且大量小对象的转移会极大增加Ceph文件系统的数据读写负载。
Ceph文件系统针对数据量低于最小操作粒度的小对象,将多个小对象合并成一个大对象存储时,会利用第三方存储系统额外存储每个小对象的自定义对象名与在大对象中的合并位置之间的映射关系,在上述情况下Ceph文件系统借助第三方存储系统的功能,增加了对象存储的复杂性和维护难度,并使合并后的对象操作性能较差。
发明内容
本申请提供了一种对象存储平台以及对象聚合方法、装置和服务器,调整第一类对象的存储结构,降低对象存储的复杂性。
提供了一种对象存储平台,平台包括:对象存储网关、第一存储池和第二存储池;其中,
所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技 术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象;其中,所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能;
所述对象存储网关设置为定期将所述第一存储池内未聚合到第二类对象的第一类对象聚合到所述第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系。
还提供了一种对象聚合方法,应用于上述的对象存储平台中,包括:
定期在第一存储池内查找出未聚合到第二类对象的第一类对象;
将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
其中,所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象,所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能。
还提供了一种对象聚合装置,设置于上述的对象存储平台中,包括:
对象查找模块,设置为定期在第一存储池内查找出未聚合到第二类对象的第一类对象;
对象聚合模块,设置为将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
其中,所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象,所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能。
还提供了一种服务器,包括:
一个或多个处理器;
存储装置,设置为存储一个或多个程序;
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述的对象聚合方法。
还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时实现上述的对象聚合方法。
附图说明
图1为本申请实施例一提供的一种对象存储平台的原理架构图;
图2为本申请实施例二提供的一种对象存储平台的结构示意图;
图3为本申请实施例三提供的一种对象聚合方法的流程图;
图4为本申请实施例四提供的一种对象聚合方法的流程图;
图5为本申请实施例五提供的一种对象聚合装置的结构示意图;
图6为本申请实施例六提供的一种服务器的结构示意图。
具体实施方式
下面结合附图和实施例对本申请进行说明。
实施例一
图1为本申请实施例一提供的一种对象存储平台的原理架构图。本实施例可适用于开源的Ceph文件系统下对任一对象进行存储的情况中。参照图1,该对象存储平台10可以包括:对象存储网关110、第一存储池120和第二存储池130。
第一存储池120为在对象存储平台10的内部存储空间上采用固态磁盘技术构建出的存储空间,存储有未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,第二存储池130为在对象存储平台10的内部存储空间上采用常规磁盘技术构建出的存储空间,存储有第一类对象聚合后的第二类对象。
对象存储网关110定期将第一存储池120内未聚合到第二类对象的第一类对象聚合到第二存储池130内的第二类对象中,并在第一存储池120内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映 射关系。
可选的,由于Ceph文件系统所支持的最小操作粒度为特定大小的对象,对于数据量高于该最小操作粒度的对象,Ceph文件系统能够准确无误地执行相应的迁移或转换等操作,而对于数据量低于该最小操作粒度的对象,Ceph文件系统在执行迁移或转换等操作时可能会存在丢失数据的风险,无法支持对象的完整操作,为了解决此类问题,本实施例提出将多个数据量低于该最小操作粒度的对象聚合成一个数据量高于该最小操作粒度的对象进行存储,后续整体执行迁移或转换等操作,因此本实施例中的第一类对象为操作粒度无法支持对象完整操作的对象,也就是Ceph文件系统内数据量低于所支持的最小操作粒度的对象,第二类对象为操作粒度支持对象完整操作,且用于聚合第一类对象的容器,也就是Ceph文件系统内预先设定的数据量高于所支持的最小操作粒度的对象容器,该容器能够聚合大量第一类对象;以媒体资源为例,本实施例中的第一类对象可以为互联网系统下存储的多类图片,而第二类对象可以为互联网系统下由大量图片聚合得到的视频。
本实施例在将第一类对象聚合到第二类对象的过程中,需要为第一类对象到第二类对象之间的聚合操作设定对应的存储结构和操作流程,准确将多个第一类对象共同聚合到一个第二类对象中,使得后续仅需要关注第二类对象的存储工作,并通过第二类对象统一执行所聚合的第一类对象的迁移或转换等操作,避免对第一类对象的操作过程中存在丢失数据的风险,提高Ceph文件系统中第一类对象的迁移或转换等操作的效率。
由于对象存储中,会额外配置一个第三方存储系统来存储多个小对象的对象名与在大对象中的合并位置之间的映射关系,增加了对象存储的复杂度和维护难度,因此本实施例中会直接在Ceph文件系统采用不同读写性能的磁盘技术,将该Ceph文件系统的内部存储空间区分成两个不同的存储空间,在内部存储空间上采用固态磁盘技术构建第一存储池120,在内部存储空间上采用常规磁盘技术构建第二存储池130,第一存储池120所支持的数据读写性能远高于第二存储池130所支持的数据读写性能,因此为了在对象聚合的基础上,保证部分第一类对象的读写性能,本实施例对于在Ceph文件系统内刚开始写入的每个第一类对象,都可以先将该第一类对象作为未聚合到第二类对象的第一类对象,直接存储到第一存储池120中,后续通过对象存储网关110定期将第一存储池120内的未聚合到第二类对象的第一类对象聚合到第二存储池130内的第二类对象中,无需借助第三方存储系统的功能,同时保证了未聚合到第二类对象的第一类对象的读写性能。
本实施例主要面向Ceph文件系统下的对象存储,因此对象存储网关110可 以为Ceph文件系统下配置的可靠,自治,分布式对象存储(Reliable,Autonomic Distributed Object Store,RADOS)网关(RADOS Gateway,RGW)。
在本实施例中,第二存储池130内的第二类对象可以由所聚合的每个第一类对象的映射标签、属性标签、内容数据和校验标签组成。其中,映射标签用于指示第二类对象所聚合的第一类对象与该第二类对象之间的聚合映射关系,例如所聚合的第一类对象在第二类对象中的聚合偏移位置以及该第一类对象的数据长度等;属性标签用于指示第二类对象所聚合的多个第一类对象的对象标识和索引分类等元数据信息,本实施例的对象存储平台10还可以包括索引记录池140,设置为记录第一类对象的索引分类,例如,在对象存储平台10内每写入一个第一类对象时,首先会通过分析该第一类对象的内容分类,而在索引记录池140的对应索引分类标签下填写该第一类对象的对象标识,以指示本次写入的第一类对象属于该索引分类标签下的内容;内容数据为第二类对象所聚合的每个第一类对象的实际内容;校验标签用于判断第二类对象所聚合的每个第一类对象是否聚合出错。在将第一类对象聚合到第二类对象中时,直接将本次聚合的第一类对象按照映射标签、属性标签、内容数据和校验标签的数据格式依次追加到该第二类对象内最后聚合的第一类对象之后,从而保证第二类对象所聚合的多个第一类对象的顺序,避免在第二类对象的同一位置重复写入第一类对象而导致的聚合出错。
在将第一类对象聚合到第二类对象的过程中,第二存储池130主要负责存储第二类对象的内容数据,而第二类对象由多个第一类对象聚合而成,第二类对象的内容数据则为所聚合的多个第一类对象的内容数据,因此为了确保第一类对象到第二类对象的成功聚合,并防止第二类对象下第一类对象的过度聚合,本实施例会通过分析Ceph文件系统所支持的最小操作粒度,在第二存储池130上预先存储第二类对象的聚合上限,使得在将第一类对象聚合到第二类对象的过程中,如果第二类对象内所聚合的第一类对象的聚合占用空间已经达到该聚合上限,则不再向该第二类对象中继续聚合第一类对象,而切换到下一个新的第二类对象中聚合。
示例性的,对象存储网关110在将第一存储池120内未聚合到第二类对象的第一类对象聚合到第二存储池130内的第二类对象中的聚合过程中,对于每聚合一个第一类对象之前,首先需要检测第二存储池130内用于本次聚合的当前第二类对象下已经聚合的多个第一类对象的聚合容量是否大于或等于第二类对象的聚合上限;若已经聚合的多个第一类对象的聚合容量小于第二类对象的聚合上限,则继续在该当前第二类对象中按照第一类对象的映射标签、属性标签、内容数据和校验标签的数据格式追加本次聚合的第一类对象;若已经聚合的多个第一类对象的聚合容量大于或等于第二类对象的聚合上限,将第二存储 池130内的下一第二类对象作为新的当前第二类对象,继续将本次聚合剩余的第一类对象聚合到新的当前第二类对象中。
对象存储网关110根据用户读写需求,会接收到用户对于其上存储的第一类对象的操作请求,对象存储网关110会根据该操作请求,对第一存储池120和第二存储池130下该操作请求指向的对象信息进行相应的读取、写入、修改或删除等处理;因此,如果对象存储网关110根据用户需求要删除一个第一类对象,且该第一类对象已经聚合到第二存储池130内的一个第二类对象中,则需要在所聚合的第二类对象内相应删除该第一类对象的内容数据,该第二类对象内会出现相应存储空缺,而由于不同第一类对象的内容数据也不同,因此无法在第二类对象内的存储空缺位置再次添加新的第一类对象,使得第二类对象下的存储空间无法被完全使用;为了解决上述问题,本实施例会在对象存储网关110上配置有回收进程111,并在第二存储池130上存储每个第二类对象的可回收容量,在删除第二存储池130内第二类对象所聚合的第一类对象时,当前仅需要删除第一存储池120内存储的该第一类对象与所聚合到的第二类对象之间的聚合映射关系,并根据本次删除的第一类对象的内容数据量来修改第二存储池130上本次删除的第一类对象所聚合到的第二类对象的可回收容量,后续通过回收进程111实时检测第二存储池130内每个第二类对象的可回收容量,来及时回收第二类对象所聚合的第一类对象中已经被删除的第一类对象,从而保证所删除的第一类对象的存储空间可以及时被回收。
回收进程111实时检测第二存储池130内每个第二类对象的可回收容量,进而查找出可回收容量超出预设回收上限的目标第二类对象,该目标第二类对象所聚合的大部分第一类对象已经被删除,由于已经删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系已经在第一存储池120内删除了,因此可以根据第一存储池120剩余记录的与目标第二类对象相关的聚合映射关系,在该目标第二类对象所聚合的多个第一类对象中查找出未被删除的多个有效第一类对象,进而将该目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池120中,并删除第一存储池120内多个有效第一类对象与该目标第二类对象之间的聚合映射关系,以及第二存储池130中内的目标第二类对象,也就是将该目标第二类对象所聚合的多个有效第一类对象重新迁移到第一存储池110内,并在第二存储池130内全部删除该目标第二类对象,从而保证所删除的第一类对象的存储空间可以及时被回收,避免所删除的第一类对象所聚合的第二类对象上的存储空间浪费。
本实施例提供的技术方案,直接在对象存储平台的内部存储空间上采用固态磁盘技术构建出第一存储池,采用常规磁盘技术构建出第二存储池,第一存储池的数据读取性能远高于第二存储池,在第一存储池内存储未聚合到第二类 对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,在第二存储池内存储第一类对象聚合后的第二类对象,无需借助第三方存储系统来存储第一类对象与第二类对象之间的聚合映射关系,调整了第一类对象的存储结构,提高了未聚合到第二类对象的第一类对象的读取性能;同时,对象存储网关能够定期将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,实现第一类对象到第二类对象之间的动态聚合,防止第一存储池内未聚合到第二类对象的第一类对象的过度存储,提高第一存储池的存储性能。
实施例二
图2为本申请实施例二提供的一种对象存储平台的结构示意图。本实施例是在上述实施例提供的技术方案的基础上进行说明。参照图2,该对象存储平台20可以包括对象存储网关210、第一存储池220、第二存储池230、索引记录池240和日志记录池250。
第一存储池220为在对象存储平台20的内部存储空间上采用固态磁盘技术构建出的存储空间,存储有未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,第二存储池230为在对象存储平台20的内部存储空间上采用常规磁盘技术构建出的存储空间,存储有第一类对象聚合后的第二类对象,第二存储池230还存储有第二类对象的聚合上限;对象存储网关210上配置有回收进程211,第二存储池230还存储有每个第二类对象的可回收容量;索引记录池240设置为记录第一类对象的索引分类;日志记录池250设置为记录第一类对象的写入日志和修改日志,并标记对应的聚合检查点日志。
对象存储网关210定期将第一存储池220内未聚合到第二类对象的第一类对象聚合到第二存储池230内的第二类对象中,并在第一存储池220内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系。同时,对象存储网关210在将第一存储池220内未聚合到第二类对象的第一类对象聚合到第二存储池230内的第二类对象中时,如果第二存储池230内当前第二类对象中第一类对象的聚合容量大于或等于聚合上限,则将第二存储池内的下一第二类对象作为新的当前第二类对象,继续将本次聚合剩余的第一类对象聚合到新的当前第二类对象中。此外,对象存储网关210上配置的回收进程211实时在第二存储池230内查找出可回收容量超出预设回收上限的目标第二类对象,将目标第二类对象所聚合的多个第一类对象中未删除的多个有效第一类对象重新写入到第一存储池220中,并删除第一存储池220内多个有 效第一类对象与目标第二类对象之间的聚合映射关系,以及第二存储池230内的目标第二类对象。
可选的,为了指示将第一存储池220内未聚合到第二类对象的第一类对象聚合到第二类对象中的聚合过程,避免第一存储池220内未聚合到第二类对象的第一类对象的聚合遗漏,本实施例会在对象存储网关210上额外配置对应的查找进程212和至少一个聚合进程213,后续由查找进程212和每个聚合进程213来共同配合,在第一存储池220内查找未聚合到第二类对象的第一类对象,并将所查找出的第一类对象聚合到第二存储池230的第二类对象中。
对象存储网关210上配置的查找进程212会定期在第一存储池220内查找未聚合到第二类对象的第一类对象,并将所查找出的未聚合到第二类对象的第一类对象逐一记录到对应的对象存储网关210的聚合分片中;其中,对象存储网关210上会预先设置多个聚合分片,多个聚合分片与多个聚合分片上配置的多个聚合进程213一一对应;查找进程212将从第一存储池220内查找出的未聚合到第二类对象的多个第一类对象分别记录到多个聚合分片后,多个聚合分片对应的多个聚合进程213会从相应聚合分片中并发读取该多个聚合分片中所记录的未聚合到第二类对象的多个第一类对象,并将读取出的第一类对象不断聚合到第二存储池230内的第二类对象中,同时在第一存储池220内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系,使得第一存储池220内不再存储已聚合到第二类对象的第一类对象的内容数据,而只存储已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,从而在避免第一存储池220过度存储的基础上,保证将第一存储池220内未聚合到第二类对象的第一类对象聚合到第二存储池230的第二类对象中的聚合准确性。
为了保证第一存储池220内所查找出的未聚合到第二类对象中的第一类对象的全面性,本实施例会预先设置一个日志记录池250,由于对于新写入或修改的第一类对象,本实施例会将其作为未聚合到第二类对象中的第一类对象,先存储到第一存储池220内,后续再定期聚合到第二存储池230的第二类对象中,以保证部分第一类对象的读写性能,因此对象存储网关210在根据用户需求在第一存储池220内写入新的第一类对象或者修改原有的第一类对象时,会在该日志记录池250中记录该第一类对象的写入日志或修改日志,以供对象存储网关210通过定期回放该日志记录池250中所记录的多个日志,并查找出未聚合到第二类对象中的每个第一类对象的回放日志,进而将所查找出的回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象,聚合到第二存储池230内的第二类对象中。
为了保证所回放的日志能够准确判断所对应的第一类对象是否已经聚合到第二存储池230内的第二类对象中,本实施例在定期回放日志记录池250中所记录的多个日志时,还会相应标记出本次最后回放的日志,作为本实施例中的聚合检查点日志,由于对象存储网关210会按照日志记录顺序来回放日志记录池250中的日志,因此能够确定日志记录池250中位于所标记的聚合检查点日志之前的日志均已经回放过,而已经回放过的日志所面向的第一类对象在回访过程中已经聚合到第二存储池230内的第二类对象中,也就是日志记录池250中位于所标记的聚合检查点日志之前的多个写入日志和修改日志所面向的第一类对象均已经聚合到第二存储池230内的第二类对象中,而位于所标记的聚合检查点日志之后的多个写入日志和修改日志所面向的第一类对象还未聚合到第二存储池230内的第二类对象中,因此对象存储网关210在通过定期回放日志记录池250中的日志来查找未聚合到第二类对象的第一类对象时,仅需要定期回放日志记录池250中位于聚合检查点日志之后的多个写入日志和修改日志,每一回放日志面向的第一类对象为未聚合到第二类对象的第一类对象,进而将每一回放日志面向的第一类对象聚合到第二存储池230内的第二类对象中,同时根据本次聚合状态,在日志记录池250内重新标记对应的聚合检查点日志,下次回放日志时,继续从该聚合检查点日志开始回放;例如,如果本次定期回放的所有日志全部回放完成,则将本次回放的最后一个日志作为新的聚合检查点日志,在日志记录池250中标记出来,下次从该新的聚合检查点日志开始继续回放,从而提高未聚合到第二类对象的第一类对象的查找效率和准确性。
示例性的,本实施例以通过定期回放日志记录池250中位于聚合检查点日志之后的多个写入日志和修改日志,由对象存储网关210上配置的查找进程212和至少一个聚合进程213共同配合来确定未聚合到第二类对象的多个第一类对象为例,对第一存储池220内未聚合到第二类对象的多个第一类对象的查找过程进行说明:
对象存储网关210上配置的查找进程212会定期回放日志记录池250中位于聚合检查点日志之后的多个写入日志和修改日志,并将多个回放日志对应写入到多个聚合分片中,每个聚合进程213分别从各自对应的聚合分片中并发读取回放日志,并将每一回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象聚合到第二存储池230内的第二类对象中,聚合进程213在每一第一类对象的聚合过程中,首先会检测第二存储池230内用于本次聚合的当前第二类对象下已经聚合的多个第一类对象的聚合容量是否大于或等于第二类对象的聚合上限;若已经聚合的多个第一类对象的聚合容量小于第二类对象的聚合上,则继续在该当前第二类对象中按照第一类对象的映射标签、属性标签、内容数据和校验标签的数据格式追加本次聚合的第一类对象;若已经聚合的多个 第一类对象的聚合容量大于或等于第二类对象的聚合上,将第二存储池230内的下一第二类对象作为新的当前第二类对象,继续将本次聚合剩余的第一类对象聚合到新的当前第二类对象中。
本实施例根据用户需求对于对象存储平台20中的第一类对象会存在相应的写入、读取、删除和修改等多项操作,以下对于本实施例中第一类对象的每一操作过程进行解释说明:
1)针对第一类对象的写入操作,为了保证部分第一类对象的读写性能和避免第一存储池220的过度存储,本实施例对于每一第一类对象的初步写入,会直接将新写入的第一类对象作为未聚合到第二类对象中的第一类对象,直接将该第一类对象的内容数据存储到第一存储池220中,后续由对象存储网关210定期将未聚合到第二类对象的多个第一类对象统一聚合到第二存储池230内的第二类对象中,从而在避免第一存储池220的过度存储的基础上,提高第一类对象在初步写入阶段的读写性能。
对象存储网关210如果接收到对第一类对象的写入请求,则首先将本次写入的第一类对象作为未聚合到第二类对象的第一类对象,直接存储到第一存储池220内,并在日志记录池250中对应记录本次写入的第一类对象的写入日志,以便后续通过定期回放日志记录池250中位于聚合检查点日志之后的多个写入日志和修改日志,本次写入的第一类对象的写入日志则会被回放,进而将本次写入的第一类对象聚合到第二存储池230内的第二类对象中。
2)针对第一类对象的读取操作,由于第一类对象既可存储在第一存储池220内,也可存储到第二存储池230内的第二类对象中,因此为了保证所读取的第一类对象的准确性,本实施例首先会在第一存储池220内读取本次读取的第一类对象的内容数据;如果在第一存储池220内读取出的内容数据为空,则继续在第一存储池220内查找出本次读取的第一类对象与所聚合到的第二类对象之间的聚合映射关系,并根据该聚合映射关系在第二存储池230内查找出本次读取的第一类对象所聚合到的第二类对象,进而在该第二类对象中继续读取本次读取的第一类对象的内容数据。
对象存储网关210如果接收到第一类对象的读取请求,则由于第一存储池220内会存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,因此首先在第一存储池220内读取本次读取的第一类对象的内容数据,如果能够读取出该第一类对象的内容数据,则说明本次读取的第一类对象还未聚合到第二存储池230内的第二类对象中,进而直接将所读取的内容数据反馈给用户;而如果在第一存储池220内读取出的内容数据为空,则说明本次读取的第一类对象已经聚合到第 二存储池230内的第二类对象中,因此需要在第一存储池220内查找出本次读取的第一类对象与所聚合到的第二类对象之间的聚合映射关系,如第一类对象在所聚合到的第二类对象中的偏移位置以及该第一类对象的数据长度等,进而根据该聚合映射关系在第二存储池230内查找出本次读取的第一类对象所聚合到的第二类对象,并在该第二类对象的相应偏移位置下继续读取出相应数据长度的内容数据,作为本次读取的第一类对象的内容数据。此外,为了保证所读取的内容数据的准确性,本实施例还会读取出该第一类对象的校验标签,并采用该校验标签来判断本次读取的内容数据是否出错,从而提高第一类对象的读取准确性。
3)针对第一类对象的删除操作,由于第一类对象既可存储在第一存储池220内,也可存储到第二存储池230内的第二类对象中,因此首先需要判断本次删除的第一类对象为未聚合到第二类对象的第一类对象,还是已聚合到第二类对象的第一类对象,以便后续执行不同的删除操作,确保本次删除的准确性。
对象存储网关210如果接收到第一类对象的删除请求,且本次删除的第一类对象为未聚合到第二类对象的第一类对象,则说明本次读取的第一类对象存储在第一存储池220内,在日志记录池250中记录有本次删除的第一类对象的写入日志或者修改日志,以便后续定期回放日志时能够将该第一类对象聚合到第二类对象中,因此本实施例直接在第一存储池220内删除本次删除的第一类对象的内容数据,同时为了避免后续定期回放日志时对本次删除的第一类对象进行聚合而导致聚合出错,还会在日志记录池250中删除本次删除的第一类对象的写入日志或者修改日志,使得后续不会回放已经删除的第一类对象的写入日志或修改日志,从而保证第一类对象的聚合准确性。但是,对象存储网关210如果接收到第一类对象的删除请求,且本次删除的第一类对象为已聚合到第二类对象的第一类对象,则说明本次删除的第一类对象存储在第二存储池230内的第二类对象中,在日志记录池250中已经回放过本次删除的第一类对象的写入日志或修改日志,不会再次回放,因此仅需要在第一存储池220内删除本次删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系即可,而无需在第二存储池230内本次删除的第一类对象所聚合到的第二类对象中删除该第一类对象的内容数据,会通过对象存储网关210上配置的回收进程211来通过检测第二存储池230内的每个第二类对象的可回收容量来统一执行相应的回收操作,为了保证回收准确性,仅需要在第二存储池230内更新本次删除的第一类对象所聚合到的第二类对象的可回收容量,也就是在原有的可回收容量的基础上,再加上本次删除的第一类对象的数据长度,后续在第二类对象的可回收容量超出预设回收上限时进行相应的回收,将可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池220中, 并删除第一存储池220内多个有效第一类对象与目标第二类对象之间的聚合映射关系,以及第二存储池230内的目标第二类对象。
4)针对第一类对象的修改操作,本实施例中的修改操作主要为对于当前已经写入的第一类对象的内容数据进行修改,由于第一类对象既可存储在第一存储池220内,也可存储到第二存储池230内的第二类对象中,因此首先需要判断本次修改的第一类对象为未聚合到第二类对象的第一类对象,还是已聚合到第二类对象的第一类对象,以便后续执行不同的修改操作,确保本次修改的准确性;而如果本次修改的第一类对象已经聚合到第二存储池230内的第二类对象中,由于修改前后的内容长度不同,则无法在第二类对象中对本次修改的第一类对象的内容数据进行修改,因此可以将修改操作作为删除操作和写入操作的组合,也就是通过对象存储网关210对已经写入的原有第一类对象执行相应的删除操作,在删除成功后,再次对当前要求写入的新的第一类对象执行相应的写入操作,从而保证修改操作的准确执行。
对象存储网关210如果接收到第一类对象的修改请求,且本次修改的第一类对象为未聚合到第二类对象的第一类对象,则说明本次读取的第一类对象存储在第一存储池220内,在日志记录池250中记录有本次删除的第一类对象的写入日志,因此可以直接更新第一存储池220内本次修改的第一类对象的内容数据,同时为了避免后续定期回放日志时对本次修改的第一类对象最初写入的内容数据进行聚合而导致聚合出错,还会在日志记录池250中记录本次修改的第一类对象的修改日志,并删除本次修改的第一类对象的写入日志,以便后续仅回放本次修改的第一类对象的修改日志,不再回放本次修改的第一类对象修改前的写入日志,从而保证本次修改的第一类对象在后续聚合时的所聚合的内容数据的准确性。但是,对象存储网关210如果接收到第一类对象的修改请求,且本次修改的第一类对象为聚合到第二类对象的第一类对象,则说明本次修改的第一类对象存储在第二存储池230内的第二类对象中,在日志记录池250中已经回放过本次修改的第一类对象的写入日志,不会再次回放,因此仅需要在第一存储池220内直接写入本次修改后的第一类对象,由于读取第一类对象时,会首先在第一存储池220内读取对应的内容数据,如果可以读取到内容数据,就不会再关注该第一类对象与第二类对象之间的聚合映射关系,因此本实施例即使在第一存储池220内写入本次修改后的第一类对象,而不删除第一存储池220内已经存储的该第一类对象在修改前的内容数据与所聚合到的第二类对象之间的聚合映射关系时,也能保证本次修改的第一类对象的读取准确性,因此在第一存储池220内写入本次修改后的第一类对象时,无需删除第一存储池220内已经存储的该第一类对象在修改前的内容数据与所聚合到的第二类对象之间的聚合映射关系,而需要将本次修改的第一类对象的修改日志记录到日志记录 池250中,以便后续定期回放日志时,能够根据该修改日志将本次修改后的第一类对象重新聚合到第二存储池230内的第二类对象中,在将本次修改后的第一类对象聚合到第二类对象中时,才需要在第一存储池220内删除本次修改前的第一类对象与所聚合到的第二类对象之间的聚合映射关系,进而将本次修改后的第一类对象与所聚合到的第二类对象之间的聚合映射关系存储到第一存储池220中,从而保证第一类对象在修改后的读取准确性。
本实施例提供的技术方案,通过在日志记录池中标记对应的聚合检查点日志,对未聚合到第二类对象的第一类对象和已聚合到第二类对象的第一类对象进行准确区分,后续定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,直接将每一回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象,来聚合到第二存储池内的第二类对象中,从而保证第一类对象聚合到第二类对象的准确性,实现第一类对象到第二类对象之间的动态聚合,防止第一存储池内未聚合到第二类对象的第一类对象的过度存储,提高第一存储池的存储性能。
实施例三
图3为本申请实施例三提供的一种对象聚合方法的流程图,本实施例可适用于开源的Ceph文件系统下对任一对象进行存储的情况中,应用于上述实施例提供的对象存储平台中。本实施例提供的一种对象聚合方法可以由本申请实施例提供的对象聚合装置来执行,该装置可以通过软件和/或硬件的方式来实现,并集成在执行本方法的服务器中。
参考图3,该方法可以包括如下步骤:
S310,定期在第一存储池内查找出未聚合到第二类对象的第一类对象。
第一存储池为在对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,存储有未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,第二存储池为在对象存储平台的内部存储空间上采用常规磁盘技术构建出的存储空间,存储有第一类对象聚合后的第二类对象。
由于本实施例对于在Ceph文件系统内刚开始写入的每个第一类对象,都会先将该第一类对象作为未聚合到第二类对象的第一类对象,直接存储到第一存储池中,因此本实施例通过对象存储网关分析第一存储池内的多个第一类对象的存储情况,定期在该第一存储池内查找出该时段内新写入该对象存储平台,且还未聚合到第二类对象的第一类对象,以便后续准确聚合到第二存储池内的 第二类对象中。
示例性的,为了保证第一存储池内所查找出的未聚合到第二类对象中的第一类对象的全面性,由于对于新写入或修改的第一类对象,本实施例会将其作为未聚合到第二类对象中的第一类对象,先存储到第一存储池内,后续再定期聚合到第二存储池的第二类对象中,以保证部分第一类对象的读写性能,因此本实施例会预先设置一个日志记录池,在根据用户需求在第一存储池内写入新的第一类对象或者修改原有的第一类对象时,会在该日志记录池中记录该第一类对象的写入日志或修改日志,以供对象存储网关通过定期回放该日志记录池中所记录的多个日志,并查找出未聚合到第二类对象中的多个第一类对象的回放日志,进而将所查找出的回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象,依次聚合到第二存储池内的第二类对象中。
本实施例中定期在第一存储池内查找出未聚合到第二类对象的第一类对象,可以包括:定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志;将每个回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象。
为了保证所回放的日志能够准确判断所对应的第一类对象是否已经聚合到第二存储池内的第二类对象中,本实施例在定期回放日志记录池中所记录的多个日志时,还会相应标记出本次最后回放的日志,作为本实施例中的聚合检查点日志,由于对象存储网关会按照日志记录顺序来回放日志记录池中的日志,因此能够确定日志记录池中位于所标记的聚合检查点日志之前的日志均已经回放过,而已经回放过的日志所面向的第一类对象在回访过程中已经聚合到第二存储池内的第二类对象中,也就是日志记录池中位于所标记的聚合检查点日志之前的多个写入日志和修改日志所面向的第一类对象均已经聚合到第二存储池内的第二类对象中,而位于所标记的聚合检查点日志之后的多个写入日志和修改日志所面向的第一类对象还未聚合到第二存储池内的第二类对象中,因此对象存储网关在通过定期回放日志记录池中的日志来查找未聚合到第二类对象的第一类对象时,仅需要定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,每一回放日志面向的第一类对象为未聚合到第二类对象的第一类对象,进而将每一回放日志面向的第一类对象聚合到第二存储池内的第二类对象中;同时,在将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中之后,还可以包括:根据本次聚合状态,在日志记录池内的多个写入日志和修改日志中重新标记聚合检查点日志,以便下次回放日志时,继续从该聚合检查点日志开始回放;例如,如果本次定期回放的所有日志全部回放完成,则将本次回放的最后一个日志作为新的聚合检查点日志,在日志记录池中标记出来,下次从该新的聚合检查点日志开始继续回放,从而提高未聚 合到第二类对象的第一类对象的查找效率和准确性。
S320,将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在第一存储池内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系。
本实施例定期在第一存储池内查找出未聚合到第二类对象的第一类对象之后,会将所查找出的每一第一类对象不断聚合到第二存储池内的第二类对象中,可以直接将本次聚合的第一类对象按照映射标签、属性标签、内容数据和校验标签的数据格式依次追加到该第二类对象内最后聚合的第一类对象之后,从而保证第二类对象所聚合的多个第一类对象的顺序,避免在第二类对象的同一位置重复写入第一类对象而导致的聚合出错。
在将第一类对象聚合到第二类对象的过程中,第二存储池主要负责存储第二类对象的内容数据,而第二类对象由多个第一类对象聚合而成,第二类对象的内容数据则为所聚合的多个第一类对象的内容数据,因此为了确保第一类对象到第二类对象的成功聚合,并防止第二类对象下第一类对象的过度聚合,本实施例会通过分析Ceph文件系统所支持的最小操作粒度,在第二存储池上预先存储第二类对象的聚合上限,将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,可以包括:针对第一存储池内未聚合到第二类对象的每一第一类对象,如果第二存储池内当前第二类对象中第一类对象的聚合容量小于聚合上限,则直接在当前第二类对象内追加该第一类对象;如果第二存储池内当前第二类对象中第一类对象的聚合容量不小于聚合上限,将第二存储池内的下一第二类对象作为新的当前第二类对象,在新的当前第二类对象内追加该第一类对象;使得在将第一类对象聚合到第二类对象的过程中,如果第二类对象内所聚合的第一类对象的聚合占用空间已经达到该聚合上限,则不再向该第二类对象中继续聚合第一类对象,而切换到下一个新的第二类对象中聚合。
示例性的,对象存储网关在将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中的聚合过程中,对于每聚合一个第一类对象之前,首先需要检测第二存储池内用于本次聚合的当前第二类对象下已经聚合的多个第一类对象的聚合容量是否大于或等于第二类对象的聚合上限;如果已经聚合的多个第一类对象的聚合容量小于该聚合上限,则继续在该当前第二类对象中按照第一类对象的映射标签、属性标签、内容数据和校验标签的数据格式追加本次聚合的第一类对象;如果已经聚合的多个第一类对象的聚合容量不小于该聚合上限,将第二存储池内的下一第二类对象作为新的当前第二类对象,继续将本次聚合剩余的第一类对象聚合到新的当前第二类对象中。
为了指示将第一存储池内未聚合到第二类对象的第一类对象聚合到第二类 对象中的聚合过程,避免第一存储池内未聚合到第二类对象的第一类对象的聚合遗漏,本实施例通过对象存储网关上配置的查找进程定期在第一存储池内查找未聚合到第二类对象的第一类对象,并将所查找出的未聚合到第二类对象的第一类对象逐一记录到对应的聚合分片中;其中,对象存储网关上会预先设置多个聚合分片,多个聚合分片与多个聚合分片上配置的多个聚合进程一一对应;查找进程将从第一存储池内查找出的未聚合到第二类对象的多个第一类对象分别记录到多个聚合分片后,多个聚合分片对应的多个聚合进程会从相应聚合分片中并发读取该多个聚合分片中所记录的未聚合到第二类对象的多个第一类对象,并将读取出的第一类对象不断聚合到第二存储池内的第二类对象中,同时在第一存储池内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系,使得第一存储池内不再存储已聚合到第二类对象的第一类对象的内容数据,而只存储已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,从而在避免第一存储池过度存储的基础上,保证将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池的第二类对象中的聚合准确性。
示例性的,本实施例以通过定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,由对象存储网关上配置的查找进程和至少一个聚合进程共同配合来确定未聚合到第二类对象的多个第一类对象为例,对第一存储池内未聚合到第二类对象的多个第一类对象的查找过程进行说明:
可选的,通过对象存储网关上配置的查找进程会定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,并将多个回放日志对应写入到多个聚合分片中,每个聚合进程从各自对应的聚合分片中并发读取回放日志,并将每一回放日志面向的第一类对象作为未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,同时聚合进程在每一第一类对象的聚合过程中,首先会检测第二存储池内用于本次聚合的当前第二类对象下已经聚合的多个第一类对象的聚合容量是否大于或等于第二类对象的聚合上限;若已经聚合的多个第一类对象的聚合容量小于第二类对象的聚合上,则继续在该当前第二类对象中按照第一类对象的映射标签、属性标签、内容数据和校验标签的数据格式追加本次聚合的第一类对象;若已经聚合的多个第一类对象的聚合容量大于或等于第二类对象的聚合上,将第二存储池内的下一第二类对象作为新的当前第二类对象,继续将本次聚合剩余的第一类对象聚合到新的当前第二类对象中。
本实施例提供的技术方案,直接在对象存储平台的内部存储空间上采用固态磁盘技术构建出第一存储池,采用常规磁盘技术构建出第二存储池,第一存储池的数据读取性能远高于第二存储池,在第一存储池内存储未聚合到第二类 对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,在第二存储池内存储第一类对象聚合后的第二类对象,无需借助第三方存储系统来存储第一类对象与第二类对象之间的聚合映射关系,调整了第一类对象的存储结构,提高了未聚合到第二类对象的第一类对象的读取性能;同时,对象存储网关能够定期将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,实现第一类对象到第二类对象之间的动态聚合,防止第一存储池内未聚合到第二类对象的第一类对象的过度存储,提高第一存储池的存储性能。
实施例四
图4为本申请实施例四提供的一种对象聚合方法的流程图,本实施例是在上述实施例的基础上进行说明。如图4所示,本实施例主要对于在将第一类对象聚合到第二类对象的过程中存在的对第一类对象的多项操作过程和第二类对象的回收过程进行解释说明。
参考图4,该方法可以包括如下步骤:
S410,定期在第一存储池内查找出未聚合到第二类对象的第一类对象。
S420,将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在第一存储池内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系。
S430,实时检测第二存储池内每个第二类对象的可回收容量,将可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池中,并删除第一存储池内多个有效第一类对象与目标第二类对象之间的聚合映射关系,以及第二存储池内的目标第二类对象。
可选的,根据用户读写需求,会接收到用户对于其上存储的第一类对象的操作请求,对象存储网关会根据该操作请求,对第一存储池和第二存储池下该操作请求指向的对象信息进行相应的读取、写入、修改或删除等处理;因此,如果对象存储网关根据用户需求要删除一个第一类对象,且该第一类对象已经聚合到第二存储池内的一个第二类对象中,则需要在所聚合的第二类对象内相应删除该第一类对象的内容数据,该第二类对象内会出现相应存储空缺,而由于不同第一类对象的内容数据也不同,因此无法在第二类对象内的存储空缺位置再次添加新的第一类对象,使得第二类对象下的存储空间无法被完全使用;为了解决上述问题,本实施例会在对象存储网关上配置有回收进程,并在第二存储池上存储每个第二类对象的可回收容量,在删除第二存储池内第二类对象 所聚合的第一类对象时,当前仅需要删除第一存储池内存储的该第一类对象与所聚合到的第二类对象之间的聚合映射关系,并根据本次删除的第一类对象的内容数据量来修改第二存储池上本次删除的第一类对象所聚合到的第二类对象的可回收容量,后续通过回收进程实时检测第二存储池内每个第二类对象的可回收容量,来及时回收第二类对象所聚合的第一类对象中已经被删除的第一类对象,从而保证所删除的第一类对象的存储空间可以及时被回收。
通过回收进程实时检测第二存储池内每个第二类对象的可回收容量,进而查找出可回收容量超出预设回收上限的目标第二类对象,该目标第二类对象所聚合的大部分第一类对象已经被删除,由于已经删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系已经在第一存储池内删除了,因此可以根据第一存储池剩余记录的与目标第二类对象相关的聚合映射关系,在该目标第二类对象所聚合的多个第一类对象中查找出未被删除的多个有效第一类对象,进而将该目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池中,并删除第一存储池内多个有效第一类对象与该目标第二类对象之间的聚合映射关系,以及第二存储池中内的目标第二类对象,也就是将该目标第二类对象所聚合的多个有效第一类对象重新迁移到第一存储池内,并在第二存储池内全部删除该目标第二类对象,从而保证所删除的第一类对象的存储空间可以及时被回收,避免所删除的第一类对象所聚合的第二类对象上的存储空间浪费。
示例性的,本实施例中将可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池中,可以包括:在第一存储池中查找目标第二类对象所聚合的每一第一类对象与目标第二类对象之间的聚合映射关系;将聚合映射关系为非空的第一类对象作为目标第二类对象所聚合的有效第一类对象重新写入到第一存储池中。通过依次确定目标第二类对象所聚合的每一第一类对象,并判断在第一存储池中是否还存在该第一类对象与目标第二类对象之间的聚合映射关系,若在第一存储池中不存在该第一类对象与目标第二类对象之间的聚合映射关系,说明该第一类对象已经被删除而不作任何处理,若在第一存储池中存在该第一类对象与目标第二类对象之间的聚合映射关系,可以确定聚合映射关系为非空的第一类对象为目标第二类对象中未被删除的第一类对象,将其作为目标第二类对象所聚合的有效第一类对象重新写入到第一存储池中,并删除第一存储池内多个有效第一类对象与该目标第二类对象之间的聚合映射关系,以及第二存储池中内的目标第二类对象。
S440,如果接收到对第一类对象的操作请求,则对应更新操作请求在第一存储池和第二存储池下指向的对象信息。
可选的,根据用户需求对于所存储的第一类对象会存在相应的写入、读取、 删除和更改等多项操作,为了保证第一存储池和第二存储池上存储信息在多项操作下的动态更新,本实施例会通过对象存储网关实时检测用户是否对第一类对象存在一项操作需求,如果接收到对第一类对象的操作请求,则直接在第一存储池和第二存储池下查找出该操作请求所指向的对象信息,该对象信息可以是第一存储池内未聚合到第二类对象的第一类对象的内容数据或者已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,或者第二存储池内已聚合到第二类对象的第一类对象的内容数据,进而对在第一存储池和第二存储池下查找出的该操作请求所指向的对象信息进行与本次操作相关的更新,以保证第一存储池和第二存储池上对象信息在多项操作下的动态更新,从而提高对象操作的准确性。
示例性的,本实施例根据用户需求对于所存储的第一类对象会存在相应的写入、读取、删除和修改等多项操作,以下对于本实施例中第一类对象的每一操作过程进行解释说明:
1)如果接收到对第一类对象的写入请求,则将本次写入的第一类对象作为未聚合到第二类对象的第一类对象,直接存储到第一存储池内,并在日志记录池中记录本次写入的第一类对象的写入日志。
如果通过对象存储网关接收到对第一类对象的写入请求,则首先将本次写入的第一类对象作为未聚合到第二类对象的第一类对象,直接存储到第一存储池内,并在日志记录池中对应记录本次写入的第一类对象的写入日志,以便后续通过定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,本次写入的第一类对象的写入日志则会被回放,进而将本次写入的第一类对象聚合到第二存储池内的第二类对象中。
2)如果接收到对第一类对象的读取请求,则在第一存储池内读取本次读取的第一类对象的内容数据;如果在第一存储池内读取出的内容数据为空,则在第一存储池内查找出本次读取的第一类对象与所聚合到的第二类对象之间的聚合映射关系,并根据该聚合映射关系在第二存储池内的本次读取的第一类对象所聚合到的第二类对象中继续读取本次读取的第一类对象的内容数据。
如果通过对象存储网关接收到第一类对象的读取请求,则由于第一存储池内会存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,因此首先在第一存储池内读取本次读取的第一类对象的内容数据,如果能够读取出该第一类对象的内容数据,则说明本次读取的第一类对象还未聚合到第二存储池内的第二类对象中,进而直接将所读取的内容数据反馈给用户;而如果在第一存储池内读取出的内容数据为空,则说明本次读取的第一类对象已经聚合到第二存储池内的第 二类对象中,因此需要在第一存储池内查找出本次读取的第一类对象与所聚合到的第二类对象之间的聚合映射关系,如第一类对象在所聚合到的第二类对象中的偏移位置以及该第一类对象的数据长度等,进而根据该聚合映射关系在第二存储池内查找出本次读取的第一类对象所聚合到的第二类对象,并在该第二类对象的相应偏移位置下继续读取出相应数据长度的内容数据,作为本次读取的第一类对象的内容数据。
3)如果接收到对第一类对象的删除请求,且本次删除的第一类对象为未聚合到第二类对象的第一类对象,则在第一存储池内删除本次删除的第一类对象,并在日志记录池中删除本次删除的第一类对象的写入日志;如果接收到第一类对象的删除请求,且本次删除的第一类对象为已聚合到第二类对象的第一类对象,则在第一存储池内删除本次删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系,并在第二存储池内更新本次删除的第一类对象所聚合到的第二类对象的可回收容量。
如果通过对象存储网关接收到第一类对象的删除请求,且本次删除的第一类对象为未聚合到第二类对象的第一类对象,则说明本次读取的第一类对象存储在第一存储池内,在日志记录池中记录有本次删除的第一类对象的写入日志或者修改日志,以便后续定期回放日志时能够将该第一类对象聚合到第二类对象中,因此本实施例直接在第一存储池内删除本次删除的第一类对象的内容数据,同时为了避免后续定期回放日志时对本次删除的第一类对象进行聚合而导致聚合出错,还会在日志记录池中删除本次删除的第一类对象的写入日志或者修改日志,使得后续不会回放已经删除的第一类对象的写入日志或修改日志,从而保证第一类对象的聚合准确性。但是,如果通过对象存储网关接收到第一类对象的删除请求,且本次删除的第一类对象为已聚合到第二类对象的第一类对象,则说明本次删除的第一类对象存储在第二存储池内的第二类对象中,在日志记录池中已经回放过本次删除的第一类对象的写入日志或修改日志,不会再次回放,因此仅需要在第一存储池内删除本次删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系即可,而无需在第二存储池内本次删除的第一类对象所聚合到的第二类对象中删除该第一类对象的内容数据,会通过对象存储网关上配置的回收进程来通过检测第二存储池内的每个第二类对象的可回收容量来统一执行相应的回收操作,为了保证回收准确性,仅需要在第二存储池内更新本次删除的第一类对象所聚合到的第二类对象的可回收容量,也就是在原有的可回收容量的基础上,再加上本次删除的第一类对象的数据长度,后续在第二类对象的可回收容量超出预设回收上限时进行相应的回收,将可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到第一存储池中,并删除第一存储池内多个有效第一类对象与目标第二类对 象之间的聚合映射关系,以及第二存储池内的目标第二类对象。
4)如果接收到对第一类对象的修改请求,且本次修改的第一类对象为未聚合到第二类对象的第一类对象,则更新第一存储池内本次修改的第一类对象的内容数据,同时在日志记录池中记录本次修改的第一类对象的修改日志,并删除本次修改的第一类对象的写入日志;如果接收到第一类对象的修改请求,且本次修改的第一类对象为聚合到第二类对象的第一类对象,则在第一存储池内直接写入本次修改后的第一类对象,并在将本次修改后的第一类对象聚合到第二类对象中时,在第一存储池内删除本次修改前的第一类对象与所聚合到的第二类对象之间的聚合映射关系。
如果通过对象存储网关接收到第一类对象的修改请求,且本次修改的第一类对象为未聚合到第二类对象的第一类对象,则说明本次读取的第一类对象存储在第一存储池内,在日志记录池中记录有本次删除的第一类对象的写入日志,因此可以直接更新第一存储池内本次修改的第一类对象的内容数据,同时为了避免后续定期回放日志时对本次修改的第一类对象最初写入的内容数据进行聚合而导致聚合出错,还会在日志记录池中记录本次修改的第一类对象的修改日志,并删除本次修改的第一类对象的写入日志,以便后续仅回放本次修改的第一类对象的修改日志,不再回放本次修改的第一类对象修改前的写入日志,从而保证本次修改的第一类对象在后续聚合时的所聚合的内容数据的准确性。但是,如果通过对象存储网关接收到第一类对象的修改请求,且本次修改的第一类对象为聚合到第二类对象的第一类对象,则说明本次修改的第一类对象存储在第二存储池内的第二类对象中,在日志记录池中已经回放过本次修改的第一类对象的写入日志,不会再次回放,因此仅需要在第一存储池内直接写入本次修改后的第一类对象,由于读取第一类对象时,会首先在第一存储池内读取对应的内容数据,如果可以读取到内容数据,就不会再关注该第一类对象与第二类对象之间的聚合映射关系,因此本实施例即使在第一存储池内写入本次修改后的第一类对象,而不删除第一存储池内已经存储的该第一类对象在修改前的内容数据与所聚合到的第二类对象之间的聚合映射关系时,也能保证本次修改的第一类对象的读取准确性,因此在第一存储池内写入本次修改后的第一类对象时,无需删除第一存储池内已经存储的该第一类对象在修改前的内容数据与所聚合到的第二类对象之间的聚合映射关系,而需要将本次修改的第一类对象的修改日志记录到日志记录池中,以便后续定期回放日志时,能够根据该修改日志将本次修改后的第一类对象重新聚合到第二存储池内的第二类对象中,在将本次修改后的第一类对象聚合到第二类对象中时,才需要在第一存储池内删除本次修改前的第一类对象与所聚合到的第二类对象之间的聚合映射关系,进而将本次修改后的第一类对象与所聚合到的第二类对象之间的聚合映射关系存 储到第一存储池中,从而保证第一类对象在修改后的读取准确性。
本实施例提供的技术方案,直接在对象存储平台的内部存储空间上采用固态磁盘技术构建出第一存储池,采用常规磁盘技术构建出第二存储池,第一存储池的数据读取性能远高于第二存储池的数据读取性能,在第一存储池内存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,在第二存储池内存储第一类对象聚合后的第二类对象,无需借助第三方存储系统来存储第一类对象与第二类对象之间的聚合映射关系,调整了第一类对象的存储结构,提高了未聚合到第二类对象的第一类对象的读取性能;同时,对象存储网关能够定期将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,实现第一类对象到第二类对象之间的动态聚合,防止第一存储池内未聚合到第二类对象的第一类对象的过度存储,提高第一存储池的存储性能;同时通过回收进程实时检测第二存储池内每个第二类对象的可回收容量,保证所删除的第一类对象的存储空间可以及时被回收,避免所删除的第一类对象所聚合的第二类对象上的存储空间浪费。
实施例五
图5为本申请实施例五提供的一种对象聚合装置的结构示意图,设置于上述实施例提供的对象存储平台中,如图5所示,该装置可以包括:
对象查找模块510,设置为定期在第一存储池内查找出未聚合到第二类对象的第一类对象;对象聚合模块520,设置为将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在第一存储池内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系;其中,第一存储池为在对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,存储有未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,第二存储池为在对象存储平台的内部存储空间上采用常规磁盘技术构建出的存储空间,存储有第一类对象聚合后的第二类对象。
本实施例提供的技术方案,直接在对象存储平台的内部存储空间上采用固态磁盘技术构建出第一存储池,采用常规磁盘技术构建出第二存储池,第一存储池的数据读取性能远高于第二存储池,在第一存储池内存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,在第二存储池内存储第一类对象聚合后的第二类对象,无需借助第三方存储系统来存储第一类对象与第二类对象之间的聚合映 射关系,调整了第一类对象的存储结构,提高了未聚合到第二类对象的第一类对象的读取性能;同时,对象存储网关能够定期将第一存储池内未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,实现第一类对象到第二类对象之间的动态聚合,防止第一存储池内未聚合到第二类对象的第一类对象的过度存储,提高第一存储池的存储性能。
本实施例提供的对象聚合装置可适用于上述任意实施例提供的对象聚合方法,具备相应的功能和效果。
实施例六
图6为本申请实施例六提供的一种服务器的结构示意图,如图6所示,该服务器包括处理器60、存储装置61和通信装置62;服务器中处理器60的数量可以是一个或多个,图6中以一个处理器60为例;服务器中的处理器60、存储装置61和通信装置62可以通过总线或其他方式连接,图6中以通过总线连接为例。
本实施例提供的一种服务器可设置为执行上述任意实施例提供的对象聚合方法,具备相应的功能和效果。
实施例七
本申请实施例七还提供了一种计算机可读存储介质,其上存储有计算机程序,该程序被处理器执行时可实现上述任意实施例中的对象聚合方法。该方法可以包括如下步骤:定期在第一存储池内查找出未聚合到第二类对象的第一类对象;将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在第一存储池内将本次聚合的第一类对象更换为该第一类对象与所聚合到的第二类对象之间的聚合映射关系;其中,第一存储池为在对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,存储有未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,第二存储池为在对象存储平台的内部存储空间上采用常规磁盘技术构建出的存储空间,存储有第一类对象聚合后的第二类对象。
本申请实施例所提供的一种包含计算机可执行指令的存储介质,其计算机可执行指令不限于如上所述的方法操作,还可以执行本申请任意实施例所提供的对象聚合方法中的相关操作。

Claims (23)

  1. 一种对象存储平台,包括:对象存储网关、第一存储池和第二存储池;其中,
    所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
    所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象;其中,所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能;
    所述对象存储网关设置为定期将所述第一存储池内未聚合到第二类对象的第一类对象聚合到所述第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系。
  2. 根据权利要求1所述的对象存储平台,其中,所述第二存储池内的第二类对象由所聚合的每个第一类对象的映射标签、属性标签、内容数据和校验标签组成。
  3. 根据权利要求1所述的对象存储平台,其中,所述对象存储网关上配置有查找进程和至少一个聚合进程;其中,
    所述对象存储网关设置为通过所述查找进程定期在所述第一存储池内查找出未聚合到第二类对象的第一类对象,并记录到对应的所述对象存储网关的聚合分片中;
    所述对象存储网关设置为通过所述至少一个聚合进程并发读取所述聚合分片中记录的未聚合到第二类对象的第一类对象,并将读取出的第一类对象聚合到所述第二存储池内的第二类对象中,同时在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系。
  4. 根据权利要求1所述的对象存储平台,还包括日志记录池,设置为记录所述第一类对象的写入日志和修改日志,并标记聚合检查点日志;
    所述对象存储网关还设置为定期回放所述日志记录池中位于所述聚合检查点日志之后的多个写入日志和修改日志,其中,所述聚合检查点日志为上次回放中被回放的最后一个日志;将每一被回放的日志面向的第一类对象作为未聚合到第二类对象的第一类对象,并根据本次聚合状态,在所述日志记录池内重新标记所述聚合检查点日志。
  5. 根据权利要求1所述的对象存储平台,其中,所述第二存储池还设置为记录所述第二类对象的聚合上限;
    所述对象存储网关设置为在将所述第一存储池内未聚合到第二类对象的第一类对象聚合到所述第二存储池内的当前第二类对象中,且所述第二存储池内所述当前第二类对象中已聚合的第一类对象的聚合容量大于或等于所述聚合上限的情况下,将所述第二存储池内的下一第二类对象作为新的当前第二类对象,将本次聚合中除当前已聚合的第一类对象外的第一类对象聚合到所述新的当前第二类对象中。
  6. 根据权利要求1-5中任一项所述的对象存储平台,其中,所述对象存储网关上配置有回收进程,所述第二存储池还设置为记录每个第二类对象的可回收容量;其中,
    所述对象存储网关还设置为通过所述回收进程实时在所述第二存储池内查找出可回收容量超出预设回收上限的目标第二类对象,将所述目标第二类对象所聚合的多个第一类对象中未删除的多个有效第一类对象重新写入到所述第一存储池中,并删除所述第一存储池内所述多个有效第一类对象与所述目标第二类对象之间的聚合映射关系,以及所述第二存储池内的所述目标第二类对象。
  7. 根据权利要求1-5中任一项所述的对象存储平台,还包括索引记录池,设置为记录所述第一类对象的索引分类。
  8. 一种对象聚合方法,应用于权利要求1-7中任一项所述的对象存储平台中,包括:
    定期在第一存储池内查找出未聚合到第二类对象的第一类对象;
    将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
    其中,所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象;所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能。
  9. 根据权利要求8所述的方法,其中,所述定期在第一存储池内查找出未聚合到第二类对象的第一类对象,包括:
    定期回放日志记录池中位于聚合检查点日志之后的多个写入日志和修改日志,其中,所述聚合检查点日志为上次回放中被回放的最后一个日志;
    将每一被回放的日志面向的第一类对象作为未聚合到第二类对象的第一类对象。
  10. 根据权利要求9所述的方法,在所述将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中之后,还包括:
    根据本次聚合状态,在所述日志记录池内的多个写入日志和修改日志中重新标记所述聚合检查点日志。
  11. 根据权利要求8所述的方法,其中,所述将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,包括:
    针对所述第一存储池内未聚合到第二类对象的每一第一类对象,在所述第二存储池内当前第二类对象中已聚合的第一类对象的聚合容量小于聚合上限的情况下,直接在所述当前第二类对象内追加所述每一第一类对象;在所述第二存储池内当前第二类对象中已聚合的第一类对象的聚合容量不小于聚合上限的情况下,将所述第二存储池内的下一第二类对象作为新的当前第二类对象,在新的当前第二类对象内追加所述每一第一类对象。
  12. 根据权利要求8所述的方法,还包括:
    实时检测所述第二存储池内每个第二类对象的可回收容量,将所述可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到所述第一存储池中,并删除所述第一存储池内所述多个有效第一类对象与所述目标第二类对象之间的聚合映射关系,以及所述第二存储池内的所述目标第二类对象。
  13. 根据权利要求12所述的方法,其中,所述将所述可回收容量超出预设回收上限的目标第二类对象所聚合的多个有效第一类对象重新写入到所述第一存储池中,包括:
    在所述第一存储池中查找所述目标第二类对象所聚合的每一第一类对象与所述目标第二类对象之间的聚合映射关系;
    将所述聚合映射关系为非空的第一类对象作为所述目标第二类对象所聚合的有效第一类对象重新写入到所述第一存储池中。
  14. 根据权利要求8所述的方法,还包括:
    在接收到对所述第一类对象的操作请求的情况下,更新所述操作请求在所述第一存储池和所述第二存储池下指向的对象信息。
  15. 根据权利要求14所述的方法,其中,所述在接收到对第一类对象的操作请求的情况下,更新所述操作请求在所述第一存储池和所述第二存储池下指向的对象信息,包括:
    在接收到对所述第一类对象的写入请求的情况下,将本次写入的第一类对象作为未聚合到第二类对象的第一类对象,直接存储到所述第一存储池内。
  16. 根据权利要求15所述的方法,在所述将本次写入的第一类对象作为未聚合到第二类对象的第一类对象,直接存储到所述第一存储池内之后,还包括:
    在日志记录池中记录本次写入的第一类对象的写入日志。
  17. 根据权利要求14所述的方法,其中,所述在接收到对第一类对象的操作请求的情况下,更新所述操作请求在所述第一存储池和所述第二存储池下指向的对象信息,包括:
    在接收到对所述第一类对象的读取请求的情况下,在所述第一存储池内读取本次读取的第一类对象的内容数据;
    在所述第一存储池内读取出的内容数据为空的情况下,在所述第一存储池内查找出本次读取的第一类对象与所聚合到的第二类对象之间的聚合映射关系,并根据读取的聚合映射关系在所述第二存储池内的本次读取的第一类对象所聚合到的第二类对象中读取本次读取的第一类对象的内容数据。
  18. 根据权利要求14所述的方法,其中,所述在接收到对第一类对象的操作请求的情况下,更新所述操作请求在所述第一存储池和所述第二存储池下指向的对象信息,包括:
    在接收到对所述第一类对象的删除请求,且本次删除的第一类对象为未聚合到第二类对象的第一类对象的情况下,在所述第一存储池内删除本次删除的第一类对象,并在日志记录池中删除本次删除的第一类对象的写入日志;
    在接收到对所述第一类对象的删除请求,且本次删除的第一类对象为已聚合到第二类对象的第一类对象的情况下,在所述第一存储池内删除本次删除的第一类对象与所聚合到的第二类对象之间的聚合映射关系,并在所述第二存储池内更新本次删除的第一类对象所聚合到的第二类对象的可回收容量。
  19. 根据权利要求14所述的方法,其中,所述在接收到对第一类对象的操作请求的情况下,更新所述操作请求在所述第一存储池和所述第二存储池下指向的对象信息,包括:
    在接收到对所述第一类对象的修改请求,且本次修改的第一类对象为未聚合到第二类对象的第一类对象的情况下,更新所述第一存储池内本次修改的第 一类对象的内容数据;
    在接收到对所述第一类对象的修改请求,且本次修改的第一类对象为聚合到第二类对象的第一类对象的情况下,在所述第一存储池内直接写入本次修改后的第一类对象,并在将本次修改后的第一类对象聚合到第二类对象中的情况下,在所述第一存储池内删除本次修改前的第一类对象与所聚合到的第二类对象之间的聚合映射关系。
  20. 根据权利要求19所述的方法,在更新所述第一存储池内本次修改的第一类对象的内容数据之后,还包括:
    在日志记录池中记录本次修改的第一类对象的修改日志,并删除本次修改的第一类对象的写入日志。
  21. 一种对象聚合装置,设置于权利要求1-7中任一项所述的对象存储平台中,包括:
    对象查找模块,设置为定期在第一存储池内查找出未聚合到第二类对象的第一类对象;
    对象聚合模块,设置为将未聚合到第二类对象的第一类对象聚合到第二存储池内的第二类对象中,并在所述第一存储池内将本次聚合的第一类对象更换为所述本次聚合的第一类对象与所聚合到的第二类对象之间的聚合映射关系;
    其中,所述第一存储池为在所述对象存储平台的内部存储空间上采用固态磁盘技术构建出的存储空间,设置为存储未聚合到第二类对象的第一类对象,以及已聚合到第二类对象的第一类对象与所聚合到的第二类对象之间的聚合映射关系,所述第二存储池为在所述对象存储平台的内部存储空间上采用设定磁盘技术构建出的存储空间,设置为存储由多个第一类对象聚合后的第二类对象;所述第一存储池所支持的数据读写性能高于所述第二存储池所支持的数据读写性能。
  22. 一种服务器,包括:
    至少一个处理器;
    存储装置,设置为存储至少一个程序;
    当所述至少一个程序被所述至少一个处理器执行,使得所述至少一个处理器实现如权利要求8-20中任一项所述的对象聚合方法。
  23. 一种计算机可读存储介质,存储有计算机程序,其中,所述程序被处理器执行时实现如权利要求8-20中任一项所述的对象聚合方法。
PCT/CN2021/085236 2020-05-25 2021-04-02 对象存储平台以及对象聚合方法、装置和服务器 WO2021238408A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010450286.9 2020-05-25
CN202010450286.9A CN111610936B (zh) 2020-05-25 2020-05-25 一种对象存储平台以及对象聚合方法、装置和服务器

Publications (1)

Publication Number Publication Date
WO2021238408A1 true WO2021238408A1 (zh) 2021-12-02

Family

ID=72200758

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/085236 WO2021238408A1 (zh) 2020-05-25 2021-04-02 对象存储平台以及对象聚合方法、装置和服务器

Country Status (2)

Country Link
CN (1) CN111610936B (zh)
WO (1) WO2021238408A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111610936B (zh) * 2020-05-25 2023-04-14 广州市百果园信息技术有限公司 一种对象存储平台以及对象聚合方法、装置和服务器
CN113687783B (zh) * 2021-07-31 2024-02-13 济南浪潮数据技术有限公司 一种对象聚合方法、系统、装置及计算机存储介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180067673A1 (en) * 2016-03-15 2018-03-08 International Business Machines Corporation Storage capacity allocation using distributed spare space
CN109085999A (zh) * 2018-06-15 2018-12-25 华为技术有限公司 数据处理方法及处理系统
CN111125034A (zh) * 2019-12-27 2020-05-08 深信服科技股份有限公司 一种聚合对象数据处理方法、系统及相关设备
CN111176578A (zh) * 2019-12-29 2020-05-19 浪潮电子信息产业股份有限公司 一种对象聚合方法、装置、设备及可读存储介质
CN111610936A (zh) * 2020-05-25 2020-09-01 广州市百果园信息技术有限公司 一种对象存储平台以及对象聚合方法、装置和服务器

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8341332B2 (en) * 2003-12-02 2012-12-25 Super Talent Electronics, Inc. Multi-level controller with smart storage transfer manager for interleaving multiple single-chip flash memory devices
CN102129472B (zh) * 2011-04-14 2012-12-19 上海红神信息技术有限公司 面向语义搜索引擎的高效混合存储结构的构建方法
US9235346B2 (en) * 2012-05-04 2016-01-12 Avago Technologies General Ip (Singapore) Pte. Ltd. Dynamic map pre-fetching for improved sequential reads of a solid-state media
CN109213420A (zh) * 2017-06-29 2019-01-15 杭州海康威视数字技术股份有限公司 数据存储方法、装置及系统

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180067673A1 (en) * 2016-03-15 2018-03-08 International Business Machines Corporation Storage capacity allocation using distributed spare space
CN109085999A (zh) * 2018-06-15 2018-12-25 华为技术有限公司 数据处理方法及处理系统
CN111125034A (zh) * 2019-12-27 2020-05-08 深信服科技股份有限公司 一种聚合对象数据处理方法、系统及相关设备
CN111176578A (zh) * 2019-12-29 2020-05-19 浪潮电子信息产业股份有限公司 一种对象聚合方法、装置、设备及可读存储介质
CN111610936A (zh) * 2020-05-25 2020-09-01 广州市百果园信息技术有限公司 一种对象存储平台以及对象聚合方法、装置和服务器

Also Published As

Publication number Publication date
CN111610936A (zh) 2020-09-01
CN111610936B (zh) 2023-04-14

Similar Documents

Publication Publication Date Title
US8051249B2 (en) Method for preloading data to improve data-retrieval times
US8326894B2 (en) Method and system to space-efficiently track memory access of object-oriented language in presence of garbage collection
US7836023B2 (en) System for managing access and storage of worm files without sending parameters for associated file access
CN102541757B (zh) 写缓存方法、缓存同步方法和装置
WO2021238408A1 (zh) 对象存储平台以及对象聚合方法、装置和服务器
CN106951375B (zh) 在存储系统中删除快照卷的方法及装置
US11625412B2 (en) Storing data items and identifying stored data items
CN113568582B (zh) 数据管理方法、装置和存储设备
CN113626431A (zh) 一种基于lsm树的延迟垃圾回收的键值分离存储方法及系统
CN113448946B (zh) 数据迁移方法及装置、电子设备
CN111694806B (zh) 一种事务日志的缓存方法、装置、设备和存储介质
US8135760B1 (en) Determining the lineage of a content unit on an object addressable storage system
CN111752941B (zh) 一种数据存储、访问方法、装置、服务器及存储介质
CN108021562B (zh) 应用于分布式文件系统的存盘方法、装置及分布式文件系统
CN109241011B (zh) 一种虚拟机文件处理方法及装置
CN111581157A (zh) 一种对象存储平台以及对象操作方法、装置和服务器
JP4279346B2 (ja) データベース管理装置及びプログラム
CN117473117B (zh) 一种视频循环存储方法、系统及计算机
CN118277392B (zh) 一种基于键值分离的键值存储系统优化方法及装置
CN118170323B (zh) 数据读写方法、装置、电子设备、存储介质和程序产品
CN117453632B (zh) 一种数据存储方法及装置
US20240020019A1 (en) Resumable transfer of virtual disks
CN115878563B (zh) 一种分布式文件系统目录级快照的实现方法及电子设备
US20240143213A1 (en) Fingerprint tracking structure for storage system
WO2024130885A1 (zh) 适用于块设备的leveldb存储方法及存储系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21812660

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21812660

Country of ref document: EP

Kind code of ref document: A1