CN114546261B - Object movement optimization method and system in distributed object storage - Google Patents

Object movement optimization method and system in distributed object storage Download PDF

Info

Publication number
CN114546261B
CN114546261B CN202210018247.0A CN202210018247A CN114546261B CN 114546261 B CN114546261 B CN 114546261B CN 202210018247 A CN202210018247 A CN 202210018247A CN 114546261 B CN114546261 B CN 114546261B
Authority
CN
China
Prior art keywords
metadata
data
source
barrel
head
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210018247.0A
Other languages
Chinese (zh)
Other versions
CN114546261A (en
Inventor
李欢欢
武模仁
赵煜
陶桐桐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202210018247.0A priority Critical patent/CN114546261B/en
Publication of CN114546261A publication Critical patent/CN114546261A/en
Application granted granted Critical
Publication of CN114546261B publication Critical patent/CN114546261B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0647Migration mechanisms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/0652Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5011Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
    • G06F9/5016Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method and a system for optimizing object movement in distributed object storage, which are characterized in that the head object data and metadata of a source object are modified, a newly added field is added in the metadata, the newly added field value is the unique identifier of a source barrel through the newly added field, then tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is brushed into a new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, the code logic is simpler, the operation steps are fewer, and meanwhile, the scheme of repeatedly using the data greatly simplifies the logic flow, improves the movement efficiency, greatly improves the performance of the distributed object storage system, and brings better use experience to users.

Description

Object movement optimization method and system in distributed object storage
Technical Field
The invention relates to the technical field of distributed object storage, in particular to a method and a system for optimizing object movement in distributed object storage.
Background
The object stored in the storage system is mainly composed of two parts, namely data and metadata. For larger sized objects, the metadata is only a small percentage. All operations of the object are basically related to the two parts of content, and the object is not exceptional in moving. The data of the object is stored in the data pool, and the metadata is respectively stored in the head object of the current object in the data pool and a certain barrel slice of the barrel where the current object in the index pool is located.
In current distributed object storage systems, an operation of moving an object, i.e., moving the object from one bucket to another, is often used. This operation involves several major tasks, namely reading the object, uploading the object, deleting the source object, and it is extremely resource-consuming to perform the move operation on the large-sized memory object in addition to the cumbersome process. If the objects are moved in batches, the phenomenon of memory storm may occur, thereby affecting the front-end service of the client and further bringing bad user experience.
Disclosure of Invention
The invention aims to provide an object movement optimization method and system in distributed object storage, which aim to solve the problems of complicated object movement process and high memory resource consumption in the prior art, and realize improvement of movement efficiency and performance of a distributed object storage system.
In order to achieve the above technical object, the present invention provides a method for optimizing object movement in a distributed object store, the method comprising the following operations:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
Preferably, the new field is used for reading the head object of the object when the client reads the content of the object, if the metadata information of the head object has the new field, which indicates that the object is moved, then when the tail object data of the object is read, the name of the tail object is determined according to the value of the new field, and then other data is read from the tail object.
Preferably, the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
Preferably, the number of the bucket fragment is calculated according to the name of the object through a hash algorithm.
The invention also provides an object movement optimization system in the distributed object storage, which comprises:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
Preferably, the new field is used for reading the head object of the object when the client reads the content of the object, if the metadata information of the head object has the new field, which indicates that the object is moved, then when the tail object data of the object is read, the name of the tail object is determined according to the value of the new field, and then other data is read from the tail object.
Preferably, the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
Preferably, the number of the bucket fragment is calculated according to the name of the object through a hash algorithm.
The effects provided in the summary of the invention are merely effects of embodiments, not all effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
compared with the prior art, the method and the device have the advantages that the head object data and the metadata of the source object are modified, the newly added field is added in the metadata, the newly added field value is the unique identifier of the source barrel through the newly added field, then the tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is downwards brushed into the new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, the code logic is simpler, the operation steps are fewer, meanwhile, the scheme of recycling the data is adopted, the logic flow is greatly simplified, the movement efficiency is improved, the performance of the distributed object storage system is greatly improved, and better use experience is brought to users.
Drawings
FIG. 1 is a flowchart of an object movement optimization method in a distributed object store according to an embodiment of the present invention;
FIG. 2 is a block diagram of an object movement optimization system in distributed object storage according to an embodiment of the present invention.
Detailed Description
In order to clearly illustrate the technical features of the present solution, the present invention will be described in detail below with reference to the following detailed description and the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and processes are omitted so as to not unnecessarily obscure the present invention.
The following describes in detail an object movement optimization method and system in distributed object storage according to an embodiment of the present invention with reference to the accompanying drawings.
As shown in fig. 1, the invention discloses a method for optimizing object movement in a distributed object store, which comprises the following operations:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
The embodiment of the invention simplifies the object moving step by only moving and modifying the metadata of the object without moving the data aiming at the object movement among different buckets in the same data pool in the object storage system.
The data of the object is stored in the data pool, the metadata is respectively stored in the head object of the current object in the data pool and a certain barrel slice of the barrel where the current object in the index pool is located, and the metadata only occupies a small proportion, so that in the embodiment of the invention, only the metadata is moved and modified, and the metadata is correspondingly modified in the data pool and the index pool respectively.
And finding the object head of the target object from the data pool according to the object name and the unique identifier of the source barrel where the object is located, and then reading the data and the metadata in the object head into the memory. The original metadata of the object is directly modified in the memory, and most importantly, a new field is added, and the value of the field takes the unique identification of the source barrel. The field is used for reading the head object of the object when the client reads the content of the object, and if the metadata information of the head object has the newly added field, which indicates that the object is moved, the name of the tail object is determined according to the value of the newly added field when the tail object data of the object is read, and then other data are read from the tail object.
Deleting the data and metadata in the source object header, and then brushing the data and new metadata in the memory, such as etag, tag, content _type, pg_ver, source_zone and other attribute information, down to a new header object named by the destination bucket unique identifier and object name in the same storage pool. And writing the name, size, classification and metadata information of the owner of the object into a certain barrel slice of a target barrel in the index pool, wherein the number of the barrel slice is calculated according to the name of the object through a hash algorithm.
And deleting the metadata information of the object in the source barrel slice in the index pool.
In the invention, only the data and metadata in the object head are moved and modified in the object movement, but the tail object data is not moved, and the field value is the unique identification of the source barrel by adding the field in the head object metadata, so that the tail object data can be correctly read when the object data is read, thereby ensuring the integrity of data access.
According to the embodiment of the invention, the head object data and the metadata of the source object are modified, the newly added field is added in the metadata, the newly added field value is used as the unique identifier of the source barrel, and then the tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is brushed down into a new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, so that the code logic is simpler, the operation steps are fewer, and meanwhile, the scheme of recycling the data greatly simplifies the logic flow, improves the movement efficiency, greatly improves the performance of the distributed object storage system, and brings better use experience to users.
As shown in fig. 2, the embodiment of the invention further discloses an object movement optimization system in the distributed object storage, which comprises:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
And finding the object head of the target object from the data pool according to the object name and the unique identifier of the source barrel where the object is located, and then reading the data and the metadata in the object head into the memory. The original metadata of the object is directly modified in the memory, and most importantly, a new field is added, and the value of the field takes the unique identification of the source barrel. The field is used for reading the head object of the object when the client reads the content of the object, and if the metadata information of the head object has the newly added field, which indicates that the object is moved, the name of the tail object is determined according to the value of the newly added field when the tail object data of the object is read, and then other data are read from the tail object.
Deleting the data and metadata in the source object header, and then brushing the data and new metadata in the memory, such as etag, tag, content _type, pg_ver, source_zone and other attribute information, down to a new header object named by the destination bucket unique identifier and object name in the same storage pool. And writing the name, size, classification and metadata information of the owner of the object into a certain barrel slice of a target barrel in the index pool, wherein the number of the barrel slice is calculated according to the name of the object through a hash algorithm.
And deleting the metadata information of the object in the source barrel slice in the index pool.
In the invention, only the data and metadata in the object head are moved and modified in the object movement, but the tail object data is not moved, and the field value is the unique identification of the source barrel by adding the field in the head object metadata, so that the tail object data can be correctly read when the object data is read, thereby ensuring the integrity of data access.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.

Claims (6)

1. A method for optimizing movement of objects in a distributed object store, the method comprising the acts of:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
the new field is used for reading a head object of the object when the client reads the content of the object, and if the metadata information of the head object contains the new field, which indicates that the object is moved, the name of the tail object is determined according to the value of the new field when the tail object data of the object is read, and then other data are read from the tail object;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
2. The method of claim 1, wherein the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
3. The method for optimizing object movement in a distributed object store according to claim 1, wherein the number of the bucket fragment is calculated by a hash algorithm according to the name of the object.
4. An object movement optimization system in a distributed object store, the system comprising:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata; the new field is used for reading a head object of the object when the client reads the content of the object, and if the metadata information of the head object contains the new field, which indicates that the object is moved, the name of the tail object is determined according to the value of the new field when the tail object data of the object is read, and then other data are read from the tail object;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
5. The system of claim 4, wherein the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
6. The system of claim 4, wherein the number of the bucket fragments is calculated by a hashing algorithm based on the name of the object.
CN202210018247.0A 2022-01-07 2022-01-07 Object movement optimization method and system in distributed object storage Active CN114546261B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210018247.0A CN114546261B (en) 2022-01-07 2022-01-07 Object movement optimization method and system in distributed object storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210018247.0A CN114546261B (en) 2022-01-07 2022-01-07 Object movement optimization method and system in distributed object storage

Publications (2)

Publication Number Publication Date
CN114546261A CN114546261A (en) 2022-05-27
CN114546261B true CN114546261B (en) 2023-08-08

Family

ID=81669392

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210018247.0A Active CN114546261B (en) 2022-01-07 2022-01-07 Object movement optimization method and system in distributed object storage

Country Status (1)

Country Link
CN (1) CN114546261B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018897A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 Data processing method, device and calculating equipment
CN113886331A (en) * 2021-12-03 2022-01-04 苏州浪潮智能科技有限公司 Distributed object storage method and device, electronic equipment and readable storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20150227414A1 (en) * 2012-08-31 2015-08-13 Pradeep Varma Systems And Methods Of Memory And Access Management
US11409720B2 (en) * 2019-11-13 2022-08-09 Western Digital Technologies, Inc. Metadata reduction in a distributed storage system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110018897A (en) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 Data processing method, device and calculating equipment
CN113886331A (en) * 2021-12-03 2022-01-04 苏州浪潮智能科技有限公司 Distributed object storage method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN114546261A (en) 2022-05-27

Similar Documents

Publication Publication Date Title
US11334270B2 (en) Key-value store using journaling with selective data storage format
CN101826107B (en) Hash data processing method and device
CN103488709B (en) A kind of index establishing method and system, search method and system
CN110147204B (en) Metadata disk-dropping method, device and system and computer-readable storage medium
US8495286B2 (en) Write buffer for improved DRAM write access patterns
CN107122130B (en) Data deduplication method and device
CN1645516B (en) Data recovery apparatus and method used for flash memory
CN110888837B (en) Object storage small file merging method and device
CN106599091B (en) RDF graph structure storage and index method based on key value storage
CN104503703A (en) Cache processing method and device
CN109460404A (en) A kind of efficient Hbase paging query method based on redis
JP2014106736A (en) Information processor and control method thereof
KR101226600B1 (en) Memory System And Memory Mapping Method thereof
CN110222046B (en) List data processing method, device, server and storage medium
CN113268457B (en) Self-adaptive learning index method and system supporting efficient writing
CN114546261B (en) Object movement optimization method and system in distributed object storage
CN116662327B (en) Data fusion cleaning method for database
KR101344649B1 (en) Hash-based skyline query processing method and apparatus thereof
CN109189696B (en) SSD (solid State disk) caching system and caching method
CN106909623A (en) A kind of data set and date storage method of supporting efficient mass data to analyze and retrieve
CN105243099A (en) Large data real-time storage method based on translation document
CN116243869A (en) Data processing method and device and electronic equipment
CN112799872A (en) Erasure code encoding method and device based on key value pair storage system
CN112395440A (en) Caching method, efficient image semantic retrieval method and system
Li et al. Necklace: An efficient cuckoo hashing scheme for cloud storage services

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant