CN114546261B - Object movement optimization method and system in distributed object storage - Google Patents
Object movement optimization method and system in distributed object storage Download PDFInfo
- Publication number
- CN114546261B CN114546261B CN202210018247.0A CN202210018247A CN114546261B CN 114546261 B CN114546261 B CN 114546261B CN 202210018247 A CN202210018247 A CN 202210018247A CN 114546261 B CN114546261 B CN 114546261B
- Authority
- CN
- China
- Prior art keywords
- metadata
- data
- source
- barrel
- head
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 19
- 238000005457 optimization Methods 0.000 title claims description 9
- 230000001680 brushing effect Effects 0.000 claims description 5
- 239000012634 fragment Substances 0.000 claims description 4
- 230000004048 modification Effects 0.000 claims description 4
- 238000012986 modification Methods 0.000 claims description 4
- 238000011010 flushing procedure Methods 0.000 claims description 3
- 230000000694 effects Effects 0.000 description 3
- 238000004064 recycling Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0602—Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
- G06F3/0614—Improving the reliability of storage systems
- G06F3/0619—Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0647—Migration mechanisms
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0628—Interfaces specially adapted for storage systems making use of a particular technique
- G06F3/0646—Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
- G06F3/0652—Erasing, e.g. deleting, data cleaning, moving of data to a wastebasket
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G06F3/06—Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
- G06F3/0601—Interfaces specially adapted for storage systems
- G06F3/0668—Interfaces specially adapted for storage systems adopting a particular infrastructure
- G06F3/067—Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5005—Allocation of resources, e.g. of the central processing unit [CPU] to service a request
- G06F9/5011—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals
- G06F9/5016—Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resources being hardware resources other than CPUs, Servers and Terminals the resource being the memory
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Human Computer Interaction (AREA)
- Software Systems (AREA)
- Computer Security & Cryptography (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a method and a system for optimizing object movement in distributed object storage, which are characterized in that the head object data and metadata of a source object are modified, a newly added field is added in the metadata, the newly added field value is the unique identifier of a source barrel through the newly added field, then tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is brushed into a new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, the code logic is simpler, the operation steps are fewer, and meanwhile, the scheme of repeatedly using the data greatly simplifies the logic flow, improves the movement efficiency, greatly improves the performance of the distributed object storage system, and brings better use experience to users.
Description
Technical Field
The invention relates to the technical field of distributed object storage, in particular to a method and a system for optimizing object movement in distributed object storage.
Background
The object stored in the storage system is mainly composed of two parts, namely data and metadata. For larger sized objects, the metadata is only a small percentage. All operations of the object are basically related to the two parts of content, and the object is not exceptional in moving. The data of the object is stored in the data pool, and the metadata is respectively stored in the head object of the current object in the data pool and a certain barrel slice of the barrel where the current object in the index pool is located.
In current distributed object storage systems, an operation of moving an object, i.e., moving the object from one bucket to another, is often used. This operation involves several major tasks, namely reading the object, uploading the object, deleting the source object, and it is extremely resource-consuming to perform the move operation on the large-sized memory object in addition to the cumbersome process. If the objects are moved in batches, the phenomenon of memory storm may occur, thereby affecting the front-end service of the client and further bringing bad user experience.
Disclosure of Invention
The invention aims to provide an object movement optimization method and system in distributed object storage, which aim to solve the problems of complicated object movement process and high memory resource consumption in the prior art, and realize improvement of movement efficiency and performance of a distributed object storage system.
In order to achieve the above technical object, the present invention provides a method for optimizing object movement in a distributed object store, the method comprising the following operations:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
Preferably, the new field is used for reading the head object of the object when the client reads the content of the object, if the metadata information of the head object has the new field, which indicates that the object is moved, then when the tail object data of the object is read, the name of the tail object is determined according to the value of the new field, and then other data is read from the tail object.
Preferably, the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
Preferably, the number of the bucket fragment is calculated according to the name of the object through a hash algorithm.
The invention also provides an object movement optimization system in the distributed object storage, which comprises:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
Preferably, the new field is used for reading the head object of the object when the client reads the content of the object, if the metadata information of the head object has the new field, which indicates that the object is moved, then when the tail object data of the object is read, the name of the tail object is determined according to the value of the new field, and then other data is read from the tail object.
Preferably, the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
Preferably, the number of the bucket fragment is calculated according to the name of the object through a hash algorithm.
The effects provided in the summary of the invention are merely effects of embodiments, not all effects of the invention, and one of the above technical solutions has the following advantages or beneficial effects:
compared with the prior art, the method and the device have the advantages that the head object data and the metadata of the source object are modified, the newly added field is added in the metadata, the newly added field value is the unique identifier of the source barrel through the newly added field, then the tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is downwards brushed into the new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, the code logic is simpler, the operation steps are fewer, meanwhile, the scheme of recycling the data is adopted, the logic flow is greatly simplified, the movement efficiency is improved, the performance of the distributed object storage system is greatly improved, and better use experience is brought to users.
Drawings
FIG. 1 is a flowchart of an object movement optimization method in a distributed object store according to an embodiment of the present invention;
FIG. 2 is a block diagram of an object movement optimization system in distributed object storage according to an embodiment of the present invention.
Detailed Description
In order to clearly illustrate the technical features of the present solution, the present invention will be described in detail below with reference to the following detailed description and the accompanying drawings. The following disclosure provides many different embodiments, or examples, for implementing different structures of the invention. In order to simplify the present disclosure, components and arrangements of specific examples are described below. Furthermore, the present invention may repeat reference numerals and/or letters in the various examples. This repetition is for the purpose of simplicity and clarity and does not in itself dictate a relationship between the various embodiments and/or configurations discussed. It should be noted that the components illustrated in the figures are not necessarily drawn to scale. Descriptions of well-known components and processing techniques and processes are omitted so as to not unnecessarily obscure the present invention.
The following describes in detail an object movement optimization method and system in distributed object storage according to an embodiment of the present invention with reference to the accompanying drawings.
As shown in fig. 1, the invention discloses a method for optimizing object movement in a distributed object store, which comprises the following operations:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
The embodiment of the invention simplifies the object moving step by only moving and modifying the metadata of the object without moving the data aiming at the object movement among different buckets in the same data pool in the object storage system.
The data of the object is stored in the data pool, the metadata is respectively stored in the head object of the current object in the data pool and a certain barrel slice of the barrel where the current object in the index pool is located, and the metadata only occupies a small proportion, so that in the embodiment of the invention, only the metadata is moved and modified, and the metadata is correspondingly modified in the data pool and the index pool respectively.
And finding the object head of the target object from the data pool according to the object name and the unique identifier of the source barrel where the object is located, and then reading the data and the metadata in the object head into the memory. The original metadata of the object is directly modified in the memory, and most importantly, a new field is added, and the value of the field takes the unique identification of the source barrel. The field is used for reading the head object of the object when the client reads the content of the object, and if the metadata information of the head object has the newly added field, which indicates that the object is moved, the name of the tail object is determined according to the value of the newly added field when the tail object data of the object is read, and then other data are read from the tail object.
Deleting the data and metadata in the source object header, and then brushing the data and new metadata in the memory, such as etag, tag, content _type, pg_ver, source_zone and other attribute information, down to a new header object named by the destination bucket unique identifier and object name in the same storage pool. And writing the name, size, classification and metadata information of the owner of the object into a certain barrel slice of a target barrel in the index pool, wherein the number of the barrel slice is calculated according to the name of the object through a hash algorithm.
And deleting the metadata information of the object in the source barrel slice in the index pool.
In the invention, only the data and metadata in the object head are moved and modified in the object movement, but the tail object data is not moved, and the field value is the unique identification of the source barrel by adding the field in the head object metadata, so that the tail object data can be correctly read when the object data is read, thereby ensuring the integrity of data access.
According to the embodiment of the invention, the head object data and the metadata of the source object are modified, the newly added field is added in the metadata, the newly added field value is used as the unique identifier of the source barrel, and then the tail object data can be correctly read when the object is read, so that the integrity of data access is ensured, the modified metadata is brushed down into a new head object named by the unique identifier of the destination barrel and the object name in the same storage pool, the quick movement of the object is completed, so that the code logic is simpler, the operation steps are fewer, and meanwhile, the scheme of recycling the data greatly simplifies the logic flow, improves the movement efficiency, greatly improves the performance of the distributed object storage system, and brings better use experience to users.
As shown in fig. 2, the embodiment of the invention further discloses an object movement optimization system in the distributed object storage, which comprises:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
And finding the object head of the target object from the data pool according to the object name and the unique identifier of the source barrel where the object is located, and then reading the data and the metadata in the object head into the memory. The original metadata of the object is directly modified in the memory, and most importantly, a new field is added, and the value of the field takes the unique identification of the source barrel. The field is used for reading the head object of the object when the client reads the content of the object, and if the metadata information of the head object has the newly added field, which indicates that the object is moved, the name of the tail object is determined according to the value of the newly added field when the tail object data of the object is read, and then other data are read from the tail object.
Deleting the data and metadata in the source object header, and then brushing the data and new metadata in the memory, such as etag, tag, content _type, pg_ver, source_zone and other attribute information, down to a new header object named by the destination bucket unique identifier and object name in the same storage pool. And writing the name, size, classification and metadata information of the owner of the object into a certain barrel slice of a target barrel in the index pool, wherein the number of the barrel slice is calculated according to the name of the object through a hash algorithm.
And deleting the metadata information of the object in the source barrel slice in the index pool.
In the invention, only the data and metadata in the object head are moved and modified in the object movement, but the tail object data is not moved, and the field value is the unique identification of the source barrel by adding the field in the head object metadata, so that the tail object data can be correctly read when the object data is read, thereby ensuring the integrity of data access.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.
Claims (6)
1. A method for optimizing movement of objects in a distributed object store, the method comprising the acts of:
acquiring head object data and metadata of a source object in a data pool to a memory according to the object name and a unique identifier of a source barrel where the object is located;
modifying original metadata in the memory, and adding a newly added field in the metadata;
the new field is used for reading a head object of the object when the client reads the content of the object, and if the metadata information of the head object contains the new field, which indicates that the object is moved, the name of the tail object is determined according to the value of the new field when the tail object data of the object is read, and then other data are read from the tail object;
deleting the head object data and the metadata in the source object, and brushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the movement of the object.
2. The method of claim 1, wherein the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
3. The method for optimizing object movement in a distributed object store according to claim 1, wherein the number of the bucket fragment is calculated by a hash algorithm according to the name of the object.
4. An object movement optimization system in a distributed object store, the system comprising:
the source object acquisition module is used for acquiring head object data and metadata of a source object in the data pool into the memory according to the object name and the unique identifier of a source barrel where the object is located;
the metadata modification module is used for modifying original metadata in the memory and adding a newly added field in the metadata; the new field is used for reading a head object of the object when the client reads the content of the object, and if the metadata information of the head object contains the new field, which indicates that the object is moved, the name of the tail object is determined according to the value of the new field when the tail object data of the object is read, and then other data are read from the tail object;
the data pool replacement module is used for deleting the head object data and the metadata in the source object, and flushing the data in the memory and the modified metadata down to a new head object named by a unique target bucket identifier and an object name in the same memory pool;
and the index pool replacement module is used for writing the metadata information of the object into a certain barrel slice of the target barrel in the index pool, deleting the metadata information of the object in the source barrel slice in the index pool, and completing the object movement.
5. The system of claim 4, wherein the metadata includes etag, tag, content _type, pg_ver, source_zone attribute information.
6. The system of claim 4, wherein the number of the bucket fragments is calculated by a hashing algorithm based on the name of the object.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210018247.0A CN114546261B (en) | 2022-01-07 | 2022-01-07 | Object movement optimization method and system in distributed object storage |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210018247.0A CN114546261B (en) | 2022-01-07 | 2022-01-07 | Object movement optimization method and system in distributed object storage |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114546261A CN114546261A (en) | 2022-05-27 |
CN114546261B true CN114546261B (en) | 2023-08-08 |
Family
ID=81669392
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210018247.0A Active CN114546261B (en) | 2022-01-07 | 2022-01-07 | Object movement optimization method and system in distributed object storage |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114546261B (en) |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018897A (en) * | 2018-01-09 | 2019-07-16 | 阿里巴巴集团控股有限公司 | Data processing method, device and calculating equipment |
CN113886331A (en) * | 2021-12-03 | 2022-01-04 | 苏州浪潮智能科技有限公司 | Distributed object storage method and device, electronic equipment and readable storage medium |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150227414A1 (en) * | 2012-08-31 | 2015-08-13 | Pradeep Varma | Systems And Methods Of Memory And Access Management |
US11409720B2 (en) * | 2019-11-13 | 2022-08-09 | Western Digital Technologies, Inc. | Metadata reduction in a distributed storage system |
-
2022
- 2022-01-07 CN CN202210018247.0A patent/CN114546261B/en active Active
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110018897A (en) * | 2018-01-09 | 2019-07-16 | 阿里巴巴集团控股有限公司 | Data processing method, device and calculating equipment |
CN113886331A (en) * | 2021-12-03 | 2022-01-04 | 苏州浪潮智能科技有限公司 | Distributed object storage method and device, electronic equipment and readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN114546261A (en) | 2022-05-27 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101826107B (en) | Hash data processing method and device | |
US20210081128A1 (en) | Key-value store using journaling with selective data storage format | |
CN103488709B (en) | A kind of index establishing method and system, search method and system | |
CN110147204B (en) | Metadata disk-dropping method, device and system and computer-readable storage medium | |
CN103164490B (en) | A kind of efficient storage implementation method of not fixed-length data and device | |
JP6161266B2 (en) | Information processing apparatus, control method therefor, electronic device, program, and storage medium | |
US8495286B2 (en) | Write buffer for improved DRAM write access patterns | |
CN1645516B (en) | Data recovery apparatus and method used for flash memory | |
CN106599091B (en) | RDF graph structure storage and index method based on key value storage | |
CN107122130A (en) | A kind of data delete method and device again | |
CN110888837A (en) | Object storage small file merging method and device | |
CN109460404A (en) | A kind of efficient Hbase paging query method based on redis | |
KR101226600B1 (en) | Memory System And Memory Mapping Method thereof | |
CN108664577B (en) | File management method and system based on FLASH idle area | |
CN111897828A (en) | Data batch processing implementation method, device, equipment and storage medium | |
CN114546261B (en) | Object movement optimization method and system in distributed object storage | |
CN116662327B (en) | Data fusion cleaning method for database | |
CN109325022B (en) | Data processing method and device | |
CN106909623B (en) | A kind of data set and date storage method for supporting efficient mass data to analyze and retrieve | |
KR101344649B1 (en) | Hash-based skyline query processing method and apparatus thereof | |
CN110515897B (en) | Method and system for optimizing reading performance of LSM storage system | |
CN109189696B (en) | SSD (solid State disk) caching system and caching method | |
CN117035000A (en) | Evolutionary dual-task feature selection method based on mixed initialization particle swarm optimization | |
CN107450859B (en) | Method and device for reading file data | |
CN112328630B (en) | Data query method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |