CN114415954A - Optimization method and device for Ceph object storage metadata processing - Google Patents

Optimization method and device for Ceph object storage metadata processing Download PDF

Info

Publication number
CN114415954A
CN114415954A CN202210006071.7A CN202210006071A CN114415954A CN 114415954 A CN114415954 A CN 114415954A CN 202210006071 A CN202210006071 A CN 202210006071A CN 114415954 A CN114415954 A CN 114415954A
Authority
CN
China
Prior art keywords
metadata
memory
database
storage
layer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210006071.7A
Other languages
Chinese (zh)
Inventor
李毅
李海静
晁飞
谢福平
季小庭
徐博
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fiberhome Telecommunication Technologies Co Ltd
Original Assignee
Fiberhome Telecommunication Technologies Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fiberhome Telecommunication Technologies Co Ltd filed Critical Fiberhome Telecommunication Technologies Co Ltd
Priority to CN202210006071.7A priority Critical patent/CN114415954A/en
Publication of CN114415954A publication Critical patent/CN114415954A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • G06F3/0619Improving the reliability of storage systems in relation to data integrity, e.g. data losses, bit errors
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/061Improving I/O performance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computer Security & Cryptography (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention belongs to the technical field of object storage of distributed storage, and particularly relates to an optimization method and device for Ceph object storage metadata processing, wherein the optimization method and device comprise the following steps: judging the data type of the request operation; if the data type of the requested operation is metadata, calling a metadata storage engine to interact with a database in the memory to complete corresponding operation; and if the data type of the requested operation is object data, calling a native storage engine to interact with the physical hard disk to complete corresponding operation. The invention separates the operation of the metadata and the operation of the object data, and because the operation of the metadata is finished by using the newly added metadata storage engine and the database in the memory in an interactive way, the processing efficiency can be effectively improved, thereby supporting the object storage application scene under the mass files.

Description

Optimization method and device for Ceph object storage metadata processing
[ technical field ] A method for producing a semiconductor device
The invention relates to the technical field of object storage of distributed storage, in particular to an optimization method and device for Ceph object storage metadata processing.
[ background of the invention ]
With the rapid development of cloud computing, big data, mobile internet and social networks, the informatization unstructured data of enterprises shows explosive growth. However, in the face of data explosion, traditional storage becomes a bottleneck in enterprise digital transformation, and selecting a more appropriate storage technology becomes a necessary option for IT infrastructure construction. Compared with the traditional file system storage, the object storage takes the object as a data storage unit, the characteristic of file system metadata management is abandoned, and all the objects are subjected to data storage in a flattened keyword value-taking mode, so that the complexity of metadata management is greatly simplified.
Ceph is a uniform and excellent open source distributed storage system, supports both traditional block and file storage protocols and emerging object storage protocols, and has the characteristics of high availability, high expandability and high performance, so that the Ceph is widely used in various fields.
The data types in the Ceph object store are divided into two major classes. One type is metadata, which is data describing the attributes of object data and is used to indicate the storage location of the object data, historical data, resource search, file record and other functions, and mainly includes user metadata, bucket metadata, object index metadata and the like; the other type is real object data, namely data uploaded by a user, and is meaningful and real visible data for the user, such as a picture, a document, a video and the like. Currently, Ceph object storage performs unified management on metadata and object data, and performs reading and writing through the same storage engine (i.e., a native storage engine) to store the two types of data on a physical hard disk. If a certain object data is to be operated, the original storage engine is needed to load the related metadata from the physical hard disk, then the index of the object data is searched, and finally the storage position of the target object data in the disk is found according to the index to perform read-write operation. When the data size is large (for example, a large amount of small file object data is stored), it is time consuming to read a huge index from the hard disk and perform a search, resulting in poor read-write performance of object storage in a large amount of file scenes.
In view of the above, overcoming the drawbacks of the prior art is an urgent problem in the art.
[ summary of the invention ]
The technical problem to be solved by the invention is as follows:
in the prior art, Ceph object storage performs unified management on metadata and object data, and performs reading and writing through the same storage engine (i.e., a native storage engine) to store the two types of data on a physical hard disk. If a certain object data is to be operated, the original storage engine is needed to load the related metadata from the physical hard disk, then the index of the object data is searched, and finally the storage position of the target object data in the disk is found according to the index to perform read-write operation. When the data size is large (for example, a large amount of small file object data is stored), it is time consuming to read a huge index from the hard disk and perform a search, resulting in poor read-write performance of object storage in a large amount of file scenes.
The invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for optimizing Ceph object storage metadata processing, including:
judging the data type of the request operation;
if the data type of the requested operation is metadata, calling a metadata storage engine to interact with a database in the memory to complete corresponding operation;
and if the data type of the requested operation is object data, calling a native storage engine to interact with the physical hard disk to complete corresponding operation.
Preferably, the metadata storage engine is at the same logical layer as the native storage engine;
the upper application calls a designated class and a function in the interface adaptation layer, and calls a metadata storage engine or a native storage engine to process an operation request under a corresponding branch according to the condition that the data type is metadata or object data;
and the branch condition and the action called under the branch are increased in the implementation of the interface adaptation layer, so that the definition of the interface adaptation layer outside is kept unchanged.
Preferably, the metadata storage engine includes an external interface layer, an intermediate translation layer, and a driver layer, specifically:
the external interface layer provides a standard functional interface which can be called to an upper-level interface adaptation layer;
the intermediate conversion layer abstracts a uniform standard operation interface for the database in each memory for the external interface layer to call, and is also butted with a driving module of the database in each memory in the driving layer;
and the drive layer is used for realizing a drive module for correspondingly accessing the database in each memory aiming at the database in each memory.
Preferably, if the metadata is written in/read, a standard functional interface corresponding to the metadata storage engine is called, and the metadata is written in/read from the database in the memory after passing through the intermediate conversion layer and the drive layer.
Preferably, if the object data is read, calling a metadata storage engine according to the information carried in the operation request, and acquiring corresponding metadata from a database in the memory;
and calling a native storage engine to read the object data from the disk according to the metadata acquired from the database in the memory.
Preferably, if the object data is written, calling a metadata storage engine according to the information carried in the operation request, and acquiring corresponding metadata from a database in the memory;
calling a native storage engine to write object data in a disk according to metadata acquired from a database in a memory, and obtaining metadata which is newly added when the object data is written in the disk;
and calling a metadata storage engine to write the newly added metadata into a database in the memory.
Preferably, the method further comprises an initialization process, specifically:
starting a corresponding database in a memory through an external interface layer, a middle conversion layer and a driving module in a driving layer provided by a metadata storage engine in an initialization process;
in the starting process of the database in the memory, all metadata which are persistently stored on the physical hard disk are read, and all metadata read from the physical hard disk are loaded into the database in the memory to be started.
Preferably, the method further comprises the step of periodically writing the metadata in the database in the memory into the physical hard disk for persistent storage through an asynchronous thread in the metadata storage engine, so as to prevent the metadata from being lost due to abnormal power failure of the node or abnormal exit of service.
Preferably, the metadata includes cluster metadata and user metadata information;
the cluster metadata comprises one or more of a realm region, a zonegroup namespace group, and a zone namespace;
the user metadata information includes one or more of user metadata, acl access control list metadata, quota metadata, object metadata, and bucket _ index bucket index metadata.
In a second aspect, the present invention further provides an optimization apparatus for Ceph object storage metadata processing, including at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform the method of optimizing Ceph object storage metadata processing of the first aspect.
The invention has the beneficial effects that:
the method separates the operation of metadata from the operation of object data, and if the data type of the requested operation is metadata, a newly added metadata storage engine is called to interact with a database in a memory to complete corresponding operation; and if the data type of the requested operation is object data, calling a native storage engine to interact with the physical hard disk to complete corresponding operation. Because the operation of the metadata is finished by using the newly added metadata storage engine to interact with the database in the memory, and the memory reading and writing efficiency is far higher than that of a physical hard disk, the processing efficiency can be effectively improved, and the object storage application scene under the mass files is supported.
Furthermore, after the metadata storage engine is newly added, only the internal implementation of the interface adaptation layer needs to be modified, and the interface definition remains unchanged, so that the upper-layer service system directly calls the interface of the interface adaptation layer, does not sense the change, and minimizes the influence on the service system. In addition, the metadata storage engine adopts a driving layer in design, the driving layer can be adapted to databases in various memories and is flexible and configurable, and a user can select the database in the memory used by the back end according to actual needs.
[ description of the drawings ]
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings required to be used in the embodiments of the present invention will be briefly described below. It is obvious that the drawings described below are only some embodiments of the invention, and that for a person skilled in the art, other drawings can be derived from them without inventive effort.
Fig. 1 is a flowchart of an optimization method for Ceph object storage metadata processing according to an embodiment of the present invention;
fig. 2 is an architecture diagram of a metadata storage engine of an optimization method for Ceph object storage metadata processing according to an embodiment of the present invention;
fig. 3 is an architecture diagram of an optimization method for Ceph object storage metadata processing according to an embodiment of the present invention;
fig. 4 is a timing diagram of an optimization method for Ceph object storage metadata processing according to an embodiment of the present invention;
fig. 5 is an architecture diagram of an optimization method for Ceph object storage metadata processing according to an embodiment of the present invention;
fig. 6 is a block diagram of an optimization apparatus for Ceph object storage metadata processing according to an embodiment of the present invention.
[ detailed description ] embodiments
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
In the description of the present invention, the terms "inner", "outer", "longitudinal", "lateral", "upper", "lower", "top", "bottom", and the like indicate orientations or positional relationships based on those shown in the drawings, and are for convenience only to describe the present invention without requiring the present invention to be necessarily constructed and operated in a specific orientation, and thus should not be construed as limiting the present invention.
In addition, the technical features involved in the embodiments of the present invention described below may be combined with each other as long as they do not conflict with each other.
Example 1:
embodiment 1 of the present invention provides an optimization method for Ceph object storage metadata processing, as shown in fig. 1, including:
step 1: judging the data type of the request operation; after the interface adaptation layer obtains the operation request, different storage engines are called according to the data type of the requested operation to execute the corresponding operation request.
Step 2: if the data type of the requested operation is metadata, calling a metadata storage engine to interact with a database in the memory to complete corresponding operation; the metadata storage engine can be divided into 3 layers in terms of architecture, namely an external interface layer, an intermediate conversion layer and a driving layer.
And the external interface layer is used as an external window of the metadata storage engine, and exposes functions which can be provided by the metadata storage engine to the interface adaptation layer at the upper stage by defining a plurality of standard function interfaces. When the interface adaptation layer needs to use the functions provided by the metadata storage engine, only the standard function interface provided by the external interface layer needs to be called. The interface adaptation layer need not be concerned with the implementation and processing details inside the metadata storage engine, wherein the standard functional interface includes: synchronizing data to one or more of a physical hard disk, managing a back-end database service start-stop, a user metadata management class, a bucket metadata management class, an object metadata management class, and an index metadata management class. The user metadata management class defines the operation of adding, deleting, modifying and checking user metadata; the bucket metadata management class defines bucket metadata addition, deletion, modification and check operations; the object metadata management class defines the operation of adding, deleting, checking and modifying object metadata; the index metadata management class defines the operation of increasing, deleting, checking and modifying index metadata.
The intermediate conversion layer is used for butting the drive files of the memory type databases in the drive layer and abstracting a uniform standard operation interface for each memory type database at the back end (the memory type database is a database taking a memory as a storage medium, namely the database in the memory) to be called by an external interface layer, so that the memory type databases can be integrated into an object storage system in a plug-and-play mode only by realizing the standard operation interface. The external interface layer uniformly calls a standard operation interface for processing according to the operation type; the operation type comprises increasing, deleting, checking, modifying and the like, the standard operation interface also comprises increasing, deleting, checking, modifying, and if the operation type is checking, the external interface layer correspondingly calls a DBOp:: driver- > get _ key () function in the standard operation interface (the function is a query interface in the standard operation interface); due to the existence of the intermediate conversion layer, when a new database is adapted, the external interface layer does not sense the change, and only needs to call the uniform standard operation interface abstracted by the intermediate conversion layer uniformly.
And the driving layer encapsulates corresponding database operation interfaces for various memory type databases according to the standard operation interface defined by the intermediate conversion layer for the intermediate conversion layer to call, and each memory type database corresponds to one driving file.
And step 3: and if the data type of the requested operation is object data, calling a native storage engine to interact with the physical hard disk to complete corresponding operation.
The embodiment provides a manner that can be implemented in an actual scenario, as shown in fig. 2-3, a Ceph object store includes an object store gateway, and the object store gateway includes a front-end processing layer and a back-end data processing layer, where the front-end processing layer includes an HTTP Server module, a RestAPI general processing layer, a Handler module, and an OP Operation module (Operation module); the back-end data processing layer comprises an interface adaptation layer, a metadata storage engine and a native storage engine, wherein the metadata storage engine and the native storage engine are in the same logic layer, the metadata storage engine comprises an external interface layer, a middle conversion layer and a drive layer, and the specific steps are as follows:
suppose the operation request is: as shown in fig. 4, metadata of a user whose Key is "user _ Key" is queried. If the metadata is written in/read, calling a corresponding standard functional interface of the metadata storage engine, and writing in/reading the metadata in a database in the memory after passing through the intermediate conversion layer and the drive layer.
After receiving an operation request for inquiring a certain user sent by a client, the object storage gateway judges that the data type of the operation requested this time is metadata.
After the operation request sequentially passes through the HTTP Server module, the RestAPI universal processing layer, the Handler module and the OP operation module of the object storage gateway, the OP operation module calls a storage engine corresponding to the interface adaptation layer to execute the operation.
The interface adaptation layer selects different code branches according to different data types of the requested operation, so as to call different storage engines to execute the operation request of query (namely reading):
since the data type of the request operation is to query the user metadata, the interface adaptation layer calls a corresponding standard functional interface defined by an external interface layer of the metadata storage engine, specifically, calls a UserMetaOp function in a user metadata management class in the standard functional interface, wherein the user _ Key is a Key of the user to execute the query operation request.
Since the operation type is the query operation, the UserMetaOp: (user _ key) function continues to call the DBOp: (driver- > get _ key) function in the standard operation interface of the intermediate translation layer to execute the query operation. And then the intermediate conversion layer calls a corresponding database operation interface defined in the drive file according to the pre-configured drive file so as to inquire user metadata from the memory type database corresponding to the drive file. Assuming that the preconfigured drive file is redis.cc, the drive file redis.cc includes various database operation interfaces of the Redis database, so that the intermediate conversion layer can continuously call the corresponding database operation interface in the drive file redis.cc to query the user metadata from the memory type Redis database, and return the queried user metadata layer by layer upwards. Wherein, the memory type database includes: redis database, LevelDB, Rocks, TiKV database, and Xedis, etc., which are not described herein in detail. The present embodiment is merely an example, and is not intended to limit the present invention.
The metadata storage engine is in the same logical layer as the native storage engine; the upper application calls a designated class and a function in the interface adaptation layer, and calls a metadata storage engine or a native storage engine to process an operation request under a corresponding branch according to the condition that the data type is metadata or object data; wherein the branch condition and the action called under the branch are added in the interface adaptation layer implementation, thereby keeping the interface adaptation layer definition unchanged.
The OP operation module directly interacts with an interface adaptation layer, and inside the interface adaptation layer, according to the data type of metadata or object data as a branch condition, a metadata storage engine or a native storage engine is called under a corresponding branch to process an operation request, such as: an interface function native to the interface adaptation layer is func a, and parameters transmitted by the interface function func a are well defined in advance, so that the change caused by adding one branch to an internal implementation mode is avoided. The OP operation module can directly call the interface function func A, the specific implementation inside the interface function func A is divided into two branches according to the condition that the data type is metadata or object data as a branch, wherein one branch is an operation request for calling a native storage engine to execute the object data, and the other branch is an operation request for calling the metadata storage engine to execute the metadata. Since the branch condition and the action called under the branch are added in the implementation of the interface adaptation layer, the interface adaptation layer definition can be kept unchanged, and thus the OP operation module does not sense the change when calling the interface function to func a. This will be advantageous to reduce the impact on the business system.
The metadata storage engine comprises an external interface layer, an intermediate conversion layer and a drive layer, and specifically comprises the following steps: the external interface layer provides a standard functional interface which can be called to an upper-level interface adaptation layer;
the intermediate conversion layer abstracts a uniform standard operation interface for the database in each memory for the external interface layer to call, and is also butted with a driving module of the database in each memory in the driving layer;
the intermediate conversion layer abstracts a uniform standard operation interface for the database in each memory for the external interface layer to call, and specifically includes: the middle conversion layer abstracts a uniform standard operation interface for a database in each memory at the rear end, and the external interface layer uniformly calls the standard operation interface for processing; the external interface layer does not perceive the adaptation of the database in the new memory whenever a new in-memory database is adapted.
And the drive layer is used for realizing a drive module for correspondingly accessing the database in each memory aiming at the database in each memory.
And the driving layer encapsulates corresponding database operation interfaces for various memory type databases according to the standard operation interface defined by the intermediate conversion layer, wherein each memory type database corresponds to one driving file.
The method further comprises an initialization process, specifically: starting a corresponding database in a memory through an external interface layer, a middle conversion layer and a driving module in a driving layer provided by a metadata storage engine in an initialization process; in the starting process of the database in the memory, all metadata which are persistently stored on the physical hard disk are read, and all metadata read from the physical hard disk are loaded into the database in the memory to be started.
After starting the object storage gateway for storing the Ceph object, various global variables and working threads are initialized, and various resource managers and resource processors are registered. When the object storage gateway initializes the working thread, the object storage gateway includes an initialization process of a metadata management working thread, and in the initialization process of the metadata management working thread, starting and stopping of a management back-end database service in a standard functional interface provided by an external interface layer, a middle conversion layer and a drive file of a drive layer are realized, so that starting of a database in a corresponding memory is realized. And the database in the memory reads all metadata which is persistently stored on the physical hard disk in the starting process, and loads all metadata read from the physical hard disk into the database in the memory to be started.
In order to prevent the metadata in the memory type database from being lost due to the power failure of a node or abnormal exit of service.
And the method also comprises the step of periodically writing the metadata in the database in the memory into the physical hard disk for persistent storage through an asynchronous thread in the metadata storage engine, so as to prevent the metadata from being lost due to abnormal power failure of the node or abnormal exit of service.
The method specifically comprises the following steps: in the running process of a main thread of the object storage gateway, an asynchronous thread of a metadata storage engine periodically (for example, once in an hour) writes metadata in a database in a memory into a physical hard disk for persistent storage, so that the metadata is prevented from being lost due to abnormal power failure of a node or abnormal exit of service; when the main thread of the object storage gateway exits, the asynchronous thread of the metadata storage engine is also triggered to completely drop the metadata in the memory-type database, so that the memory-type database is initialized, loaded and persistently stored in all the metadata on the physical hard disk when the object storage gateway is started next time.
The metadata comprises cluster metadata and user metadata information; the cluster metadata comprises one or more of a realm region, a zonegroup namespace group, and a zone namespace; the user metadata information includes one or more of user metadata, acl access control list metadata, quota metadata, object metadata, and bucket _ index bucket index metadata.
Example 2:
for further understanding of the present invention, on the basis of embodiment 1, the present invention also provides a way that can be implemented in a practical scenario, and this embodiment assumes that a memory-type Redis database is used to store metadata. As shown in fig. 5, the system includes three gateway nodes, which are a gateway node 1, a gateway node 2, and a gateway node 3, where each gateway node includes a Redis Server (the Redis Server is a service instance of a Redis database) and an object storage gateway, where the three gateway nodes adopt a Redis database cluster deployment mode with one master and two slaves. In the whole Redis database cluster, the gateway node 1 serves as a main node to provide service for the outside, and the gateway node 2 and the gateway node 3 are used as standby nodes to synchronize metadata from the main node regularly. The standby node is only used as a hot standby, and when the main node fails, the main node and the standby node can be automatically switched. The Redis Server and the object storage gateway are deployed on 3 servers, and the other servers (at least 3 servers) are deployed with OSD and Monitor of the RADS system. And the object storage gateway finishes the operation request of the metadata through interaction of the metadata storage engine and the Redis database cluster, and finishes the operation request of the object data through interaction of the native storage engine and the OSD.
Assume that the present embodiment wants to create a user through the S3 protocol.
First, a user S3 needs to be created through a radosgw-admin tool (the tool is a command line management tool for Ceph object storage), and the created user metadata is stored in a memory-type Redis database. The following takes the creation of the demo user as an example, and a specific flow of the embodiment is described.
Step S11: after the service of the Ceph object storage gateway is started, various global variables and working threads are initialized, and various resource managers and resource processors are registered.
Step S12: the object storage gateway initializes a metadata management worker thread. In the initialization process, the thread starts a Redis database through a management back-end database service start-stop in a standard functional interface provided by a metadata storage engine, a middle conversion layer and a drive file Redis. In the starting process of the Redis database, all metadata which are persistently stored on the physical hard disk are read, and all metadata read from the physical hard disk are loaded into the database in the memory to be started. As shown in fig. 5, the standby node may synchronize metadata from the primary node on a periodic basis. The metadata is stored in a Key-Value form and is stored in a memory of a node running a Redis Server.
Step S13: and sending an operation request of the creating user to the object storage gateway through a radosgw-admin tool, wherein the operation request carries the user name demo of the user needing to be created at this time. And after receiving the operation request, the object storage gateway further analyzes the operation request. The operation request sent by the radosgw-admin tool is handed to the admin type resource manager for processing. Because the request is a creation operation on the User resource, the corresponding resource manager instance RGWRESTMgr _ User, resource processor instance RGWHandler _ User, and operation object instance RGWOp _ User _ Create can be obtained.
Step S14: the operation object instance RGWOp _ User _ Create preprocesses the operation request, mainly performs authentication check and operation authority authentication.
Step S14-1: the operation request of the creating user is authenticated. Because the radosgw-admin tool performs related operations through the built-in admin account, the operation request of the creating user passes the authentication directly.
Step S14-2: then, the user name demo carried in the operation request is used as a Key, an inquiry operation defined by a user metadata management class in a standard functional interface provided by a metadata storage engine is called through an interface adaptation layer, namely a UserMetaOp: (get) (demo) function is called, the function is converted into a statement suitable for a Redis database through a middle conversion layer and a drive file Redis.
Step S14-3: if not, the operation request of the creating user is continuously executed.
Step S15: a demo user is created and user metadata is saved to the Redis database.
Step S15-1: executing the operand instance RGWOp _ User _ Create creates a demo User. After the user is created, the metadata of the demo user is obtained, including id (demo), access key (the user identifies the user's identity, e.g., 9ISH6L9KS061DX0BGD1J), SecretKey (stored on the server as a private key, e.g., Uk9C4WYQUHXDo78gj3t3 eL8 UPPaAh 49 HXX), user operation permission mask (e.g., "read, write, delete"), user quota ("user _ quota": enabled ": fault," max _ size _ kb ": 1," max _ objects ": 1}), storage bucket (" bucket _ quta ": fault": max _ size _ kb-1, "max _ objects }, etc.
Step S15-2: the metadata obtained in step S15-1 calls a user metadata management class in a standard functional interface provided by a metadata storage engine through an interface adaptation layer, and is converted into a statement suitable for addition of a Redis database and subjected to an addition operation through a drive file redis.cc of an intermediate conversion layer and a drive layer, and then demo user metadata information is stored in the Redis database in the form of Key ═ demo { "id", "demo" }, { "accessskey", "9 ISH6L9KS061DX0BGD 1J" }, … { … } }.
Step S16: and periodically writing user metadata of the Redis database in the memory into a physical hard disk for persistent storage through an asynchronous thread in a metadata storage engine, so that the metadata loss caused by abnormal power failure of a node or abnormal exit of service is prevented.
Example 3:
for further understanding of the present invention, on the basis of embodiment 1, the present invention further provides a manner that can be implemented in an actual scenario, and the present embodiment is described by taking the example of uploading the object data file1.txt of the local file to the bucket01, and the present embodiment assumes that a memory-type Redis database is used to store metadata.
A bucket is a container of object data and must be present before the object data is uploaded. Suppose now that the demo user has created a bucket named bucket01, which has common read-write permissions and a capacity quota of 100 GB. On the client server, installing S3cmd tool (S3cmd is a command line program for creating S3 storage bucket, uploading, retrieving and managing data to the object storage system), and performing relevant configuration, mainly setting the server address and port stored in S3, and the AccessKey and SecretKey of demo user, so as to execute the operation request of uploading object data: the following describes a specific flow of the embodiment by taking the example of uploading the object data file1.txt of the local file to the bucket 01.
Step 21: after the service of the Ceph object storage gateway is started, various global variables and working threads are initialized, and various resource managers and resource processors are registered.
Step S22: the object storage gateway initializes a metadata management worker thread. In the initialization process, the thread starts a Redis database through a management back-end database service start-stop in a standard functional interface provided by a metadata storage engine, a middle conversion layer and a drive file Redis. In the starting process of the Redis database, all metadata which are persistently stored on the physical hard disk are read, and all metadata read from the physical hard disk are loaded into the database to be started. As shown in fig. 5, the standby node may synchronize metadata from the primary node on a periodic basis. The metadata is in a form of Key-Value and is stored in a memory of a node running a Redis Server.
Step S23: and sending an operation request s3cmd put file1.txt s3:// bucket01/file1.obj for uploading the object data file1.txt of the local file to the bucket01 to the object storage gateway through an s3cmd tool. And after receiving the operation request, the object storage gateway further analyzes the operation request. The operation request for obtaining the object data of the uploaded local file through analysis includes a bucket name bucket01, a name file1.obj of the object data stored in the RADOS system (the name of the object data may be different from the name of the source file of the object data to be uploaded and is automatically specified by the user), and a source file1. txt.
An operation request sent via the S3cmd facility is processed by a resource manager of the type S3. The corresponding resource manager instance RGWRESTMgr _ S3, resource processor instance RGWHandler _ REST _ Service _ S3 (to get all Bucket lists), RGWHandler _ REST _ Bucket _ S3 (to operate and manage a single Bucket), RGWHandler _ REST _ Obj _ S3 (to operate and manage object data), and operand instance RGWPutObj to create object data may be obtained.
Step S25: the operation object instance RGWPutObj authenticates and authenticates the operation request for uploading the object data file1.txt of the local file to the bucket 01. The header of the operation request carries signature information of the client, for example, client signature 9ISu + ty4SOQZ +9xcSkpSe5 nRpdM. The local server side can calculate the legal signature information of the corresponding user according to the stored SecretKey and compares the legal signature information with the signature information transmitted by the operation request. If they are identical, the authentication and authorization is successful, and then go to step S26. If the authentication and authorization fails, the processing procedure of the operation request is exited.
Step S26: if the object data is written, calling a metadata storage engine according to the information carried in the operation request, and acquiring corresponding metadata from a database in the memory;
taking a user name demo, a storage bucket name bucket01 and a name file1.obj of object data carried in an operation request as keys, and calling query operations defined in a user metadata management class, a storage bucket metadata management class and an object metadata management class in a standard functional interface provided by a metadata storage engine through an interface adaptation layer, namely calling a user metadata op, a get (demo) function, a BuckMetaOp, a get (bucket01) function and an ObjMetaOp (bucket 1.obj) function respectively, converting the user name demo, the storage bucket name bucket01 and the name file1.obj of the object data into statements suitable for a Redis database through an intermediate conversion layer and a drive layer, and inquiring to obtain metadata of a demo user and a storage bucket01, wherein the metadata mainly comprises quota information of the user, an operation mask of the user, index information of the storage bucket, quota information of the storage bucket, access control information of the storage bucket acl and the like; and inquiring whether the metadata of file1.obj exists in a Redis database or not, and returning the acquired metadata to the interface adaptation layer.
Step S27: the interface adaptation layer judges whether the demo user has the operation authority of data writing into the storage bucket or not according to the user operation authority mask op _ mask obtained in the step S26; and checks whether the size of the object data to be written into the bucket exceeds the quota limit of the demo user and the bucket 01. If the operation request passes the verification, jumping to step S28; if not, exiting the processing procedure of the operation request.
Step S28: calling a native storage engine to write object data in a disk according to metadata acquired from a database in a memory, and obtaining metadata which is newly added when the object data is written in the disk;
the interface adaptation layer further calculates a target physical location of the object data storage, including information of the storage node (IP or host name of the node) and information of the disk (ID corresponding to the OSD service), using a CRUSH algorithm according to the index information of the bucket acquired in step S26 and the name of the object data to be uploaded, file1. txt. And then the interface adaptation layer establishes TCP connection with an OSD daemon corresponding to the target storage node by calling a native storage engine, performs interaction, writes the object data into a position corresponding to the target disk, and obtains the newly added metadata by writing the object data file1.txt in the target disk.
Step S29: calling a metadata storage engine to write the newly added metadata into a database in the memory;
after the object data file1.txt is written, newly added metadata including the size (unit: byte) of the object data, modification time, entity tag ETAG information, etc. after the object data file1.txt is written into the target disk can be obtained, then an object metadata management class in a standard functional interface provided by a metadata storage engine is called through an interface adaptation layer, a Redis drive file redis.cc passing through an intermediate conversion layer and a drive layer is converted into an added statement suitable for a Redis database and is subjected to an adding operation, and metadata corresponding to the object data file1.obj is saved in a form of "file Key 1. obj", Value { "obj _ size", "100" }, { "mtime 202", "1-07-0417: 26: 30.000000" }, "{ ETtag", "45 a62d3d5d3e946250904697486591 bc" }, … { … } }. And after the metadata is added, returning the execution result of the written operation request to the client. In addition, user metadata of the Redis database in the memory is periodically written into the physical hard disk for persistent storage through an asynchronous thread in the metadata storage engine, and metadata loss caused by abnormal power failure of the node or abnormal exit of service is prevented.
If the object data is read, calling a metadata storage engine according to the information carried in the operation request, and acquiring corresponding metadata from a database in the memory; and calling a native storage engine to read the object data from the disk according to the metadata acquired from the database in the memory.
Example 4:
on the basis of the optimization method for Ceph object storage metadata processing provided in embodiments 1 to 3, the present invention further provides an optimization apparatus for Ceph object storage metadata processing, which is capable of implementing the method, as shown in fig. 6, and is a schematic diagram of an apparatus architecture in an embodiment of the present invention. The optimization apparatus for Ceph object storage metadata processing of the present embodiment includes one or more processors 31 and a memory 32. In fig. 6, one processor 31 is taken as an example.
The processor 31 and the memory 32 may be connected by a bus or other means, and fig. 6 illustrates the connection by a bus as an example.
The memory 32, which is a non-volatile computer-readable storage medium for optimizing Ceph object storage metadata processing, may be used to store non-volatile software programs, non-volatile computer-executable programs, and modules, such as the methods for optimizing Ceph object storage metadata processing in embodiments 1-3. The processor 31 executes various functional applications and data processing of the Ceph object storage metadata processing optimization apparatus by running the nonvolatile software program, instructions, and modules stored in the memory 32, that is, implements the Ceph object storage metadata processing optimization method according to embodiments 1 to 3.
The memory 32 may include high speed random access memory and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some embodiments, the memory 32 may optionally include memory located remotely from the processor 31, and these remote memories may be connected to the processor 31 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The program instructions/modules are stored in the memory 32 and, when executed by the one or more processors 31, perform the optimization method for Ceph object storage metadata processing in embodiments 1-3 above, for example, perform the steps illustrated in fig. 1 described above.
Those of ordinary skill in the art will appreciate that all or part of the steps of the various methods of the embodiments may be implemented by associated hardware as instructed by a program, which may be stored on a computer-readable storage medium, which may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (10)

1. A method for optimizing Ceph object storage metadata processing is characterized by comprising the following steps:
judging the data type of the request operation;
if the data type of the requested operation is metadata, calling a metadata storage engine to interact with a database in the memory to complete corresponding operation;
and if the data type of the requested operation is object data, calling a native storage engine to interact with the physical hard disk to complete corresponding operation.
2. The method of claim 1, wherein the metadata storage engine is at the same logical level as a native storage engine;
the upper application calls a designated class and a function in the interface adaptation layer, and calls a metadata storage engine or a native storage engine to process an operation request under a corresponding branch according to the condition that the data type is metadata or object data;
and the branch condition and the action called under the branch are increased in the implementation of the interface adaptation layer, so that the definition of the interface adaptation layer outside is kept unchanged.
3. The method for optimizing Ceph object storage metadata processing according to claim 1, wherein the metadata storage engine includes an external interface layer, an intermediate translation layer, and a driver layer, specifically:
the external interface layer provides a standard functional interface which can be called to an upper-level interface adaptation layer;
the intermediate conversion layer abstracts a uniform standard operation interface for the database in each memory for the external interface layer to call, and is also butted with a driving module of the database in each memory in the driving layer;
and the drive layer is used for realizing a drive module for correspondingly accessing the database in each memory aiming at the database in each memory.
4. The method of claim 3, wherein if the metadata is written in/read from, a standard functional interface corresponding to the metadata storage engine is invoked, and the metadata is written in/read from the database in the memory after passing through the intermediate translation layer and the driver layer.
5. The optimization method for Ceph object storage metadata processing according to claim 1, wherein if object data is read, a metadata storage engine is called according to information carried in the operation request, and corresponding metadata is obtained from a database in the memory;
and calling a native storage engine to read the object data from the disk according to the metadata acquired from the database in the memory.
6. The optimization method for Ceph object storage metadata processing according to claim 1, wherein if the object data is written, the metadata storage engine is called according to information carried in the operation request, and corresponding metadata is obtained from a database in the memory;
calling a native storage engine to write object data in a disk according to metadata acquired from a database in a memory, and obtaining metadata which is newly added when the object data is written in the disk;
and calling a metadata storage engine to write the newly added metadata into a database in the memory.
7. The method for optimizing Ceph object storage metadata processing according to claim 3, further comprising an initialization process, specifically:
starting a corresponding database in a memory through an external interface layer, a middle conversion layer and a driving module in a driving layer provided by a metadata storage engine in an initialization process;
in the starting process of the database in the memory, all metadata which are persistently stored on the physical hard disk are read, and all metadata read from the physical hard disk are loaded into the database in the memory to be started.
8. The optimization method for Ceph object storage metadata processing according to claim 1, further comprising periodically writing metadata in a database in the memory into a physical hard disk for persistent storage through an asynchronous thread in the metadata storage engine, so as to prevent the metadata from being lost due to abnormal power failure of a node or abnormal exit from service.
9. The method for optimizing Ceph object storage metadata processing according to any one of claims 1 to 8, wherein the metadata includes cluster metadata and user metadata information;
the cluster metadata comprises one or more of a realm region, a zonegroup namespace group, and a zone namespace;
the user metadata information includes one or more of user metadata, acl access control list metadata, quota metadata, object metadata, and bucket _ index bucket index metadata.
10. An optimization device for Ceph object storage metadata processing is characterized by comprising at least one processor; and a memory communicatively coupled to the at least one processor; wherein the memory stores instructions executable by the at least one processor and programmed to perform the method for optimizing Ceph object storage metadata processing of any of claims 1 to 9.
CN202210006071.7A 2022-01-04 2022-01-04 Optimization method and device for Ceph object storage metadata processing Pending CN114415954A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210006071.7A CN114415954A (en) 2022-01-04 2022-01-04 Optimization method and device for Ceph object storage metadata processing

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210006071.7A CN114415954A (en) 2022-01-04 2022-01-04 Optimization method and device for Ceph object storage metadata processing

Publications (1)

Publication Number Publication Date
CN114415954A true CN114415954A (en) 2022-04-29

Family

ID=81271402

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210006071.7A Pending CN114415954A (en) 2022-01-04 2022-01-04 Optimization method and device for Ceph object storage metadata processing

Country Status (1)

Country Link
CN (1) CN114415954A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114895875A (en) * 2022-05-20 2022-08-12 杨云波 Zero-code visual information system metadata production application method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287869A (en) * 2017-12-20 2018-07-17 江苏省公用信息有限公司 A kind of mass small documents solution based on speedy storage equipment
CN113608692A (en) * 2021-07-25 2021-11-05 济南浪潮数据技术有限公司 Method, system, equipment and medium for verifying data consistency of storage system
CN113849492A (en) * 2021-09-23 2021-12-28 北京网聘咨询有限公司 System for providing standardized data quality check for multi-scenario service

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108287869A (en) * 2017-12-20 2018-07-17 江苏省公用信息有限公司 A kind of mass small documents solution based on speedy storage equipment
CN113608692A (en) * 2021-07-25 2021-11-05 济南浪潮数据技术有限公司 Method, system, equipment and medium for verifying data consistency of storage system
CN113849492A (en) * 2021-09-23 2021-12-28 北京网聘咨询有限公司 System for providing standardized data quality check for multi-scenario service

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
王锦涛;张海明: "面向科研领域的分布式对象存储系统", 计算机系统应用, vol. 29, no. 07, 3 July 2020 (2020-07-03), pages 82 - 88 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114895875A (en) * 2022-05-20 2022-08-12 杨云波 Zero-code visual information system metadata production application method and system
CN114895875B (en) * 2022-05-20 2023-03-24 珠海沃德尔软件科技有限公司 Zero-code visual information system metadata production application method and system

Similar Documents

Publication Publication Date Title
US11086531B2 (en) Scaling events for hosting hierarchical data structures
CN106506587B (en) Docker mirror image downloading method based on distributed storage
CN107247808B (en) Distributed NewSQL database system and picture data query method
US11550763B2 (en) Versioning schemas for hierarchical data structures
US9130952B2 (en) Method and apparatus for searching metadata
US10210191B2 (en) Accelerated access to objects in an object store implemented utilizing a file storage system
CN104778270A (en) Storage method for multiple files
CN104679898A (en) Big data access method
US9910881B1 (en) Maintaining versions of control plane data for a network-based service control plane
US20230376475A1 (en) Metadata management method, apparatus, and storage medium
CN114586011A (en) Insertion of owner-specified data processing pipelines into input/output paths of object storage services
CN110781505B (en) System construction method and device, retrieval method and device, medium and equipment
KR20120106544A (en) Method for accessing files of a file system according to metadata and device implementing the method
US11741144B2 (en) Direct storage loading for adding data to a database
US10938956B2 (en) Processing command line templates for database queries
CN114415954A (en) Optimization method and device for Ceph object storage metadata processing
CN113435605B (en) AI dynamic injection control method and device based on network data pool
WO2024021808A1 (en) Data query request processing method and apparatus, device and storage medium
KR101694301B1 (en) Method for processing files in storage system and data server thereof
US11106667B1 (en) Transactional scanning of portions of a database
US11768832B1 (en) Table data storage in non-SQL document store to enable efficient data access by SQL query engine
WO2022057698A1 (en) Efficient bulk loading multiple rows or partitions for single target table
US12007954B1 (en) Selective forwarding for multi-statement database transactions
US11853319B1 (en) Caching updates appended to an immutable log for handling reads to the immutable log
US11294892B2 (en) Virtual archiving of database records

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination