CN110825715A - Multi-object data second combination implementation method based on Ceph object storage - Google Patents

Multi-object data second combination implementation method based on Ceph object storage Download PDF

Info

Publication number
CN110825715A
CN110825715A CN201911087463.5A CN201911087463A CN110825715A CN 110825715 A CN110825715 A CN 110825715A CN 201911087463 A CN201911087463 A CN 201911087463A CN 110825715 A CN110825715 A CN 110825715A
Authority
CN
China
Prior art keywords
data
combination
ceph
metadata information
data distribution
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201911087463.5A
Other languages
Chinese (zh)
Other versions
CN110825715B (en
Inventor
刘勇
谢赟
孙卓峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Tak Billiton Information Technology Ltd By Share Ltd
Original Assignee
Shanghai Tak Billiton Information Technology Ltd By Share Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Tak Billiton Information Technology Ltd By Share Ltd filed Critical Shanghai Tak Billiton Information Technology Ltd By Share Ltd
Priority to CN201911087463.5A priority Critical patent/CN110825715B/en
Publication of CN110825715A publication Critical patent/CN110825715A/en
Application granted granted Critical
Publication of CN110825715B publication Critical patent/CN110825715B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/11File system administration, e.g. details of archiving or snapshots

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Storage Device Security (AREA)

Abstract

The invention discloses a method for realizing multi-object data union based on Ceph object storage, which comprises the following steps: a client initiates a multi-object data second combination request to a Ceph object storage system, wherein request parameters of the multi-object data second combination request comprise a plurality of objects needing second combination operation and second combination object names needing to be newly generated; carrying out parameter validity check on the multi-object data second combination request, and entering the next step after the multi-object data second combination request passes the parameter validity check; otherwise, returning an error; checking the access authority of each object for the multi-object data second-combination request, and entering the next step after the access authority of each object passes the second-combination request; otherwise, returning an error; creating a second combination object according to the second combination object name, combining data distribution metadata information of each object needing second combination, and writing the combined data distribution metadata information into the second combination object; supplementing data distribution metadata information in the second union object, and returning the information that the second union object data is successfully created to the client. Thereby realizing the rapid synthesis of new object data without data copying.

Description

Multi-object data second combination implementation method based on Ceph object storage
Technical Field
The invention relates to the technical field of object storage, in particular to a method for realizing multi-object data second combination based on Ceph (uniform and distributed storage system) object storage.
Background
With the development of new technologies such as big data, cloud computing, physical networks and 5G, and the change of applications in the industries such as telecommunications, internet, government and enterprise, medical treatment, etc., the rapid growth of mass data brings many challenges to the traditional storage system, and object storage is widely used as an emerging storage technology in more and more industries and application scenarios.
Compared with the traditional file system storage, the object storage abandons the complex semantic and directory design of the file system, and the data storage is carried out in a flattened Key-Value (Value is taken according to keywords), so that the complexity of metadata management is greatly simplified, and the technical limit on storage capacity hardly exists, so that the method is more suitable for mass data storage in various industries at present, particularly the application scene of the big data industry. The object storage data management is different from the traditional file system, and especially, certain IO characteristics have certain limitation conditions on application, for example, modification operation is not supported, and only overwriting is allowed, so that the object storage is more suitable for application scenes of 'write once and read many times'.
In security media services, object storage is also increasingly applied widely, relatively large video files are usually processed in these application scenes, such as media files collected by video monitoring, high-definition video materials and the like, especially with popularization and application of 5G and 4K/8K ultra-high-definition code rates, the size of a video file is increased from tens of G to hundreds of G, and the storage capacity required for a single storage system is also increased from the traditional hundreds of TB level to the PB level, even to the magnitude of tens of PB. For such storage systems, object storage systems are generally employed to support the storage and application of mass data.
In a media application scenario, due to performance limitation of a single workstation or server in processing a media file, the media file needs to be segmented, stored in different workstations or servers as different object files for processing, and then merged and written into corresponding storage systems. With the increase of media files, such a processing mode inevitably brings a large amount of data migration copies, occupies valuable IO resources of a storage system, and aims to solve performance consumption caused by data copy and improve storage efficiency of storage.
Disclosure of Invention
The invention aims to provide a method for realizing multi-object data second combination, which realizes the rapid combination of new object data under the condition of not copying data.
The technical scheme for realizing the purpose is as follows:
a method for realizing multi-object data union based on Ceph object storage comprises the following steps:
step S1, the client side sends a multi-object data second combination request to the Ceph object storage system, the request parameters of the multi-object data second combination request comprise a plurality of objects needing second combination operation and second combination object names needing to be newly generated;
step S2, the Ceph object storage system carries out parameter validity check on the multi-object data second combination request, and enters the next step after the multi-object data second combination request passes the parameter validity check; otherwise, returning an error;
step S3, the Ceph object storage system checks the access authority of each object for the multi-object data second combination request, and enters the next step after the multi-object data second combination request passes; otherwise, returning an error;
step S4, the Ceph object storage system creates a second combination object according to the second combination object name, merges the data distribution metadata information of each object needing second combination and writes the merged data distribution metadata information into the second combination object;
and step S5, the Ceph object storage system supplements data distribution metadata information in the second union object, and returns the information that the second union object data is successfully created to the client.
Preferably, the step S4 includes:
the Ceph object storage system creates a second combination object according to the second combination object name;
traversing each object needing to be subjected to second closing, and acquiring corresponding data distribution metadata information;
storing the data distribution metadata information into a unified list according to the sequence of each object in the request parameter;
analyzing data distribution metadata in the data distribution metadata information in the list, recoding the data distribution metadata information according to a uniform logical address according to the list sequence, and merging the data distribution metadata information into uniform data distribution metadata information;
and writing the merged uniform data distribution metadata information into a second merging object.
Preferably, the method further comprises the following steps:
step S6, the client end sends read request to the Ceph object memory system, the request parameter of the read request includes the object to be read; the Ceph object storage system carries out parameter validity check and object access permission check on the read request, and the next step is carried out after the read request passes the parameter validity check and the object access permission check; otherwise, returning an error;
step S7, the Ceph object storage system acquires the data distribution metadata information of the object, and judges whether the current access object is a second-combined object or a non-second-combined object according to the corresponding mark information; if the object is a non-second object, directly reading the data of the non-second object; and if the second-closing object exists, analyzing the data distribution metadata information according to the uniform logical address to obtain the data distribution metadata information of each object subjected to the second-closing operation, reading data from each object according to the data distribution metadata information, merging and sending the data to the client.
Preferably, the Ceph object storage system provides a RESTful API interface, and the client calls the RESTful API interface to access the Ceph object storage system.
Preferably, in step S5, the supplementary data distribution metadata information includes: creator, owner, and ACL (access control list) rights controls.
The invention has the beneficial effects that: the invention is based on the Ceph object storage, and quickly synthesizes a plurality of object storage data related to the service into single and complete object storage data in seconds, thereby realizing the quick synthesis of new object data under the condition of not copying the data, saving the resources of a storage system and greatly improving the service efficiency of an application system. Meanwhile, the object data after the second combination has all attributes and rights of the common object, and the original object data is not influenced, so that the original object data can be repeatedly used, the system cost of the second combination is low, when part of member object data is updated, the second combination operation is carried out again, the single and complete object data can be regenerated, and the service requirement is quickly responded. And supports the client to access through the standard object storage interface type. Other access interfaces of the object storage do not need to be modified, corresponding object storage functions can be compatible, the object storage system can be compatible with the existing object storage system to the maximum extent, the object storage bottom layer is transparent and imperceptible to the application system for IO operation of the second-time object storage, and the use complexity of the application system is reduced. In addition, in the object storage, the metadata information of the second object is compatible with the original implementation except for the newly added data distribution metadata information containing multi-object data, and the original operation interface is used for writing, updating and deleting the metadata, so that the original metadata design implementation is not influenced. The second-combined object is mainly used for updating metadata, setting authority and reading data for a service system, the metadata updating and the authority setting are all consistent with the original interface at the interface level, an interface is not required to be newly added, a background judges whether the second-combined object is subjected to different processing according to the metadata, and the standard RESTful API is still used by a client.
Drawings
FIG. 1 is a flow chart of creating a second sum object and second sum data in the present invention;
fig. 2 is a flowchart of a process of reading a second sum object in the present invention.
Detailed Description
The invention will be further explained with reference to the drawings.
Referring to fig. 1 and fig. 2, the method for implementing a multiple object data second sum stored based on a Ceph object according to the present invention includes the following steps:
step S1, the Ceph object storage system provides a standard RESTful API interface (Web application program interface based on HTTP protocol), and the client initiates a multi-object data second combination request, where request parameters of the multi-object data second combination request include multiple objects that need to be subjected to second combination operation and a second combination object name that needs to be newly generated.
Step S2, the multi-object data second combination request enters the Web processing layer of the Ceph object storage system, and first needs to check the validity of the request parameter, and if the parameter check fails, returns an error, and if the parameter check succeeds, enters the object storage processing layer of the Ceph object storage system.
Step S3, in the object storage processing layer, parsing the parameter list, obtaining each object that needs to be subjected to union of seconds, then traversing all the objects to perform access right check, and if any object fails in the access right check, returning an error. And after the access right check is successful, the second combination processing is carried out.
Step S4, first, a second object without user data is created based on the second object name specified by the request parameter, and then each object that needs to be second-matched is traversed to obtain corresponding data distribution metadata information, where the metadata information is used to address and access corresponding object data. And after the data distribution metadata information is successfully acquired, merging all the data distribution metadata information based on the uniform logical address according to the sequence of each object in the request parameter, and writing the finally merged data distribution metadata information into a second merging object. Specifically, storing the data distribution metadata information into a unified list according to the sequence of each object in the request parameter, analyzing the data distribution metadata in the data distribution metadata information in the list, recoding the data distribution metadata information according to a unified logical address according to the list sequence, and finally merging the data distribution metadata information into unified data of a single object (second-combined object).
Step S5, creating other metadata information (information such as creator, owner, ACL authority control, etc.), and writing the corresponding metadata information into the newly generated second closed object, where the data of the second closed object has been successfully generated, and returning to the client: the object data after the second closing is successfully created.
Step S6, the client calls the RESTful API interface to make a read request for a normal object (non-union-second object) or a union-second object, that is: the client side initiates a read request to the Ceph object storage system, and request parameters of the read request comprise objects to be read. Similarly, parameter check and object access right check are performed, and after the check is passed, the corresponding read request is processed by the object storage processing layer.
Step S7, after receiving the read request, the object storage processing layer first obtains the data distribution metadata information of the object and obtains the flag information of the object data type, where the flag information is used to determine whether the current access object is a second join object or a common object. If the object is a common object, a request of corresponding object data is directly sent to the bottom layer (the data of the non-second-combined object is directly read). And if the object is the second combination object, analyzing the data distribution metadata information according to the uniform logical address to obtain the data distribution metadata information of each object subjected to the second combination operation, and reading data from each object according to the data distribution metadata information.
Step S8, after the read operation based on the second-combining object is completed, the acquired data of each object needs to be sorted and combined based on the unified logical address, and finally, complete data is returned to the client, and the data request processing is completed.
In conclusion, the invention can ensure compatibility with the use of most Ceph object storage systems in the market. The metadata and the data of the original object data can not be influenced, the synthesis of new object data is quickly realized on the premise of not migrating the original data, the occupation of storage resources is reduced, and the service efficiency is improved. The second-combined object data realized by the invention not only supports reading operation, but also has the attribute and the permission of common object data, such as Access Control (ACL), owner, creator, modification time and other metadata information, and is compatible with a corresponding RESTful API interface to carry out corresponding access and setting update.
The above embodiments are provided only for illustrating the present invention and not for limiting the present invention, and those skilled in the art can make various changes and modifications without departing from the spirit and scope of the present invention, and therefore all equivalent technical solutions should also fall within the scope of the present invention, and should be defined by the claims.

Claims (5)

1. A method for realizing multi-object data union based on Ceph object storage is characterized by comprising the following steps:
step S1, the client side sends a multi-object data second combination request to the Ceph object storage system, the request parameters of the multi-object data second combination request comprise a plurality of objects needing second combination operation and second combination object names needing to be newly generated;
step S2, the Ceph object storage system carries out parameter validity check on the multi-object data second combination request, and enters the next step after the multi-object data second combination request passes the parameter validity check; otherwise, returning an error;
step S3, the Ceph object storage system checks the access authority of each object for the multi-object data second combination request, and enters the next step after the multi-object data second combination request passes; otherwise, returning an error;
step S4, the Ceph object storage system creates a second combination object according to the second combination object name, merges the data distribution metadata information of each object needing second combination and writes the merged data distribution metadata information into the second combination object;
and step S5, the Ceph object storage system supplements data distribution metadata information in the second union object, and returns the information that the second union object data is successfully created to the client.
2. The method for implementing the multiple object data coalition based on the Ceph object storage according to claim 1, wherein the step S4 includes:
the Ceph object storage system creates a second combination object according to the second combination object name;
traversing each object needing to be subjected to second closing, and acquiring corresponding data distribution metadata information;
storing the data distribution metadata information into a unified list according to the sequence of each object in the request parameter;
analyzing data distribution metadata in the data distribution metadata information in the list, recoding the data distribution metadata information according to a uniform logical address according to the list sequence, and merging the data distribution metadata information into uniform data distribution metadata information;
and writing the merged uniform data distribution metadata information into a second merging object.
3. The method for implementing the multiple-object data coalition based on the Ceph object storage according to claim 2, further comprising:
step S6, the client end sends read request to the Ceph object memory system, the request parameter of the read request includes the object to be read; the Ceph object storage system carries out parameter validity check and object access permission check on the read request, and the next step is carried out after the read request passes the parameter validity check and the object access permission check; otherwise, returning an error;
step S7, the Ceph object storage system acquires the data distribution metadata information of the object, and judges whether the current access object is a second-combined object or a non-second-combined object according to the corresponding mark information; if the object is a non-second object, directly reading the data of the non-second object; and if the second-closing object exists, analyzing the data distribution metadata information according to the uniform logical address to obtain the data distribution metadata information of each object subjected to the second-closing operation, reading data from each object according to the data distribution metadata information, merging and sending the data to the client.
4. The method of claim 1, wherein the Ceph object storage system provides a RESTful API interface, and the client calls the RESTful API interface to access the Ceph object storage system.
5. The method for implementing multiple object data sec-union based on Ceph object storage according to claim 1, wherein in step S5, the supplementary data distribution metadata information includes: creator, owner, and ACL rights controls.
CN201911087463.5A 2019-11-08 2019-11-08 Multi-object data second combination implementation method based on Ceph object storage Active CN110825715B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911087463.5A CN110825715B (en) 2019-11-08 2019-11-08 Multi-object data second combination implementation method based on Ceph object storage

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911087463.5A CN110825715B (en) 2019-11-08 2019-11-08 Multi-object data second combination implementation method based on Ceph object storage

Publications (2)

Publication Number Publication Date
CN110825715A true CN110825715A (en) 2020-02-21
CN110825715B CN110825715B (en) 2020-11-03

Family

ID=69553528

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911087463.5A Active CN110825715B (en) 2019-11-08 2019-11-08 Multi-object data second combination implementation method based on Ceph object storage

Country Status (1)

Country Link
CN (1) CN110825715B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930708A (en) * 2020-07-14 2020-11-13 上海德拓信息技术股份有限公司 Extension system and method of object tag based on Ceph object storage

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399941A (en) * 2013-08-13 2013-11-20 广州中国科学院软件应用技术研究所 Distributed file processing method, device and system
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
US20160366225A1 (en) * 2015-06-09 2016-12-15 Electronics And Telecommunications Research Institute Shuffle embedded distributed storage system supporting virtual merge and method thereof
CN107948334A (en) * 2018-01-09 2018-04-20 无锡华云数据技术服务有限公司 Data processing method based on distributed memory system
CN108776578A (en) * 2018-06-01 2018-11-09 南京紫光云信息科技有限公司 A kind of method and system of quick combining objects

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103399941A (en) * 2013-08-13 2013-11-20 广州中国科学院软件应用技术研究所 Distributed file processing method, device and system
CN103577123A (en) * 2013-11-12 2014-02-12 河海大学 Small file optimization storage method based on HDFS
US20160366225A1 (en) * 2015-06-09 2016-12-15 Electronics And Telecommunications Research Institute Shuffle embedded distributed storage system supporting virtual merge and method thereof
CN107948334A (en) * 2018-01-09 2018-04-20 无锡华云数据技术服务有限公司 Data processing method based on distributed memory system
CN108776578A (en) * 2018-06-01 2018-11-09 南京紫光云信息科技有限公司 A kind of method and system of quick combining objects

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111930708A (en) * 2020-07-14 2020-11-13 上海德拓信息技术股份有限公司 Extension system and method of object tag based on Ceph object storage
CN111930708B (en) * 2020-07-14 2023-07-11 上海德拓信息技术股份有限公司 Ceph object storage-based object tag expansion system and method

Also Published As

Publication number Publication date
CN110825715B (en) 2020-11-03

Similar Documents

Publication Publication Date Title
CN109254733B (en) Method, device and system for storing data
US11341118B2 (en) Atomic application of multiple updates to a hierarchical data structure
US8914856B1 (en) Synchronization of networked storage systems and third party systems
US10649960B2 (en) Workflow functions of content management system enforced by client device
US6470353B1 (en) Object-oriented framework for managing access control in a multimedia database
CN103020255B (en) Classification storage means and device
CN107295425B (en) Method for rapidly splicing transcoding fragmented files
US8977662B1 (en) Storing data objects from a flat namespace in a hierarchical directory structured file system
US20040199521A1 (en) Method, system, and program for managing groups of objects when there are different group types
US20210097036A1 (en) Snapshot isolation in a distributed storage system
JP7374232B2 (en) Content item sharing with context
CN112597348A (en) Method and device for optimizing big data storage
CN115114232A (en) Method, device and medium for enumerating historical version objects
CN110825715B (en) Multi-object data second combination implementation method based on Ceph object storage
CN107408239B (en) Architecture for managing mass data in communication application through multiple mailboxes
CN109542860B (en) Service data management method based on HDFS and terminal equipment
CN113448946B (en) Data migration method and device and electronic equipment
CN105677579B (en) Data access method in caching system and system
EP2686791B1 (en) Variants of files in a file system
CN109241011B (en) Virtual machine file processing method and device
CN111435342B (en) Poster updating method, poster updating system and poster management system
US8990265B1 (en) Context-aware durability of file variants
US11687701B1 (en) System, method, and computer program for enabling text editing across multiple content blocks in a system
CN114040346B (en) File digital information management system and management method based on 5G network
CN115730016B (en) Data synchronization method, system, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant