KR102033383B1 - Method and system for managing data geographically distributed - Google Patents
Method and system for managing data geographically distributed Download PDFInfo
- Publication number
- KR102033383B1 KR102033383B1 KR1020160019221A KR20160019221A KR102033383B1 KR 102033383 B1 KR102033383 B1 KR 102033383B1 KR 1020160019221 A KR1020160019221 A KR 1020160019221A KR 20160019221 A KR20160019221 A KR 20160019221A KR 102033383 B1 KR102033383 B1 KR 102033383B1
- Authority
- KR
- South Korea
- Prior art keywords
- data
- user
- slave
- manager
- request
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/27—Replication, distribution or synchronisation of data between databases or within a distributed database system; Distributed database system architectures therefor
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/10—File systems; File servers
- G06F16/18—File system types
- G06F16/182—Distributed file systems
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/21—Design, administration or maintenance of databases
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/23—Updating
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/25—Integrating or interfacing systems involving database management systems
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Computing Systems (AREA)
Abstract
According to an aspect of the present invention, a data management method includes: requesting a user to upload data to a management unit of a local data center; Confirming, by the management unit, the write permission of the user and the capacity of the node of the local data center; Uploading the data to the node if the user has write permission and the capacity of the node is greater than the capacity of the data as a result of the checking; Updating, by the management unit of the regional data center, the data map and the metadata database of the regional data center according to the uploaded result; Transmitting, by a management unit of the regional data center, a data map and metadata update information of the regional data center to a management unit of a central data center; And the management unit of the central data center updates the update information to the data map and metadata database of the central data center and transmits the updated metadata database to the management unit of all local data centers connected to the central data center to maintain synchronization. Characterized in that it comprises a step.
Description
The present invention relates to a data management method, and more particularly, to a method and system for efficiently managing big data in a geographically dispersed data environment.
Recently, as interest in big data has increased, many improvements have been made in the big data field by experts in various fields such as theorists, system builders, scientists, and application developers.
As the amount of data to be processed grows exponentially and the demand for data is diversified, more and more data centers are being deployed in various regions.
Since these data centers have different purposes, structures, and software specifications, the connections between the data centers are not organized, resulting in the inability to fully utilize the data in the data centers, especially in sharing or accessing data. If there is a problem that the efficiency falls.
This problem can be even worse if you need to use all the data that is distributed across data centers.
In order to solve these problems, the prior art employs a middle server to approach users' requests and coordinate their work.
However, there are limitations in using the Mediation Server because it has the following limitations in processing a large amount of data.
First, as a problem of scalability of a system, if a large amount of user requests arrive at the same time, it may not be able to handle the entire system.
Second, there is a problem that it is difficult to provide a data view for close management and integrated management between data centers.
Finally, there is a limit to the inability to provide complex analysis that connects all of the data centers.
SUMMARY OF THE INVENTION The present invention has been made in the technical background as described above, and an object thereof is to provide a system and method for enabling processing of data processing, storage, and access of various users in a geographically dispersed data center. do.
The object of the present invention is not limited to the above-mentioned object, and other objects that are not mentioned will be clearly understood by those skilled in the art from the following description.
Data management method according to the first aspect of the present invention for achieving the above object, the user requesting the data upload to the management unit of the local data center; Confirming, by the management unit, the write permission of the user and the capacity of the node of the local data center; Uploading the data to the node if the user has write permission and the capacity of the node is greater than the capacity of the data as a result of the checking; Updating, by the management unit of the regional data center, the data map and the metadata database of the regional data center according to the uploaded result; Transmitting, by a management unit of the regional data center, a data map and metadata update information of the regional data center to a management unit of a central data center; And the management unit of the central data center updates the update information to the data map and metadata database of the central data center and transmits the updated metadata database to the management unit of all local data centers connected to the central data center to maintain synchronization. Steps.
In addition, a data management system in a distributed data environment according to a second aspect of the present invention includes a local data map in which the data and node information are stored, a slave manager for processing a user's request and managing data stored in the local data map. And a central data map comprising a plurality of slave data centers including a metadata database storing information for determining whether a node or a user has access to the data, and a plurality of regional data maps, and the slave manager. A master manager for processing the request of the controller and managing metadata of the slave data center based on the central data map, and information for determining whether the local data center, node, or user has access to the data. Containing stored metadata Sum and a data center.
The data stored in the local data map may include at least one of an ID, a location of a slave data center, a location of a node, a format of a node, and size of a node.
The user's request is a user's data upload request, and as the slave manager receives a data upload request from the user, confirms the user's write permission and the capacity of the node of the local data center, based on the result of the confirmation. To upload the data to the node.
The slave manager may upload the data to the node if the user has a write right and the capacity of the node is greater than the capacity of the data.
According to the uploaded result, the slave manager updates the regional data map and the metadata database, and transmits update information of the local data map and the metadata database to the master manager, and the master manager sends the update information to the central data. The updated metadata database may be maintained by updating the map and the metadata database and transmitting the updated metadata database to the slave manager of all slave data centers connected to the integrated data center.
The user's request is a user's data access request, and as the slave manager receives a data access request from the user, confirms whether the user has access authority by using access information stored in the metadata database, and confirms If there is a result access authority, the presence or absence of data requested by the user may be checked based on the local data map, and the slave manager may transmit the location information of the data to the user based on the confirmation result.
The slave manager requests the master manager to provide the location information of the data requested by the user when the confirmation result data does not exist, and the master manager retrieves the location information from the central data map. The terminal may receive the searched location information and transmit location information of the searched data to the user.
The user's request is a user's data processing request, and as the slave manager receives a data processing request from the user, access information stored in the metadata database and whether or not the capacity of the node of the local database is exceeded is determined. Transmits the data processing request to the master management unit, and the master management unit searches for a node corresponding to the data processing request and allocates a job to process data, so that the slave manager transmits the data processing result to a user. Can be.
The user's request is a data copy request of the user, and the slave manager receives the data copy request from the user, based on whether access information stored in the metadata database and whether or not the capacity of the node of the local database is exceeded Transmits the data copy request to the master management unit, and the master management unit checks the location of the source data and available resources in the central data map so that the slave data center of the source role and the slave of the destination role When the data center is determined, a data copy request is transmitted to the determined slave data center, and the slave manager of the determined slave data center updates the local data map and the metadata database as the copy of the data is completed, and masks the updated information. It can be transferred to the administration.
The master manager may update the central data map and the metadata database in response to receiving the updated information, and transmit the updated metadata database to the slave manager of all slave data centers connected to the integrated data center to maintain synchronization. .
delete
According to the present invention, it is possible to provide a system and method for integrated management of big data by centrally managing locally distributed data centers and providing a centralized data center that is scalable to various requirements of users. It has an effect.
1 is a structural diagram of an entire system according to an embodiment of the present invention.
2 is a flowchart of a data upload method according to an embodiment of the present invention;
3 is a flowchart of a data download method according to an embodiment of the present invention;
4 is a flowchart of a data processing method according to an embodiment of the present invention.
5 is a flowchart of a data copying method according to an embodiment of the present invention.
6 is a structural diagram of a computer system in which a data management method is executed according to another embodiment of the present invention.
Advantages and features of the present invention and methods for achieving them will be apparent with reference to the embodiments described below in detail with the accompanying drawings. However, the present invention is not limited to the embodiments disclosed below, but will be implemented in various forms, and only the present embodiments are intended to complete the disclosure of the present invention, and the general knowledge in the art to which the present invention pertains. It is provided to fully convey the scope of the invention to those skilled in the art, and the present invention is defined only by the scope of the claims. Meanwhile, the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. In this specification, the singular also includes the plural unless specifically stated otherwise in the phrase. As used herein, “comprises” and / or “comprising” refers to a component, step, operation and / or device that is present in one or more other components, steps, operations and / or elements. Or does not exclude additions.
Hereinafter, exemplary embodiments of the present invention will be described in detail with reference to the accompanying drawings. 1 shows the overall structure of a system according to the invention.
The system consists of an Integrated Data Center (100) and a Slave Data Center (110-130) in each region.
There may be a plurality of devices in the
The data includes the identity (ID), the location of the local data center, the location of the node and other information (format, size, etc.).
Nodes refer to devices in a data center and include information such as ID, data center location, IP address, status, resources (CPU, RAM, etc.).
The
The
The connection between the
2 is a flowchart illustrating a method for writing data by a user according to the present invention.
The user first transmits a data upload request to the slave manager 114 (S210).
The user transmits a user ID, a data ID, and an access list to the
If there is a problem of authority or capacity, the user repeats step S210 of requesting the
After uploading the data of the user, the
After the update is completed, the
The
3 is a flowchart illustrating a method for reading data by a user according to the present invention.
The user transmits a data access request including the user ID and the data ID to the
Upon receiving the request, the
If the user does not have access rights, the user repeats the step of transmitting an access request again (S310), and if there is access authority, the
If the data exists in the
On the other hand, if the data requested by the user does not exist in the corresponding
When receiving the location information including the data center ID and the node ID where the data is located, the user can access the desired data using the location information (S370).
4 is a flowchart illustrating a data processing method according to the present invention.
The user transmits a data processing request including a user ID, a job file (Job file), an input data ID, an output data ID, and the like to the
Upon receiving the request, the
If there is no access authority or the capacity is exceeded, the process of waiting for a data processing request is repeated again (S410). If there is no such violation, the
The
After the job is assigned and the data processing is completed, the
5 shows a flowchart of a data copying method according to the present invention.
The user transmits a data copy request including the user ID, the data ID, and the data center ID to the slave manager 114 (S510).
The
If there is a problem, the process returns to the previous step (S510) and waits for the user's request. If there is no problem with the access right or the capacity excess, the
The
After copying data, the slave manager of the target data center updates the local data map and the meta DB (S560), and transmits the updated information to the master manager 104 (S570).
The
This data management method not only effectively manages data in geographically dispersed data centers, but also enables users to efficiently access data regardless of which region's data center. have.
On the other hand, the data management method according to an embodiment of the present invention may be implemented in a computer system or recorded on a recording medium. As shown in FIG. 6, a computer system includes at least one
The computer system can further include a
The
Therefore, the data management method according to the embodiment of the present invention can be implemented in a computer executable method. When a data management method according to an embodiment of the present invention is performed in a computer device, computer readable instructions may perform the recognition method according to the present invention.
On the other hand, the data management method according to the present invention described above can be implemented as a computer-readable code on a computer-readable recording medium. Computer-readable recording media include all kinds of recording media having data stored thereon that can be decrypted by a computer system. For example, there may be a read only memory (ROM), a random access memory (RAM), a magnetic tape, a magnetic disk, a flash memory, an optical data storage device, and the like. The computer readable recording medium can also be distributed over computer systems connected over a computer network, stored and executed as readable code in a distributed fashion.
In the above, the configuration of the present invention has been described in detail with reference to the accompanying drawings, which are merely examples, and those skilled in the art to which the present invention pertains various modifications and changes within the scope of the technical idea of the present invention. Of course this is possible. Therefore, the protection scope of the present invention should not be limited to the above-described embodiment but should be defined by the following claims.
Claims (10)
A local data map in which the data and node information are stored, a slave manager for processing a user's request and managing data stored in the local data map, and information for determining whether the user has access to the data. A plurality of slave data centers including stored metadata databases and
A central data map composed of a combination of the plurality of regional data maps, a master manager for processing a request of the slave manager and managing metadata of the slave data center based on the central data map, and the data of the user; A data management system comprising an integrated data center that includes a metadata database that stores information for determining whether to have access.
And the data stored in the local data map includes at least one of an ID, a location of a slave data center, a location of a node, a format of a node, and size of a node.
The user's request is a user's data upload request,
When the slave manager receives a data upload request from the user, the slave manager verifies the write permission of the user and the capacity of the node of the slave data center, and uploads the data to the node based on the result of the check. system.
And the slave manager uploads the data to the node if the user has write permission and the capacity of the node is greater than the capacity of the data.
The slave manager updates the regional data map and the metadata database according to the uploaded result, and transmits the update information of the regional data map and the metadata database to the master manager.
And the master manager updates the update information in the central data map and the metadata database and transmits the updated metadata database to the slave manager of all slave data centers connected to the integrated data center to maintain synchronization.
The user's request is a user's data access request,
As the slave manager receives a data access request from the user, the slave manager verifies whether the user has access authority by using access information stored in the metadata database.
And confirming the presence or absence of data requested by the user based on the local data map when the access result is authorized, and transmitting the location information of the data to the user based on the confirmation result.
The slave manager requests the master manager to provide the location information of the data requested by the user, if the confirmation result data does not exist;
And the slave manager receives the retrieved location information and transmits the location information of the retrieved data to the user as the master manager retrieves the location information from a central data map.
The user's request is a user's data processing request,
As the slave manager receives a data processing request from the user, the slave manager transmits the data to the master manager based on whether access information stored in the metadata database and capacity of a node of the metadata database included in the slave data center are exceeded. Send processing requests,
And the slave manager transmits the data processing result to a user as the master manager searches for a node corresponding to the data processing request and allocates a job to process data.
The user's request is a user's request to copy data,
As the slave manager receives a data copy request from the user, the slave manager transmits the data to the master manager based on whether access information stored in the metadata database and the capacity of a node of the metadata database included in the slave data center are exceeded. Send a copy request,
The master manager checks the location and available resources of the source data in the central data map to determine the slave data center of the source role and the slave data center of the destination role. Send a copy request,
The slave management unit of the determined slave data center updates the local data map and the metadata database as data copying is completed, and transmits the updated information to the master management unit.
The master manager updates the central data map and the metadata database in response to receiving the updated information, and transmits the updated metadata database to the slave manager of all slave data centers connected to the integrated data center to maintain synchronization. Data management system.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160019221A KR102033383B1 (en) | 2016-02-18 | 2016-02-18 | Method and system for managing data geographically distributed |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020160019221A KR102033383B1 (en) | 2016-02-18 | 2016-02-18 | Method and system for managing data geographically distributed |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170097448A KR20170097448A (en) | 2017-08-28 |
KR102033383B1 true KR102033383B1 (en) | 2019-10-17 |
Family
ID=59759635
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020160019221A KR102033383B1 (en) | 2016-02-18 | 2016-02-18 | Method and system for managing data geographically distributed |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR102033383B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102038527B1 (en) * | 2018-03-28 | 2019-11-26 | 주식회사 리얼타임테크 | Distributed cluster management system and method for thereof |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101115793B1 (en) * | 2010-05-19 | 2012-03-09 | 삼성에스디에스 주식회사 | System for virtual data center based on client hypervisor |
KR20150029918A (en) * | 2013-09-11 | 2015-03-19 | 한국전자통신연구원 | System of synchronizing contents in a cloud system having a plurality of distributed servers |
-
2016
- 2016-02-18 KR KR1020160019221A patent/KR102033383B1/en active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
KR20170097448A (en) | 2017-08-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
EP3667500B1 (en) | Using a container orchestration service for dynamic routing | |
CN101568919B (en) | Single view of data in a networked computer system with distributed storage | |
CN109936571B (en) | Mass data sharing method, open sharing platform and electronic equipment | |
US20150215405A1 (en) | Methods of managing and storing distributed files based on information-centric network | |
US20080243847A1 (en) | Separating central locking services from distributed data fulfillment services in a storage system | |
US20100161657A1 (en) | Metadata server and metadata management method | |
KR101265856B1 (en) | Automated state migration while deploying an operating system | |
CN105933376A (en) | Data manipulation method, server and storage system | |
KR102372424B1 (en) | Apparatus for distributed processing through remote direct memory access and method for the same | |
CN104580439B (en) | Method for uniformly distributing data in cloud storage system | |
KR20120072907A (en) | Distribution storage system of distributively storing objects based on position of plural data nodes, position-based object distributive storing method thereof, and computer-readable recording medium | |
US10503693B1 (en) | Method and system for parallel file operation in distributed data storage system with mixed types of storage media | |
JP5848339B2 (en) | Leader arbitration for provisioning services | |
CN110022338B (en) | File reading method and system, metadata server and user equipment | |
US8196179B2 (en) | Storage controller for controlling access based on location of controller | |
US20100161585A1 (en) | Asymmetric cluster filesystem | |
US10545667B1 (en) | Dynamic data partitioning for stateless request routing | |
CN101483668A (en) | Network storage and access method, device and system for hot spot data | |
KR102033383B1 (en) | Method and system for managing data geographically distributed | |
JP6035934B2 (en) | Data store management device, data providing system, and data providing method | |
US6519610B1 (en) | Distributed reference links for a distributed directory server system | |
JP5824519B2 (en) | Distributed metadata cache | |
CN114615263A (en) | Cluster online migration method, device, equipment and storage medium | |
CN111212138B (en) | Cross-site storage system and data information access method | |
JP2009193502A (en) | Computer system, storage device, and processing alternative method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right |