WO2021057108A1 - 一种读数据方法、写数据方法及服务器 - Google Patents

一种读数据方法、写数据方法及服务器 Download PDF

Info

Publication number
WO2021057108A1
WO2021057108A1 PCT/CN2020/096124 CN2020096124W WO2021057108A1 WO 2021057108 A1 WO2021057108 A1 WO 2021057108A1 CN 2020096124 W CN2020096124 W CN 2020096124W WO 2021057108 A1 WO2021057108 A1 WO 2021057108A1
Authority
WO
WIPO (PCT)
Prior art keywords
data
copy
management server
target
target data
Prior art date
Application number
PCT/CN2020/096124
Other languages
English (en)
French (fr)
Inventor
黄爽
洪福成
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP20869429.9A priority Critical patent/EP3951607A4/en
Publication of WO2021057108A1 publication Critical patent/WO2021057108A1/zh
Priority to US17/526,659 priority patent/US12038879B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/172Caching, prefetching or hoarding of files
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/17Details of further file system functions
    • G06F16/178Techniques for file synchronisation in file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/184Distributed file systems implemented as replicated file system
    • G06F16/1844Management specifically adapted to replicated file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/10File systems; File servers
    • G06F16/18File system types
    • G06F16/182Distributed file systems
    • G06F16/1824Distributed file systems implemented using Network-attached Storage [NAS] architecture
    • G06F16/183Provision of network file services by network file servers, e.g. by using NFS, CIFS
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Definitions

  • the embodiments of the present application relate to the storage field, and in particular to a method for reading data, a method for writing data, and a server.
  • All copies of a data in an existing Hadoop cluster are only stored in a certain DC. For example, all copies of file 1 are only stored in DC1, and all copies of file 2 are only stored in DC2.
  • another resource negotiator (yet another resource negotiator, YARN) application (application) allocated for client read data requests is only running on the same DC.
  • the embodiment of the present application provides a method for reading data, a method for writing data, and a server. After a single DC fails, the data access of the client on this DC will not be affected, and the client's request can be responded to.
  • a method for reading data which includes: a resource management server receives a data read request from a client, and the data read request is used to request multiple files; and a copy of target data can also be read from a first data center.
  • the data includes data of different files in the plurality of files.
  • the first data center is the data center with the highest data localization among multiple data centers storing copies of target data, and data localization is used to indicate how close the copy of target data stored in the data center is to the target data.
  • the resource management server sends a copy of the target data read from the first data center to the client.
  • the copy is not only stored in one DC, but is stored across DCs, and a copy of data can be stored in multiple DCs.
  • the client's read data request is always executed on the DC with the highest data localization, so as to avoid the client's read and write data across DCs from occupying too much bandwidth between DCs. After a single DC fails, the client can also access the copy on other DCs where the target data is stored, avoiding the client's request from not being responded to.
  • the read data request carries the directory information of the target data
  • the resource management server may determine how many copies of the target data are stored according to the directory information of the multiple files.
  • a data center further, calculating how close the copies of the target data stored in the multiple data centers are to the target data, and determining the data center where the stored copies of the target data are closest to the target data as the first data center.
  • the copies are placed across DCs, and the actual distribution of the copies of the target data can be determined according to the directory information of the target data, and then the data localization of each DC for the target data can be calculated, so that the data localization is the highest
  • the copy of the target data is accessed in the DC, and the copy that is closer to the target data is accessed as much as possible in the same DC, so as to avoid excessively occupying the communication bandwidth between the DCs to access the copy across the DC, thereby improving the performance of the entire system.
  • the method further includes: when the first data center fails, the resource management server from the second The data center reads a copy of the target data, where the second data center is the data center with the highest data localization among the plurality of data centers except the first data center; the resource management server sends the read from the second data center to the client Take a copy of the target data. .
  • the actual distribution of the copies can still be combined to determine the data center with the highest data localization among the remaining DCs storing the target copy, and the data center can be accessed as much as possible. Close to the copy of the target data, avoid excessive use of communication bandwidth between DCs to access the copy across DCs, thereby improving the performance of the entire system.
  • the copy of the target data is a copy stored in the data center where the client is located.
  • the copy when writing a copy across DCs, the copy is first written to the DC where the client is located, which can reduce the traffic of the HDFS write operation.
  • a method for writing data including: a name management server receives a data write request from a client, the data write request carries target data, and a copy of the target data can be written to multiple data centers based on the target data.
  • the copy is not only stored in one DC, but is stored across DCs, and a copy of data can be stored in multiple DCs. Based on this, combined with the actual distribution of the copies, the client's read data request and read data request are always executed on the DC with the highest data localization, so as to prevent the client from reading and writing data across the DC to excessively occupy the bandwidth between the DCs. After a single DC fails, the client can also access the copy on other DCs where the target data is stored, avoiding the client's request from not being responded to.
  • the name management server writes copies of the target data into multiple data centers according to the target data, including: the name management server writes in the data center where the client is located Enter the first copy of the target data.
  • the copy when writing a copy across DCs, the copy is first written to the DC where the client is located, which can reduce the traffic of the HDFS write operation.
  • the method further includes: the actual distribution of the data center of the target data copy and the copy placement strategy indication When multiple data centers of are inconsistent, the name management server adjusts the copy of the target data to the multiple data centers indicated by the copy placement strategy.
  • the actual distribution of the copies is monitored to ensure that the actual distribution of the copies is consistent with the copy placement strategy.
  • the second data center where the target copy is actually distributed does not belong to the multiple data centers indicated by the copy placement strategy, delete the target copy in the second data center; if the multiple data centers indicated by the copy placement strategy The third data center in the center does not include the target data pair, and the target copy is written in the third data center.
  • the embodiment of the present application also provides a specific method for monitoring the distribution of replicas, which can ensure that the actual distribution of replicas is consistent with the replica placement strategy.
  • a resource management server including: a transceiver unit for receiving a data read request from a client, the data read request is for requesting multiple files; a processing unit for reading a target from a first data center
  • the target data includes data of different files in the multiple files; wherein, the first data center is the data center with the highest data localization among the multiple data centers that store the copy of the target data, and the data localization is used It indicates how close the copy of the target data stored in the data center is to the target data.
  • the transceiver unit is also used to send a copy of the target data read from the first data center to the client.
  • the processing unit is further configured to: determine multiple data centers storing copies of the target data according to the directory information of multiple files, and the read data request carries the target data Calculate the proximity of the copies of the target data stored in the multiple data centers to the target data, and determine the data center where the stored copies of the target data are closest to the target data as the first data center.
  • the processing unit is further configured to: when the first data center fails, obtain the data from the second data center.
  • the center reads a copy of the target data, where the second data center is the data center with the highest data localization among the multiple data centers except the first data center; the transceiver unit is also used to: send the data read from the second data center to the client Take a copy of the target data.
  • the copy of the target data is a copy stored in the data center where the client is located.
  • a name management server including: a transceiving unit for receiving a write data request from a client, the write data request carrying target data; a processing unit for writing a copy of the target data to the multiple Data centers.
  • the processing unit is further configured to: write a copy of the first target data in the data center where the client is located.
  • the processing unit is further used for: the data center where the copies of the target data are actually distributed and the copy placement When the multiple data centers indicated by the policy are inconsistent, the copy of the target data is adjusted to the multiple data centers indicated by the copy placement policy.
  • the present application provides a resource management server, which includes a processor and a memory.
  • the memory stores computer instructions; when the processor executes the computer instructions stored in the memory, the resource management server executes the foregoing first aspect or the methods provided by various possible implementations of the first aspect.
  • the present application provides a name management server.
  • the name management server includes a processor and a memory.
  • the memory stores computer instructions; the processor executes the computer instructions stored in the memory, and the name management server executes the foregoing second aspect or the methods provided by various possible implementations of the second aspect.
  • the present application provides a computer-readable storage medium in which computer instructions are stored, and the computer instructions instruct the resource management server to execute the foregoing first aspect or various possible implementations of the first aspect. method.
  • the present application provides a computer-readable storage medium having computer instructions stored in the computer-readable storage medium, and the computer instructions instruct the name management server to execute the foregoing second aspect or various possible implementations of the second aspect. method.
  • the present application provides a computer program product.
  • the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the resource management server can read the computer instruction from a computer readable storage medium.
  • the resource management server executes the foregoing first aspect or the methods provided by various possible implementations of the first aspect.
  • this application provides a computer program product.
  • the computer program product includes computer instructions, and the computer instructions are stored in a computer-readable storage medium.
  • the processor of the name management server can read the computer instruction from the computer-readable storage medium.
  • the name management server executes the above-mentioned second aspect or the methods provided by various possible implementations of the second aspect.
  • Figure 1 is an architecture diagram of HDFS in the prior art
  • FIG. 2 is an architecture diagram of a data storage system provided by an embodiment of the application
  • FIG. 3 is a structural block diagram of a server provided by an embodiment of the application.
  • FIG. 4 is a schematic flowchart of a data writing method provided by an embodiment of the application.
  • FIG. 5 is a schematic diagram of copy writing provided by an embodiment of the application.
  • FIG. 6 is another schematic diagram of copy writing provided by an embodiment of the application.
  • Figure 7 is a schematic diagram of copy monitoring provided by an embodiment of the application.
  • FIG. 8 is a schematic flowchart of a data reading method provided by an embodiment of the application.
  • FIG. 9 is a schematic diagram of copy distribution provided by an embodiment of the application.
  • FIG. 10 is a schematic diagram of another copy distribution provided by an embodiment of the application.
  • FIG. 11 is a schematic diagram of application program distribution provided by an embodiment of the application.
  • FIG. 12 is a schematic diagram of another application program distribution provided by an embodiment of the application.
  • FIG. 13 is another structural block diagram of a server provided by an embodiment of the application.
  • FIG. 14 is another structural block diagram of a server provided by an embodiment of the application.
  • Hadoop distributed file system (Hadoop distributed file system, HDFS)
  • HDFS is a file system of Hadoop, used to store massive amounts of data, and has the characteristics of high fault tolerance and high throughput.
  • HDFS includes YARN Resource Manager (YARN resource manager), HDFS client, Name Node (Name Node), and many data nodes (Data Node, DN).
  • YARN resource manager YARN resource manager
  • HDFS client Name Node
  • Data Node DN
  • One rack in HDFS includes multiple data nodes, and multiple racks constitute a data center (DC).
  • DC data center
  • the HDFS client writes a file, the file can be divided into multiple data blocks, and multiple copies can be stored for each data block, and different copies are stored in different data nodes.
  • YARN Resource Manager is used to manage YARN, and the data in Hadoop can be uniformly managed through YARN Resource Manager.
  • YARN Resource Manager receives a read and write request from a client, and allocates MapReduce tasks to the request.
  • YARN application reads data on data node 1 and data node 2.
  • YARN is the resource scheduling framework of Hadoop, which can allocate computing resources for the operation of the computing engine.
  • the client is used for users to create files, access files, and modify files.
  • the name node is responsible for managing the namespace of the file and storing the metadata of the file.
  • the metadata may be the file name of the file, the directory of the file, the block list of the file, and the data node information corresponding to the block.
  • the directory of the file is the path to access the file;
  • the block list of the file indicates the data blocks included in the file;
  • the data node information corresponding to the block is used to indicate which data node the data block is stored on.
  • the data node is used to store data and receive data read requests from the client (can also be referred to as data read and write requests).
  • the cross-DC cluster includes multiple DCs, which can store massive amounts of data.
  • the Hadoop shown in Figure 1 is a cross-DC cluster Hadoop, that is, the data nodes of Hadoop are distributed in three or more data centers.
  • DC refers to a physical space for centralized processing, storage, transmission, exchange, and management of data information, and computers, servers, network equipment, communication equipment, data centers, etc. are the key equipment of the data center.
  • MapReduce is the computing framework (computing engine) of Hadoop, which allocates applications to read data requests submitted by clients. Through MapReduce, applications can be run on YARN to perform read and write requests on HDFS from the HDFS client.
  • MapReduce is a computing engine. After the client submits a data read request and allocates MapReduce tasks for the client's data read request, YARN can allocate computing resources for the MapReduce tasks.
  • the rules for placing copies are called BPP.
  • BPP the rules for placing copies.
  • all copies of the same data block are placed in the same DC. Therefore, the existing BPP may be a copy placement rule in one DC, for example, a copy placement rule in different racks of the same DC.
  • copies of the same data block may be placed across DCs, and the BPP may indicate the placement rules of the copies in different data centers (DC).
  • different copies of the same data block can be placed on different racks in the DC cluster, which can prevent all the copies from being lost when a rack fails and improve the reliability of the system.
  • Data localization can represent the degree of "closeness" between HDFS data and the MapReduce task that processes the data. The higher the data localization, the closer the data node where the copy is located to the client. Reading the copy on a node with high local data can reduce the bandwidth consumption and delay of reading data.
  • the file requested by the client includes data block 1, data block 2, and data block 3.
  • YARN application 1 is used to access DC 1
  • DC 1 includes a copy of data block 1
  • YARN application 2 is used to access DC 2
  • DC 2 includes a copy of data block 1, a copy of data block 2, and a copy of data block 3
  • YARN application 3 is used to access DC 3, which includes a copy of data block 2 and a copy of data block 3. It can be seen that the entire file can be accessed through YARN application 2, and there is no need to cross-DC to consume traffic to access the copy.
  • DC 2 has the highest data localization.
  • Recovery point object recovery point object
  • RPO refers to the length of time between the first moment and the moment when the disaster occurs.
  • RTO refers to the length of time from the moment when the system (for example, HDFS) downtime causes the business to stop to the moment when the system is restored and the business resumes operation after a disaster occurs.
  • Figure 1 is a diagram of the existing Hadoop system architecture. As shown in Figure 1, all copies of a data in the existing HDFS are only stored in a certain DC. For example, all copies of file 1 (three copies: DN1-1, DN1-2,: DN1-3) are stored in DC 1, and all copies of file 2 (three copies: DN2-1, DN2-2, DN2) -3) All are stored in DC 2.
  • the YARN Resource Manager receives the client's data read request, and the YARN application allocated for the data read request runs on the same DC to ensure data localization and avoid cross-DC access to copies.
  • FIG 2 is an architectural diagram of the data storage system provided by the embodiment of the present application, including HDFS client, resource management server (ie YARN Resource Manager, YARN resource manager), HDFS NameNode (HDFS NameNode), HDFS DataNode ), at least three DCs.
  • resource management server ie YARN Resource Manager, YARN resource manager
  • HDFS NameNode HDFS NameNode
  • HDFS DataNode at least three DCs.
  • DC 1 DC 1
  • DC 3 DCs
  • the HDFS client exists in a certain DC in the data storage system.
  • DC 2 shown in Figure 2.
  • HDFS NameNode is deployed across DCs.
  • the HDFS NameNode of DC 1 in Figure 2 is HDFS NameNode 1 and HDFS NameNode 2, HDFS NameNode 1 and HDFS NameNode 2 are deployed on DC 1 and DC 2, HDFS NameNode 1 and HDFS NameNode 2 Can manage the NameSpace of DC 1.
  • HDFS NameNode 1 is the primary NameNode
  • HDFS NameNode 2 is the standby NameNode accordingly.
  • Resource management servers are deployed across DCs; for example, the YARN Resource Manager of DC 1 in Figure 2 is YARN Resource Manager 1 and YARN Resource Manager 2, YARN Resource Manager 1 and YARN Resource Manager 2 are deployed on DC 1 and DC 2, respectively, YARN Resource Manager 1 and YARN Resource Manager 2 can perform resource management on DC 1.
  • the HDFS NameNode of DC2 and DC3 can also be deployed across DCs.
  • the HDFS NameNode of DC 2 is HDFS NameNode 3 and HDFS NameNode 4 respectively deployed in DC 2 and DC 3
  • HDFS NameNode 3 and HDFS NameNode 4 can manage the NameSpace of DC 2.
  • the HDFS NameNode of DC 3 is HDFS NameNode 5 and HDFS NameNode 6 respectively deployed in DC 3 and DC 1, and HDFS NameNode 5 and HDFS NameNode 6 can manage the NameSpace of DC 3.
  • YARN Resource Manager of DC 2 and DC 3 can also be deployed across DCs.
  • ZooKeeper open source distributed application coordination service
  • JournalNode log node
  • ZooKeeper and JournalNode of other DCs can work normally, and data access is not affected.
  • a DC can include multiple data nodes.
  • DC 1 includes multiple data nodes such as DN 1-1, DN 1-2, and DN 1-3;
  • DC 2 includes multiple data nodes such as DN 2-1, DN 2-2, and DN 2-3;
  • DC 3 includes multiple data nodes such as DN 3-1, DN 3-2, and DN 3-3.
  • copies of the same data block are stored across DCs.
  • a copy of file 1 is stored in DC 1 and DC 2
  • a copy of file 2 is stored in DC 1, DC 2.
  • DC fails, a copy of the data block can also be accessed on other DCs to ensure normal read, write, and access by the client.
  • the resource management server may be the YARN Resource Manager described in the embodiment of this application, which is used to receive the client's read data request, and according to the client's read data request and the actual distribution of the copy of the file requested by the client, calculate the one with the highest data localization.
  • DC assign Application to execute the client's read data request on the DC.
  • the resource management server recalculates another DC with the highest data localization, and migrates the Application to the newly determined DC to execute the client's data read request.
  • the client may need to occupy bandwidth between DCs to read data or write data on data nodes of different DCs.
  • the communication bandwidth between different DCs is often limited.
  • the client occupies the communication bandwidth between DCs to read or write data across DCs, which will greatly affect the normal communication between DCs and affect the performance of the entire data storage system. For example, as shown in Figure 2, the client is in DC 2. Assuming that the copy requested by the client is stored in the data node DN 1-3, the client will occupy the communication bandwidth between DC 2 and DC 1. Read the copy from DN 1-3 of DC 1.
  • the copy is not only stored in one DC, but is stored across DCs, and a copy of data can be stored in multiple DCs.
  • the client's read data request and read data request are always executed on the DC with the highest data localization, so as to prevent the client from reading and writing data across DCs to excessively occupy the bandwidth between DCs. After a single DC fails, the client can also access the copy on other DCs where the target data is stored, avoiding the client's request from not being responded to.
  • FIG. 3 is a schematic diagram of the hardware structure of the server 30 provided by an embodiment of the application.
  • the server 30 may be the resource management server or the name management server described in the embodiment of the present application.
  • the server 30 includes a processor 301, a memory 302, and at least one network interface (in FIG. 3, it is only an example and the network interface 303 is included as an example for description).
  • the processor 301, the memory 302, and the network interface 303 are connected to each other.
  • the processor 301 may be a general-purpose central processing unit (central processing unit, CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more programs for controlling the execution of the program of this application. integrated circuit.
  • CPU central processing unit
  • ASIC application-specific integrated circuit
  • the network interface 303 is an interface of the server 30 for communicating with other devices or communication networks, such as Ethernet, radio access network (RAN), wireless local area networks (WLAN), and so on.
  • RAN radio access network
  • WLAN wireless local area networks
  • the memory 302 can be a read-only memory (ROM) or other types of static data centers that can store static information and instructions, random access memory (RAM), or other types that can store information and instructions
  • the dynamic data center can also be electrically erasable programmable read-only memory (EEPROM), compact disc read-only memory (CD-ROM) or other optical disk storage, optical disc storage (Including compact discs, laser discs, optical discs, digital universal discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic data centers, or can be used to carry or store desired program codes in the form of instructions or data structures and can be used by a computer Any other media accessed, but not limited to this.
  • the memory can exist independently and is connected to the processor through the communication line 302. The memory can also be integrated with the processor.
  • the memory 302 is used to store computer-executable instructions for executing the solution of the present application, and the processor 301 controls the execution.
  • the processor 301 is configured to execute computer-executable instructions stored in the memory 302, so as to implement the intention processing method provided in the following embodiments of the present application.
  • the computer-executable instructions in the embodiments of the present application may also be referred to as application program codes, which are not specifically limited in the embodiments of the present application.
  • the processor 301 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 3.
  • the server 30 may include multiple processors, such as the processor 301 and the processor 306 in FIG. 3. Each of these processors can be a single-CPU (single-CPU) processor or a multi-core (multi-CPU) processor.
  • the processor here may refer to one or more devices, circuits, and/or processing cores for processing data (for example, computer program instructions).
  • the server 30 may further include an output device 304 and an input device 305.
  • the output device 304 communicates with the processor 301 and can display information in a variety of ways.
  • the output device 304 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait.
  • the input device 305 communicates with the processor 301, and can receive user input in a variety of ways.
  • the input device 305 may be a mouse, a keyboard, a touch screen device, a sensor device, or the like.
  • the aforementioned server 30 may be a general-purpose device or a special-purpose device.
  • the server 30 may be a desktop computer, a network server, an embedded device, or other devices with a similar structure in FIG. 3.
  • the embodiment of the present application does not limit the type of the server 30.
  • An embodiment of the present application provides a method for writing data. As shown in FIG. 4, the method includes the following steps 401 and 402.
  • Step 401 The name management server receives a data write request from a client; the data write request carries target data.
  • name management server described in the embodiment of the present application may be the HDFS Name Node shown in FIG. 2.
  • Step 402 The name management server writes a target copy into multiple data centers according to the target data; the target copy is a copy of the target data.
  • first write the copy in the data center where the client is located For example, the first target copy is written in the first data center, where the first data center is the data center where the client is located.
  • a copy placement strategy (BPP) for storing copies across DCs is preset, and the name management server can write copies across DCs according to BPP, and first write the copy in the DC where the HDFS client is located, that is, the first copy storage On the DC where the HDFS client is located. Subsequently, other copies are written to the data nodes of different DCs in sequence. It is understandable that the copy is first written to the DC where the HDFS client is located, which reduces the traffic occupied by the client to write data, thereby improving the performance of the HDFS write operation.
  • BPP copy placement strategy
  • BPP instructs to write copies of the same data block into DC 1, DC 2, and DC 3.
  • the name node When the name node writes a copy according to BPP, it first writes a copy in the DC 2 where the HDFS client is located, for example, writes a copy in DN 2-2 in DC 2. Then, copy the copy in DN 2-2, and write the copied copy into DN1-3 of DC 1. Finally, copy the copy in DN 1-3, and write the copied copy into DN 3-1 of DC 3.
  • the copy placement strategy is a directory-level strategy. For example, a copy placement strategy is set for the same HDFS directory, and all files in the directory follow this strategy to write the copy. After the name node writes a file according to the copy placement strategy, the actual distribution of the copies of each data block included in the file is recorded through the metadata of the file.
  • the metadata of the file may include the block list of the file and the data node information corresponding to the block.
  • the method shown in FIG. 4 further includes: determining whether the data center where the target replica is actually distributed is consistent with the multiple data centers indicated by the replica placement strategy;
  • the replicas of the target data are adjusted to the multiple data centers indicated by the replica placement strategy.
  • the adjusting and adjusting the actual distribution of the target replicas according to the replica placement strategy includes:
  • the second data center where the target copy is actually distributed does not belong to the multiple data centers, delete the target copy in the second data center; that is, it cannot be placed in a data center that is not indicated by the copy placement strategy
  • the actual distribution of replicas of target data needs to be consistent with the data center indicated by the replica placement strategy.
  • the target copy is written in the third data center. That is to say, if a data center indicated by the copy placement strategy does not store copies, you need to copy the copies in other data centers to store them in this data center. The actual distribution of the copies of the target data needs to be consistent with the data center indicated by the copy placement strategy be consistent.
  • the name node can also check whether the actual distributed DC of the copy is consistent with the DC indicated by the BPP. If the actual distribution of the copy is inconsistent with the BPP, the copy in each DC is adjusted, including copying the copy or deleting the copy, Ensure that the actual distribution of the replicas is consistent with the BPP.
  • the BPP indicates that the copy of file 1 is stored in DC 1, DC 2, and DC 3. That is, the copy of each data block of file 1 needs to be stored in DC 1, DC 2, and DC 3.
  • the name node detects that DC 3 is missing and can copy the copy to DC 3 from other DCs. For example, copy the copy of the data block stored in DC 1 or DC 2 to DC 3.
  • the embodiment of the present application provides a data access, which is suitable for the data storage system shown in FIG. 2. As shown in Figure 8, the method includes the following steps:
  • Step 801 The resource management server receives a data read request from the client, where the read data request is for multiple files.
  • the resource management server is used to perform resource management on the data storage system, including allocating applications for client read data requests.
  • the resource management server may be the YARN Resource Manager described in the embodiment of this application.
  • the client in response to a user's operation on the client, the client runs an application to request access to multiple files, and the data of different files among the multiple files requested by the client to be accessed may be referred to as target data.
  • target data For example, the client requests to access M files, the target data may be data of N different files, and N is a positive integer less than or equal to M.
  • the client writes a file, it writes a copy of the file across DCs. Therefore, the copies of multiple files that the client requests to access may be stored in multiple different DCs.
  • the multiple files requested by the client come from different directories.
  • the multiple files requested to be accessed by the client are: target file 1 and target file 2.
  • the directory of target file 1 “desktop/computer/E disk/folder 1/target file 1”
  • the directory of target file 2 is: "computer/D disk/folder 2/target file 2”.
  • the target data may be the data of target file 1 and target file 2.
  • the client can send a data read request to the resource management server, and the resource management server allocates a corresponding YARN application to execute the data read request, so that the client can read the data of target file 1 and target file 2.
  • the read data request may include directory information of the target data.
  • the directory information of the target data is used to indicate the access paths of multiple files that the client requests to access.
  • an application triggered by the user on the client requests access to the data of target file 1, target file 2, and the client submits a read data request to the resource management server.
  • the read data request carries the directory information of target file 1, target file 2.
  • the directory information carried in the read data request is: "desktop/computer/E disk/folder 1/target file 1", "computer/D disk/folder 2/target file 2".
  • Step 802 The resource management server determines multiple data centers that store copies of the target data.
  • the copy of the target data is the copy of multiple files requested to be accessed by the client.
  • the client requests access to the data of target file 1 and target file 2
  • the copy of the target data may be a copy of target file 1 and a copy of target file 2.
  • the copies of the files in the embodiments of the present application are placed across DCs, so the copies of the files are stored in at least two different data centers.
  • the copy placement strategy corresponding to the file determines the DC where the copies of the file are actually distributed.
  • the copy placement strategies for multiple files requested by the client are the same, and the DCs where the copies of the multiple files are actually distributed may be the same;
  • the copy placement strategies of each file are different, and the DCs where the copies of the multiple files are actually distributed may be different.
  • the copy placement strategy in the embodiment of the present application is at the directory level, and the copy placement strategy for files in the same directory is the same. That is, when the multiple files requested by the client do not belong to the same directory, that is, the multiple files come from different directories.
  • the copy placement strategies corresponding to the multiple files are different.
  • the client requests access to target file 1 and target file 2, and target file 1 and target file 2 come from different directories.
  • the copy placement strategy corresponding to the target file 1 is: the copy is placed in DC 1, DC 2, and DC 3; that is, the copies of all data blocks of the target file 1 need to be placed in DC 1, DC 2, and DC 3.
  • the copy placement strategy corresponding to target file 2 is: the copy is placed in DC 2 and DC 3; that is, the copies of all data blocks of the target file 2 need to be placed in DC 2 and DC 3.
  • the copy placement strategy for multiple target files is the same.
  • the client requests access to target file 1 and target file 2, where the copy placement strategy for target file 1 is: the copy is placed in DC 1, DC 2, DC 3; that is, copies of all data blocks of target file 1 are required Place in DC 1, DC 2, DC 3.
  • the copy placement strategy corresponding to target file 2 is also: the copy is placed in DC 1, DC 2, and DC 3; that is, the copies of all data blocks of the target file 2 need to be placed in DC 1, DC 2, and DC 3.
  • the data center corresponding to the multiple target files requested to be accessed by the client can be determined according to the directory information in the read data request, that is, the data center corresponding to the target data requested by the client, and these data centers store the target data. Copy.
  • the metadata of the file records the DC where the copies of the file are actually distributed.
  • the metadata of the file By querying the metadata of the file, it is possible to determine which data centers store the copies of the file.
  • the multiple files requested by the client can be determined according to the directory information carried in the read data request.
  • the metadata of the multiple files can be queried to determine which data centers store copies of the data (ie target data) of the multiple files, that is, the DC where the copies of the target data are actually distributed.
  • the block list of the file and the data node information corresponding to the block are queried to determine the data blocks included in the file requested by the client and the data node where each data block is located. According to the data center to which the data node where the data block is located, that is, the data center that stores the copy of the data of the file.
  • the resource management server queries the metadata of the target file 1, and can determine multiple data centers storing copies of the target file 1.
  • the block list of target file 1 includes data block 1a, data block 1b, and data block 1c.
  • the data node information corresponding to data block 1a is "DN 1-1, DN 2-1", that is, data block 1a has Two copies are distributed and stored in the data node DN 1-1 of DC 1 and the data node DN 2-1 of DC 2;
  • the data node information corresponding to data block 1b is "DN 1-2, DN 2-2", that is, data There are two copies of block 1b, which are distributed and stored in the data node DN 1-2 of DC 1, and the data node DN 2-2 of DC 2.
  • the data node information corresponding to data block 1c is "DN 1-3, DN 2-3" , That is, the data block 1c has two copies, which are distributed and stored in the data node DN 1-3 of DC 1, and the data node DN 2-3 of DC 2.
  • the block list of target file 2 includes data block 2a and data block 2b.
  • the data node information corresponding to data block 2a is "DN 2-4, DN 3-1", that is, data block 2a has two copies, which are distributed and stored in the data node DN 2-4 of DC 2 and the data node of DC 3 DN 3-1;
  • the data node information corresponding to data block 2b is "DN 2-5, DN 3-2", that is, there are two copies of data block 2b, which are distributed and stored in the data node DN 2-5, DC 3 of DC 2
  • the data node DN 3-2 is described in the data node information corresponding to data block 2a.
  • Figure 9 shows the actual distribution of the target file 1 and the copy of the target file 2.
  • DC 1 stores a copy of data block 1a, data block 1b, and data block 1c
  • DC 2 stores a copy of data block 1a, data block 1b, and data block 1c. That is to say, the data centers where the copy of the target file 1 is stored are DC1 and DC2, that is, the data centers where the copy of the target file 1 is actually distributed are DC1 and DC2.
  • DC 2 stores a copy of data block 2a and data block 2b
  • DC 3 stores a copy of data block 2a and data block 2b. That is to say, the data centers where the copy of the target file 2 is stored are DC 2 and DC 3, that is, the data centers where the copy of the target file 2 is actually distributed are DC 2 and DC 3.
  • the data centers corresponding to the multiple files that the client requests to access are DC1, DC2, and DC3, that is, the multiple data centers that store copies of the target data are DC1, DC2, and DC3.
  • Step 803 The resource management server determines the first data center with the highest data localization among the multiple data centers.
  • a copy of the target data can be accessed in the multiple data centers, but all data of the target data may not be accessible in a certain data center. It is assumed that the copy accessed in a certain data center is closest to the target data, that is, the data localization of the data center is the highest, that is, the first data center described in the embodiment of the present application.
  • a possible implementation is that the copy placement strategies corresponding to the multiple files requested by the client are inconsistent, that is, the DCs where the copies of the multiple files are actually distributed are different, so the data localization of different DCs is different, and the data localization is selected.
  • the DC with the highest degree of transformation is used as the first data center described in the embodiment of the present application.
  • the target file 1 includes data block 1a, data block 1b, and data block 1c.
  • DC 1 a copy of data block 1a, a copy of data block 1b, and a copy of data block 1c can be accessed; in DC 2 A copy of data block 1a, a copy of data block 1b, and a copy of data block 1c can be accessed.
  • Target file 2 includes data block 2a and data block 2b, and a copy of data block 2a and a copy of data block 2b can be accessed in DC 1; a copy of data block 2a and a copy of data block 2b can be accessed in DC 2.
  • the copy accessed in DC 2 is closest to the client's request, that is, all the data of target file 1 and target file 2 can be accessed in DC 2. Therefore, DC 2 is in DC 1, DC 2, and DC 3.
  • the data center with the highest data localization, DC 2 is the first data center described in the embodiment of the application.
  • the multiple files requested by the client come from the same directory, and the multiple corresponding copy placement strategies are the same, that is, the DCs where the copies of the target file are actually distributed are the same, so the data of different DCs is localized
  • any one of the DCs can be selected as the first data center described in the embodiment of the present application.
  • the target file requested by the client is target file 1 and target file 2, where target file 1 includes data block 1a, data block 1b, and data block 1c.
  • the copy placement strategy for target file 1 is: the copy is placed in DC 1, DC 2.
  • DC 1 stores a data block 1a, a data block 1b, and a copy of a data block 1c
  • DC 2 stores a data block 1a, a data block 1b, and a copy of a data block 1c. That is to say, the data centers where the copy of the target file 1 is stored are DC1 and DC2, that is, the data centers where the copy of the target file 1 is actually distributed are DC1 and DC2.
  • the target file 2 includes a data block 2a and a data block 1b.
  • the copy placement strategy corresponding to the target file 2 is the same as the copy placement strategy corresponding to the target file 1, that is, the copy is placed in DC 1 and DC 2.
  • DC 1 stores a copy of data block 2a and data block 2b
  • DC 2 stores a copy of data block 2a and data block 2b. That is to say, the data centers where the copy of the target file 2 is stored are DC1 and DC2, that is, the data centers where the copy of the target file 2 is actually distributed are DC1 and DC2.
  • the data centers corresponding to the multiple files that the client requests to access are DC1 and DC2.
  • all the data of the target file 1 and the target file 2 can be accessed in DC 1 and DC 2, and the data localization degree of DC 1 and DC 2 is the same.
  • DC 1 or DC 2 is the first data center described in the embodiment of the application.
  • Step 804 The resource management server sends a copy of the target data read from the first data center to the client.
  • the resource management server submits the read data request to the YARN resource of the DC with the highest localization (that is, the first data center described in the embodiment of this application) to run, that is, the resource management server allocates a YARN application on the DC Execute the client's read data request.
  • the resource management server allocates YARN application 2 to access the copy of data block 1a, data block 1b, data block 1c, data block 2a, and A copy of the data block 2b, and the accessed copy is returned to the client.
  • the process management service sends copies of the target data read in the first data center to the client, for example, copies of all data blocks of target file 1 and target file 2.
  • the process management service does not have access to all copies of the target data in the first data center, the process management service may return the copies of the target data accessed from the first data center to the client, and the customer The end can also access other copies of the target data on other DCs.
  • the multiple files requested by the client include data block 1, data block 2, and data block 3.
  • the copy of data block 1 and the copy of data block 3 can be accessed on DC 1 with the highest data localization, and on DC 3 Access the copy of data block 2.
  • Step 805 When the first data center fails, the resource management server reads a copy of the target data in the second data center, and sends the copy of the target data read from the second data center to the client.
  • the second data center is one of the multiple data centers storing copies of the target file, except for the first data center, the data center with the highest data localization.
  • the client's data read request is submitted to the YARN resource of the second data center to run, that is, the resource management server allocates a YARN application to execute the client's data read request on the second data center. It can ensure that the client's read data request is responded, and due to the high localization of the data in the second data center, most copies of the target data can be accessed locally in the second data center, reducing cross-DC access to data as much as possible, and saving bandwidth Consumption.
  • DC 2 is the data center with the highest data localization.
  • the copy of data block 1a, the copy of data block 1b, and the copy of data block 1c can be accessed from DC 1.
  • DC3 a copy of data block 2a and a copy of data block 2b can be accessed.
  • the client requests to access target file 1, target file 2, including data block 1a, data block 1b, data block 1c, data block 2a, and data block 2b.
  • target file 1 including data block 1a, data block 1b, data block 1c, data block 2a, and data block 2b.
  • DC 1 The accessed copy is closer to the data requested by the client. Therefore, the data localization degree of DC 1 is higher than that of DC 3. That is, DC 1 is the second data center described in the embodiment of the application.
  • the resource management server can access a copy of data block 1a, a copy of data block 1b, and a copy of data block 1c on DC 1 YARN application 1 and return the accessed copies to the client.
  • the client can also access the copy of data block 1 across DC, for example, occupying the communication bandwidth between DC 1 and DC 3, and access the copy of data block 2a and data block 2b in DC 3.
  • the method shown in FIG. 8 in the embodiment of the present application can be implemented by the plug-in DC Scheduler in the YARN ResourceManager.
  • it can also be implemented by other functional modules in the YARN ResourceManager, which is not limited in the embodiment of the present application.
  • the copies of the files are placed across DCs, and are not limited to the same DC. After a single DC fails, all the data in the file will not be lost.
  • the client can also access a copy of the file in other DCs, as far as possible to ensure that the client's business is not affected by the DC downtime, and the client's data read request can be responded to in a timely manner.
  • FIG. 13 shows a possible structural schematic diagram of the server involved in the foregoing embodiment.
  • the server shown in FIG. 13 may be the resource management server or the name management server described in the embodiment of the present application, or may be a component in the resource management server or the name management server that implements the above method.
  • the server includes a processing unit 1301 and a transceiver unit 1302.
  • the processing unit may be one or more processors, and the transceiving unit 1302 may be a network interface.
  • the processing unit 1301 is configured to support the resource management server to perform step 802 and step 803, to support the name management server to perform step 402, and/or other processes used in the technology described herein.
  • the transceiver unit 1302 is configured to support, for example, the resource management server to perform step 801, step 801, and step 805, to support the name management server to perform step 401, and/or other processes used in the technology described herein.
  • the server shown in FIG. 13 may also be a chip used in a resource management server or a name management server.
  • the chip may be a System-On-a-Chip (SOC).
  • the above transceiver unit 1302 for receiving/sending may be a network interface of the server for receiving signals from other servers.
  • the server includes: a processing module 1401 and a communication module 1402.
  • the processing module 1401 is used to control and manage the actions of the server, for example, to perform the steps performed by the above-mentioned processing unit 1301, and/or to perform other processes of the technology described herein.
  • the communication module 1402 is configured to perform the steps performed by the above-mentioned transceiver unit 1302, and supports interaction between the server and other devices, such as interaction with other terminal servers.
  • the server may further include a storage module 1403, and the storage module 1403 is used to store program codes and data of the server.
  • the processing module 1401 is a processor
  • the communication module 1402 is a network interface
  • the storage module 1403 is a memory
  • the server is the server shown in FIG. 3.
  • the present application provides a computer-readable storage medium in which computer instructions are stored.
  • the computer instruction instructs the resource management server to execute the above-mentioned data reading method, or the computer instruction is used to implement the functional units included in the resource management server.
  • the present application provides a computer-readable storage medium in which computer instructions are stored.
  • the computer instruction instructs the name management server to execute the above-mentioned data writing method, or the computer instruction is used to implement the functional units included in the name management server.
  • This application provides a computer program product, which includes computer instructions.
  • the computer instruction instructs the resource management server to execute the above-mentioned data reading method, or the computer instruction is used to implement the functional units included in the resource management server.
  • This application provides a computer program product, which includes computer instructions.
  • the computer instruction instructs the name management server to execute the above-mentioned data writing method, or the computer instruction is used to implement the functional units included in the name management server.
  • the disclosed database access device and method can be implemented in other ways.
  • the embodiments of the database access device described above are only illustrative.
  • the division of the modules or units is only a logical function division, and there may be other division methods in actual implementation, such as multiple units or
  • the components can be combined or integrated into another device, or some features can be omitted or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, database access devices or units, and may be in electrical, mechanical or other forms.
  • the units described as separate parts may or may not be physically separate.
  • the parts displayed as units may be one physical unit or multiple physical units, that is, they may be located in one place, or they may be distributed to multiple different places. . Some or all of the units may be selected according to actual needs to achieve the objectives of the solutions of the embodiments.
  • the functional units in the various embodiments of the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units may be integrated into one unit.
  • the above-mentioned integrated unit can be implemented in the form of hardware or software functional unit.
  • the integrated unit is implemented in the form of a software functional unit and sold or used as an independent product, it can be stored in a readable storage medium.
  • the technical solutions of the embodiments of the present application are essentially or the part that contributes to the prior art, or all or part of the technical solutions can be embodied in the form of a software product, and the software product is stored in a storage medium. It includes several instructions to make a device (which may be a single-chip microcomputer, a chip, etc.) or a processor execute all or part of the steps of the method described in each embodiment of the present application.
  • the aforementioned storage media include: U disk, mobile hard disk, ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供了一种读数据方法、写数据方法及服务器,涉及存储领域,单个DC故障后不会影响客户端在这个DC上的数据访问,能够响应客户端的请求。包括:资源管理服务器从客户端接收读数据请求,所述读数据请求用于请求多个文件;所述资源管理服务器从第一数据中心读取目标数据的副本,所述目标数据包括所述多个文件中不同文件的数据;其中,所述第一数据中心为存储所述目标数据的副本的多个数据中心中数据本地化最高的数据中心,所述数据本地化用于指示数据中心存储的目标数据的副本与所述目标数据的接近程度;所述资源管理服务器向所述客户端发送从所述第一数据中心读取的所述目标数据的副本。

Description

一种读数据方法、写数据方法及服务器 技术领域
本申请实施例涉存储领域,尤其涉及一种读数据方法、写数据方法及服务器。
背景技术
当今企业大数据平台承载的数据量及业务数量越来越大,企业的Hadoop集群的规模也越来越大,企业更倾向于采用单集群模式建设Hadoop集群。此外,还可以Hadoop集群部署多个数据中心(data center,DC)以承载大规模的数据和业务。
现有Hadoop集群中一个数据的所有副本,只存储在某一个DC内。例如,文件1的所有副本只存储在DC1,文件2的所有副本只均存储在DC2。此外,为保证数据本地化,避免跨DC访问副本,为客户端的读数据请求分配的另一种资源协调者(yet another resource negotiator,YARN)应用程序(application)只同一个DC上运行。
如果Hadoop集群中某个DC发生故障后,该DC上正在运行YARN applicatio终断,客户端的请求得不到响应。此外,该DC上的数据不可读写,等该DC完全恢复后,大部分数据均丢失,可靠性难以保证。
发明内容
本申请实施例提供一种读数据方法、写数据方法及服务器,单个DC故障后不会影响客户端在这个DC上的数据访问,能够响应客户端的请求。
为达到上述目的,本申请实施例采用如下技术方案:
第一方面,提供了一种读数据方法,包括:资源管理服务器从客户端接收读数据请求,读数据请求用于请求多个文件;还可以从第一数据中心读取目标数据的副本,目标数据包括所述多个文件中不同文件的数据。其中,第一数据中心为存储目标数据的副本的多个数据中心中数据本地化最高的数据中心,数据本地化用于指示数据中心存储的目标数据的副本与目标数据的接近程度。最后,资源管理服务器向客户端发送从第一数据中心读取的目标数据的副本。
本申请实施例中,副本不仅仅存储在一个DC,而是跨DC存储,一个数据的副本可以存储在多个DC。此外,结合副本的实际分布,始终在数据本地化最高的一个DC上执行客户端的读数据请求,避免客户端跨DC读写数据过多地占用DC之间的带宽。在单个DC故障后,客户端还可以在其他存储有目标数据的DC上访问到副本,避免客户端的请求得不到响应。
结合第一方面,在第一方面的第一种可能的实现方式中,读数据请求携带目标数据的目录信息,资源管理服务器可以根据所述多个文件的目录信息确定存储目标数据的副本的多个数据中心,进一步,计算所述多个数据中心存储的目标数据的副本分别与目标数据的接近程度,将存储的目标数据的副本最接近目标数据的数据中心确定为第一数据中心。
本申请实施例提供中,副本跨DC放置,可以根据目标数据的目录信息确定目标数据的副本的实际分布,进而可以计算每一个DC针对于目标数据的数据本地化,以便在数据 本地化最高的DC中访问目标数据的副本,尽可能在同一个DC访问到更接近目标数据的副本,避免过多占用DC间的通信带宽跨DC访问副本,从而可以提升整个系统的性能。
结合第一方面或第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,方法还包括:在第一数据中心出现故障时,资源管理服务器从第二数据中心读取目标数据的副本,其中,第二数据中心为所述多个数据中心中除第一数据中心外数据本地化最高的数据中心;资源管理服务器向客户端发送从第二数据中心读取的目标数据的副本。。
本申请实施例提供中,在单个DC故障后,仍可以结合副本的实际分布,确定出存储有目标副本的其余DC中,数据本地化最高的一个数据中心,尽可能在该数据中心访问到更接近目标数据的副本,避免过多占用DC间的通信带宽跨DC访问副本,从而可以提升整个系统的性能。
结合第一方面或第一方面的第一或第二种可能的实现方式,在第一方面的第三种可能的实现方式中,目标数据的副本为存储在客户端所在的数据中心的副本。
本申请实施例中,跨DC写入副本时,首先在客户端所在的DC写入副本,可以减少HDFS写操作的流量。
第二方面,提供了一种写数据方法,包括:名称管理服务器从客户端接收写数据请求,写数据请求携带目标数据,还可以根据目标数据将目标数据的副本写入多个数据中心。
本申请实施例中,副本不仅仅存储在一个DC,而是跨DC存储,一个数据的副本可以存储在多个DC。基于此,结合副本的实际分布,始终在数据本地化最高的一个DC上执行客户端的读数据请求、读数据请求,避免客户端跨DC读写数据过多地占用DC之间的带宽。在单个DC故障后,客户端还可以在其他存储有目标数据的DC上访问到副本,避免客户端的请求得不到响应。
结合第二方面,在第二方面的第一种可能的实现方式中,名称管理服务器根据目标数据将目标数据的副本写入多个数据中心,包括:名称管理服务器在客户端所在的数据中心写入第一个所述目标数据的副本。
本申请实施例中,跨DC写入副本时,首先在客户端所在的DC写入副本,可以减少HDFS写操作的流量。
结合第二方面或第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,方法还包括:在目标数据的副本实际分布的数据中心与副本放置策略指示的多个数据中心不一致时,名称管理服务器将目标数据的副本调整到副本放置策略指示的多个数据中心。
本申请实施例中,对副本的实际分布进行监测,保证副本的实际分布与副本放置策略保持一致。
具体实现中,若目标副本实际分布的第二数据中心不属于副本放置策略指示的所述多个数据中心,则删除第二数据中心中的目标副本;若副本放置策略指示的所述多个数据中心中的第三数据中心不包括目标数据副,则在第三数据中心写入目标副本。
本申请实施例还提供了监测副本分布的具体方法,能够保证副本的实际分布与副本放置策略保持一致。
第三方面,提供了一种资源管理服务器,包括:收发单元,用于从客户端接收读数据请求,读数据请求用于请求多个文件;处理单元,用于从第一数据中心读取目标数据的副本,目标数据包括所述多个文件中不同文件的数据;其中,第一数据中心为存储目标数据的副本的所述多个数据中心中数据本地化最高的数据中心,数据本地化用于指示数据中心存储的目标数据的副本与目标数据的接近程度。收发单元,还用于向客户端发送从第一数据中心读取的目标数据的副本。
结合第三方面,在第三方面的第一种可能的实现方式中,处理单元还用于:根据多个文件的目录信息确定存储目标数据的副本的多个数据中心,读数据请求携带目标数据的目录信息;计算多个数据中心存储的目标数据的副本分别与目标数据的接近程度,将存储的目标数据的副本最接近目标数据的数据中心确定为第一数据中心。
结合第三方面或第三方面的第一种可能的实现方式,在第三方面的第二种可能的实现方式中,处理单元还用于:在第一数据中心出现故障时,从第二数据中心读取目标数据的副本,其中,第二数据中心为多个数据中心中除第一数据中心外数据本地化最高的数据中心;收发单元还用于:向客户端发送从第二数据中心读取的目标数据的副本。
结合第三方面或第三方面的第一或第二种可能的实现方式,在第三方面的第二种可能的实现方式中,目标数据的副本为存储在客户端所在的数据中心的副本。
第四方面,提供了一种名称管理服务器,包括:收发单元,用于从客户端接收写数据请求,写数据请求携带目标数据;处理单元,用于根据目标数据将目标数据的副本写入多个数据中心。
结合第四方面,在第四方面的第一种可能的实现方式中,处理单元还用于:在客户端所在的数据中心写入第一个目标数据的副本。
结合第四方面或第四方面的第一种可能的实现方式,在第四方面的第二种可能的实现方式中,处理单元还用于:在目标数据的副本实际分布的数据中心与副本放置策略指示的多个数据中心不一致时,将目标数据的副本调整到副本放置策略指示的多个数据中心。
第五方面,本申请提供一种资源管理服务器,该资源管理服务器包括处理器和存储器。该存储器存储计算机指令;该处理器执行该存储器存储的计算机指令时,所述资源管理服务器执行上述第一方面或者第一方面的各种可能实现提供的方法。
第六方面,本申请提供一种名称管理服务器,该名称管理服务器包括处理器和存储器。该存储器存储计算机指令;该处理器执行该存储器存储的计算机指令,所述名称管理服务器执行上述第二方面或者第二方面的各种可能实现提供的方法。
第七方面,本申请提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,该计算机指令指示资源管理服务器执行上述第一方面或者第一方面的各种可能实现提供的方法。
第八方面,本申请提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,该计算机指令指示名称管理服务器执行上述第二方面或者第二方面的各种可能实现提供的方法。
第九方面,本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。资源管理服务器的处理器可以从计算机可 读存储介质读取该计算机指令,处理器执行该计算机指令时,资源管理服务器执行上述第一方面或者第一方面的各种可能实现提供的方法。
第十方面,本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令,该计算机指令存储在计算机可读存储介质中。名称管理服务器的处理器可以从计算机可读存储介质读取该计算机指令,处理器执行该计算机指令时,名称管理服务器执行上述第二方面或者第二方面的各种可能实现提供的方法。
附图说明
图1为现有技术中HDFS的架构图;
图2为本申请实施例提供的数据存储系统的架构图;
图3为本申请实施例提供的服务器的结构框图;
图4为本申请实施例提供的写数据方法的流程示意图;
图5为本申请实施例提供的副本写入示意图;
图6为本申请实施例提供的副本写入的另一示意图;
图7为本申请实施例提供的副本监测示意图;
图8为本申请实施例提供的读数据方法的流程示意图;
图9为本申请实施例提供的副本分布示意图;
图10为本申请实施例提供的另一副本分布示意图;
图11为本申请实施例提供的应用程序分配示意图;
图12为本申请实施例提供的另一应用程序分配示意图;
图13为本申请实施例提供的服务器的另一结构框图;
图14为本申请实施例提供的服务器的另一结构框图。
具体实施方式
下面将结合附图,对本申请中的技术方案进行描述。
首先,对本申请实施例涉及的术语进行解释说明:
(1)Hadoop分布式文件系统(Hadoop distributed file system,HDFS)
HDFS是Hadoop的文件系统,用于存储海量的数据,具有高容错性、高吞吐量(high throughput)的特点。参考图1,HDFS包括YARN Resource Manager(YARN资源管理者)、HDFS客户端、名字节点(Name Node)和众多数据节点(Data Node,DN)。HDFS中一个机架包括多个数据节点,多个机架构成一个数据中心(data center,DC)。HDFS客户端写入文件时,可以将文件分割成多个数据块,针对每个数据块可以存储多个副本,不同的副本存储在不同的数据节点中。
其中,YARN Resource Manager用于对YARN进行管理,可以通过YARN Resource Manager对Hadoop中的数据进行统一管理。例如,YARN Resource Manager接收来自客户端的读写请求,并为该请求分配把该请求分配MapReduce任务,例如,由YARN application在数据节点1、数据节点2上读取数据。YARN是Hadoop的资源调度框架,可以为计算引擎的运行分配计算资源。
客户端用于用户创建文件、访问文件、修改文件。
名字节点负责管理文件的命名空间(namespace),存储有文件的元数据。其中,元数据可以是文件的文件名、文件的目录、文件的块列表以及块对应的数据节点信息等。 需要说明的是,文件的目录,即访问文件的路径;文件的块列表指示文件包括的数据块;块对应的数据节点信息用于指示数据块存储在哪个数据节点上。
数据节点用来存储数据,接收客户端的读数据请求(还可以称为数据读写请求)。
此外,跨DC集群包括多个DC,可以存储海量的数据。示例的,图1所示的Hadoop是跨DC集群的Hadoop,即Hadoop的数据节点分布在三个及以上的数据中心。其中,DC指的是对数据信息进行集中处理、存储、传输、交换、管理的一个物理空间,而计算机、服务器、网络设备、通信设备、数据中心等是数据中心的关键设备。
(2)MapReduce
MapReduce是Hadoop的计算框架(计算引擎),为客户端提交的读数据请求分配应用程序。通过MapReduce可以将应用程序运行在YARN上,以执行HDFS客户端的HDFS上读写请求。
上述MapReduce就是一种计算引擎,客户端提交读数据请求,为客户端的读数据请求分配MapReduce任务后,YARN可以为MapReduce任务分配计算资源。
(3)副本放置策略(block placement policy,BPP)
在HDFS存储数据时,放置副本所遵循的规则称为BPP。现有技术中,同一个数据块的所有副本放置在同一个DC,因此现有BPP可以是副本在一个DC中的放置规则,例如,副本在同一个DC的不同机架上的放置规则。
本申请实施例中,同一个数据块的副本可以跨DC放置,BPP可以指示副本在不同数据中心(data center,DC)的放置规则。
示例的,可以将同一数据块的不同副本放置在DC集群中的不同机架上,可以防止在某个机架故障时副本全部丢失,提高系统的可靠性。
(4)数据本地化(data locality,DL)
为了尽量减小客户端在Hadoop中读数据的带宽消耗以及延迟,尽可能选择与客户端所在的节点最近的副本。数据本地化可以代表HDFS数据与处理该数据的MapReduce任务的“接近”程度,数据本地化越高,代表副本所在的数据节点与客户端的距离越近。在本地数据化较高的节点上读取副本,可以减小读数据的带宽消耗以及延迟。
示例的,客户端请求的文件包括数据块1、数据块2以及数据块3。其中,YARN application 1用于访问DC 1,DC 1包括数据块1的副本;YARN application 2用于访问DC 2,DC 2包括数据块1的副本、数据块2的副本以及数据块3的副本;YARN application 3用于访问DC 3,DC 3包括数据块2的副本以及数据块3的副本。可见,通过YARN application 2可以访问到整个文件,不需要跨DC耗费流量访问副本,DC 2的数据本地化最高。
(5)恢复点目标(recovery point object,RPO)
灾难发生后,假设系统(例如,HDFS)恢复灾难发生前的第一时刻的数据。RPO指的就是第一时刻到灾难发生时刻之间的时长。
(6)恢复时间目标(recovery time object,RTO)
RTO指的是灾难发生后,从系统(例如,HDFS)宕机导致业务停顿的时刻,到系统恢复且业务恢复运营时刻之间的时长。
图1是现有的Hadoop的系统架构图。如图1所示,现有HDFS中一个数据的所有副 本,只存储在某一个DC内。例如,文件1的所有副本(三个副本:DN1-1、DN1-2、:DN1-3)均存储在DC 1,文件2的所有副本(三个副本:DN2-1、DN2-2、DN2-3)均存储在DC 2。
另外,YARN Resource Manager接收客户端的读数据请求,为读数据请求分配的YARN application只同一个DC上运行,以此保证数据本地化,避免跨DC访问副本。
但是,现有的HDFS系统中某DC发生故障后,该DC上正在运行Application终断,即RTO>0。此外,该DC上的数据不可读写,等该DC完全恢复后,大部分数据均丢失,即RPO>0。可见现有技术中HDFS的可靠性难以保证。
图2是本申请实施例提供的数据存储系统的架构图,包括HDFS客户端、资源管理服务器(即YARN Resource Manager,YARN资源管理者)、HDFS名字节点(HDFS NameNode)、HDFS数据节点(HDFS DataNode)、至少三个DC,图2中仅以DC 1、DC 2、DC 3这三个DC作为示例。需要说明的是,HDFS客户端存在于数据存储系统中的某个DC中。例如,图2所示的DC 2。此外,HDFS NameNode跨DC部署,例如,图2中DC 1的HDFS NameNode是HDFS NameNode 1和HDFS NameNode 2,HDFS NameNode 1和HDFS NameNode 2分别部署在DC 1和DC 2,HDFS NameNode 1和HDFS NameNode 2可以管理DC 1的NameSpace。HDFS NameNode 1为主NameNode,相应地HDFS NameNode 2为备用NameNode。资源管理服务器跨DC部署;例如,图2中DC 1的YARN Resource Manager是YARN Resource Manager 1和YARN Resource Manager 2,YARN Resource Manager 1和YARN Resource Manager 2分别部署在DC 1和DC 2,YARN Resource Manager 1和YARN Resource Manager 2可以对DC 1进行资源管理。
虽然图2未示出,但可以理解的是,DC 2、DC 3的HDFS NameNode也可以跨DC部署。例如,DC 2的HDFS NameNode是HDFS NameNode 3、HDFS NameNode 4分别部署在DC 2和DC 3,HDFS NameNode 3、HDFS NameNode 4可以管理DC 2的NameSpace。DC 3的HDFS NameNode是HDFS NameNode 5、HDFS NameNode 6分别部署在DC 3和DC 1,HDFS NameNode 5、HDFS NameNode 6可以管理DC 3的NameSpace。
同样,DC 2、DC 3的YARN Resource Manager也可以跨DC部署。此外,ZooKeeper(开放源码的分布式应用程序协调服务)、JournalNode(日志节点)跨DC部署在三个DC,单个DC故障时,其他DC的ZooKeeper、JournalNode可以正常工作,数据访问不受影响。
其中,一个DC可以包括多个数据节点。参考图2,DC 1包括DN 1-1、DN 1-2、DN 1-3等多个数据节点;DC 2包括DN 2-1、DN 2-2、DN 2-3等多个数据节点;DC 3包括DN 3-1、DN 3-2、DN 3-3等多个数据节点。此外,同一个数据块的副本跨DC存储,例如,图2中文件1的副本存储在DC 1、DC 2,文件2的副本存储在DC 1、DC 2。在某个DC故障时,还可以在其他DC上访问到该数据块的副本,保证客户端的正常读写、访问。
资源管理服务器可以是本申请实施例所述的YARN Resource Manager,用于接收客户端的读数据请求,根据客户端的读数据请求以及客户端请求文件的副本的实际分布,计算数据本地化最高的某个DC,分配Application在该DC上执行客户端的读数据请求。当该DC故障后,资源管理服务器重新计算数据本地化最高的另一个DC,将该Application迁移到重新确定的这个DC上执行客户端的读数据请求。
需要说明的是,副本跨DC存储时,客户端有可能需要占用DC之间的带宽在不同DC 的数据节点上读数据或写数据。但是不同DC之间的通信带宽往往是有限的,客户端占用DC之间的通信带宽跨DC读数据或写数据,会大大影响DC之间的正常通信,影响整个数据存储系统的性能。示例的,如图2所示,客户端在DC 2中,假设客户端请求访问的副本存储在数据节点DN 1-3中,客户端就要占用DC 2与DC 1之间的通信带宽,在DC 1的DN 1-3中读取该副本。
本申请实施例中,副本不仅仅存储在一个DC,而是跨DC存储,一个数据的副本可以存储在多个DC。此外,结合副本的实际分布,始终在数据本地化最高的一个DC上执行客户端的读数据请求、读数据请求,避免客户端跨DC读写数据过多地占用DC之间的带宽。在单个DC故障后,客户端还可以在其他存储有目标数据的DC上访问到副本,避免客户端的请求得不到响应。
图3所示为本申请实施例提供的服务器30的硬件结构示意图。服务器30可以是本申请实施例所述的资源管理服务器、名称管理服务器。参考图3,服务器30包括处理器301、存储器302以及至少一个网络接口(图3中仅是示例性的以包括网络接口303为例进行说明)。其中,处理器301、存储器302以及网络接口303之间互相连接。
处理器301可以是一个通用中央处理器(central processing unit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。
网络接口303是服务器30的接口,用于与其他设备或通信网络通信,如以太网,无线接入网(radio access network,RAN),无线局域网(wireless local area networks,WLAN)等。
存储器302可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态数据中心,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态数据中心,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compact disc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁数据中心、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。存储器可以是独立存在,通过通信线路302与处理器相连接。存储器也可以和处理器集成在一起。
其中,存储器302用于存储执行本申请方案的计算机执行指令,并由处理器301来控制执行。处理器301用于执行存储器302中存储的计算机执行指令,从而实现本申请下述实施例提供的意图处理方法。
可选的,本申请实施例中的计算机执行指令也可以称之为应用程序代码,本申请实施例对此不作具体限定。
在具体实现中,作为一种实施例,处理器301可以包括一个或多个CPU,例如图3中的CPU0和CPU1。
在具体实现中,作为一种实施例,服务器30可以包括多个处理器,例如图3中的处理器301和处理器306。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、 和/或用于处理数据(例如计算机程序指令)的处理核。
在具体实现中,作为一种实施例,服务器30还可以包括输出设备304和输入设备305。输出设备304和处理器301通信,可以以多种方式来显示信息。例如,输出设备304可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备305和处理器301通信,可以以多种方式接收用户的输入。例如,输入设备305可以是鼠标、键盘、触摸屏设备或传感设备等。
上述的服务器30可以是一个通用设备或者是一个专用设备。在具体实现中,服务器30可以是台式机、网络服务器、嵌入式设备或其他有图3中类似结构的设备。本申请实施例不限定服务器30的类型。
本申请实施例提供一种写数据方法,如图4所示,所述方法包括以下步骤401和步骤402。
步骤401、名称管理服务器从客户端接收写数据请求;所述写数据请求携带目标数据。
需要说明的是,本申请实施例所述的名称管理服务器可以是图2所示的HDFS Name Node。
步骤402、名称管理服务器根据所述目标数据将目标副本写入多个数据中心;所述目标副本为所述目标数据的副本。
需要说明的是,为了减少HDFS写操作的流量,首先在客户端所在的数据中心写入副本。示例的,在第一数据中心写入第一个目标副本,所述第一数据中心为所述客户端所在的数据中心。
具体实现中,预先设置跨DC存储副本的副本放置策略(BPP),名称管理服务器可以根据BPP跨DC写入副本,并且首先在HDFS客户端所在的DC中写入副本,即第一个副本存储在HDFS客户端所在的DC。随后,将其他副本依次写入到不同DC的数据节点中。可以理解的是,首先在HDFS客户端所在的DC写入副本,减少了客户端写数据占用的流量,从而提高了HDFS写操作的性能。
例如,参考图5,BPP指示将同一个数据块的副本写入DC 1、DC 2以及DC 3。名字节点按照BPP写入副本时,首先在HDFS客户端所在的DC 2写入一个副本,例如,在DC 2中的DN 2-2中写入一个副本。接着,在DN 2-2中拷贝副本,将拷贝的副本写入DC 1的DN1-3中。最后,在DN 1-3中拷贝副本,将拷贝的副本写入DC 3的DN 3-1中。
需要说明的是,名字节点在根据BPP写入副本时,如果发现某个本该写入副本的DC出现故障,可以在写入副本时忽略该DC,只在BPP指示的其他有效的DC中写入副本。示例的,参考图6,根据预先设置的BPP,DC 3中需要写入副本,但是由于DC 3出现故障而失效,名字节点在写入副本时忽略DC 3,只在BPP指示的DC 1、DC 2中写入副本。
此外,副本放置策略是目录级别的策略,例如,针对同一个HDFS目录设置一个副本放置策略,该目录下的所有文件都遵循这个策略来写入副本。名字节点根据副本放置策略写入一个文件后,通过文件的元数据记录文件包括的各个数据块的副本的实际分布,文件的元数据可以包括文件的块列表以及块对应的数据节点信息。
可选的,图4所示的方法还包括:判断所述目标副本实际分布的数据中心与所述副本放置策略指示的所述多个数据中心是否一致;
若所述目标副本实际分布的数据中心与所述副本放置策略指示的所述多个数据中心不一致,则将目标数据的副本调整到副本放置策略指示的多个数据中心。
具体地,所述根据所述副本放置策略调整调整所述所述目标副本实际分布,包括:
若所述目标副本实际分布的第二数据中心不属于所述多个数据中心,则删除所述第二数据中心中的所述目标副本;也就是说不能在副本放置策略未指示的数据中心放置副本,目标数据的副本的实际分布需要与副本放置策略指示的数据中心保持一致。
若所述多个数据中心中的第三数据中心不包括所述目标数据副,则在所述第三数据中心写入所述目标副本。也就是说,如果副本放置策略指示的某个数据中心没有存储副本,则需要在其他数据中心拷贝副本,以存储在这个数据中心,目标数据的副本的实际分布需要与副本放置策略指示的数据中心保持一致。
本申请实施例中,名字节点还可以检查副本实际分布的DC与BPP指示的DC是否一致,如果副本的实际分布与BPP不一致,则对各个DC中的副本进行调整,包括拷贝副本或删除副本,确保副本实际分布的DC与BPP保持一致。
示例的,参考图7,BPP指示文件1的副本存储在DC 1、DC 2、DC 3中,即文件1的每一个数据块的副本需要存储在DC 1、DC 2、DC 3中。当DC 3故障恢复后,名字节点检测出DC 3缺失副本,可以在其他DC中拷贝副本到DC 3。例如,将DC 1或DC 2中存储的该数据块的副本拷贝到DC 3。
本申请实施例提供一种数据访问,适用于图2所示的数据存储系统。如图8所示,所述方法包括以下步骤:
步骤801、资源管理服务器从客户端接收数据读数据请求,所述读数据请求用于多个文件。
需要说明的是,资源管理服务器用于对数据存储系统进行资源管理,包括为客户端的读数据请求分配应用程序。例如,资源管理服务器可以是本申请实施例所述的YARN Resource Manager。
具体实现中,响应于用户在客户端的操作,客户端运行某个应用程序请求访问多个文件,客户端请求访问的多个文件中不同文件的数据可以称为目标数据。示例的,客户端请求访问M个文件,目标数据可以是其中N个不同文件的数据,N为小于或者等于M的正整数。还需要说明的是,客户端在写入文件是跨DC写入文件的副本,因此,客户端请求访问的多个文件的副本可能存储在多个不同的DC中。
一种可能的实现方式中,客户端请求的所述多个文件来自不同的目录。例如,客户端请求访问的多个文件为:目标文件1、目标文件2。其中,目标文件1的目录:“桌面/计算机/E盘/文件夹1/目标文件1”,目标文件2的目录为:“计算机/D盘/文件夹2/目标文件2”。可以理解的是,目标数据可以是目标文件1、目标文件2的数据。
具体实现中,客户端可以将读数据请求发送至资源管理服务器,由资源管理服务器分配相应的YARN应用程序来执行读数据请求,以便客户端读取目标文件1、目标文件2的数据。
此外,读数据请求可以包括目标数据的目录信息。其中,目标数据的目录信息用于指示客户端请求访问的多个文件的访问路径。示例的,用户在客户端触发的应用程序请求访问目标文件1、目标文件2的数据,客户端则向资源管理服务器提交读数据请求,该 读数据请求携带目标文件1、目标文件2的目录信息,例如,读数据请求携带的目录信息是:“桌面/计算机/E盘/文件夹1/目标文件1”、“计算机/D盘/文件夹2/目标文件2”。
步骤802、资源管理服务器确定存储有所述目标数据的副本的多个数据中心。
需要说明的是,所述目标数据的副本即客户端请求访问的多个文件的副本。例如,客户端请求访问目标文件1、目标文件2的数据,目标数据的副本可以是目标文件1的副本、目标文件2的副本。
此外,本申请实施例中文件的副本跨DC放置,因此文件的副本至少存储在两个不同的数据中心。此外,文件对应的副本放置策略决定了文件的副本实际分布的DC,客户端请求的多个文件的副本放置策略相同,所述多个文件的副本实际分布的DC可能相同;客户端请求的多个文件的副本放置策略不同,所述多个文件的副本实际分布的DC可能不同。
本申请实施例中副本放置策略是目录级别的,同一目录下的文件的副本放置策略相同。也就是说,当客户端请求访问的多个文件不属于同一个目录,即所述多个文件来自不同的目录。所述多个文件对应的副本放置策略不同。例如,客户端请求访问目标文件1、目标文件2,目标文件1、目标文件2来自不同的目录。目标文件1对应的副本放置策略为:副本放置在DC 1、DC 2、DC 3;即目标文件1的所有数据块的副本均需要放置在DC 1、DC 2、DC 3。目标文件2对应的副本放置策略为:副本放置在DC 2、DC 3;即目标文件2的所有数据块的副本均需要放置在DC 2、DC 3。
当客户端请求访问的多个文件属于同一个目录,多个目标文件对应的副本放置策略相同。例如,客户端请求访问目标文件1、目标文件2,其中,目标文件1对应的副本放置策略为:副本放置在DC 1、DC 2、DC 3;即目标文件1的所有数据块的副本均需要放置在DC 1、DC 2、DC 3。目标文件2对应的副本放置策略也是:副本放置在DC 1、DC 2、DC 3;即目标文件2的所有数据块的副本均需要放置在DC 1、DC 2、DC 3。
具体实现中,可以根据读数据请求中的目录信息确定客户端请求访问的多个目标文件对应的数据中心,即客户端请求的目标数据对应的数据中心,这些数据中心存储有所述目标数据的副本。
一种可能的实现中,文件的元数据记录了文件的副本实际分布的DC,查询文件的元数据,可以确定哪些数据中心存储有文件的副本。示例的,可以根据读数据请求携带的目录信息确定客户端请求的多个文件。进而可以查询所述多个文件的元数据,确定哪些数据中心存储了所述多个文件的数据(即目标数据)的副本,即目标数据的副本实际分布的DC。
一种可能的实现方式中,查询文件的块列表以及块对应的数据节点信息,确定客户端请求的文件包括的数据块以及每个数据块所在的数据节点。根据数据块所在的数据节点所属的数据中心,即存储有该文件的数据的副本的数据中心。
示例的,资源管理服务器查询目标文件1的元数据,可以确定存储有所述目标文件1副本的多个数据中心。例如,目标文件1的块列表中包括数据块1a、数据块1b以及数据块1c,其中,数据块1a对应的数据节点信息为“DN 1-1、DN 2-1”,即数据块1a有两个副本,分布存储在DC 1的数据节点DN 1-1以及DC 2的数据节点DN 2-1;数据块1b对应的数据节点信息为“DN 1-2、DN 2-2”,即数据块1b有两个副本,分布存储在DC 1的数据节点DN 1-2、DC 2的数据节点DN 2-2;数据块1c对应的数据节点信息为“DN 1-3、 DN 2-3”,即数据块1c有2个副本,分布存储在DC 1的数据节点DN 1-3,DC 2的数据节点DN 2-3。
目标文件2的块列表中包括数据块2a、数据块2b。其中,数据块2a对应的数据节点信息为“DN 2-4、DN 3-1”,即数据块2a有两个副本,分布存储在DC 2的数据节点DN 2-4以及DC 3的数据节点DN 3-1;数据块2b对应的数据节点信息为“DN 2-5、DN 3-2”,即数据块2b有两个副本,分布存储在DC 2的数据节点DN 2-5、DC 3的数据节点DN 3-2。
图9所示,是目标文件1、目标文件2的副本的实际分布。DC 1存储有数据块1a、数据块1b以及数据块1c的副本,DC 2存储有数据块1a、数据块1b以及数据块1c的副本。也就是说,存储有所述目标文件1的副本的数据中心有DC 1、DC 2,即目标文件1的副本实际分布的数据中心为DC 1、DC 2。
DC 2存储有数据块2a、数据块2b的副本,DC 3存储有数据块2a、数据块2b的副本。也就是说,存储有所述目标文件2的副本的数据中心有DC 2、DC 3,即目标文件2的副本实际分布的数据中心为DC 2、DC 3。
综上,客户端请求访问的多个文件对应的数据中心为DC 1、DC 2、DC 3,即存储有目标数据的副本的多个数据中心为DC 1、DC 2、DC 3。
步骤803、资源管理服务器确定所述多个数据中心中,数据本地化最高的第一数据中心。
具体地,在所述多个数据中心均可以访问到目标数据的副本,但是在某个数据中心不一定能访问到目标数据的全部数据。假设在其中某个数据中心访问到的副本最接近目标数据,即该数据中心的数据本地化最高,即本申请实施例所述的第一数据中心。
一种可能的实现方式,客户端请求的多个文件对应的副本放置策略不一致,即所述多个文件的副本实际分布的DC是不同的,那么不同DC的数据本地化不同,选择其中数据本地化程度最高的一个DC作为本申请实施例所述的第一数据中心。
示例的,参考图9,目标文件1包括数据块1a、数据块1b、数据块1c,在DC 1可以访问到数据块1a的副本、数据块1b的副本以及数据块1c的副本;在DC 2可以访问到数据块1a的副本、数据块1b的副本以及数据块1c的副本。
目标文件2包括数据块2a、数据块2b,在DC 1可以访问到数据块2a的副本、数据块2b的副本;在DC 2可以访问到数据块2a的副本、数据块2b的副本。综上,在DC 2中访问到的副本最接近客户端的请求,即在DC 2中可以访问到目标文件1、目标文件2的全部数据,因此,DC 2是DC 1、DC 2以及DC 3中数据本地化最高的数据中心,DC 2为本申请实施例所述的第一数据中心。
一种可能的实现方式中,客户端请求的多个文件来自同一目录,所述多个对应的副本放置策略一致,即目标文件的副本实际分布的DC是相同的,那么不同DC的数据本地化程度相同,选择其中任意一个DC作为本申请实施例所述的第一数据中心即可。
示例的,客户端请求访问的目标文件是目标文件1、目标文件2,其中,目标文件1包括数据块1a、数据块1b以及数据块1c,目标文件1对应的副本放置策略为:副本放置在DC 1、DC 2。参考图10,DC 1存储有数据块1a、数据块1b以及数据块1c的副本,DC 2存储有数据块1a、数据块1b以及数据块1c的副本。也就是说,存储有所述目标文件1的副本的数据中心有DC 1、DC 2,即目标文件1的副本实际分布的数据中心为DC 1、 DC 2。
目标文件2包括数据块2a、数据块1b,目标文件2对应的副本放置策略与目标文件1对应的副本放置策略相同,即副本放置在DC 1、DC 2。参考图10,DC 1存储有数据块2a、数据块2b的副本,DC 2存储有数据块2a、数据块2b的副本。也就是说,存储有所述目标文件2的副本的数据中心有DC 1、DC 2,即目标文件2的副本实际分布的数据中心为DC 1、DC 2。
综上,客户端请求访问的多个文件对应的数据中心为DC 1、DC 2。其中,在DC 1、DC 2中均能访问到目标文件1、目标文件2的全部数据,DC 1、DC 2的数据本地化程度一致。DC 1或DC 2为本申请实施例所述的第一数据中心。
步骤804、所述资源管理服务器向客户端发送从第一数据中心读取的目标数据的副本。
具体地,所述资源管理服务器将读数据请求提交到本地化最高的DC(即本申请实施例所述的第一数据中心)的YARN资源上运行,即资源管理服务器分配YARN application在该DC上执行客户端的读数据请求。示例的,参考图11,由于DC2的数据本地化最高,资源管理服务器分配YARN application 2在DC2上访问数据块1a的副本、数据块1b的副本、数据块1c的副本、数据块2a的副本以及数据块2b的副本,并将访问到的副本返回给客户端。
需要说明的是,所述进程管理服务将在第一数据中心读取到的目标数据的副本发送给客户端,例如,目标文件1、目标文件2的所有数据块的副本。
或者,所述进程管理服务在所述第一数据中心没有访问到所述目标数据的所有副本,所述进程管理服务可以将从第一数据中心访问到的目标数据的副本返回给客户端,客户端还可以在其他DC上访问目标数据的其他副本。示例的,客户端请求的多个文件包括数据块1、数据块2、数据块3,在数据本地化最高的DC 1上可以访问数据块1的副本、数据块3的副本,在DC 3上访问数据块2的副本。
步骤805、当所述第一数据中心故障,资源管理服务器在第二数据中心中读取目标数据的副本,向客户端发送从第二数据中心读取的目标数据的副本。
其中,第二数据中心是存储有目标文件的副本的多个数据中心中,除第一数据中心以外,数据本地化最高的一个数据中心。当第一数据中心故障,将客户端的读数据请求递交到第二数据中心的YARN资源上运行,即资源管理服务器分配YARN application在第二数据中心上执行客户端的读数据请求。可以保证客户端的读数据请求得到响应,并且由于第二数据中心数据本地化较高,可以在第二数据中心本地就可以访问到目标数据的大部分副本,尽可能减少跨DC访问数据,节省带宽消耗。
示例的,参考图12,DC 2本是数据本地化最高的数据中心,当DC 2发生故障,可以在DC 1可以访问到数据块1a的副本、数据块1b的副本以及数据块1c的副本,在DC3可以访问到数据块2a的副本、数据块2b的副本。
客户端请求访问的是目标文件1、目标文件2,包括数据块1a、数据块1b、数据块1c、数据块2a、数据块2b,相比于DC 3上访问到的副本,在DC 1上访问到的副本更接近于客户端请求的数据,因此DC 1的数据本地化程度高于DC 3,即DC 1为本申请实施例所述的第二数据中心。
具体地,参考图12,资源管理服务器可以YARN application 1在DC 1上访问数据块1a的副本、数据块1b的副本、数据块1c的副本,并将访问到的副本返回给客户端。此外,客户端还可以跨DC访问数据块1的副本,例如,占用DC 1和DC 3之间的通信带宽,在DC 3访问数据块2a、数据块2b的副本。
可选的,本申请实施例图8所示的方法可以由YARN ResourceManager中的插件DC Scheduler来实现,当然,也可以是YARN ResourceManager中的其他功能模块来实现,本申请实施例对此不作限制。
本申请实施例提供的方法中,文件的副本跨DC放置,不仅仅局限于同一个DC。单个DC故障后,文件的数据不会全部丢失。此外,客户端还可以在其他DC访问文件的副本,尽可能保证客户端的业务不受DC宕机的影响,客户端的读数据请求能够得到及时响应。此外,始终将客户端的读数据请求调度到数据本地化最高的DC,即尽可能在同一个DC访问到目标文件的大部分数据,避免过多占用DC间的通信带宽跨DC访问数据影响系统的性能。
在采用对应各个功能划分各个功能模块的情况下,图13示出上述实施例中所涉及的服务器的一种可能的结构示意图。图13所示的服务器可以是本申请实施例所述的资源管理服务器或名称管理服务器,也可以是资源管理服务器或名称管理服务器中实现上述方法的部件。如图13所示,该服务器包括处理单元1301以及收发单元1302。处理单元可以是一个或多个处理器,收发单元1302可以是网络接口。
处理单元1301,用于支持资源管理服务器执行步骤802、步骤803,支持名称管理服务器执行步骤402,和/或用于本文所描述的技术的其它过程。
收发单元1302,用于支持例如资源管理服务器执行步骤801、步骤801以及步骤805,支持名称管理服务器执行步骤401,和/或用于本文所描述的技术的其它过程。
一种可能的实现方式中,图13所示的服务器也可以是应用于资源管理服务器或名称管理服务器中的芯片。所述芯片可以是片上系统(System-On-a-Chip,SOC)。
其中,以上用于接收/发送的收发单元1302可以是该服务器的一种网络接口,用于从其它服务器接收信号。
示例性的,在采用集成的单元的情况下,本申请实施例提供的服务器的结构示意图如图14所示。在图14中,该服务器包括:处理模块1401和通信模块1402。处理模块1401用于对服务器的动作进行控制管理,例如,执行上述处理单元1301执行的步骤,和/或用于执行本文所描述的技术的其它过程。通信模块1402用于执行上述收发单元1302执行的步骤,支持服务器与其他设备之间的交互,如与其他终端服务器之间的交互。如图14所示,服务器还可以包括存储模块1403,存储模块1403用于存储服务器的程序代码和数据。
当处理模块1401为处理器,通信模块1402为网络接口,存储模块1403为存储器时,服务器为图3所示的服务器。
本申请提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令。该计算机指令指示资源管理服务器执行上述的读数据方法,或者该计算机指令用于实现该资源管理服务器包括的功能单元。
本申请提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令。 该计算机指令指示名称管理服务器执行上述的写数据方法,或者该计算机指令用于实现该名称管理服务器包括的功能单元。
本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令。该计算机指令指示资源管理服务器执行上述的读数据方法,或者该计算机指令用于实现该资源管理服务器包括的功能单元。
本申请提供一种计算机程序产品,该计算机程序产品包括计算机指令。该计算机指令指示名称管理服务器执行上述的写数据方法,或者该计算机指令用于实现该名称管理服务器包括的功能单元。
通过以上的实施方式的描述,所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将数据库访问装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
在本申请所提供的几个实施例中,应该理解到,所揭露的数据库访问装置和方法,可以通过其它的方式实现。例如,以上所描述的数据库访问装置实施例仅仅是示意性的,例如,所述模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,数据库访问装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁盘或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何在本申请揭露的技术范围内的变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。

Claims (18)

  1. 一种读数据方法,其特征在于,包括:
    资源管理服务器从客户端接收读数据请求,所述读数据请求用于请求多个文件;
    所述资源管理服务器从第一数据中心读取目标数据的副本,所述目标数据包括所述多个文件中不同文件的数据;其中,所述第一数据中心为存储所述目标数据的副本的多个数据中心中数据本地化最高的数据中心,所述数据本地化用于指示数据中心存储的目标数据的副本与所述目标数据的接近程度;
    所述资源管理服务器向所述客户端发送从所述第一数据中心读取的所述目标数据的副本。
  2. 根据权利要求1所述的方法,其特征在于,所述方法还包括:
    所述资源管理服务器根据所述多个文件的目录信息确定存储所述目标数据的副本的所述多个数据中心,所述读数据请求携带所述目标数据的目录信息;
    所述资源管理服务器计算所述多个数据中心存储的目标数据的副本分别与所述目标数据的接近程度,将存储的目标数据的副本最接近所述目标数据的数据中心确定为所述第一数据中心。
  3. 根据权利要求1或2所述的方法,其特征在于,所述方法还包括:
    在所述第一数据中心出现故障时,所述资源管理服务器从第二数据中心读取所述目标数据的副本,其中,所述第二数据中心为所述多个数据中心中除所述第一数据中心外数据本地化最高的数据中心;
    所述资源管理服务器向所述客户端发送从所述第二数据中心读取的所述目标数据的副本。
  4. 根据权利要求1-3任一项所述的方法,其特征在于,所述目标数据的副本为存储在所述客户端所在的数据中心的副本。
  5. 一种写数据方法,其特征在于,包括:
    名称管理服务器从客户端接收写数据请求,所述写数据请求携带目标数据;
    所述名称管理服务器根据所述目标数据将所述目标数据的副本写入多个数据中心。
  6. 根据权利要求5所述的方法,其特征在于,所述名称管理服务器根据所述目标数据将所述目标数据的副本写入多个数据中心,包括:
    所述名称管理服务器在所述客户端所在的数据中心写入第一个所述目标数据的副本。
  7. 根据权利要求5或6所述的方法,其特征在于,所述方法还包括:
    在所述目标数据的副本实际分布的数据中心与副本放置策略指示的所述多个数据中心不一致时,所述名称管理服务器将所述目标数据的副本调整到所述副本放置策略指示的所述多个数据中心。
  8. 一种资源管理服务器,其特征在于,包括:
    收发单元,用于从客户端接收读数据请求,所述读数据请求用于请求多个文件;
    处理单元,用于从第一数据中心读取目标数据的副本,所述目标数据包括所述多个文件中不同文件的数据;其中,所述第一数据中心为存储所述目标数据的副本的多个数据中心中数据本地化最高的数据中心,所述数据本地化用于指示数据中心存储的目标数据的副本与所述目标数据的接近程度;
    所述收发单元,还用于向所述客户端发送从所述第一数据中心读取的所述目标数据的副本。
  9. 根据权利要求8所述的资源管理服务器,其特征在于,所述处理单元还用于:
    根据所述多个文件的目录信息确定存储所述目标数据的副本的所述多个数据中心,所述读数据请求携带所述目标数据的目录信息;
    计算所述多个数据中心存储的目标数据的副本分别与所述目标数据的接近程度,将存储的目标数据的副本最接近所述目标数据的数据中心确定为所述第一数据中心。
  10. 根据权利要求8或9所述的资源管理服务器,其特征在于,所述处理单元还用于:在所述第一数据中心出现故障时,从第二数据中心读取所述目标数据的副本,其中,所述第二数据中心为所述多个数据中心中除所述第一数据中心外数据本地化最高的数据中心;
    所述收发单元还用于:向所述客户端发送从所述第二数据中心读取的所述目标数据的副本。
  11. 根据权利要求8-10任一项所述的资源管理服务器,其特征在于,所述目标数据的副本为存储在所述客户端所在的数据中心的副本。
  12. 一种名称管理服务器,其特征在于,包括:
    收发单元,用于从客户端接收写数据请求,所述写数据请求携带目标数据;
    处理单元,用于根据所述目标数据将所述目标数据的副本写入多个数据中心。
  13. 根据权利要求12所述的名称管理服务器,其特征在于,所述处理单元用于:在所述客户端所在的数据中心写入第一个所述目标数据的副本。
  14. 根据权利要求12或13所述的名称管理服务器,其特征在于,所述处理单元还用于:
    在所述目标数据的副本实际分布的数据中心与副本放置策略指示的所述多个数据中心不一致时,将所述目标数据的副本调整到所述副本放置策略指示的所述多个数据中心。
  15. 一种资源管理服务器,其特征在于,所述资源管理服务器包括处理器和存储器;在所述处理器执行所述存储器中的计算机指令时,所述资源管理服务器执行如权利要求1至4中任一项所述的方法。
  16. 一种名称管理服务器,其特征在于,所述名称管理服务器包括处理器和存储器;在所述处理器执行所述存储器中的计算机指令时,所述名称管理服务器执行如权利要求5至7中任一项所述的方法。
  17. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令指示所述资源管理服务器执行权利要求1至4中任一项所述的方法。
  18. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质存储有计算机指令,所述计算机指令指示所述名称管理服务器执行权利要求5至7中任一项所述的方法。
PCT/CN2020/096124 2019-09-27 2020-06-15 一种读数据方法、写数据方法及服务器 WO2021057108A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP20869429.9A EP3951607A4 (en) 2019-09-27 2020-06-15 DATA READING METHOD, DATA WRITING METHOD AND SERVER
US17/526,659 US12038879B2 (en) 2019-09-27 2021-11-15 Read and write access to data replicas stored in multiple data centers

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910927180.0 2019-09-27
CN201910927180.0A CN110825704B (zh) 2019-09-27 2019-09-27 一种读数据方法、写数据方法及服务器

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/526,659 Continuation US12038879B2 (en) 2019-09-27 2021-11-15 Read and write access to data replicas stored in multiple data centers

Publications (1)

Publication Number Publication Date
WO2021057108A1 true WO2021057108A1 (zh) 2021-04-01

Family

ID=69548339

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/096124 WO2021057108A1 (zh) 2019-09-27 2020-06-15 一种读数据方法、写数据方法及服务器

Country Status (4)

Country Link
US (1) US12038879B2 (zh)
EP (1) EP3951607A4 (zh)
CN (1) CN110825704B (zh)
WO (1) WO2021057108A1 (zh)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110825704B (zh) * 2019-09-27 2023-09-01 华为云计算技术有限公司 一种读数据方法、写数据方法及服务器
CN113282246B (zh) * 2021-06-15 2023-07-04 杭州海康威视数字技术股份有限公司 数据处理方法及装置
US11847352B1 (en) * 2021-09-22 2023-12-19 Ridgeline, Inc. Parent child request recovery to improve stability
CN116795066B (zh) * 2023-08-16 2023-10-27 南京德克威尔自动化有限公司 远程io模块的通信数据处理方法、系统、服务器及介质
CN117453153B (zh) * 2023-12-26 2024-04-09 柏科数据技术(深圳)股份有限公司 基于Crush规则的文件存储方法、装置、终端及介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783002A (zh) * 2017-11-14 2019-05-21 华为技术有限公司 数据读写方法、管理设备、客户端和存储系统
CN109901949A (zh) * 2019-02-25 2019-06-18 中国工商银行股份有限公司 双活数据中心的应用灾备系统及方法
CN110022338A (zh) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 文件读取方法、系统、元数据服务器和用户设备
CN110825704A (zh) * 2019-09-27 2020-02-21 华为技术有限公司 一种读数据方法、写数据方法及服务器

Family Cites Families (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7634566B2 (en) * 2004-06-03 2009-12-15 Cisco Technology, Inc. Arrangement in a network for passing control of distributed data between network nodes for optimized client access based on locality
US9582221B2 (en) * 2012-08-24 2017-02-28 Vmware, Inc. Virtualization-aware data locality in distributed data processing
CN103678360A (zh) * 2012-09-13 2014-03-26 腾讯科技(深圳)有限公司 一种分布式文件系统的数据存储方法和装置
CN103793425B (zh) * 2012-10-31 2017-07-14 国际商业机器公司 用于分布式系统的数据处理方法及装置
US9268808B2 (en) * 2012-12-31 2016-02-23 Facebook, Inc. Placement policy
CN104615606B (zh) * 2013-11-05 2018-04-06 阿里巴巴集团控股有限公司 一种Hadoop分布式文件系统及其管理方法
CN104156381A (zh) * 2014-03-27 2014-11-19 深圳信息职业技术学院 Hadoop分布式文件系统的副本存取方法、装置和Hadoop分布式文件系统
CN104113597B (zh) * 2014-07-18 2016-06-08 西安交通大学 一种多数据中心的hdfs数据读写方法
CN105760391B (zh) * 2014-12-18 2019-12-13 华为技术有限公司 数据动态重分布的方法、数据节点、名字节点及系统
CN106095337A (zh) * 2016-06-07 2016-11-09 国云科技股份有限公司 一种基于san网络存储的云盘快速共享方法
US10437937B2 (en) * 2016-07-12 2019-10-08 Commvault Systems, Inc. Dynamic management of expandable cache storage for multiple network shares configured in a file server
US10803023B2 (en) * 2017-04-02 2020-10-13 Sas Institute Inc. Techniques for reading from and writing to distributed data stores
CN110019082A (zh) * 2017-07-31 2019-07-16 普天信息技术有限公司 文件数据的分布式多副本存储方法
US11146626B2 (en) * 2018-11-01 2021-10-12 EMC IP Holding Company LLC Cloud computing environment with replication system configured to reduce latency of data read access
US11625273B1 (en) * 2018-11-23 2023-04-11 Amazon Technologies, Inc. Changing throughput capacity to sustain throughput for accessing individual items in a database
CN110198346B (zh) * 2019-05-06 2020-10-27 北京三快在线科技有限公司 数据读取方法、装置、电子设备及可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109783002A (zh) * 2017-11-14 2019-05-21 华为技术有限公司 数据读写方法、管理设备、客户端和存储系统
CN110022338A (zh) * 2018-01-09 2019-07-16 阿里巴巴集团控股有限公司 文件读取方法、系统、元数据服务器和用户设备
CN109901949A (zh) * 2019-02-25 2019-06-18 中国工商银行股份有限公司 双活数据中心的应用灾备系统及方法
CN110825704A (zh) * 2019-09-27 2020-02-21 华为技术有限公司 一种读数据方法、写数据方法及服务器

Also Published As

Publication number Publication date
EP3951607A1 (en) 2022-02-09
EP3951607A4 (en) 2022-07-13
US12038879B2 (en) 2024-07-16
US20220075757A1 (en) 2022-03-10
CN110825704B (zh) 2023-09-01
CN110825704A (zh) 2020-02-21

Similar Documents

Publication Publication Date Title
US11153380B2 (en) Continuous backup of data in a distributed data store
US11120152B2 (en) Dynamic quorum membership changes
WO2021057108A1 (zh) 一种读数据方法、写数据方法及服务器
US11243922B2 (en) Method, apparatus, and storage medium for migrating data node in database cluster
US10474547B2 (en) Managing contingency capacity of pooled resources in multiple availability zones
US10579610B2 (en) Replicated database startup for common database storage
AU2017225086B2 (en) Fast crash recovery for distributed database systems
US10891267B2 (en) Versioning of database partition maps
US10331655B2 (en) System-wide checkpoint avoidance for distributed database systems
US9460185B2 (en) Storage device selection for database partition replicas
US8386540B1 (en) Scalable relational database service
US9699017B1 (en) Dynamic utilization of bandwidth for a quorum-based distributed storage system
US9280591B1 (en) Efficient replication of system transactions for read-only nodes of a distributed database
US9304815B1 (en) Dynamic replica failure detection and healing
US10747746B2 (en) Efficient read replicas
US11080253B1 (en) Dynamic splitting of contentious index data pages
US20100023564A1 (en) Synchronous replication for fault tolerance
US10885023B1 (en) Asynchronous processing for synchronous requests in a database
US10223184B1 (en) Individual write quorums for a log-structured distributed storage system
WO2023197404A1 (zh) 一种基于分布式数据库的对象存储方法及装置
US11914571B1 (en) Optimistic concurrency for a multi-writer database
US11709839B2 (en) Database system and query execution method
WO2020207078A1 (zh) 数据处理方法、装置和分布式数据库系统

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20869429

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2020869429

Country of ref document: EP

Effective date: 20211104

NENP Non-entry into the national phase

Ref country code: DE