WO2018176998A1 - 数据存储方法及装置 - Google Patents

数据存储方法及装置 Download PDF

Info

Publication number
WO2018176998A1
WO2018176998A1 PCT/CN2018/073315 CN2018073315W WO2018176998A1 WO 2018176998 A1 WO2018176998 A1 WO 2018176998A1 CN 2018073315 W CN2018073315 W CN 2018073315W WO 2018176998 A1 WO2018176998 A1 WO 2018176998A1
Authority
WO
WIPO (PCT)
Prior art keywords
node
tenant
copies
policy
data
Prior art date
Application number
PCT/CN2018/073315
Other languages
English (en)
French (fr)
Inventor
孙桂林
刘怀忠
查礼
辛现银
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to EP18776326.3A priority Critical patent/EP3594798B1/en
Publication of WO2018176998A1 publication Critical patent/WO2018176998A1/zh
Priority to US16/586,074 priority patent/US10972542B2/en
Priority to US17/198,908 priority patent/US11575748B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0646Horizontal data movement in storage systems, i.e. moving data in between storage devices or systems
    • G06F3/065Replication mechanisms
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/10Protocols in which an application is distributed across nodes in the network
    • H04L67/1097Protocols in which an application is distributed across nodes in the network for distributed storage of data in networks, e.g. transport arrangements for network file system [NFS], storage area networks [SAN] or network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0604Improving or facilitating administration, e.g. storage management
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0602Interfaces specially adapted for storage systems specifically adapted to achieve a particular effect
    • G06F3/0614Improving the reliability of storage systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0628Interfaces specially adapted for storage systems making use of a particular technique
    • G06F3/0662Virtualisation aspects
    • G06F3/0664Virtualisation aspects at device level, e.g. emulation of a storage device or system
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/06Digital input from, or digital output to, record carriers, e.g. RAID, emulated record carriers or networked record carriers
    • G06F3/0601Interfaces specially adapted for storage systems
    • G06F3/0668Interfaces specially adapted for storage systems adopting a particular infrastructure
    • G06F3/067Distributed or networked storage systems, e.g. storage area networks [SAN], network attached storage [NAS]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/50Network services
    • H04L67/60Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources
    • H04L67/61Scheduling or organising the servicing of application requests, e.g. requests for application data transmissions using the analysis and optimisation of the required network resources taking into account QoS or priority requirements

Definitions

  • the present application relates to the field of big data and, more particularly, to a data storage method and apparatus.
  • Multi-tenant technology is a software architecture technology that implements how to share the same system or program components in a multi-user environment and ensures data isolation between users.
  • multi-tenant technology provides the same or even customizable services for most clients in a shared data center with a single system architecture and services, and can protect tenant data isolation.
  • cloud computing services are currently in this category, such as Facebook Cloud Database Service, Facebook Cloud Server, and so on.
  • a tenant can correspond to at least one node and is managed by a big data system.
  • the at least one node is a resource owned by the tenant, and the tenant can use the at least one node to store data and run a computing operation. Waiting for the appeal.
  • the resource control node directly determines the distribution of the data of the tenant in each node according to the pre-configured data distribution policy. For example, the tenant requests to store 10 data copies, and the tenant is available.
  • the nodes are node A, node B, and node C.
  • the data distribution result determined by the resource control node for the tenant may exist in multiple possible situations.
  • the embodiment of the present application provides a data storage method and device, which can flexibly control the distribution of data that a tenant needs to store in a node through a combination of different data distribution policies, thereby reducing the complexity of policy deployment.
  • the first aspect provides a data storage method, including: receiving a data write request sent by a first tenant through a client, where the data write request is used to indicate that the first tenant requests to store N to be written data.
  • a copy, N is an integer greater than or equal to 1; determining the said plurality of RZs according to the data write request and the storage authority of the first tenant to each of the plurality of resource regions RZ Determining, by the first tenant, at least one RZ; determining, according to the data write request and the first data distribution policy, a distribution of the N copies in the at least one RZ, the first data distribution policy being used to represent Determining a distribution priority of the N copies in the at least one RZ; storing the N copies to the at least one according to a distribution of the N copies in the at least one RZ and a second data distribution policy In at least one node corresponding to the RZ, the second data distribution policy is used to indicate a distribution priority of the N copies in a plurality of nodes corresponding to each RZ
  • the first tenant may send a data write request to the resource master control node by using the corresponding client, where the data write request is used to indicate that the first tenant requests to store the data to be written. N copies of the data.
  • the data write request may carry N copies of the data to be written, and may also carry one copy of the data to be written and the number N of copies to be stored, which is not limited in this embodiment of the present application.
  • the resource master control node receives the data write request, and determines, according to the data write request and the storage authority of the first tenant to each of the plurality of RZs, determining, by the plurality of RZs, the first tenant to use At least one RZ. Then, the resource master control node stores the N copies in the at least one node corresponding to the at least one RZ according to the data write request, the first data distribution policy, and the second data distribution policy.
  • the first data distribution policy is used to indicate a distribution priority of the N copies in the at least one RZ
  • the second data distribution policy is used to indicate that the N copies are in a plurality of nodes corresponding to each RZ in the at least one RZ. Distribution priority in . Therefore, the data placement decision of the resource master node is divided into the following two stages:
  • the storage permission of the first tenant to each RZ in the multiple resource areas RZ is determined according to the resource sharing policy of each RZ.
  • the resource sharing policy of RZ1 is used to indicate that the RZ1 can be Which tenants provide resources, and tenants who do not meet RZ1's resource sharing policy do not have RZ1 storage rights.
  • the resource sharing strategy and the data distribution strategy are mutually coordinated and mutually constrained. Since at least one RZ that the first tenant can use has different resource sharing policies, different data can be adopted according to different needs of the tenant. Distribution strategies to store data for different effects.
  • the foregoing two phases may independently apply different policies according to different needs of the tenant or different application scenarios of the tenant, and combine to generate expected data distribution results, without pre-empting each application.
  • the data distribution result corresponding to the scenario configures the data distribution strategy.
  • the node that the tenant can use is divided into at least one resource area RZ, and the first data distribution policy of the at least one RZ and the second data distribution policy of the node corresponding to the at least one RZ are respectively configured, and the resource total control node is in the pair
  • a two-stage decision can be made.
  • the distribution of the data copy in at least one RZ is determined according to the first data distribution strategy, and the second stage is combined with the second data distribution strategy based on the first stage.
  • the distribution of data copies on specific nodes are described by the first data distribution strategy.
  • the data storage method of the embodiment of the present application is configured to divide a node that can be used by a tenant into at least one resource area RZ, and respectively configure a first data distribution policy of the at least one RZ and a second data distribution of the node corresponding to the at least one RZ.
  • the resource control node can perform two-stage decision according to the first data distribution policy and the second data distribution policy when the data is stored. Since the two-stage policy can be independently configured, the resource total control node can be different.
  • the data distribution policies of the phases are combined to flexibly control the distribution of data that the tenants need to store in the nodes according to the different needs of the tenants and the scenarios in which the tenants are located, which reduces the complexity of the policy deployment.
  • the at least one RZ includes a first RZ and a second RZ, where the first RZ is a reserved resource area RRZ that is only allowed to be used by the first tenant,
  • the second RZ is a shared resource area SRZ that is allowed to be used by a plurality of tenants including the first tenant.
  • the at least one RZ that the first tenant can use may include a first RZ that only allows the first tenant to use and a second RZ that allows multiple tenants including the first tenant to use.
  • the first data distribution policy may be preferentially stored in the first RZ, or may be stored in the second RZ.
  • the embodiment of the present application does not limit this; the second data distribution strategy may be equal probability.
  • the distribution strategy may also be a different strategy for considering the probability distribution of the remaining space of the node, and may also be other policies customized according to a specific scenario, which is not limited in this embodiment of the present application. Therefore, under the combination of different first data distribution strategies and second data distribution strategies, various expected effects can be achieved.
  • the first data distribution policy is to preferentially store the N replicas into the first RZ, where Determining, according to the data write request and the first data distribution policy, the distribution of the N copies in the at least one RZ, including: according to the data write request, the first data distribution policy, and the first Determining a space occupation status of the RZ, determining that the first RZ can store P copies of the data to be written, P is an integer greater than or equal to 1, and the space occupation status is used to indicate that the first RZ has been The size of the occupied space or the remaining space size; if N is less than or equal to P, determining that the N copies are distributed in the first RZ; and in the case where N is greater than P, determining the N copies The P copies are distributed in the first RZ, and the remaining copies of the N copies except the P copies are distributed in the second RZ.
  • the tenant's data can be stored in the tenant's RRZ (ie, the first RZ) as much as possible, and the use of the SRZ (ie, the second RZ) is minimized.
  • RRZ usually belongs to the tenant's prepaid resources
  • SRZ is a pay-per-use post-paid resource
  • less SRZ usage means less additional costs.
  • the space of the RRZ is usually reserved for the tenants.
  • the utilization of the RRZ also means the utilization of the platform resources.
  • the foregoing space occupation status may be a space utilization ratio, a remaining space, and the like of the RZ, which is not limited by the embodiment of the present application.
  • the system may set a space utilization threshold or a remaining space threshold of the RRZ. After the space utilization of the RRZ reaches the threshold, the first tenant can use the storage resource of the SRZ. Therefore, the resource total control node may determine the number of copies of the to-be-written data that can be stored in the first RZ according to the data write request, the first data distribution policy, the space occupation state of the first RZ, and the space utilization threshold. This embodiment of the present application does not limit this.
  • the first data distribution policy is to store Q copies of the N copies to the second In RZ, Q is an integer greater than or equal to 1, and Q is less than or equal to N, and the distribution of the N copies in at least one RZ is determined according to the data write request and the first data distribution policy, including: Determining, according to the data write request and the first data distribution policy, that Q copies in the N copies are distributed in the second RZ, except for the Q copies in the N copies The remaining NQ copies are distributed in the first RZ.
  • the tenant is allowed to specify the number of copies of the data copies in different RZs.
  • This strategy applies to different scenarios, for example, (1) for the purpose of maximizing data access overlay bandwidth, this data is often accessed by the SRZ (ie second RZ) calculation, but if most of the data is copied Focused on the RRZ (that is, the first RZ), the data access bandwidth is limited by the number of RRZ nodes, which limits the parallel ability of the calculation. At this time, regardless of the remaining space of the RRZ, always store a certain number in the SRZ. The number of data copies is a better choice; (2) data sharing between tenants, that is, the data will be shared with other tenants after the data is generated.
  • the space occupation state is used to indicate the size of the space occupied by the first RZ or the remaining a space size; if the NQ is less than or equal to P, determining that the NQ replicas are distributed in the first RZ; and in the case that the NQ is greater than P, determining that P replicas in the NQ replicas are distributed in the In the first RZ, the remaining copies of the NQ copies except the P copies are distributed in the second RZ.
  • the first data distribution policy needs to be placed in the first In the RZ, however, the memory of the first RZ is limited, and there may be a situation in which the first RZ cannot be placed. Therefore, the resource master node needs to determine the distribution of the remaining NQ replicas according to the space occupation state of the first RZ.
  • the resource master control node may first determine, according to the data write request, the first data distribution policy, and the space occupation status of the first RZ, the P replicas in the first RZ that can store the to-be-written data, if the NQ is less than or equal to P, the resource master node may determine to store all the NQ copies in the first RZ; if the NQ is greater than P, the resource master node may store the P copies of the data to be written to the first In an RZ, the remaining NQP copies are stored in the second RZ.
  • the method further includes: according to a space occupation state of the first RZ, All or part of the copy is stored in the first RZ, the space occupied state is used to indicate the size of the space occupied by the first RZ or the remaining space size; deleting all or all of the second RZ Partial copy.
  • the resource master node needs to determine the amount of data that can be relocated from the second RZ to the first RZ according to the space occupation status of the first RZ.
  • a space utilization threshold may be set. When the space utilization of the first RZ is less than the space utilization threshold, the resource master node may relocate the copy in the second RZ to the first RZ.
  • the utilization rate of the RRZ can be improved. Since the RRZ reserves exclusive for the tenant, improving the utilization of the RRZ as a whole improves the resource utilization of the big data system.
  • the method before the receiving the data write request sent by the first tenant by the client, the method further includes: receiving a resource area creation request for requesting to create a third RZ in the at least one RZ for the first tenant; creating the third RZ according to the resource area creation request, and determining a plurality of first nodes corresponding to the third RZ; adding first tag information to each of the plurality of first nodes, where the first tag information is used to identify the third RZ; The third RZ adds a first resource sharing policy, where the first resource sharing policy is used to indicate that the third RZ can be accessed by at least one tenant including the first tenant.
  • the label information is stored in the database of the operation and maintenance management OM software.
  • the label information is usually synchronized from the OM system to the storage system (for example, HDFS) itself. Therefore, the tag information forms different storage partitions in the storage system, corresponding to the RZ.
  • the resource master control node may determine a specific placement node of the replica according to the foregoing data distribution policy.
  • the method further includes: receiving a resource area deletion request, where the resource area deletion request is used to request to delete the at least a fourth RZ in an RZ; deleting a copy stored in the plurality of second nodes corresponding to the fourth RZ according to the resource area deletion request; deleting each of the plurality of second nodes Second tag information, the second tag information is used to identify the fourth RZ; the second resource sharing policy of the fourth RZ is deleted, and the second resource sharing policy is used to indicate that the fourth RZ can be At least one tenant access of the first tenant is included.
  • the resource master control node may receive the resource area deletion request, and determine to delete the fourth RZ in the at least one RZ, where the resource master control node may delete the data stored in the multiple second nodes corresponding to the fourth RZ. And copying, deleting the second tag information of each of the plurality of second nodes and the second resource sharing policy of the fourth RZ.
  • the method further includes: receiving a resource area expansion request, where the resource area expansion request is used for the request Expanding a fifth RZ in an RZ; determining at least one third node according to the resource area expansion request; adding third label information to each third node in the at least one third node, the third The tag information is used to identify the fifth RZ.
  • the method further includes: receiving a resource area reduction request, where the resource area reduction request is used for the request Determining a sixth RZ in the at least one RZ; determining, according to the resource region reduction request, at least one fourth node corresponding to the sixth RZ; deleting each fourth of the at least one fourth node The fourth tag information of the node, where the fourth tag information is used to identify the sixth RZ.
  • the OM usually has a platform administrator to operate, and the special one is the cloud scenario, and the tenant himself (may be the tenant's own management).
  • the management and maintenance of the RZ is completed by the OM system.
  • the embodiment of the present application does not limit this.
  • a second aspect provides a task allocation method, including: receiving a computing task allocation request sent by a first node, where the computing task allocation request is used to request to allocate a computing task to the first node; Requesting, a sharing policy of the first node, and a borrowing policy of the at least one tenant, from the computing task of the at least one tenant, assigning a first computing task to the first node, wherein the sharing policy is used to indicate
  • the first node provides a computing resource for a computing task of the ten tenants of the at least one tenant, the borrowing policy is used to indicate that the first tenant of the at least one tenant is allowed to use the computing resources of the j nodes, i and j All are integers greater than 0; the task indication information is sent to the first node, and the task indication information is used to indicate the first computing task.
  • the above sharing strategy is used to indicate which tenants can provide computing resources to the tenant
  • the borrowing strategy is used to indicate that the tenant is willing to use other computing resources of other nodes if their node resources are insufficient.
  • These policies are usually configured in advance and stored in the database of the Big Data System Operation and Maintenance Management OM software, which is generally configured by the system administrator and/or tenant through the OM software.
  • the final determination of the foregoing first computing task may be arbitrarily selected in the remaining computing tasks, or the computing task with the highest priority may be selected as the first computing task according to the priority order of the remaining computing tasks. This example does not limit this.
  • the node is a resource provider
  • the tenant is a resource user.
  • the node sharing strategy is only used to express how the resource provider shares its own resources and does not care about the specific resource users.
  • the tenant's borrowing strategy is only used to express how the resource users borrow available shared resources, and does not care about the specific Resource providers, which enable decoupling of resource sharing and borrowing mechanisms.
  • the resource total control node flexibly matches the computing task submitted by the computing node and the tenant according to the sharing strategy of the computing node to the computing resource in the big data system and the borrowing strategy of the tenant to the computing resource. Therefore, the computing node is allocated a computing task that satisfies the policy, and the resource sharing and borrowing mechanism is decoupled, which is simple and easy, and improves the user experience.
  • the computing task of the at least one tenant filters out the computing tasks of the m tenants that do not satisfy the sharing policy and the borrowing policy, where m is an integer greater than or equal to 1; from the computing tasks other than the m tenants
  • the first computing task is determined in the remaining computing tasks.
  • the resource master control node may match at least one computing task in the system with the first node according to the foregoing sharing policy and the borrowing policy, and filter the computing task that does not satisfy the sharing policy and the borrowing policy, thereby A first computing task assigned to the first node is determined.
  • the computing task allocation request includes the identifier information of the first node, and the information from the at least one tenant
  • the computing task of filtering the m ten tenants that do not satisfy the sharing policy and the borrowing policy includes: filtering out the calculation of the p first tenants according to the identification information of the first node and the sharing policy a task, the p first tenants do not belong to the i tenants, and p is an integer greater than or equal to 0; according to the identification information of the first node and the borrowing policy, in addition to the p first tenants
  • the computing tasks of the remaining tenants outside the computing task filter out the computing tasks of the mp second tenants, the first node not belonging to the j nodes.
  • the filtering, by the computing task of the at least one tenant, does not satisfy the sharing policy and the borrowing policy
  • the computing tasks of the m ten tenants include: filtering the computing tasks of the mp second tenants according to the identification information of the first node and the borrowing policy, and the borrowing policies of the mp second tenants indicate that the borrowing policy is not allowed a computing resource of the first node, where p is an integer greater than or equal to 0; a computing task of the remaining tenants other than the computing tasks of the mp second tenants according to the identification information of the first node and the sharing policy
  • the computing tasks of the p first tenants are filtered out, and the p first tenants do not belong to the i tenants.
  • the at least one tenant is M tenants, and M is an integer greater than 0, and the m ten tenants that do not satisfy the sharing policy and the borrowing policy are filtered out from the computing tasks of the at least one tenant.
  • the computing task includes: filtering, according to the identification information of the first node and the sharing policy, a computing task of the p tenants from the computing tasks of the M tenants; and according to the identification information of the first node
  • the borrowing strategy filters out the computing tasks of the q tenants from the computing tasks of the M tenants; and intersects the computing tasks of the remaining Mp tenants with the computing tasks of the remaining Mqs.
  • the foregoing steps of using the shared policy filtering and the borrowing policy filtering are not sequential, and may be performed at the same time, which is not limited by the embodiment of the present application.
  • the ten tenants and the q tenants may include the same tenant, but this does not affect the final filtering result.
  • the resource control node filtering the computing tasks that do not satisfy the foregoing sharing policy and the borrowing policy may adopt different filtering order, that is, may be filtered according to the sharing policy, filtered according to the borrowing policy, or filtered according to the borrowing policy. Then, according to the sharing policy, the filtering may be performed according to the sharing policy and the borrowing policy, and finally, the intersection of the two filtering results is taken, which is not limited in the embodiment of the present application.
  • the first node is a node in the first resource region RZ, and the node included in the first resource region Having the same sharing policy, the same sharing policy is a sharing policy of the first resource zone.
  • the nodes in the system may be divided into a plurality of resource regions RZ, and the plurality of RZs include a reserved resource region RRZ and a shared resource region SRZ.
  • the sharing strategy of the RZ is the sharing policy of each node in the RZ
  • the resource provider is the RZ
  • the resource user is the computing task of the tenant and the tenant.
  • the RRZ will belong to the specific tenant. From this perspective, the tenant may have both the resource provider and the resource borrower.
  • an RZ should only include nodes of the same sharing policy, and this same sharing policy is the sharing policy of the RZ.
  • the sharing policy of the RZ the tenant having the usage right on the RZ can be determined.
  • the usage rights may include the use of storage resources and computing resources, thereby implementing integration of the storage system and the computing system, that is, considering storage resources and computing resources.
  • the sharing policy is any one of the following: a strict retention policy, an idle time sharing policy, and a fair sharing policy.
  • the strict retention policy is used to indicate that only the computing tasks of the i tenants are allowed to use the computing resources of the first node
  • the idle time sharing policy is used to indicate that the first node is allowed to be disabled only when the first node is idle.
  • the tenant other than the i tenants use the computing resources of the first node
  • the fair sharing policy is used to indicate that the at least one tenant is allowed to use the computing resources of the first node fairly.
  • the strict reservation policy, the idle time sharing policy, and the fair sharing policy may be a sharing policy of the node, or may be a sharing policy of the RZ.
  • the above-mentioned resource master control node specifically distinguishes the RZ that the tenant can use according to the sharing policy of each RZ, especially the RRZ and the SRZ.
  • a strict retention policy is to strictly reserve resources. Under the strict retention policy, the resources in the RZ are only allowed to be used by the tenant of the RZ. Even if idle, the other tenants are not allowed to use. In the idle sharing policy, the RZ is reserved for the tenant to which the RZ belongs.
  • the resource is allowed to be temporarily borrowed when the resource is idle, and is preempted by the highest priority when the RZ belongs to the tenant.
  • the tenant of the RZ belongs to the 100% weight of the RZ resource;
  • the fair sharing policy means that the multi-tenant shares the resource.
  • the RZ allows multiple tenants to use their resources fairly with agreed weights. Based on the above different strategies, RZs of different natures can be generated. For example, an RZ with a fair sharing policy is an SRZ, and an RZ with a strict retention policy is an RRZ.
  • the third tenant of the at least one tenant is initially configured with at least one third node, where the third tenant
  • the borrowing strategy includes: in a case where the number of nodes usable in the at least one third node is less than a first threshold, the third tenant is allowed to borrow computing resources of the first node; and/or in the If the number of nodes that the three tenants have borrowed is greater than the second threshold, the third tenant is not allowed to borrow the computing resources of the first node; wherein the at least one third node does not include the first node.
  • the tenant's borrowing policy can be configured by the tenant and stored in the database.
  • the tenant generally has its own node resources. That is, the system initially configures a part of the node resources to provide services for the tenant.
  • the tenant A corresponds to the first RRZ, and the tenant A can use the resources in the first RRZ. If the resources of the first RRZ are not enough, the tenant A needs to borrow resources. In this case, the tenant A can set his own borrowing strategy.
  • the borrowing policy may be that when the resource available to the tenant A is less than the first threshold, the borrowing resource is allowed.
  • the borrowing strategy of the tenant A may never borrow the shared resource; If the threshold is large enough, Tenant A's borrowing strategy is to always borrow shared resources.
  • the borrowing policy may be that the tenant A is no longer allowed to borrow resources when the resource borrowed by the tenant A is greater than the second threshold, and may be other policies.
  • the borrowing policy further includes: the third tenant preferentially uses the fourth node, where the fourth node stores Data corresponding to the computing task of the third tenant, the fourth node belongs to a node resource of the third tenant.
  • the computing location of the computing task can be optimized by setting a tenant's borrowing strategy, that is, the computing task is more likely to be scheduled on the storage node of the data corresponding to the computing task, which can improve system performance and data security.
  • a data storage apparatus for performing the method of any of the above first aspect or any of the possible implementations of the first aspect.
  • the apparatus comprises means for performing the method of any of the above-described first aspect or any of the possible implementations of the first aspect.
  • a task allocation apparatus for performing the method of any of the above-described second aspect or any of the possible implementations of the second aspect.
  • the apparatus comprises means for performing the method of any of the above-described second aspect or any of the possible implementations of the second aspect.
  • a data storage device comprising: a transceiver, a memory, and a processor.
  • the transceiver, the memory and the processor are in communication with each other via an internal connection path for storing instructions for executing instructions stored in the memory to control the receiver to receive signals and to control the transmitter to transmit signals
  • the processor executes the instructions stored by the memory, the executing causes the processor to perform the method of the first aspect or any of the possible implementations of the first aspect.
  • a task allocation apparatus comprising: a transceiver, a memory, and a processor.
  • the transceiver, the memory and the processor are in communication with each other via an internal connection path for storing instructions for executing instructions stored in the memory to control the receiver to receive signals and to control the transmitter to transmit signals
  • the processor executes the instructions stored by the memory, the executing causes the processor to perform the method of any of the possible implementations of the second aspect or the second aspect.
  • a seventh aspect a computer readable medium for storing a computer program, the computer program comprising instructions for performing the method of the first aspect or any of the possible implementations of the first aspect.
  • a computer readable medium for storing a computer program comprising instructions for performing the method of the second aspect or any of the possible implementations of the second aspect.
  • FIG. 1 is a schematic diagram of an application scenario of an embodiment of the present application.
  • FIG. 2 is a schematic diagram of a system architecture provided by an embodiment of the present application.
  • FIG. 3 is a schematic flowchart of a data storage method provided by an embodiment of the present application.
  • FIG. 4 is a schematic structural diagram of another system according to an embodiment of the present application.
  • FIG. 5 is a schematic flowchart of a task allocation method provided by an embodiment of the present application.
  • FIG. 6 is a schematic flowchart of another task allocation method according to an embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of another system provided by an embodiment of the present application.
  • FIG. 8 is a schematic block diagram of a data storage device according to an embodiment of the present application.
  • FIG. 9 is a schematic block diagram of a task allocation apparatus according to an embodiment of the present application.
  • FIG. 10 is a schematic block diagram of another data storage apparatus according to an embodiment of the present application.
  • FIG. 11 is a schematic block diagram of another task allocation apparatus according to an embodiment of the present application.
  • Big data refers to a large data set collected in many forms and from many sources, often in real time. In the case of business-to-business sales, this data may come from social networks, e-commerce sites, customer visit records, and many other sources. From a technical point of view, the relationship between big data and cloud computing is as inseparable as the front and back of a coin. Big data must not be processed by a single computer, and a distributed computing architecture must be used. Therefore, big data is characterized by the mining of massive data, but it must rely on distributed processing of cloud computing, distributed databases, cloud storage and/or virtualization technologies, and so on.
  • Multi-tenant technology is a software architecture technology that implements how to share the same system or program components in a multi-user environment and ensures data isolation between users.
  • implementing a multi-tenant technology requires a resource pool, or a job pool.
  • Each resource pool has a certain amount of resources (configured by the administrator).
  • Each tenant belongs to a resource pool, and the submitted jobs can use the resources in the resource pool to implement storage data and run computing operations.
  • big data system is also called multi-node cluster.
  • the cluster includes multiple cluster nodes. The more cluster nodes, the larger the cluster size, and the stronger the data processing capability of big data systems.
  • unified operation and management (OM) software is required to achieve unified management. Therefore, a tenant can correspond to at least one node and be managed by the OM software of the big data system.
  • FIG. 1 is a schematic diagram of an application scenario 100 provided by an embodiment of the present application.
  • the application scenario 100 includes a client 110, a resource master node 120, and a data node/computation node 130.
  • the client 110 corresponds to the first tenant, and the first tenant can send the data to be stored to the resource master control node 120 and/or submit a calculation job through the client 110, and request the resource master control node 120 to allocate a corresponding one. Resources to enable data storage and/or run calculations.
  • the resource master node 120 is a management node for managing all the data nodes/computing nodes in the cluster.
  • the OM software can be installed in the resource master node 120 to facilitate the software. Achieve unified management of nodes in the big data system.
  • the data node/computing node 130 is any node in the cluster of the big data system for implementing storage of tenant data and/or operation of a computing job. It should be understood that a node in the cluster may be a data node for storing data of the tenant or a computing node for completing the computing task of the tenant. Therefore, a node may include storage resources and/or computing resources, including all storage-capable resources in the node, such as disk, flash memory, memory, etc., which may be used to store tenant data; computing resources are used to complete The various computing tasks submitted by the tenant through the client 110.
  • FIG. 1 exemplarily shows only one client and one data node/computing node.
  • the application scenario 100 may further include multiple data nodes/computing nodes and corresponding to multiple tenants respectively. For a plurality of clients, this embodiment of the present application does not limit this.
  • FIG. 2 shows a schematic diagram of a system architecture 200 provided by an embodiment of the present application.
  • the system architecture 200 includes three tenants (tenant A, tenant B, and tenant C), three resource zones (RZs), and nodes corresponding to the three RZs.
  • Each of the above three RZs has its own resource sharing policy for indicating which tenants the respective node resources can be used by.
  • the foregoing three RZs may include a first RZ, a second RZ, and a third RZ.
  • Each of the tenants has different usage rights under a preset resource sharing policy. For example, the first RZ can be used by all tenants.
  • the second RZ can only be used by the tenant B, and the third RZ can only be used by the tenant C; for example, the first RZ can be used by the tenant A and the tenant B, and the second RZ can be used by the tenant B and the tenant C, the third RZ Can only be used by tenant B.
  • This embodiment of the present application does not limit this.
  • the foregoing three RZs include a first reserved resource zone (RRZ), a second RRZ, and a shared resource zone (SRZ).
  • RRZ first reserved resource zone
  • SRZ shared resource zone
  • the difference between the RRZ and the SRZ lies in the resource sharing policy. s difference.
  • the SRZ can be used by all tenants, and the first RRZ can only be used by the tenant A, and the second RRZ can only be used by the tenant B. Therefore, the foregoing RZs respectively correspond to a public storage resource pool and a private storage resource pool.
  • the system architecture 200 reflects the correspondence between the resource area and the tenant of the storage system in the big data platform, and is used to implement data storage of the tenant.
  • node corresponding to the RZ may be a physical machine, and may be a virtual machine or a container, which is not limited in this embodiment of the present application.
  • FIG. 2 only exemplarily shows three tenants and three RZs.
  • the system architecture 200 may further include multiple tenants and multiple RZs corresponding to the multiple tenants respectively. The embodiment does not limit this, and in general, there is only one SRZ.
  • a big data cluster can serve multiple tenants at the same time.
  • tenants can be divided into two categories: one is a large-scale, stable and stable tenant, and its business type is relatively determined.
  • the development has a stable expectation.
  • such tenants can open up the RRZ space according to their daily stable resource demands, and the RRZ reserves the isolation resources for the tenant without paying any performance cost of the running period;
  • Tenants are small, long-term, and uncertain tenants. Their business demands are unstable, and their demands for resources are difficult to fix.
  • Such tenants can not open up RRZs and realize resource demands through long-term use of SRZ.
  • RRZs Unlike the isolation of tenants' exclusive clusters, RRZs have good elastic scalability. On the one hand, for the changes in the tenant's stable resource appeal, it is easy to split the resources from the SRZ into the RRZ, or return the resources from the RRZ to the SRZ. For the tenant, it is not necessary to wait for the long server procurement process, and also avoid The waste of idle resources, therefore, here reflects the RZ's own scalability. On the other hand, when the RRZ resources are insufficient, the tenants can temporarily borrow the resources of the SRZ to cope with sudden resource demands and expected resource demand spikes, and avoid the idle and waste of resources brought by the RRZ.
  • FIG. 3 is a schematic flowchart of a data storage method 300 provided by an embodiment of the present application.
  • the method 300 can be applied to the application scenario 100 shown in FIG. 1 and the system architecture 200 shown in FIG. 2, but the embodiment of the present application is not limited thereto.
  • S310 Receive a data write request sent by the first tenant through the client, where the data write request is used to indicate that the first tenant requests to store N copies of data to be written, where N is an integer greater than or equal to 1;
  • S330 Determine, according to the data write request and the first data distribution policy, a distribution of the N copies in the at least one RZ, where the first data distribution policy is used to indicate that the N copies are in the at least Distribution priority in an RZ;
  • the cluster in the big data system is divided into a plurality of resource zones (RZs), each of the plurality of RZs includes at least one node, and the plurality of RZs
  • Each of the RZs has a resource sharing policy for indicating the tenant's storage rights to each of the RZs in the big data system. Based on the resource sharing policy of each RZ of the multiple RZs, it is illegal to allocate storage resources to the tenant in the RZ without storage rights.
  • the resource control node must allocate different storage resources to different tenants according to the resource sharing policy. (ie RZ) to ensure the normal operation of the big data system.
  • the resource sharing policy of the RZ may be pre-configured, and may be described in multiple manners, which is not limited in this embodiment of the present application.
  • the system may formulate an RZ resource sharing policy by using a correspondence between the RZ identifier and the tenant identifier, as shown in the following table.
  • RZ1 can allow all tenants to store data
  • RZ2 can only allow tenant 1 to store data
  • RZ3 can allow tenant 3 and tenant 4 to store data
  • RZ4 can allow tenants to identify tenants whose first three letters are foo to store data.
  • the RZ identifier and/or the tenant identifier may be represented by other characters of any length, as long as the RZ and/or the tenant can be used for the identification.
  • the storage permission here is only reflected in the data placement, and the storage permission does not include the limitation of accessing the data of the resource area itself.
  • HDFS Hadoop's distributed file system
  • Tenant 1 does not have RZ3 data storage authority. However, whether tenant 1 can access data on RZ3 depends on the access control list on HDFS. List, ACL) settings.
  • the first tenant when the first tenant needs to store data, the first tenant may send a data write request to the resource master control node by using the corresponding client, where the data write request is used to indicate the The first tenant requests to store N copies of the data to be written.
  • the data write request may carry N copies of the data to be written, and may also carry one copy of the data to be written and the number N of copies to be stored, which is not limited in this embodiment of the present application.
  • the resource master control node receives the data write request, and determines, according to the data write request and the storage authority of the first tenant to each of the plurality of RZs, determining, by the plurality of RZs, the first tenant to use At least one RZ, in the above example, if the identifier of the first tenant is 1, the at least one RZ is RZ1 and RZ2. Then, the resource master control node stores the N copies in the at least one node corresponding to the at least one RZ according to the data write request, the first data distribution policy, and the second data distribution policy.
  • the foregoing first data distribution policy and the second data distribution policy may be pre-configured to determine a distribution of the N replicas.
  • the first data distribution policy is used to indicate that the N replicas are at least A distribution priority in an RZ
  • the second data distribution strategy is used to indicate a distribution priority of the N copies in a plurality of nodes corresponding to each RZ in the at least one RZ. Therefore, the data placement decision of the resource master node is divided into the following two stages:
  • the foregoing two phases may independently apply different policies according to different needs of the tenant or different application scenarios of the tenant, and combine to generate expected data distribution results, without pre-empting each application.
  • the data distribution result corresponding to the scenario configures the data distribution strategy.
  • the node that the tenant can use is divided into at least one resource area RZ, and the first data distribution policy of the at least one RZ and the second data distribution policy of the node corresponding to the at least one RZ are respectively configured, and the resource total control node is in the pair
  • a two-stage decision can be made.
  • the distribution of the data copy in at least one RZ is determined according to the first data distribution strategy, and the second stage is combined with the second data distribution strategy based on the first stage.
  • the distribution of data copies on specific nodes are described by the first data distribution strategy.
  • the data storage method of the embodiment of the present application divides the node that the tenant can use into at least one resource area RZ, and separately configures the first data distribution policy of the at least one RZ and the second node of the at least one RZ.
  • the data distribution strategy when the resource total control node stores the data, can perform two-stage decision according to the first data distribution strategy and the second data distribution strategy, and the two-stage policy can be independently configured, so that the resource total control node can
  • the data distribution strategies of different phases are combined to flexibly control the distribution of data that the tenants need to store in the nodes according to the different needs of the tenants and the scenarios of the tenants, which reduces the complexity of the policy deployment.
  • the resource sharing strategy and the data distribution strategy are mutually coordinated and mutually constrained. Since at least one RZ that the first tenant can use has different resource sharing policies, different data distributions may be adopted according to different needs of the tenants. Strategies to store data for different effects. The advantage of introducing a two-stage decision is that the two phases can independently apply different strategies and combine to produce the desired effect. Otherwise, for each combination, a specific strategy implementation is required.
  • the at least one RZ includes a first RZ and a second RZ, where the first RZ is a reserved resource area RRZ that is only allowed to be used by the first tenant, and the second RZ is allowed to include A shared resource area SRZ used by a plurality of tenants of the first tenant.
  • the at least one RZ that the first tenant can use may include a first RZ that allows only the first tenant to be used and a second RZ that is allowed to be used by the plurality of tenants including the first tenant.
  • the first RZ That is, RZ2, and the second RZ is RZ1.
  • the N copies can be placed in the nodes corresponding to RZ1 and RZ2, respectively.
  • the resource master node determines to place 2 copies in RZ2, and the remaining 1 copy is placed in In RZ1, in the second phase, the resource master node may preferentially select two nodes with more remaining space in RZ2 to place the above two copies, and preferentially select one node with more remaining space in RZ1 to place the above one copy.
  • the first data distribution policy may be preferentially stored in the first RZ, or may be stored in the second RZ.
  • the embodiment of the present application does not limit this; the second data distribution strategy may be equal probability.
  • the distribution strategy may also be a different strategy for considering the probability distribution of the remaining space of the node, and may also be other policies customized according to a specific scenario, which is not limited in this embodiment of the present application. Therefore, under the combination of different first data distribution strategies and second data distribution strategies, various expected effects can be achieved.
  • the first data distribution policy is to preferentially store the N copies into the first RZ, and according to the data write request and the first data distribution policy, determine Determining the distribution of the N copies in the at least one resource region RZ, including: determining, according to the data write request, the first data distribution policy, and the space occupation status of the first RZ, that the first RZ can be stored P copies of the data to be written, P is an integer greater than or equal to 1, the space occupation state is used to indicate the size of the space occupied by the first RZ or the remaining space size; In the case of P, determining that the N copies are distributed in the first RZ; and in the case that N is greater than P, determining that P copies of the N copies are distributed in the first RZ, The remaining copies of the N copies except the P copies are distributed in the second RZ.
  • the resource master control node may perform the data write request, the first data distribution policy, and the first An RZ space occupancy state determines that P copies of the data to be written can be stored in the first RZ. If N is less than or equal to P, the resource master node may store the N copies in the first RZ, thereby achieving the purpose of the first RZ priority storage. If N is greater than P, the resource master node may store the P copies of the data to be written into the first RZ, and store the remaining N-P copies into the second RZ.
  • the tenant's data can be stored in the tenant's RRZ (ie, the first RZ) as much as possible, and the use of the SRZ (ie, the second RZ) is minimized.
  • RRZ is usually a prepaid fee resource for tenants
  • SRZ is a pay-per-use post-paid resource
  • less SRZ usage means less additional costs.
  • the space of the RRZ is usually reserved for the tenants.
  • the utilization of the RRZ also means the utilization of the platform resources.
  • the foregoing space occupation status may be a space utilization ratio, a remaining space, and the like of the RZ, which is not limited by the embodiment of the present application.
  • the system may set a space utilization threshold or a remaining space threshold of the RRZ. After the space utilization of the RRZ reaches the threshold, the first tenant can use the storage resource of the SRZ. Therefore, the resource total control node may determine the number of copies of the to-be-written data that can be stored in the first RZ according to the data write request, the first data distribution policy, the space occupation state of the first RZ, and the space utilization threshold. This embodiment of the present application does not limit this.
  • the first data distribution policy is to store Q copies of the N copies into the second RZ, Q is an integer greater than or equal to 1, and Q is less than or equal to Determining, according to the data write request and the first data distribution policy, the distribution of the N copies in the at least one resource region RZ, including: according to the data write request and the first data distribution policy Determining that Q copies of the N copies are distributed in the second RZ, and remaining NQ copies of the N copies except the Q copies are distributed in the first RZ.
  • the resource master control node may according to the data write request and the first data distribution.
  • the policy determines to store the Q copies of the data to be written into the second RZ, and stores the remaining NQ copies into the first RZ.
  • the tenant is allowed to specify the number of copies of the data copies in different RZs.
  • This strategy applies to different scenarios, for example, (1) for the purpose of maximizing data access overlay bandwidth, if the data is often accessed by the SRZ (ie second RZ) calculation, if most of the data is copied Focused on the RRZ (that is, the first RZ), the data access bandwidth is limited by the number of RRZ nodes, which limits the parallel ability of the calculation. At this time, regardless of the remaining space of the RRZ, always store a certain number in the SRZ. The number of data copies is a better choice; (2) data sharing between tenants, that is, the data will be shared with other tenants after the data is generated.
  • tenant A requests data writing, and the number of copies of data to be written is three, and the expected data distribution policy is RRZ priority, and is set when the space utilization rate of the RRZ reaches 90%.
  • the data write request is sent from the client of the tenant A to the NameNode node of the server, where the NameNode node is the resource control node described above.
  • the NameNode node selects 3 nodes for the tenant to store different copies.
  • the RRZ space utilization is less than 90%, and the NameNode node selects three nodes Node A, Node B, and Node C in the RRZ and informs the client that the client sends a data write request to the above three nodes.
  • the client After the data copy is written, the client continues to request to write 3 copies.
  • the new data write request is sent to the NameNode node.
  • the NameNode node finds that the space utilization rate of the RRZ has reached 90%, so the SRZ is selected.
  • the three nodes Node X, Node Y and Node C determine to store subsequent copies to Node X, Node Y and Node C.
  • the determining, according to the data writing request and the first data distribution policy, that the remaining NQ replicas of the N replicas except the Q replicas are distributed in the first An RZ comprising: determining, according to the data write request, the first data distribution policy, and the space occupation status of the first RZ, the P items in the first RZ capable of storing the data to be written a copy, P is an integer greater than or equal to 1, the space occupation state is used to indicate the size of the space occupied by the first RZ or the remaining space size; if the NQ is less than or equal to P, the NQ is determined.
  • the copies are distributed in the first RZ; in the case that the NQ is greater than P, it is determined that P copies in the NQ copies are distributed in the first RZ, except for the P pieces in the NQ copies The remaining copies outside the copy are distributed in the second RZ.
  • the first data distribution policy needs to be placed in the first In an RZ, however, the memory of the first RZ is limited, and there may be a situation in which it cannot be placed. Therefore, the resource master node needs to determine the distribution of the remaining NQ replicas according to the space occupancy state of the first RZ.
  • the resource master control node may first determine, according to the data write request, the first data distribution policy, and the space occupation status of the first RZ, the P replicas in the first RZ that can store the to-be-written data, if the NQ is less than or equal to P, the resource master node may determine to store all the NQ copies in the first RZ; if the NQ is greater than P, the resource master node may store the P copies of the data to be written to the first In an RZ, the remaining NQP copies are stored in the second RZ.
  • the method further includes: storing all or part of the second RZ in the first RZ according to a space occupation state of the first RZ, where the space is occupied The state is used to indicate the size of the space in which the first RZ has been occupied or the remaining space size; deleting all or part of the copy in the second RZ.
  • the resource master node may further delete the stored copy according to the needs of the tenant.
  • the space of the first RZ becomes large, and the resource master node can relocate the copy in the second RZ to the first RZ. It should be understood that, for different data, the size of the copy is generally different, and the resource master node needs to determine the amount of data that can be relocated from the second RZ to the first RZ according to the space occupation status of the first RZ.
  • a space utilization threshold may be set.
  • the resource master node may relocate the copy in the second RZ to the first RZ.
  • the utilization rate of the RRZ can be improved. Since the RRZ reserves exclusive for the tenant, improving the utilization of the RRZ as a whole improves the resource utilization of the big data system.
  • the method before the receiving the data write request sent by the first tenant by the client, the method further includes: receiving a resource area creation request, where the resource area creation request is used for the request Creating, by the first tenant, a third RZ in the at least one RZ; creating the third RZ according to the resource area creation request, and determining a plurality of first nodes corresponding to the third RZ; Each of the first nodes adds first label information, the first label information is used to identify the third RZ, and a first resource sharing policy is added to the third RZ, where the first resource is The sharing policy is for indicating that the third RZ can be accessed by at least one tenant including the first tenant.
  • the resource master control node may receive the resource area creation request, thereby creating a third RZ for the first tenant, where the third RZ belongs to at least one RZ that the first tenant can use.
  • the resource master node needs to determine a plurality of first nodes corresponding to the third RZ, and adds first label information to each of the plurality of first nodes, where The first tag information is used to identify the third RZ.
  • the resource master control node further needs to add a first resource sharing policy for the third RZ, where the first resource sharing policy is used to indicate that the third RZ can be accessed by at least one tenant including the first tenant.
  • the foregoing label information is stored in the database of the OM.
  • the label information is usually synchronized from the OM system to the storage system (for example, HDFS). ) itself, so the tag information forms different storage partitions in the storage system, corresponding to the RZ.
  • the resource master control node may determine a specific placement node of the replica according to the foregoing data distribution policy.
  • the method further includes: receiving a resource area deletion request, the resource area deletion request is used to request to delete a fourth RZ in the at least one RZ; and deleting according to the resource area deletion request a copy stored in the plurality of second nodes corresponding to the fourth RZ; deleting second tag information of each of the plurality of second nodes, the second tag information being used to identify the And deleting a second resource sharing policy of the fourth RZ, where the second resource sharing policy is used to indicate that the fourth RZ can be accessed by at least one tenant including the first tenant.
  • the resource master control node may receive the resource area deletion request, and determine to delete the fourth RZ in the at least one RZ, where the resource master control node may delete the data stored in the multiple second nodes corresponding to the fourth RZ. And copying, deleting the second tag information of each of the plurality of second nodes and the second resource sharing policy of the fourth RZ.
  • the method further includes: receiving a resource area expansion request, where the resource area expansion request is used to request to expand a fifth RZ in the at least one RZ; and according to the resource area expansion request Determining at least one third node; adding third tag information for each of the at least one third node, the third tag information being used to identify the fifth RZ.
  • the method further includes: receiving a resource region reduction request, where the resource region reduction request is used to request a third RZ in the at least one RZ; according to the resource a region contracting request, determining at least one fourth node corresponding to the sixth RZ; deleting fourth tag information of each fourth node of the at least one fourth node, where the fourth tag information is used to identify Said the sixth RZ.
  • the OM usually has a platform administrator to operate, and the special one is the cloud scenario, and the tenant himself (may be the tenant's own management).
  • the management and maintenance of the RZ is completed by the OM system.
  • the embodiment of the present application does not limit this.
  • FIG. 4 is a schematic diagram of another system architecture 400 provided by an embodiment of the present application.
  • the system architecture 400 includes three tenants (tenant A, tenant B, and tenant C), three resource zones (RZ), and nodes corresponding to the three RZs.
  • Each of the above three RZs has its own resource sharing policy for indicating which tenants the respective node resources can be used by.
  • the foregoing three RZs may include a first RZ, a second RZ, and a third RZ.
  • Each of the tenants has different usage rights under a preset resource sharing policy. For example, the first RZ can be used by all tenants.
  • the second RZ can only be used by the tenant B, and the third RZ can only be used by the tenant C; for example, the first RZ can be used by the tenant A and the tenant B, and the second RZ can be used by the tenant B and the tenant C, the third RZ Can only be used by tenant B.
  • This embodiment of the present application does not limit this.
  • the foregoing three RZs include a first reserved resource zone (RRZ), a second RRZ, and a shared resource zone (SRZ).
  • the SRZ can be used by all tenants. While the first RRZ can only be used by tenant A, the second RRZ can only be used by tenant B. Therefore, the tenant can run the computing task separately in the RZ with computing authority.
  • the system architecture 400 reflects the correspondence between the resource area and the tenant of the computing system in the big data platform, and is used to implement various types of computing operations of the tenant.
  • node corresponding to the RZ may be a physical machine, and may be a virtual machine or a container, which is not limited in this embodiment of the present application.
  • FIG. 4 exemplarily shows three tenants and three RZs.
  • the system architecture 400 may further include multiple tenants and multiple RZs corresponding to the multiple tenants respectively. The embodiment does not limit this, and in general, there is only one SRZ.
  • FIG. 5 is a schematic flowchart of a task assignment method provided by an embodiment of the present application.
  • the task assignment method 500 can be applied to the application scenario 100 shown in FIG. 1 and the system architecture 400 shown in FIG. 4, but the embodiment of the present application is not limited thereto.
  • S510 Receive a computing task allocation request sent by the first node, where the computing task allocation request is used to request to allocate a computing task to the first node.
  • the method 500 can be performed by the resource master control node 120 in the application scenario 100, and the first node can be the compute node 130 in the application scenario 100.
  • the first node sends a computing task allocation request to the resource master control node, that is, requests the resource master control node for the task.
  • the above sharing strategy is used to indicate which tenants can provide computing resources to the tenant
  • the borrowing strategy is used to indicate that the tenant is willing to use other computing resources of other nodes if their node resources are insufficient.
  • These policies are usually configured in advance and stored in the database of the Big Data System Operation and Maintenance Management OM software, which is generally configured by the system administrator and/or tenant through the OM software.
  • the node is a resource provider
  • the tenant is a resource user.
  • the node sharing strategy is only used to express how the resource provider shares its own resources and does not care about the specific resource users.
  • the tenant's borrowing strategy is only used to express how the resource users borrow available shared resources, and does not care about the specific Resource providers, which enable decoupling of resource sharing and borrowing mechanisms.
  • the first meaning of such decoupling is that resource providers and consumers do not need to establish a global view of resource planning, only need to describe their own sharing and borrowing strategies. Compared with the current mainstream practices, there is no need for people to comprehensively plan resources to set the proportion of resources that meet expectations, especially in the case of a large number of tenants.
  • the second meaning is that from the perspective of responsibility and authority, the decoupled representation is more convenient for tenants to self-configure. For example, the resource provider can unilaterally adjust the borrowing strategy without any settings involving resource users.
  • the resource total control node flexibly matches the computing task submitted by the computing node and the tenant according to the sharing strategy of the computing node to the computing resource in the big data system and the borrowing strategy of the tenant to the computing resource. Therefore, the computing node is allocated a computing task that satisfies the policy, and the resource sharing and borrowing mechanism is decoupled, which is simple and easy, and improves the user experience.
  • the computing task allocation request, the sharing policy of the first node, and the borrowing policy of each tenant in the at least one tenant are from the computing task of the at least one tenant.
  • the first node assigning the first computing task includes: matching, according to the computing task allocation request, the computing task of the at least one tenant with the sharing policy and the borrowing policy; from the at least one tenant Calculating, in the computing task, the computing tasks of the m tenants that do not satisfy the sharing policy and the borrowing policy, where m is an integer greater than or equal to 1; from the remaining computing tasks except the computing tasks of the m tenants Determining the first computing task.
  • the resource master control node may match at least one computing task in the system with the first node according to the foregoing sharing policy and the borrowing policy, and filter the computing task that does not satisfy the sharing policy and the borrowing policy, thereby A first computing task assigned to the first node is determined.
  • the computing task allocation request includes the identifier information of the first node, and the filtering from the computing task of the at least one tenant does not satisfy the sharing policy and the borrowing policy.
  • the computing task of the m tenants includes: filtering the computing tasks of the p first tenants according to the identification information of the first node and the sharing policy, where the p first tenants do not belong to the i tenants, P is an integer greater than or equal to 0; according to the identification information of the first node and the borrowing policy, mp second is filtered out in the computing task of the remaining tenants except the computing tasks of the p first tenants The computing task of the tenant, the first node does not belong to the j nodes.
  • the filtering, by the computing task of the at least one tenant, the computing tasks of the m tenants that do not satisfy the sharing policy and the borrowing policy includes: according to the first node
  • the at least one tenant is M tenants, and M is an integer greater than 0, and the m ten tenants that do not satisfy the sharing policy and the borrowing policy are filtered out from the computing tasks of the at least one tenant.
  • the computing task includes: filtering, according to the identification information of the first node and the sharing policy, a computing task of the p tenants from the computing tasks of the M tenants; and according to the identification information of the first node
  • the borrowing strategy filters out the computing tasks of the q tenants from the computing tasks of the M tenants; and intersects the computing tasks of the remaining Mp tenants with the computing tasks of the remaining Mqs.
  • the foregoing steps of using the shared policy filtering and the borrowing policy filtering are not sequential, and may be performed at the same time, which is not limited by the embodiment of the present application.
  • the ten tenants and the q tenants may include the same tenant, but this does not affect the final filtering result.
  • the identification information of a node and the foregoing sharing policy filter out the computing task of the tenant 1 and the computing task of the tenant 2, the computing task of the remaining tenant 3, the computing task of the tenant 4, and the computing task of the tenant 5;
  • the information and the borrowing strategy described above filter out the computing task of the tenant 2 and the computing task of the tenant 3, the computing task of the remaining tenant 1, the computing task of the tenant 4, and the computing task of the tenant.
  • the remaining computing tasks of the two groups are taken. The intersection is obtained by the calculation task of the tenant 4 and the calculation task of the tenant 5.
  • the resource control node filtering the computing tasks that do not satisfy the foregoing sharing policy and the borrowing policy may adopt different filtering order, that is, may be filtered according to the sharing policy, filtered according to the borrowing policy, or filtered according to the borrowing policy. Then, according to the sharing policy, the filtering may be performed according to the sharing policy and the borrowing policy, and finally, the intersection of the two filtering results is taken, which is not limited in the embodiment of the present application.
  • the first node is a node in the first resource area RZ
  • the nodes included in the first resource area have the same sharing policy
  • the same sharing policy is the first The sharing strategy of the resource area.
  • a node in the system may be divided into a plurality of resource regions RZ including a reserved resource region RRZ and a shared resource region SRZ.
  • the first node may correspond to the first RZ, and the first RZ may be any one of the first RRZ, the second RRZ, and the SRZ in the system architecture 400.
  • the sharing strategy of the RZ is the sharing policy of each node in the RZ
  • the resource provider is the RZ
  • the resource user is the computing task of the tenant and the tenant.
  • the RRZ the RRZ will belong to the specific tenant. From this perspective, the tenant may have both the resource provider and the resource borrower.
  • an RZ should only include nodes of the same sharing policy, and this same sharing policy is the sharing policy of the RZ.
  • the sharing policy of the RZ the tenant having the usage right on the RZ can be determined.
  • the usage rights may include the use of storage resources and computing resources, thereby implementing integration of the storage system and the computing system, that is, considering storage resources and computing resources.
  • the sharing policy is any one of the following: a strict retention policy, an idle time sharing policy, and a fair sharing policy, where the strict retention policy is used to indicate that only the i tenants are allowed.
  • the computing task uses the computing resources of the first node, and the idle time sharing policy is used to indicate that other tenants other than the i tenants are allowed to use the first node only when the first node is idle.
  • Computing resources, the fair sharing policy being used to indicate that the at least one tenant is allowed to use the computing resources of the first node fairly.
  • the strict reservation policy, the idle time sharing policy, and the fair sharing policy may be a sharing policy of the node, or may be a sharing policy of the RZ.
  • the above-mentioned resource master control node specifically distinguishes the RZ that the tenant can use according to the sharing policy of each RZ, especially the RRZ and the SRZ.
  • a strict retention policy is to strictly reserve resources. Under the strict retention policy, the resources in the RZ are only allowed to be used by the tenant of the RZ. Even if idle, the other tenants are not allowed to use. In the idle sharing policy, the RZ is reserved for the tenant to which the RZ belongs. The resource is allowed to be temporarily borrowed when the resource is idle.
  • the idle time sharing policy can be preempted by the highest priority when the RZ belongs to the tenant, and the 100% weight of the RZ belonging to the RZ resource is guaranteed.
  • a fair sharing strategy is a multi-tenant sharing resource. Under the fair sharing strategy, the RZ allows multiple tenants to use their resources fairly with agreed weights. Based on the above different strategies, RZs of different natures can be generated. For example, an RZ with a fair sharing policy is an SRZ, and an RZ with a strict retention policy is an RRZ.
  • a third tenant of the at least one tenant is initially configured with at least one third node, and the borrowing policy of the third tenant includes: a node that can be used in the at least one third node Where the number of the third tenants is less than the first threshold, the third tenant is allowed to borrow the computing resources of the first node; and/or if the number of nodes that the third tenant has borrowed is greater than the second threshold, The third tenant is not allowed to borrow the computing resources of the first node; wherein the at least one third node does not include the first node.
  • the tenant's borrowing policy can be configured by the tenant and stored in the database.
  • the tenant generally has its own node resource, that is, the system initially configures a part of the node resource to provide services for the tenant.
  • the tenant A in the system architecture 400 corresponds to the first An RRZ
  • the tenant A can use the resources in the first RRZ. If the resources of the first RRZ are not enough, the tenant A needs to borrow resources. In this case, the tenant A can set his own borrowing strategy.
  • the borrowing policy may be that when the resource available to the tenant A is less than the first threshold, the borrowing resource is allowed.
  • the borrowing strategy of the tenant A may never borrow the shared resource; If the threshold is large enough, Tenant A's borrowing strategy is to always borrow shared resources.
  • the borrowing policy may be that the tenant A is no longer allowed to borrow resources when the resource borrowed by the tenant A is greater than the second threshold, and may be other policies.
  • job A submitted by tenant A is running, its expected policy is RRZ first, and the resource of SRZ is used when the RRZ cannot allocate resources for 1 minute.
  • the first 100 tasks of the job, Task1 to Task100, are running in the RRZ.
  • the Task101 is waiting for scheduling. After 1 minute, the RRZ has no free resources to run Task101, and the Task101 is scheduled to run on the SRZ.
  • the borrowing policy further includes: the third tenant preferentially uses the fourth node, and the fourth node stores data corresponding to the computing task of the third tenant, the fourth The node belongs to the node resource of the third tenant.
  • the computing location of the computing task can be optimized by setting a tenant's borrowing strategy, that is, the computing task is more likely to be scheduled on the storage node of the data corresponding to the computing task, which can improve system performance and data security.
  • FIG. 6 is a schematic flowchart of another task assignment method 600 provided by an embodiment of the present application.
  • the method 600 can also be applied to the system architecture 500 described above, but the embodiment of the present application is not limited thereto.
  • the first node sends a heartbeat packet to the resource master control node for requesting a computing task
  • the resource master control node receives the heartbeat packet, and sorts all computing tasks in the system according to the priority of the service;
  • the preset execution condition is used to filter out the calculation task with a long execution time
  • a first computing task is determined in the remaining computing tasks, and the first computing task is assigned to the first node.
  • the method 600 takes a Hadoop resource management (YARN) system as an example, and embodies a process in which a resource total control node allocates a task to a computing node in the system.
  • YARN Hadoop resource management
  • task execution itself does not have a concept of priority, and task execution uses a first-in-first-out strategy.
  • each task has a different priority because of the corresponding service, and the task execution time in the Hadoop cluster may be long, which will affect other tasks, especially the higher priority tasks. Therefore, task execution in the system needs to be scheduled.
  • two filtering steps of S640 and S650 are added, and the first node is allocated a calculation that satisfies the foregoing strategy by adopting a sharing policy of the computing resource of the first node and a borrowing strategy of the computing resource of the tenant of the system.
  • Tasks, decoupled resource sharing and borrowing mechanisms, are simple and easy to improve, and improve the user experience.
  • the final determination of the first computing task may be arbitrarily selected among the remaining computing tasks, or the computing task with the highest priority may be selected as the first computing task according to the priority order of the remaining tasks.
  • the application embodiment does not limit this.
  • FIG. 7 is a schematic diagram of another system architecture 700 provided by an embodiment of the present application.
  • the data storage method 300, the task assignment method 500, and the task assignment method 600 can be applied to the system architecture 700, but the embodiment of the present application does not limit this.
  • the system architecture 700 includes three tenants (tenant A, tenant B, and tenant C), three resource zones (RZ), and nodes corresponding to the three RZs.
  • Each of the above three RZs has its own resource sharing policy for indicating which tenants the respective node resources can be used by.
  • the foregoing three RZs may include a first RZ, a second RZ, and a third RZ.
  • Each of the tenants has different usage rights under a preset resource sharing policy.
  • the first RZ can be used by all tenants.
  • the second RZ can only be used by the tenant B, and the third RZ can only be used by the tenant C; for example, the first RZ can be used by the tenant A and the tenant B, and the second RZ can be used by the tenant B and the tenant C, the third RZ Can only be used by tenant B.
  • This embodiment of the present application does not limit this.
  • the foregoing three RZs include a first reserved resource zone (RRZ), a second RRZ, and a shared resource zone (SRZ), wherein the SRZ can be used by all tenants, and the first RRZ It can only be used by tenant A, and the second RRZ can only be used by tenant B.
  • the above resource area includes storage resources and computing resources, so the tenants can store data and/or run computing tasks in the corresponding RZ respectively.
  • RZ can be divided into computing RZ and storage RZ, wherein computing RZ is responsible for computing resource scheduling, such as tenant computing tasks, resident services, etc.; storage RZ is responsible for storage resource scheduling, that is, tenants Placement of data. Therefore, the system architecture 200 embodies the storage RZ, and the system architecture 400 embodies the calculation RZ, but in the normal case, the calculation RZ and the storage RZ need to be overlapped, that is, allocated on the same group of nodes to improve system performance and security.
  • the system architecture 700 shows the case where the calculated RZ and the storage RZ are placed in an overlapping manner. In this way, the distribution of computing resources and storage resources can be considered synchronously between different systems, thereby enhancing the flexibility of resource deployment.
  • node corresponding to the RZ may be a physical machine, and may be a virtual machine or a container, which is not limited in this embodiment of the present application.
  • FIG. 7 exemplarily shows three tenants and three RZs.
  • the system architecture 400 may further include multiple tenants and multiple RZs corresponding to the multiple tenants respectively. The embodiment does not limit this, and in general, there is only one SRZ.
  • FIG. 8 is a schematic block diagram of a data storage device 800 provided by an embodiment of the present application.
  • the device 800 includes:
  • the receiving unit 810 is configured to receive a data write request sent by the first tenant through the client, where the data write request is used to indicate that the first tenant requests to store N copies of data to be written, where N is greater than or equal to An integer of 1;
  • a determining unit 820 configured to determine, according to the data writing request and the first tenant's storage authority for each RZ of the plurality of resource areas RZ, from the plurality of RZs, the first tenant is usable At least one RZ;
  • the determining unit 820 is further configured to: determine, according to the data writing request and the first data distribution policy, a distribution of the N copies in the at least one RZ, where the first data distribution policy is used to indicate the a distribution priority of N copies in the at least one RZ;
  • the storage unit 830 is configured to store the N copies into at least one node corresponding to the at least one RZ according to the distribution of the N copies in the at least one RZ and the second data distribution policy, where The second data distribution policy is used to indicate a distribution priority of the N copies among the plurality of nodes corresponding to each RZ in the at least one RZ.
  • the data storage device of the embodiment of the present application divides the node that the tenant can use into at least one resource region RZ, and separately configures the first data distribution policy of the at least one RZ and the second data distribution of the node corresponding to the at least one RZ.
  • the resource control node can perform two-stage decision according to the first data distribution policy and the second data distribution policy when the data is stored. Since the two-stage policy can be independently configured, the resource total control node can be different.
  • the data distribution policies of the phases are combined to flexibly control the distribution of data that the tenants need to store in the nodes according to the different needs of the tenants and the scenarios in which the tenants are located, which reduces the complexity of the policy deployment.
  • the at least one RZ includes a first RZ and a second RZ, where the first RZ is a reserved resource area RRZ that is only allowed to be used by the first tenant, and the second RZ is allowed to include the first The shared resource area SRZ used by multiple tenants of the tenant.
  • the first data distribution policy is to preferentially store the N copies in the first RZ
  • the determining unit 820 is specifically configured to: according to the data write request, the first data a distribution policy and a space occupation status of the first RZ, determining that the first RZ can store P copies of the data to be written, P is an integer greater than or equal to 1, and the space occupation status is used to represent The size of the space occupied by the first RZ or the remaining space size; if N is less than or equal to P, determining that the N copies are distributed in the first RZ; in the case where N is greater than P, Determining that P copies of the N copies are distributed in the first RZ, and remaining copies of the N copies except the P copies are distributed in the second RZ.
  • the first data distribution policy is to store Q copies of the N copies into the second RZ, where Q is an integer greater than or equal to 1, and Q is less than or equal to N
  • the determining unit 820 is specifically configured to: according to the data writing request and the first data distribution policy, determine that Q copies of the N copies are distributed in the second RZ, except the N copies The remaining NQ copies outside the Q copies are distributed in the first RZ.
  • the determining unit 820 is specifically configured to: determine, according to the data write request, the first data distribution policy, and the space occupation status of the first RZ, that the first RZ can store the P copies of the data to be written, P is an integer greater than or equal to 1, the space occupation state is used to indicate the size of the space occupied by the first RZ or the remaining space size; if the NQ is less than or equal to P In the case, it is determined that the NQ replicas are distributed in the first RZ; and in a case that the NQ is greater than P, determining that P replicas in the NQ replicas are distributed in the first RZ, the NQ The remaining copies of the copy other than the P copies are distributed in the second RZ.
  • the storage unit 830 is further configured to: store all or part of the second RZ in the first RZ according to the space occupation status of the first RZ, where the space occupation status
  • the device is configured to indicate a size of the space occupied by the first RZ or a remaining space size; the apparatus further includes: a deleting unit, configured to delete the all or part of the copy in the second RZ.
  • the apparatus 800 herein is embodied in the form of a functional unit.
  • the term "unit” as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor (eg, a shared processor, a proprietary processor, or a group) for executing one or more software or firmware programs. Processors, etc.) and memory, merge logic, and/or other suitable components that support the described functionality.
  • ASIC application specific integrated circuit
  • the device 800 may be specifically the resource total control node in the foregoing embodiment 300, and the device 800 may be used for each of the resource master control nodes that execute the foregoing method embodiment 300. Processes and/or steps, to avoid repetition, will not be repeated here.
  • FIG. 9 is a schematic block diagram of a task distribution apparatus 900 provided by an embodiment of the present application.
  • the apparatus 900 includes:
  • the receiving unit 910 is configured to receive a computing task allocation request sent by the first node, where the computing task allocation request is used to request to allocate a computing task to the first node;
  • the allocating unit 920 is configured to allocate, according to the computing task allocation request, the sharing policy of the first node, and the borrowing policy of the at least one tenant, the first node from the computing task of the at least one tenant a computing task, wherein the sharing policy is used to indicate that the first node provides computing resources for computing tasks of i ten tenants of the at least one tenant, and the borrowing policy is used to indicate that the first tenant of the at least one tenant Allows calculation resources of j nodes to be used, i and j are integers greater than 0;
  • the sending unit 930 is configured to send task indication information to the first node, where the task indication information is used to indicate the first computing task.
  • the task allocation device of the embodiment of the present invention flexibly matches the computing task submitted by the computing node and the tenant according to the sharing strategy of the computing node to the computing resource and the borrowing strategy of the tenant to the computing resource in the big data system, thereby
  • the computing node is assigned a computing task that satisfies the policy, and the resource sharing and borrowing mechanism is decoupled, which is simple and easy to improve the user experience.
  • the device further includes: a matching unit, configured to match the computing task of the at least one tenant with the sharing policy and the borrowing policy according to the computing task allocation request; and a filtering unit, configured to: And filtering, from the computing task of the at least one tenant, a computing task of the m tenants that do not satisfy the sharing policy and the borrowing policy, where m is an integer greater than or equal to 1; and determining unit, configured to remove the m
  • the first computing task is determined among the remaining computing tasks outside the computing tasks of the tenants.
  • the calculating task allocation request includes the identifier information of the first node, where the filtering unit is specifically configured to: filter out p first tenants according to the identifier information of the first node and the sharing policy
  • the calculation task the p first tenants do not belong to the i tenants, and p is an integer greater than or equal to 0; according to the identification information of the first node and the borrowing policy, in addition to the p first
  • the computing tasks of the remaining tenants outside the computing task of a tenant filter out the computing tasks of the mp second tenants, the first node not belonging to the j nodes.
  • the first node is a node in the first resource area RZ
  • the nodes included in the first resource area have the same sharing policy
  • the same sharing policy is sharing in the first resource area.
  • the sharing policy is any one of the following: a strict retention policy, an idle time sharing policy, and a fair sharing policy, where the strict retention policy is used to indicate that only the computing tasks of the i tenants are allowed to be used. a computing resource of the first node, where the idle time sharing policy is used to indicate that other tenants other than the i tenants are allowed to use the computing resources of the first node only when the first node is idle.
  • the fair sharing policy is used to indicate that the at least one tenant is allowed to use the computing resources of the first node fairly.
  • the third tenant of the at least one tenant is initially configured with at least one third node, and the borrowing policy of the third tenant includes: the number of nodes that can be used in the at least one third node is smaller than the number In the case of a threshold, the third tenant is allowed to borrow computing resources of the first node; and/or in the case that the number of nodes that the third tenant has borrowed is greater than a second threshold, the third tenant The computing resources of the first node are not allowed to be borrowed; wherein the at least one third node does not include the first node.
  • the borrowing policy further includes: the third tenant preferentially uses a fourth node, where the fourth node stores data corresponding to a computing task of the third tenant, and the fourth node belongs to the The node resource of the third tenant.
  • the apparatus 900 herein is embodied in the form of a functional unit.
  • the term "unit” as used herein may refer to an application specific integrated circuit (ASIC), an electronic circuit, a processor (eg, a shared processor, a proprietary processor, or a group) for executing one or more software or firmware programs. Processors, etc.) and memory, merge logic, and/or other suitable components that support the described functionality.
  • ASIC application specific integrated circuit
  • the device 900 may be specifically the resource master node in the foregoing embodiment 500 or 600, and the device 900 may be used to perform resource total control with the foregoing method embodiment 500 or 600.
  • the various processes and/or steps corresponding to the nodes are not repeated here to avoid repetition.
  • FIG. 10 is a schematic block diagram of another data storage device 1000 provided by an embodiment of the present application.
  • the apparatus 1000 includes a processor 1010, a transceiver 1020, and a memory 1030.
  • the processor 1010, the transceiver 1020, and the memory 1030 communicate with each other through an internal connection path.
  • the memory 1030 is configured to store instructions, and the processor 1010 is configured to execute instructions stored by the memory 1030 to control the transceiver 1020 to send signals and / or receive signals.
  • the transceiver 1020 is configured to: receive a data write request sent by the first tenant through the client, where the data write request is used to indicate that the first tenant requests to store N copies of the data to be written, where N is An integer greater than or equal to 1; the processor 1010 is configured to: determine, according to the data write request and the first tenant's storage authority for each RZ of the plurality of resource regions RZ, from the plurality of RZs Determining, by the first tenant, at least one RZ; determining, according to the data writing request and the first data distribution policy, a distribution of the N copies in the at least one RZ, where the first data distribution policy is used Representing a distribution priority of the N copies in the at least one RZ; storing the N copies to the N according to a distribution of the N copies in the at least one RZ and a second data distribution policy In at least one node corresponding to the at least one RZ, the second data distribution policy is used to indicate a distribution priority of the N copies in a plurality of nodes
  • the device 1000 may be specifically the resource master node in the foregoing embodiment 300, and may be used to perform various steps and/or processes corresponding to the resource master node in the foregoing method embodiment 300.
  • the memory 1030 can include read only memory and random access memory and provides instructions and data to the processor. A portion of the memory may also include a non-volatile random access memory.
  • the memory can also store information of the device type.
  • the processor 1010 can be configured to execute instructions stored in a memory, and when the processor 1010 executes instructions stored in the memory, the processor 1010 is configured to perform various steps corresponding to the resource master node of the embodiment 300 above. And / or process.
  • FIG. 11 is a schematic block diagram of another task distribution apparatus 1100 provided by an embodiment of the present application.
  • the apparatus 1100 includes a processor 1110, a transceiver 1120, and a memory 1130.
  • the processor 1110, the transceiver 1120, and the memory 1130 communicate with each other through an internal connection path.
  • the memory 1130 is configured to store an instruction
  • the processor 1110 is configured to execute an instruction stored by the memory 1130 to control the transceiver 1120 to send a signal and / or receive signals.
  • the transceiver 1120 is configured to receive a computing task allocation request sent by the first node, where the computing task allocation request is used to request to allocate a computing task to the first node; the processor 1110 is configured to allocate according to the computing task. Requesting, a sharing policy of the first node, and a borrowing policy of the at least one tenant, from the computing task of the at least one tenant, assigning a first computing task to the first node, wherein the sharing policy is used to indicate
  • the first node provides a computing resource for a computing task of the ten tenants of the at least one tenant, the borrowing policy is used to indicate that the first tenant of the at least one tenant is allowed to use the computing resources of the j nodes, i and j All of the integers are greater than 0; the transceiver 1120 is configured to send task indication information to the first node, where the task indication information is used to indicate the first computing task.
  • the device 1100 may be specifically the resource master node in the foregoing embodiment 500 or 600, and may be used to perform various steps and/or processes corresponding to the resource master node of the foregoing method embodiment 500 or 600.
  • the memory 1130 can include read only memory and random access memory and provides instructions and data to the processor. A portion of the memory may also include a non-volatile random access memory.
  • the memory can also store information of the device type.
  • the processor 1110 can be configured to execute instructions stored in a memory, and when the processor 1110 executes instructions stored in the memory, the processor 1110 is configured to execute each of the resources corresponding to the resource master node of the embodiment 500 or 600 described above. Steps and / or processes.
  • the resource total control node may be any device that has the foregoing data storage function and/or task allocation function, that is, the resource summary node may be used only to execute the foregoing data storage method, and may be used only for performing the above task assignment.
  • the method may be used to perform the foregoing data storage method and the foregoing task allocation method, which is not limited by the embodiment of the present application.
  • the processor of the foregoing device may be a central processing unit (CPU), and the processor may also be other general-purpose processors, digital signal processing (DSP).
  • DSP digital signal processing
  • the general purpose processor may be a microprocessor or the processor or any conventional processor or the like.
  • each step of the above method may be completed by an integrated logic circuit of hardware in a processor or an instruction in a form of software.
  • the steps of the method disclosed in the embodiments of the present application may be directly implemented as a hardware processor, or may be performed by a combination of hardware and software units in the processor.
  • the software unit can be located in a conventional storage medium such as random access memory, flash memory, read only memory, programmable read only memory or electrically erasable programmable memory, registers, and the like.
  • the storage medium is located in a memory, and the processor executes instructions in the memory, in combination with hardware to perform the steps of the above method. To avoid repetition, it will not be described in detail here.
  • system and “network” are used interchangeably herein.
  • the term “and/or” in this context is merely an association describing the associated object, indicating that there may be three relationships, for example, A and / or B, which may indicate that A exists separately, and both A and B exist, respectively. B these three situations.
  • the character "/" in this article generally indicates that the contextual object is an "or" relationship.
  • B corresponding to A means that B is associated with A, and B can be determined according to A.
  • determining B from A does not mean that B is only determined based on A, and that B can also be determined based on A and/or other information.
  • the disclosed systems, devices, and methods may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of cells is only a logical function division.
  • multiple units or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, or an electrical, mechanical or other form of connection.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the objectives of the embodiments of the present application.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit.
  • the above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • the technical solution of the present application may be in essence or part of the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like, which can store program code. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Human Computer Interaction (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请实施例提供一种数据存储方法和装置,该方法包括:接收第一租户通过客户端发送的数据写入请求;根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布;根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中。本申请实施例的数据存储方法和装置,能够通过不同数据分布策略的组合,灵活控制租户需要存储的数据在节点的分布,降低了策略部署的复杂度。

Description

数据存储方法及装置
本申请要求于2017年03月29日提交中国专利局、申请号为201710198809.3、申请名称为“数据存储方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及大数据领域,并且更具体地,涉及一种数据存储方法及装置。
背景技术
多租户技术或称多重租赁技术,是一种软件架构技术,是实现如何在多用户环境下共用相同的系统或程序组件,并且可确保各用户间数据的隔离性。在当下云计算时代,多租户技术在共用的数据中心以单一系统架构与服务提供多数客户端相同甚至可定制化的服务,并且可以保障租户的数据隔离。目前各种各样的云计算服务就是这类技术范畴,例如阿里云数据库服务、阿里云服务器等等。
在多租户场景下,一个租户可以对应至少一个节点,由大数据系统统一进行管理,该至少一个节点即为该租户所拥有的资源,该租户可以利用该至少一个节点实现存储数据、运行计算作业等诉求。现有技术中,租户在需要存储数据时,资源控制节点直接根据预配置的数据分布策略,确定该租户的数据在各个节点的分布,例如,该租户请求存储10个数据副本,且该租户可用的节点为节点A、节点B以及节点C,资源控制节点为该租户确定的数据分布结果可以存在多种可能的情况,可以是5个副本分布于节点A,3个副本分布于节点2,2个副本分布于节点3,也可以是7个副本分布于节点2,3个副本分布于节点3。由于租户可能存在不同的需求,或租户处于不同的应用场景,需要不同的数据分布结果,因此,现有技术需要预先为每一个应用场景对应的数据分布结果配置数据分布策略,复杂度较高。
发明内容
有鉴于此,本申请实施例提供一种数据存储方法及装置,能够通过不同数据分布策略的组合,灵活控制租户需要存储的数据在节点的分布,降低了策略部署的复杂度。
第一方面,提供了一种数据存储方法,包括:接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存 储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
具体地,当第一租户需要存储数据时,该第一租户可以通过对应的客户端向资源总控节点发送数据写入请求,该数据写入请求用于表示该第一租户请求存储待写入数据的N个副本。其中,该数据写入请求可以携带待写入数据的N个副本,也可以携带待写入数据的1个副本和请求存储的副本个数N,本申请实施例对此不作限定。该资源总控节点接收该数据写入请求,根据该数据写入请求以及该第一租户对多个RZ中的每个RZ的存储权限,从该多个RZ中确定出该第一租户能够使用的至少一个RZ。然后,该资源总控节点根据该数据写入请求、第一数据分布策略以及第二数据分布策略,将该N个副本分别存储至该至少一个RZ对应的至少一个节点中。
上述第一数据分布策略用于表示该N个副本在该至少一个RZ中的分布优先级,第二数据分布策略用于表示该N个副本在该至少一个RZ中每个RZ对应的多个节点中的分布优先级。因此,该资源总控节点的数据放置决策分为如下两个阶段:
(1)根据该数据写入请求以及第一数据分布策略,确定该N个副本在该至少一个RZ的分布;
应理解,这里的分布是指该N个副本与至少一个RZ的对应关系,例如,N=5,并且第一租户可以使用的至少一个RZ为RZ1和RZ2,那么根据第一数据分布策略,这5个副本在这两个RZ的分布可以为2个副本分布于RZ1,3个副本分布于RZ2。
(2)根据上述N个副本在至少一个RZ的分布以及第二数据分布策略,确定该N个副本分别分布于该至少一个RZ对应的至少一个节点中。
应理解,上述第一租户对多个资源区RZ中的每个RZ的存储权限是根据每个RZ的资源共享策略确定的,以RZ1为例,RZ1的资源共享策略用于表示该RZ1能够为哪些租户提供资源,不满足RZ1的资源共享策略的租户是没有RZ1的存储权限的。
还应理解,资源共享策略与数据分布策略是相互协调,相互制约的,由于该第一租户能够使用的至少一个RZ具有不同的资源共享策略,因此,可以根据租户的不同需求,采用不同的数据分布策略来存储数据,从而获得不同的效果。
在本申请实施例中,上述两个阶段可以分别根据租户的不同需求或租户所处的不同应用场景,独立应用不同的策略,并组合产生出预期的数据分布结果,不需要预先为每一个应用场景对应的数据分布结果配置数据分布策略。首先,将租户能够使用的节点划分成至少一个资源区RZ,并分别配置该至少一个RZ的第一数据分布策略以及该至少一个RZ对应的节点的第二数据分布策略,资源总控节点在对数据进行存储时,可以进行两阶段决策,在第一阶段根据第一数据分布策略确定数据副本在至少一个RZ的分布,第二阶段在第一阶段的基础之上再结合第二数据分布策略确定数据副本在具体节点的分布。
本申请实施例的数据存储方法,通过将租户能够使用的节点划分成至少一个资源区RZ,并分别配置该至少一个RZ的第一数据分布策略以及该至少一个RZ对应的节点的第二数据分布策略,资源总控节点在对数据进行存储时,可以根据该第一数据分布策略和第二数据分布策略进行两阶段决策,由于两个阶段的策略可以独立配置,使得资源总控节点能够对不同阶段的数据分布策略进行组合,根据租户的不同需求以及租户所处的场景,灵活控制租户需要存储的数据在节点的分布,降低了策略部署的复杂度。
在第一方面的第一种可能的实现方式中,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
具体地,该第一租户能够使用的至少一个RZ可以包括仅允许第一租户使用的第一RZ和允许包括该第一租户的多个租户使用的第二RZ。在这种情况下,基于上述的数据分布策略,可以将该N个副本分别放置于第一RZ和第二RZ对应的节点中。例如,若N=3,即该待写入数据的副本个数为3,在第一阶段,该资源总控节点确定其中的2个副本放置于RZ2中,剩下的1个副本放置于RZ1中,在第二阶段,该资源总控节点可以在RZ2中优先选择剩余空间多的两个节点放置上述2个副本,在RZ1中优先选择剩余空间多的1个节点放置上述1个副本。
应理解,第一数据分布策略可以为在第一RZ中优先存储,也可以为总是在第二RZ中存储部分副本,本申请实施例对此不作限定;第二数据分布策略可以为等概率分布策略,也可以为考虑节点剩余空间进行概率分布的不同策略,还可以为一些根据特定场景定制的其它策略,本申请实施例对此也不作限定。因此,在不同的第一数据分布策略和第二数据分布策略的组合下,可以实现多种预期的效果。
结合第一方面的上述可能的实现方式,在第一方面的第二种可能的实现方式中,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个RZ的分布,包括:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
在本申请实施例中,租户的数据可以被尽可能存储在该租户的RRZ(即第一RZ)中,尽量减少对SRZ(即第二RZ)的使用。从租户成本角度而言,RRZ通常属于租户的预付费资源,SRZ属于按量付费的后付费资源,更少的SRZ使用意味着更少的额外费用产生。从平台运营角度而言,RRZ的空间通常为租户预留独占,RRZ的利用率提升也意味着平台资源利用率的提升。
应理解,上述空间占用状态可以为RZ的空间利用率、剩余空间等等,本申请实施例对此不作限定。可选地,在这种策略下,系统可以设置RRZ的空间利用阈值或剩余空间阈值,当该RRZ的空间利用率达到该阈值后,该第一租户才可以使用SRZ的存储资源。因此,该资源总控节点可以根据数据写入请求、第一数据分布策略、第一RZ的空间占用状态以及空间利用阈值,确定该第一RZ中能够存储的该待写入数据的副本个数,本申请实施例对此不作限定。
结合第一方面的上述可能的实现方式,在第一方面的第三种可能的实现方式中,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个RZ的分布,包括:根据所述数据写入请求以及所述第一数据 分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
在本申请实施例中,允许租户指定在不同的RZ里的数据副本的存储个数。这个策略适用于不同的场景,例如,(1)出于最大数据访问叠加带宽的目的,该数据经常会被运行在SRZ(即第二RZ)的计算所访问,但若大部分数据的副本都集中在RRZ(即第一RZ)中,那么数据访问带宽会受到RRZ的节点个数的限制,从而限制了计算的并行能力,此时,不考虑RRZ的剩余空间,始终在SRZ中存放一定个数的数据副本是更好的选择;(2)租户间的数据共享,即该数据产生后会共享给其它的租户,如果数据副本集中在RRZ中,那么其它租户访问时也会占据该RRZ的输入输出I/O资源,从而对租户在RRZ中的自身应用造成性能干扰,此时,选择将部分数据副本放置在SRZ中,可以避免对RRZ性能的干扰。
结合第一方面的上述可能的实现方式,在第一方面的第四种可能的实现方式中,所述根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中,包括:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
具体地,在上述第一数据分布策略为将该N个副本中的Q个副本存储至该第二RZ中的情况下,对于剩余的N-Q个副本,按照第一数据分布策略需要放置在第一RZ中,但是,该第一RZ的内存有限,可能存在放不下的情况,因此,该资源总控节点需要根据该第一RZ的空间占用状态,确定剩余的N-Q个副本的分布。该资源总控节点可以先根据数据写入请求、第一数据分布策略以及第一RZ的空间占用状态,确定该第一RZ中能够存储该待写入数据的P个副本,若N-Q小于或等于P,那么该资源总控节点可以确定将该N-Q个副本全部存储至该第一RZ中;若N-Q大于P,那么该资源总控节点可以将该待写入数据的P个副本存储至该第一RZ中,将剩余的N-Q-P个副本存储至第二RZ中。
结合第一方面的上述可能的实现方式,在第一方面的第五种可能的实现方式中,所述方法还包括:根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;删除所述第二RZ中的所述全部或部分副本。
应理解,对于不同的数据,副本的大小一般是不相同的,该资源总控节点需要根据该第一RZ的空间占用状态,确定能够从该第二RZ搬迁至该第一RZ的数据量。可选地,可以设置空间利用阈值,当该第一RZ的空间利用率小于该空间利用阈值时,该资源总控节点可以将该第二RZ中的副本搬迁至该第一RZ中。
这样,能够提高RRZ的利用率,由于RRZ为租户预留独占,因此,从整体上来说,提高RRZ的利用率即提高了该大数据系统的资源利用率。
结合第一方面的上述可能的实现方式,在第一方面的第六种可能的实现方式中,在所 述接收第一租户通过客户端发送的数据写入请求之前,所述方法还包括:接收资源区创建请求,所述资源区创建请求用于请求为所述第一租户创建所述至少一个RZ中的第三RZ;根据所述资源区创建请求,创建所述第三RZ,并确定与所述第三RZ对应的多个第一节点;为所述多个第一节点中的每个第一节点添加第一标签信息,所述第一标签信息用于标识所述第三RZ;为所述第三RZ添加第一资源共享策略,所述第一资源共享策略用于表示所述第三RZ能够被包括所述第一租户的至少一个租户访问。
应理解,上述标签信息存储于运维管理OM软件的数据库中,为避免存储系统在使用过程中对于OM的依赖,该标签信息通常会从OM系统同步到存储系统(例如HDFS)自身中去,因此,该标签信息在存储系统中形成了不同的存储分区,与RZ对应。基于该标签信息,资源总控节点可以根据上述数据分布策略确定副本的具体放置节点。
结合第一方面的上述可能的实现方式,在第一方面的第七种可能的实现方式中,所述方法还包括:接收资源区删除请求,所述资源区删除请求用于请求删除所述至少一个RZ中的第四RZ;根据所述资源区删除请求,删除与所述第四RZ对应的多个第二节点中存储的副本;删除所述多个第二节点中每个第二节点的第二标签信息,所述第二标签信息用于标识所述第四RZ;删除所述第四RZ的第二资源共享策略,所述第二资源共享策略用于表示所述第四RZ能够被包括所述第一租户的至少一个租户访问。
具体地,该资源总控节点可以接收资源区删除请求,确定删除该至少一个RZ中的第四RZ,该资源总控节点可以删除该第四RZ对应的多个第二节点中存储的数据的副本,再删除该多个第二节点中每个第二节点的第二标签信息以及该第四RZ的第二资源共享策略。
结合第一方面的上述可能的实现方式,在第一方面的第八种可能的实现方式中,所述方法还包括:接收资源区扩容请求,所述资源区扩容请求用于请求为所述至少一个RZ中的第五RZ进行扩容;根据所述资源区扩容请求,确定至少一个第三节点;为所述至少一个第三节点中的每个第三节点添加第三标签信息,所述第三标签信息用于标识所述第五RZ。
结合第一方面的上述可能的实现方式,在第一方面的第九种可能的实现方式中,所述方法还包括:接收资源区缩容请求,所述资源区缩容请求用于请求为所述至少一个RZ中的第六RZ进行缩容;根据所述资源区缩容请求,确定与所述第六RZ对应的至少一个第四节点;删除所述至少一个第四节点中每个第四节点的第四标签信息,所述第四标签信息用于标识所述第六RZ。
应理解,上述资源区的创建、删除、扩容以及缩容等管理操作可以由OM系统完成,OM通常有平台管理员进行操作,比较特殊的是云端的场景,租户自己(可能是租户自己的管理员)会通过OM系统自助完成RZ的管理维护,本申请实施例对此不作限定。
第二方面,提供了一种任务分配方法,包括:接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;向所述第 一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
应理解,上述共享策略用于表示第一节点能够为哪些租户提供计算资源,上述借用策略用于表示租户在自身的节点资源不足的情况下愿意使用其他哪些节点的计算资源。这些策略通常都是提前配置好的,存储在大数据系统运维管理OM软件的数据库中,一般由系统管理员和/或租户通过OM软件进行配置。
此外,上述第一计算任务的最终确定可以是在剩余的计算任务中任意选取,也可以是按照剩余的计算任务的优先级顺序,选择优先级最高的计算任务作为第一计算任务,本申请实施例对此不作限定。
在本申请实施例中,节点为资源提供者,租户为资源使用者。节点的共享策略仅仅用来表达资源提供者如何共享自己的资源,并不关心具体的资源使用者;而租户的借用策略仅仅用来表达资源使用者如何借用可用的共享资源,并不关心具体的资源提供者,从而可以实现资源共享和借用机制的解耦。
因此,本申请实施例的任务分配方法,通过资源总控节点根据大数据系统中计算节点对计算资源的共享策略以及租户对计算资源的借用策略,对计算节点和租户提交的计算任务进行灵活匹配,从而为计算节点分配满足策略的计算任务,解耦了资源共享和借用机制,简单易行,提高了用户体验。
在第二方面的第一种可能的实现方式中,所述根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户中每个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,包括:根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
具体地,该资源总控节点可以根据上述共享策略和借用策略,将系统中的至少一个计算任务与该第一节点进行匹配,将不满足该共享策略和该借用策略的计算任务过滤掉,从而确定为该第一节点分配的第一计算任务。
结合第二方面的上述可能的实现方式,在第二方面的第二种可能的实现方式中,所述计算任务分配请求包括所述第一节点的标识信息,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
结合第二方面的上述可能的实现方式,在第二方面的第三种可能的实现方式中,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述借用策略,过滤掉m-p个第二租户的计算任务,所述m-p个第二租户的借用策略表示不允许使用第一节点的计算资源,p为大于或等于0的整数;根据所述第一节点的标识信息和所述共享策略,在除所述m-p个第二租户的计算任务外的剩余租户的计算任务中过滤掉p个第一租户的计算任务, 所述p个第一租户不属于所述i个租户。
可选地,所述至少一个租户为M个租户,M为大于0的整数,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述共享策略,从所述M个租户的计算任务中过滤掉p个租户的计算任务;根据所述第一节点的标识信息和所述借用策略,从所述M个租户的计算任务中过滤掉q个租户的计算任务;将剩余的M-p个租户的计算任务和剩余的M-q的计算任务取交集。
具体地,上述采用共享策略过滤与采用借用策略过滤两个步骤并没有先后顺序,可以同时进行,本申请实施例对此不作限定。在这种过滤方式下,p个租户与q个租户中可能包括相同的租户,但这并不会对最终的过滤结果造成影响。
应理解,该资源总控节点过滤掉不满足上述共享策略和借用策略的计算任务可以采用不同的过滤顺序,即可以先根据共享策略过滤,再根据借用策略过滤,也可以先根据借用策略过滤,再根据共享策略过滤,还可以分别按照共享策略和借用策略过滤,最后取两个过滤结果的交集,本申请实施例对此不作限定。
结合第二方面的上述可能的实现方式,在第二方面的第四种可能的实现方式中,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
具体地,系统中的节点可以被划分为多个资源区RZ,该多个RZ中包括保留资源区RRZ和共享资源区SRZ。在这种情况下,RZ的共享策略就是该RZ中每个节点的共享策略,资源提供者就是RZ,而资源使用者就是租户以及租户的计算任务。对于RRZ而言,RRZ会归属到具体的租户,从这个角度而言,租户可能同时具备资源提供者和资源借用者的双重身份。
应理解,一个RZ应该只包括相同共享策略的节点,这个相同的共享策略即为RZ的共享策略。根据RZ的共享策略,可以确定在该RZ上具有使用权限的租户。可选地,这个使用权限可以包括对存储资源和计算资源的使用,从而实现存储系统与计算系统的一体化,即将存储资源与计算资源拉通考虑。此外,从部署方面讲,就不需要为每个节点都设置共享策略了,为一个RZ设置共享策略即可,有利于节省设置的复杂度。
结合第二方面的上述可能的实现方式,在第二方面的第五种可能的实现方式中,所述共享策略为下列策略中的任意一个:严格保留策略、空闲时共享策略以及公平共享策略,其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
具体地,上述严格保留策略、空闲时共享策略以及公平共享策略可以为节点的共享策略,也可以为RZ的共享策略。换句话说,上述资源总控节点就是根据每个RZ的共享策略来具体区分租户能够使用的RZ的,特别是RRZ和SRZ。严格保留策略即严格保留资源,在严格保留策略下,RZ中的资源仅允许该RZ所属租户使用,即使空闲也不允许其它租户使用;在空闲时共享策略下,该RZ为该RZ所属租户保留资源,但在资源空闲时允许其它租户暂时借用,并在该RZ所属租户需要时以最高优先级抢占,保证该RZ所属 租户对于该RZ资源的100%权重;公平共享策略即多租户共享资源,在公平共享策略下,该RZ允许多个租户以约定的权重公平地使用其资源。基于上述不同的策略,可以产生不同性质的RZ,例如,具有公平共享策略的RZ为SRZ,具有严格保留策略的RZ为RRZ。
应理解,本申请实施例仅仅以上述三个共享策略为例进行说明,但系统管理员或租户还可以为节点或RZ设置其他不同的共享策略,本申请实施例对此不作限定。
结合第二方面的上述可能的实现方式,在第二方面的第六种可能的实现方式中,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节点的计算资源;其中,所述至少一个第三节点不包括所述第一节点。
具体地,租户的借用策略可以由租户配置并存储在数据库中,租户一般都拥有自身的节点资源,即系统会初始配置一部分节点资源为租户提供服务,例如,租户A对应第一RRZ,该租户A可以使用第一RRZ中的资源。若第一RRZ的资源不够用了,该租户A就需要借用资源,在这种情况下,租户A可以设置自身的借用策略。该借用策略可以是在租户A可用的资源小于第一阈值时允许借用资源,那么,在上述第一阈值为0的情况下,租户A的借用策略为永远不可以借用共享资源;在上述第一阈值足够大的情况下,租户A的借用策略为永远可以借用共享资源。此外,该借用策略可以是在租户A借用的资源大于第二阈值时不再允许该租户A借用资源,还可以是其他的策略,本申请实施例对此不作限定。
结合第二方面的上述可能的实现方式,在第二方面的第七种可能的实现方式中,所述借用策略还包括:所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
具体地,可以通过设置租户的借用策略对计算任务的计算位置进行优化,即更倾向于在与计算任务对应的数据的存储节点上调度该计算任务,这样能够提高系统性能以及数据的安全性。
第三方面,提供了一种数据存储装置,用于执行上述第一方面或第一方面的任意可能的实现方式中的方法。具体地,该装置包括用于执行上述第一方面或第一方面的任意可能的实现方式中的方法的单元。
第四方面,提供了一种任务分配装置,用于执行上述第二方面或第二方面的任意可能的实现方式中的方法。具体地,该装置包括用于执行上述第二方面或第二方面的任意可能的实现方式中的方法的单元。
第五方面,提供了一种数据存储装置,该装置包括:收发器、存储器和处理器。其中,该收发器、该存储器和该处理器通过内部连接通路互相通信,该存储器用于存储指令,该处理器用于执行该存储器存储的指令,以控制接收器接收信号,并控制发送器发送信号,并且当该处理器执行该存储器存储的指令时,该执行使得该处理器执行第一方面或第一方面的任一种可能的实现方式中的方法。
第六方面,提供了一种任务分配装置,该装置包括:收发器、存储器和处理器。其中,该收发器、该存储器和该处理器通过内部连接通路互相通信,该存储器用于存储指令,该 处理器用于执行该存储器存储的指令,以控制接收器接收信号,并控制发送器发送信号,并且当该处理器执行该存储器存储的指令时,该执行使得该处理器执行第二方面或第二方面的任一种可能的实现方式中的方法。
第七方面,提供了一种计算机可读介质,用于存储计算机程序,该计算机程序包括用于执行第一方面或第一方面的任意可能的实现方式中的方法的指令。
第八方面,提供了一种计算机可读介质,用于存储计算机程序,该计算机程序包括用于执行第二方面或第二方面的任意可能的实现方式中的方法的指令。
附图说明
图1是本申请实施例的应用场景示意图。
图2是本申请实施例提供的系统架构示意图。
图3是本申请实施例提供的数据存储方法的示意性流程图。
图4为本申请实施例提供的另一系统架构示意图。
图5是本申请实施例提供的任务分配方法的示意性流程图。
图6为本申请实施例提供的另一任务分配方法的示意性流程图。
图7为本申请实施例提供的另一系统架构示意图。
图8为本申请实施例提供的数据存储装置的示意性框图。
图9为本申请实施例提供的任务分配装置的示意性框图。
图10为本申请实施例提供的另一数据存储装置的示意性框图。
图11为本申请实施例提供的另一任务分配装置的示意性框图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。
首先介绍一下本申请实施例所涉及的大数据系统以及多租户技术。
“大数据”是指以多元形式、自许多来源搜集而来的庞大数据组,往往具有实时性。在企业对企业销售的情况下,这些数据可能得自社交网络、电子商务网站、顾客来访纪录以及许多其他来源。从技术上看,大数据与云计算的关系就像一枚硬币的正反面一样密不可分。大数据必然无法用单台的计算机进行处理,必须采用分布式计算架构。因此,大数据的特色在于对海量数据的挖掘,但它必须依托云计算的分布式处理、分布式数据库、云存储和/或虚拟化技术等等。
多租户技术或称多重租赁技术,是一种软件架构技术,是实现如何在多用户环境下共用相同的系统或程序组件,并且可确保各用户之间数据的隔离性。在大数据环境下,实现多租户技术需要资源池,或者作业池。每个资源池里有一定量的资源(由管理员配置),每个租户属于某个资源池,其提交的作业可使用这个资源池中的资源,从而实现存储数据、运行计算作业等诉求。
从技术实现上看,大数据系统也称为多节点集群,集群中包括多个集群节点,集群节点数越多,集群规模越大,大数据系统的数据处理能力就越强。对于多租户共用一个集群的情况,需要统一的运维管理(operations management,OM)软件实现统一化管理。因此,一个租户可以对应至少一个节点,由大数据系统的OM软件统一进行管理。
图1示出了本申请实施例提供应用场景100的示意图。该应用场景100包括客户端110、资源总控节点120以及数据节点/计算节点130。
具体地,客户端110对应第一租户,该第一租户可以通过该客户端110向资源总控节点120发送待存储的数据和/或提交计算作业,请求资源总控节点120为其分配对应的资源,从而实现数据存储和/或运行计算作业。
该资源总控节点120是一个管理节点,用于管理集群中的所有数据节点/计算节点;在一种可能的实现方式中,该资源总控节点120中可以安装上述OM软件,以便于通过软件实现对该大数据系统中节点的统一管理。
该数据节点/计算节点130为该大数据系统的集群中任一节点,用于实现租户数据的存储和/或计算作业的运行。应理解,集群中的一个节点可以是数据节点,用于存储租户的数据,也可以是计算节点,用于完成租户的计算任务。因此,一个节点可以包括存储资源和/或计算资源,存储资源包括该节点中所有具备存储能力的资源,例如磁盘、闪存Flash、内存等等,可以用于存储租户的数据;计算资源用于完成租户通过客户端110提交的各类计算任务。
还应理解,图1仅仅示例性地示出了一个客户端和一个数据节点/计算节点,可选地,该应用场景100还可以包括多个数据节点/计算节点以及分别与多个租户对应的多个客户端,本申请实施例对此不做限定。
图2示出了本申请实施例提供的系统架构200的示意图。具体地,系统架构200包括三个租户(租户A、租户B和租户C)、三个资源区(resource zone,RZ)以及该三个RZ所对应的节点。上述三个RZ中的每个RZ都具有各自的资源共享策略,用于表示各自的节点资源能够被哪些租户使用。具体地,上述三个RZ可以包括第一RZ、第二RZ以及第三RZ,在预设的资源共享策略下,每个租户具有不同的使用权限,例如,第一RZ能够被所有租户使用,第二RZ仅能够被租户B使用,第三RZ仅能够被租户C使用;又例如,第一RZ能够被租户A和租户B使用,第二RZ能够被租户B和租户C使用,第三RZ仅能够被租户B使用。本申请实施例对此不作限定。
可选地,上述三个RZ包括第一保留资源区(reserved resource zone,RRZ)、第二RRZ和共享资源区(shared resource zone,SRZ),应理解,RRZ和SRZ的区别在于其资源共享策略的不同。其中,SRZ能够被所有租户使用,而第一RRZ仅能够被租户A使用,第二RRZ仅能够被租户B使用。因此,上述RZ分别对应公有存储资源池和私有存储资源池。该系统架构200体现出了大数据平台中的存储系统的资源区与租户之间的对应关系,用于实现租户的数据存储。
应理解,上述与RZ对应的节点可以为物理机,可以为虚拟机,也可以为容器,本申请实施例对此不作限定。
还应理解,图2仅仅示例性地示出了三个租户和三个RZ,可选地,该系统架构200还可以包括多个租户以及分别与该多个租户对应的多个RZ,本申请实施例对此不做限定,在一般情况下,仅存在一个SRZ。
对于多租户的大数据系统,大数据集群可以同时服务于多个租户,通常可以把租户分为两类:一类是大型的、业务成熟较稳定的租户,其业务类型较确定,业务规模的发展有稳定的预期,在本申请实施例中,这类租户可以根据自己的日常稳定资源诉求开辟RRZ 空间,RRZ为租户预留的隔离资源,而且无需付出任何运行期的性能代价;第二类租户是小型的、成长期的、不确定性较大的租户,其业务诉求不稳定,对资源的诉求也很难固定下来,这类租户可以不开辟RRZ,通过长期使用SRZ来实现资源诉求。
与租户独占集群的隔离方式不同,RRZ具备很好的弹性伸缩能力。一方面,对于租户稳定资源诉求的变化,可以很容易地从SRZ中拆分资源到RRZ中,或将资源从RRZ中归还到SRZ,对于租户来说,不用等待漫长的服务器采购过程,也避免了闲置资源的浪费,因此,这里体现出了RZ自身的伸缩能力。另一方面,当RRZ资源不足时,租户可以临时借用SRZ的资源,以应对突发性资源诉求和有预期的资源诉求尖峰,避免RRZ带来的资源的闲置和浪费。
图3示出了本申请实施例提供的数据存储方法300的示意性流程图。该方法300可以应用于图1所示的应用场景100以及图2所示的系统架构200中,但本申请实施例不限于此。
S310,接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;
S320,根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;
S330,根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;
S340,根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
具体地,在本申请实施例中,大数据系统中的集群被划分为多个资源区(resource zone,RZ),该多个RZ中的每个RZ中包括至少一个节点,并且该多个RZ中的每个RZ都具有资源共享策略,该资源共享策略用于表示该大数据系统中的租户对上述每个RZ的存储权限。基于该多个RZ中每个RZ的资源共享策略,在无存储权限的RZ中为租户分配存储资源是非法的,该资源总控节点必须依照上述资源共享策略为不同的租户分配不同的存储资源(即RZ),从而保证该大数据系统的正常运行。
应理解,RZ的资源共享策略可以是预配置的,具体可以通过多种方式来描述,本申请实施例对此不作限定。在一种可能的实现方式中,系统可以通过RZ标识和租户标识之间的对应关系来制定RZ的资源共享策略,如下表所示,
RZ标识 租户标识
1 *
2 1
3 3,4
4 foo_*
其中,*是通配符。上述资源共享策略为RZ1可以允许被所有租户存储数据,RZ2只能够允许租户1存储数据,RZ3可以允许租户3和租户4存储数据,RZ4可以允许租户标识前三个字母为foo的租户存储数据。应理解,上述RZ标识和/或租户标识还可以采用其 他任意长度的字符来表示,只要能对RZ和/或租户起到标识的作用即可,本申请实施例对此不作限定。
需要注意的是,这里的存储权限仅仅体现在数据放置上,该存储权限不包括对该资源区本身的数据的访问的限制。以Hadoop的分布式文件系统(hadoop distributed file system,HDFS)为例,租户1没有RZ3的数据存储权限,但租户1是否可以访问RZ3上的数据,则取决于HDFS上的访问控制列表(access control list,ACL)设置。
在本申请实施例的数据存储方法中,当第一租户需要存储数据时,该第一租户可以通过对应的客户端向资源总控节点发送数据写入请求,该数据写入请求用于表示该第一租户请求存储待写入数据的N个副本。其中,该数据写入请求可以携带待写入数据的N个副本,也可以携带待写入数据的1个副本和请求存储的副本个数N,本申请实施例对此不作限定。该资源总控节点接收该数据写入请求,根据该数据写入请求以及该第一租户对多个RZ中的每个RZ的存储权限,从该多个RZ中确定出该第一租户能够使用的至少一个RZ,在上述例子中,若该第一租户的标识为1,该至少一个RZ即为RZ1和RZ2。然后,该资源总控节点根据该数据写入请求、第一数据分布策略以及第二数据分布策略,将该N个副本分别存储至该至少一个RZ对应的至少一个节点中。
应理解,上述第一数据分布策略以及第二数据分布策略可以是预配置的,用于确定上述N个副本的分布,具体地,上述第一数据分布策略用于表示该N个副本在该至少一个RZ中的分布优先级,第二数据分布策略用于表示该N个副本在该至少一个RZ中每个RZ对应的多个节点中的分布优先级。因此,该资源总控节点的数据放置决策分为如下两个阶段:
(1)根据该数据写入请求以及第一数据分布策略,确定该N个副本在该至少一个RZ的分布;
应理解,这里的分布是指该N个副本与至少一个RZ的对应关系,例如,N=5,并且第一租户可以使用的至少一个RZ为RZ1和RZ2,那么根据第一数据分布策略,这5个副本在这两个RZ的分布可以为2个副本分布于RZ1,3个副本分布于RZ2。
(2)根据上述N个副本在至少一个RZ的分布以及第二数据分布策略,确定该N个副本分别分布于该至少一个RZ对应的至少一个节点中。
在本申请实施例中,上述两个阶段可以分别根据租户的不同需求或租户所处的不同应用场景,独立应用不同的策略,并组合产生出预期的数据分布结果,不需要预先为每一个应用场景对应的数据分布结果配置数据分布策略。首先,将租户能够使用的节点划分成至少一个资源区RZ,并分别配置该至少一个RZ的第一数据分布策略以及该至少一个RZ对应的节点的第二数据分布策略,资源总控节点在对数据进行存储时,可以进行两阶段决策,在第一阶段根据第一数据分布策略确定数据副本在至少一个RZ的分布,第二阶段在第一阶段的基础之上再结合第二数据分布策略确定数据副本在具体节点的分布。
因此,本申请实施例的数据存储方法,通过将租户能够使用的节点划分成至少一个资源区RZ,并分别配置该至少一个RZ的第一数据分布策略以及该至少一个RZ对应的节点的第二数据分布策略,资源总控节点在对数据进行存储时,可以根据该第一数据分布策略和第二数据分布策略进行两阶段决策,由于两个阶段的策略可以独立配置,使得资源总控节点能够对不同阶段的数据分布策略进行组合,根据租户的不同需求以及租户所处的场景,灵活控制租户需要存储的数据在节点的分布,降低了策略部署的复杂度。
应理解,资源共享策略与数据分布策略是相互协调,相互制约的,由于该第一租户能够使用的至少一个RZ具有不同的资源共享策略,因此,可以根据租户的不同需求,采用不同的数据分布策略来存储数据,从而获得不同的效果。引入两阶段决策的优点是两个阶段可以独立应用不同的策略,并组合产生出预期的效果,否则,对于每一种组合可能,都需要一个特定的策略实现。
还应理解,上述方法300可以由应用场景100中的资源总控节点120执行,但本申请实施例对此不作限定。
作为一个可选的实施例,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
具体地,该第一租户能够使用的至少一个RZ可以包括仅允许第一租户使用的第一RZ和允许包括该第一租户的多个租户使用的第二RZ,在上述例子中,第一RZ即为RZ2,第二RZ为RZ1。在这种情况下,基于上述的数据分布策略,可以将该N个副本分别放置于RZ1和RZ2对应的节点中。
例如,若N=3,即该待写入数据的副本个数为3,在第一阶段,该资源总控节点确定将其中的2个副本放置于RZ2中,剩下的1个副本放置于RZ1中,在第二阶段,该资源总控节点可以在RZ2中优先选择剩余空间多的两个节点放置上述2个副本,在RZ1中优先选择剩余空间多的1个节点放置上述1个副本。
应理解,第一数据分布策略可以为在第一RZ中优先存储,也可以为总是在第二RZ中存储部分副本,本申请实施例对此不作限定;第二数据分布策略可以为等概率分布策略,也可以为考虑节点剩余空间进行概率分布的不同策略,还可以为一些根据特定场景定制的其它策略,本申请实施例对此也不作限定。因此,在不同的第一数据分布策略和第二数据分布策略的组合下,可以实现多种预期的效果。
作为一个可选的实施例,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个资源区RZ的分布,包括:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
具体地,在上述第一数据分布策略为将该N个副本优先存储至该第一RZ中的情况下,该资源总控节点可以根据该数据写入请求、该第一数据分布策略以及该第一RZ的空间占用状态,确定该第一RZ中能够存储该待写入数据的P个副本。若N小于或等于P,那么该资源总控节点可以将该N个副本全部存储至该第一RZ中,从而实现第一RZ优先存储的目的。若N大于P,那么该资源总控节点可以将该待写入数据的P个副本存储至该第一RZ中,将剩余的N-P个副本存储至第二RZ中。
在本申请实施例中,租户的数据可以被尽可能存储在该租户的RRZ(即第一RZ)中,尽量减少对SRZ(即第二RZ)的使用。从租户成本角度而言,RRZ通常属于租户的预付 费资源,SRZ属于按量付费的后付费资源,更少的SRZ使用意味着更少的额外费用产生。从平台运营角度而言,RRZ的空间通常为租户预留独占,RRZ的利用率提升也意味着平台资源利用率的提升。
应理解,上述空间占用状态可以为RZ的空间利用率、剩余空间等等,本申请实施例对此不作限定。可选地,在这种策略下,系统可以设置RRZ的空间利用阈值或剩余空间阈值,当该RRZ的空间利用率达到该阈值后,该第一租户才可以使用SRZ的存储资源。因此,该资源总控节点可以根据数据写入请求、第一数据分布策略、第一RZ的空间占用状态以及空间利用阈值,确定该第一RZ中能够存储的该待写入数据的副本个数,本申请实施例对此不作限定。
作为一个可选的实施例,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个资源区RZ的分布,包括:根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
具体地,在上述第一数据分布策略为将该N个副本中的Q个副本存储至该第二RZ中的情况下,该资源总控节点可以根据该数据写入请求以及该第一数据分布策略,确定将该待写入数据的Q个副本存储至该第二RZ中,将剩余的N-Q个副本存储至该第一RZ中。
在本申请实施例中,允许租户指定在不同的RZ里的数据副本的存储个数。这个策略适用于不同的场景,例如,(1)出于最大数据访问叠加带宽的目的,如果该数据经常会被运行在SRZ(即第二RZ)的计算所访问,如果大部分数据的副本都集中在RRZ(即第一RZ)中,那么数据访问带宽会受到RRZ的节点个数的限制,从而限制了计算的并行能力,此时,不考虑RRZ的剩余空间,始终在SRZ中存放一定个数的数据副本是更好的选择;(2)租户间的数据共享,即该数据产生后会共享给其它的租户,如果数据副本集中在RRZ中,那么其它租户访问时也会占据该RRZ的输入输出I/O资源,从而对租户在RRZ中的自身应用造成性能干扰,此时,选择将部分数据副本放置在SRZ中可以避免对RRZ性能的干扰。
在一种可能的实现方式中,租户A请求数据写入,且待写入数据的副本个数为3,其期望的数据分布策略为RRZ优先,并设置当RRZ的空间利用率达到90%时使用SRZ的空间。数据写入请求从租户A的客户端发给服务器端的NameNode节点,这里NameNode节点即为上述的资源总控节点。NameNode节点为该租户选择3个节点来存储不同的副本。此时,RRZ空间利用率低于90%,NameNode节点选择了RRZ内的3个节点Node A、Node B和Node C并告知客户端,客户端向上述3个节点发送数据写入请求。数据副本写完后,客户端继续请求写入3个副本,新的数据写入请求被发给NameNode节点,NameNode节点发现此时RRZ的空间利用率已达到了90%,于是选择了SRZ中的3个节点Node X、Node Y和Node C,确定将后续的副本存储至Node X、Node Y和Node C。
作为一个可选的实施例,所述根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中,包括:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定 所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
具体地,在上述第一数据分布策略为将该N个副本中的Q个副本优先存储至该第二RZ中的情况下,对于剩余的N-Q个副本,按照第一数据分布策略需要放置在第一RZ中,但是,该第一RZ的内存有限,可能存在放不下的情况,因此,该资源总控节点需要根据该第一RZ的空间占用状态,确定剩余的N-Q个副本的分布。该资源总控节点可以先根据数据写入请求、第一数据分布策略以及第一RZ的空间占用状态,确定该第一RZ中能够存储该待写入数据的P个副本,若N-Q小于或等于P,那么该资源总控节点可以确定将该N-Q个副本全部存储至该第一RZ中;若N-Q大于P,那么该资源总控节点可以将该待写入数据的P个副本存储至该第一RZ中,将剩余的N-Q-P个副本存储至第二RZ中。
作为一个可选的实施例,所述方法还包括:根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;删除所述第二RZ中的所述全部或部分副本。
可选地,在将上述待写入数据的N个副本存储至第一RZ和第二RZ中之后,该资源总控节点还可以根据租户的需求对已存储的副本进行删除。在删除了第一RZ中待写入数据的M个副本之后,该第一RZ的空间变大,该资源总控节点可以将第二RZ中的副本搬迁至该第一RZ。应理解,对于不同的数据,副本的大小一般是不相同的,该资源总控节点需要根据该第一RZ的空间占用状态,确定能够从该第二RZ搬迁至该第一RZ的数据量。
可选地,可以设置空间利用阈值,当该第一RZ的空间利用率小于该空间利用阈值时,该资源总控节点可以将该第二RZ中的副本搬迁至该第一RZ中。
这样,能够提高RRZ的利用率,由于RRZ为租户预留独占,因此,从整体上来说,提高RRZ的利用率即提高了该大数据系统的资源利用率。
作为一个可选的实施例,在所述接收第一租户通过客户端发送的数据写入请求之前,所述方法还包括:接收资源区创建请求,所述资源区创建请求用于请求为所述第一租户创建所述至少一个RZ中的第三RZ;根据所述资源区创建请求,创建所述第三RZ,并确定与所述第三RZ对应的多个第一节点;为所述多个第一节点中的每个第一节点添加第一标签信息,所述第一标签信息用于标识所述第三RZ;为所述第三RZ添加第一资源共享策略,所述第一资源共享策略用于表示所述第三RZ能够被包括所述第一租户的至少一个租户访问。
具体地,该资源总控节点可以接收资源区创建请求,从而为该第一租户创建第三RZ,该第三RZ属于上述该第一租户能够使用的至少一个RZ。在创建该第三RZ时,该资源总控节点需要确定与该第三RZ对应的多个第一节点,并为该多个第一节点中的每个第一节点添加第一标签信息,该第一标签信息用于标识该第三RZ。此外,该资源总控节点还需要为该第三RZ添加第一资源共享策略,该第一资源共享策略用于表示该第三RZ能够被 包括该第一租户的至少一个租户访问。
应理解,上述标签信息存储于OM的数据库中,为避免存储系统在使用过程中对于OM的访问依赖,从而影响存储系统的处理性能,该标签信息通常会从OM系统同步到存储系统(例如HDFS)自身中去,因此,该标签信息在存储系统中形成了不同的存储分区,与RZ对应。基于该标签信息,资源总控节点可以根据上述数据分布策略确定副本的具体放置节点。
作为一个可选的实施例,所述方法还包括:接收资源区删除请求,所述资源区删除请求用于请求删除所述至少一个RZ中的第四RZ;根据所述资源区删除请求,删除与所述第四RZ对应的多个第二节点中存储的副本;删除所述多个第二节点中每个第二节点的第二标签信息,所述第二标签信息用于标识所述第四RZ;删除所述第四RZ的第二资源共享策略,所述第二资源共享策略用于表示所述第四RZ能够被包括所述第一租户的至少一个租户访问。
具体地,该资源总控节点可以接收资源区删除请求,确定删除该至少一个RZ中的第四RZ,该资源总控节点可以删除该第四RZ对应的多个第二节点中存储的数据的副本,再删除该多个第二节点中每个第二节点的第二标签信息以及该第四RZ的第二资源共享策略。
作为一个可选的实施例,所述方法还包括:接收资源区扩容请求,所述资源区扩容请求用于请求为所述至少一个RZ中的第五RZ进行扩容;根据所述资源区扩容请求,确定至少一个第三节点;为所述至少一个第三节点中的每个第三节点添加第三标签信息,所述第三标签信息用于标识所述第五RZ。
作为一个可选的实施例,所述方法还包括:接收资源区缩容请求,所述资源区缩容请求用于请求为所述至少一个RZ中的第六RZ进行缩容;根据所述资源区缩容请求,确定与所述第六RZ对应的至少一个第四节点;删除所述至少一个第四节点中每个第四节点的第四标签信息,所述第四标签信息用于标识所述第六RZ。
应理解,上述资源区的创建、删除、扩容以及缩容等管理操作可以由OM系统完成,OM通常有平台管理员进行操作,比较特殊的是云端的场景,租户自己(可能是租户自己的管理员)会通过OM系统自助完成RZ的管理维护,本申请实施例对此不作限定。
应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图4示出了本申请实施例提供的另一系统架构400示意图。具体地,系统架构400包括三个租户(租户A、租户B和租户C)、三个资源区(resource zone,RZ)以及该三个RZ所对应的节点。上述三个RZ中的每个RZ都具有各自的资源共享策略,用于表示各自的节点资源能够被哪些租户使用。具体地,上述三个RZ可以包括第一RZ、第二RZ以及第三RZ,在预设的资源共享策略下,每个租户具有不同的使用权限,例如,第一RZ能够被所有租户使用,第二RZ仅能够被租户B使用,第三RZ仅能够被租户C使用;又例如,第一RZ能够被租户A和租户B使用,第二RZ能够被租户B和租户C使用,第三RZ仅能够被租户B使用。本申请实施例对此不作限定。
可选地,上述三个RZ包括第一保留资源区(reserved resource zone,RRZ)、第二RRZ和共享资源区(shared resource zone,SRZ),根据不同的资源共享策略,SRZ能够 被所有租户使用,而第一RRZ仅能够被租户A使用,第二RRZ仅能够被租户B使用。因此,租户可以分别在具有计算权限的RZ中运行计算任务。该系统架构400体现出了大数据平台中的计算系统的资源区与租户之间的对应关系,用于实现租户的各类计算作业。
应理解,上述与RZ对应的节点可以为物理机,可以为虚拟机,也可以为容器,本申请实施例对此不作限定。
还应理解,图4仅仅示例性地示出了三个租户和三个RZ,可选地,该系统架构400还可以包括多个租户以及分别与该多个租户对应的多个RZ,本申请实施例对此不做限定,在一般情况下,仅存在一个SRZ。
图5示出了本申请实施例提供的任务分配方法的示意性流程图。该任务分配方法500可以应用于图1所示的应用场景100以及图4所示的系统架构400中,但本申请实施例不限于此。
S510,接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;
S520,根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;
S530,向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
具体地,该方法500可以由应用场景100中的资源总控节点120执行,且该第一节点可以为应用场景100中的计算节点130。该第一节点向资源总控节点发送计算任务分配请求,即向资源总控节点索要任务。资源总控节点接收该第一节点发送的计算任务分配请求,并根据第一节点的共享策略以及该大数据系统中租户的借用策略,确定将第一计算任务分配给该第一节点,向该第一节点发送任务指示信息。
应理解,上述共享策略用于表示第一节点能够为哪些租户提供计算资源,上述借用策略用于表示租户在自身的节点资源不足的情况下愿意使用其他哪些节点的计算资源。这些策略通常都是提前配置好的,存储在大数据系统运维管理OM软件的数据库中,一般由系统管理员和/或租户通过OM软件进行配置。
在本申请实施例中,节点为资源提供者,租户为资源使用者。节点的共享策略仅仅用来表达资源提供者如何共享自己的资源,并不关心具体的资源使用者;而租户的借用策略仅仅用来表达资源使用者如何借用可用的共享资源,并不关心具体的资源提供者,从而可以实现资源共享和借用机制的解耦。在多租户场景下,这样解耦的第一个意义是,资源的提供者和消费者无需建立资源规划的全局视图,只需要描述自己的共享和借用策略。与当前的主流做法相比,不需要由人对资源进行全面规划来设置符合预期的资源比例,尤其是在租户数量较多的情况下,简单便捷。第二个意义是,从职责和权限的角度,解耦后的表述方式更便于租户自助完成配置,例如,资源提供者可以单方面调整借用策略,无需涉及资源使用者的任何设置。
因此,本申请实施例的任务分配方法,通过资源总控节点根据大数据系统中计算节点 对计算资源的共享策略以及租户对计算资源的借用策略,对计算节点和租户提交的计算任务进行灵活匹配,从而为计算节点分配满足策略的计算任务,解耦了资源共享和借用机制,简单易行,提高了用户体验。
作为一个可选的实施例,所述根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户中每个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,包括:根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
具体地,该资源总控节点可以根据上述共享策略和借用策略,将系统中的至少一个计算任务与该第一节点进行匹配,将不满足该共享策略和该借用策略的计算任务过滤掉,从而确定为该第一节点分配的第一计算任务。
作为一个可选的实施例,所述计算任务分配请求包括所述第一节点的标识信息,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
作为一个可选的实施例,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述借用策略,过滤掉m-p个第二租户的计算任务,所述m-p个第二租户的借用策略表示不允许使用第一节点的计算资源,p为大于或等于0的整数;根据所述第一节点的标识信息和所述共享策略,在除所述m-p个第二租户的计算任务外的剩余租户的计算任务中过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户。
可选地,所述至少一个租户为M个租户,M为大于0的整数,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:根据所述第一节点的标识信息和所述共享策略,从所述M个租户的计算任务中过滤掉p个租户的计算任务;根据所述第一节点的标识信息和所述借用策略,从所述M个租户的计算任务中过滤掉q个租户的计算任务;将剩余的M-p个租户的计算任务和剩余的M-q的计算任务取交集。
具体地,上述采用共享策略过滤与采用借用策略过滤两个步骤并没有先后顺序,可以同时进行,本申请实施例对此不作限定。在这种过滤方式下,p个租户与q个租户中可能包括相同的租户,但这并不会对最终的过滤结果造成影响。
在一种具体的实现方式中,例如,M=5,系统中存在租户1的计算任务、租户2的计算任务、租户3的计算任务、租户4的计算任务以及租户5的计算任务,根据第一节点的标识信息以及上述共享策略,过滤掉了租户1的计算任务以及租户2的计算任务,剩余租户3的计算任务、租户4的计算任务以及租户5的计算任务;根据第一节点的标识信息以及上述借用策略,过滤掉了租户2的计算任务以及租户3的计算任务,剩余租户1的计算任务、租户4的计算任务以及租户5的计算任务,最后,将两组剩余的计算任务取交集, 得到租户4的计算任务以及租户5的计算任务。
应理解,该资源总控节点过滤掉不满足上述共享策略和借用策略的计算任务可以采用不同的过滤顺序,即可以先根据共享策略过滤,再根据借用策略过滤,也可以先根据借用策略过滤,再根据共享策略过滤,还可以分别按照共享策略和借用策略过滤,最后取两个过滤结果的交集,本申请实施例对此不作限定。
作为一个可选的实施例,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
具体地,对于系统架构400,系统中的节点可以被划分为多个资源区RZ,该多个RZ中包括保留资源区RRZ和共享资源区SRZ。上述第一节点可以对应第一RZ,该第一RZ可以是上述系统架构400中的第一RRZ、第二RRZ和SRZ中的任意一个。在这种情况下,RZ的共享策略就是该RZ中每个节点的共享策略,资源提供者就是RZ,而资源使用者就是租户以及租户的计算任务。对于RRZ而言,RRZ会归属到具体的租户,从这个角度而言,租户可能同时具备资源提供者和资源借用者的双重身份。
应理解,一个RZ应该只包括相同共享策略的节点,这个相同的共享策略即为RZ的共享策略。根据RZ的共享策略,可以确定在该RZ上具有使用权限的租户。可选地,这个使用权限可以包括对存储资源和计算资源的使用,从而实现存储系统与计算系统的一体化,即将存储资源与计算资源拉通考虑。此外,从部署方面讲,就不需要为每个节点都设置共享策略了,为一个RZ设置共享策略即可,有利于节省设置的复杂度。
作为一个可选的实施例,所述共享策略为下列策略中的任意一个:严格保留策略、空闲时共享策略以及公平共享策略,其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
具体地,上述严格保留策略、空闲时共享策略以及公平共享策略可以为节点的共享策略,也可以为RZ的共享策略。换句话说,上述资源总控节点就是根据每个RZ的共享策略来具体区分租户能够使用的RZ的,特别是RRZ和SRZ。严格保留策略即严格保留资源,在严格保留策略下,RZ中的资源仅允许该RZ所属租户使用,即使空闲也不允许其它租户使用;在空闲时共享策略下,该RZ为该RZ所属租户保留资源,但在资源空闲时允许其它租户暂时借用,可选地,该空闲时共享策略可以在该RZ所属租户需要时以最高优先级抢占,保证该RZ所属租户对于该RZ资源的100%权重;公平共享策略即多租户共享资源,在公平共享策略下,该RZ允许多个租户以约定的权重公平地使用其资源。基于上述不同的策略,可以产生不同性质的RZ,例如,具有公平共享策略的RZ为SRZ,具有严格保留策略的RZ为RRZ。
应理解,本申请实施例仅仅以上述三个共享策略为例进行说明,但系统管理员或租户还可以为节点或RZ设置其他不同的共享策略,本申请实施例对此不作限定。
作为一个可选的实施例,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节 点的计算资源;其中,所述至少一个第三节点不包括所述第一节点。
具体地,租户的借用策略可以由租户配置并存储在数据库中,租户一般都拥有自身的节点资源,即系统会初始配置一部分节点资源为租户提供服务,例如,系统架构400中的租户A对应第一RRZ,该租户A可以使用第一RRZ中的资源。若第一RRZ的资源不够用了,该租户A就需要借用资源,在这种情况下,租户A可以设置自身的借用策略。该借用策略可以是在租户A可用的资源小于第一阈值时允许借用资源,那么,在上述第一阈值为0的情况下,租户A的借用策略为永远不可以借用共享资源;在上述第一阈值足够大的情况下,租户A的借用策略为永远可以借用共享资源。此外,该借用策略可以是在租户A借用的资源大于第二阈值时不再允许该租户A借用资源,还可以是其他的策略,本申请实施例对此不作限定。
例如,租户A提交的作业A正在运行,其期望的策略为RRZ优先,并设置当RRZ连续1分钟无法分配资源时使用SRZ的资源。作业的前100个任务Task1到Task100都运行在RRZ内,Task101在等待调度,1分钟之后,RRZ无空闲资源来运行Task101,Task101会被调度到SRZ上运行。
作为一个可选的实施例,所述借用策略还包括:所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
具体地,可以通过设置租户的借用策略对计算任务的计算位置进行优化,即更倾向于在与计算任务对应的数据的存储节点上调度该计算任务,这样能够提高系统性能以及数据的安全性。
应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图6示出了本申请实施例提供的另一任务分配方法600的示意性流程图。该方法600同样可以应用于上述系统架构500,但本申请实施例不限于此。
在S610中,第一节点向资源总控节点发送心跳包,用于索要计算任务;
在S620中,该资源总控节点接收该心跳包,对系统中的所有计算任务按照业务的优先级进行排序;
在S630中,采用预设的限制条件,过滤掉执行时间较长的计算任务;
在S640中,根据该第一节点的共享策略,过滤掉不满足该共享策略的租户的计算任务;
在S650中,根据系统中的至少一个租户的借用策略,过滤掉不满足该借用策略的租户的计算任务;
在S660中,在剩下的计算任务中确定第一计算任务,并将该第一计算任务分配给该第一节点。
具体地,方法600以Hadoop的资源管理(yet another resource negotiator,YARN)系统为例,体现出了系统中资源总控节点给计算节点分配任务的流程。在Hadoop集群中,任务执行本身不存在优先级的概念,任务执行采用先进先出的策略。但是每个任务由于对应的业务不同,存在优先级的高低,而Hadoop集群中任务执行时间有可能会很久,这样就会影响其他任务运行,特别是优先级更高的任务运行。因此需要对系统中的任务执行进 行调度。在本申请实施例中,添加了S640和S650两个过滤步骤,通过采用第一节点对计算资源的共享策略以及系统的租户对计算资源的借用策略,为该第一节点分配满足上述策略的计算任务,解耦了资源共享和借用机制,简单易行,提高了用户体验。
应理解,在S660中,第一计算任务的最终确定可以是在剩余的计算任务中任意选取,也可以是按照剩余任务的优先级顺序,选择优先级最高的计算任务作为第一计算任务,本申请实施例对此不作限定。
应理解,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
图7示出了本申请实施例提供的另一系统架构700示意图。上述数据存储方法300、任务分配方法500以及任务分配方法600均可以应用于该系统架构700中,但本申请实施例对此不作限定。
具体地,系统架构700包括三个租户(租户A、租户B和租户C)、三个资源区(resource zone,RZ)以及该三个RZ所对应的节点。上述三个RZ中的每个RZ都具有各自的资源共享策略,用于表示各自的节点资源能够被哪些租户使用。具体地,上述三个RZ可以包括第一RZ、第二RZ以及第三RZ,在预设的资源共享策略下,每个租户具有不同的使用权限,例如,第一RZ能够被所有租户使用,第二RZ仅能够被租户B使用,第三RZ仅能够被租户C使用;又例如,第一RZ能够被租户A和租户B使用,第二RZ能够被租户B和租户C使用,第三RZ仅能够被租户B使用。本申请实施例对此不作限定。
可选地,上述三个RZ包括第一保留资源区(reserved resource zone,RRZ)、第二RRZ和共享资源区(shared resource zone,SRZ),其中,SRZ能够被所有租户使用,而第一RRZ仅能够被租户A使用,第二RRZ仅能够被租户B使用。上述资源区包括了存储资源和计算资源,因此,租户可以分别在对应的RZ中存储数据和/或运行计算任务。
从计算资源与存储资源的角度,RZ可以分为计算RZ和存储RZ,其中计算RZ负责计算资源的调度,例如租户的计算任务、常驻服务等等;存储RZ负责存储资源的调度,即租户数据的放置。因此,系统架构200体现的是存储RZ,系统架构400体现的是计算RZ,但是在通常情况下,计算RZ与存储RZ需要重叠放置,即分配在同一组节点上,以提升系统性能和安全性,系统架构700即示出了计算RZ和存储RZ重叠放置的情况。这样,可以在不同的系统间同步考虑计算资源和存储资源的分布,从而增强资源部署的灵活性。
应理解,上述与RZ对应的节点可以为物理机,可以为虚拟机,也可以为容器,本申请实施例对此不作限定。
还应理解,图7仅仅示例性地示出了三个租户和三个RZ,可选地,该系统架构400还可以包括多个租户以及分别与该多个租户对应的多个RZ,本申请实施例对此不做限定,在一般情况下,仅存在一个SRZ。
上文中结合图1至图7,详细描述了根据本申请实施例的方法,下面将结合图8至图11,详细描述根据本申请实施例的装置。
图8示出了本申请实施例提供的数据存储装置800的示意性框图,该装置800包括:
接收单元810,用于接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;
确定单元820,用于根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;
所述确定单元820还用于:根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;
存储单元830,用于根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
本申请实施例的数据存储装置,通过将租户能够使用的节点划分成至少一个资源区RZ,并分别配置该至少一个RZ的第一数据分布策略以及该至少一个RZ对应的节点的第二数据分布策略,资源总控节点在对数据进行存储时,可以根据该第一数据分布策略和第二数据分布策略进行两阶段决策,由于两个阶段的策略可以独立配置,使得资源总控节点能够对不同阶段的数据分布策略进行组合,根据租户的不同需求以及租户所处的场景,灵活控制租户需要存储的数据在节点的分布,降低了策略部署的复杂度。
可选地,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
可选地,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,所述确定单元820具体用于:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
可选地,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,所述确定单元820具体用于:根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
可选地,所述确定单元820具体用于:根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
可选地,所述存储单元830还用于:根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;所述装置还包括:删除单元,用于删除所述第二 RZ中的所述全部或部分副本。
应理解,这里的装置800以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置800可以具体为上述实施例300中的资源总控节点,装置800可以用于与执行上述方法实施例300的资源总控节点对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图9示出了本申请实施例提供的任务分配装置900的示意性框图,该装置900包括:
接收单元910,用于接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;
分配单元920,用于根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;
发送单元930,用于向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
本申请实施例的任务分配装置,通过资源总控节点根据大数据系统中计算节点对计算资源的共享策略以及租户对计算资源的借用策略,对计算节点和租户提交的计算任务进行灵活匹配,从而为计算节点分配满足策略的计算任务,解耦了资源共享和借用机制,简单易行,提高了用户体验。
可选地,所述装置还包括:匹配单元,用于根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;过滤单元,用于从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;确定单元,用于从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
可选地,所述计算任务分配请求包括所述第一节点的标识信息,所述过滤单元具体用于:根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
可选地,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
可选地,所述共享策略为下列策略中的任意一个:严格保留策略、空闲时共享策略以及公平共享策略,其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
可选地,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节点的计算资源;其中,所述至少一个第三节点不包括所述第一节点。
可选地,所述借用策略还包括:所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
应理解,这里的装置900以功能单元的形式体现。这里的术语“单元”可以指应用特有集成电路(application specific integrated circuit,ASIC)、电子电路、用于执行一个或多个软件或固件程序的处理器(例如共享处理器、专有处理器或组处理器等)和存储器、合并逻辑电路和/或其它支持所描述的功能的合适组件。在一个可选例子中,本领域技术人员可以理解,装置900可以具体为上述实施例500或600中的资源总控节点,装置900可以用于与执行上述方法实施例500或600的资源总控节点对应的各个流程和/或步骤,为避免重复,在此不再赘述。
图10示出了本申请实施例提供的另一数据存储装置1000的示意性框图。该装置1000包括处理器1010、收发器1020和存储器1030。其中,处理器1010、收发器1020和存储器1030通过内部连接通路互相通信,该存储器1030用于存储指令,该处理器1010用于执行该存储器1030存储的指令,以控制该收发器1020发送信号和/或接收信号。
其中,该收发器1020用于:接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;该处理器1010用于:根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
应理解,装置1000可以具体为上述实施例300中的资源总控节点,并且可以用于执行上述方法实施例300中与资源总控节点对应的各个步骤和/或流程。可选地,该存储器1030可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。该处理器1010可以用于执行存储器中存储的指令,并且当该处理器1010执行存储器中存储的指令时,该处理器1010用于执行与上述实施例300的资源总控节点对应的的各个步骤和/或流程。
图11示出了本申请实施例提供的另一任务分配装置1100的示意性框图。该装置1100包括处理器1110、收发器1120和存储器1130。其中,处理器1110、收发器1120和存储器1130通过内部连接通路互相通信,该存储器1130用于存储指令,该处理器1110用于执行该存储器1130存储的指令,以控制该收发器1120发送信号和/或接收信号。
其中,该收发器1120用于接收第一节点发送的计算任务分配请求,所述计算任务分 配请求用于请求为所述第一节点分配计算任务;该处理器1110用于根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;该收发器1120用于向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
应理解,装置1100可以具体为上述实施例500或600中的资源总控节点,并且可以用于执行上述方法实施例500或600的资源总控节点对应的各个步骤和/或流程。可选地,该存储器1130可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据。存储器的一部分还可以包括非易失性随机存取存储器。例如,存储器还可以存储设备类型的信息。该处理器1110可以用于执行存储器中存储的指令,并且当该处理器1110执行存储器中存储的指令时,该处理器1110用于执行与上述实施例500或600的资源总控节点对应的各个步骤和/或流程。
在本申请实施例中,资源总控节点可以是具备上述数据存储功能和/或任务分配功能的任意装置,即资源总括节点可以仅用于执行上述数据存储方法,可以仅用于执行上述任务分配方法,还可以既用于执行上述数据存储方法,又用于执行上述任务分配方法,本申请实施例对此不作限定。
应理解,在本申请实施例中,上述装置的处理器可以是中央处理单元(central processing unit,CPU),该处理器还可以是其他通用处理器、数字信号处理器(digital signal processing,DSP)、专用集成电路(application specific integrated circuit,ASIC)、现场可编程门阵列(field-programmable gate array,FPGA)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。
在实现过程中,上述方法的各步骤可以通过处理器中的硬件的集成逻辑电路或者软件形式的指令完成。结合本申请实施例所公开的方法的步骤可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件单元组合执行完成。软件单元可以位于随机存储器,闪存、只读存储器,可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器执行存储器中的指令,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。
应理解,说明书通篇中提到的“一个实施例”或“一实施例”意味着与实施例有关的特定特征、结构或特性包括在本申请的至少一个实施例中。因此,在整个说明书各处出现的“在一个实施例中”或“在一实施例中”未必一定指相同的实施例。此外,这些特定的特征、结构或特性可以任意适合的方式结合在一个或多个实施例中。应理解,在本申请的各种实施例中,上述各过程的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符 “/”,一般表示前后关联对象是一种“或”的关系。
应理解,在本申请实施例中,“与A相应的B”表示B与A相关联,根据A可以确定B。但还应理解,根据A确定B并不意味着仅仅根据A确定B,还可以根据A和/或其它信息确定B。
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、计算机软件或者二者的结合来实现,为了清楚地说明硬件和软件的可互换性,在上述说明中已经按照功能一般性地描述了各示例的组成及步骤。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另外,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口、装置或单元的间接耦合或通信连接,也可以是电的,机械的或其它的形式连接。
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本申请实施例方案的目的。
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以是两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分,或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易向到各种等效的修改或替换,这些修改或替换都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。

Claims (41)

  1. 一种数据存储方法,其特征在于,包括:
    接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;
    根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;
    根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;
    根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
  2. 根据权利要求1所述的方法,其特征在于,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
  3. 根据权利要求2所述的方法,其特征在于,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,
    所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个RZ的分布,包括:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;
    在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  4. 根据权利要求2所述的方法,其特征在于,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,
    所述根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在至少一个RZ的分布,包括:
    根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
  5. 根据权利要求4所述的方法,其特征在于,所述根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中,包括:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所 述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;
    在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  6. 根据权利要求2至5中任一项所述的方法,其特征在于,所述方法还包括:
    根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    删除所述第二RZ中的所述全部或部分副本。
  7. 一种任务分配方法,其特征在于,包括:
    接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;
    根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;
    向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
  8. 根据权利要求7所述的方法,其特征在于,所述根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户中每个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,包括:
    根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;
    从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;
    从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
  9. 根据权利要求8所述的方法,其特征在于,所述计算任务分配请求包括所述第一节点的标识信息,所述从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,包括:
    根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;
    根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
  10. 根据权利要求7至9中任一项所述的方法,其特征在于,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
  11. 根据权利要求7至10中任一项所述的方法,其特征在于,所述共享策略为下列策略中的任意一个:
    严格保留策略、空闲时共享策略以及公平共享策略,
    其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
  12. 根据权利要求7至11中任一项所述的方法,其特征在于,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:
    在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或
    在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节点的计算资源;
    其中,所述至少一个第三节点不包括所述第一节点。
  13. 根据权利要求12所述的方法,其特征在于,所述借用策略还包括:
    所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
  14. 一种数据存储装置,其特征在于,包括:
    接收单元,用于接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;
    确定单元,用于根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;
    所述确定单元还用于:
    根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;
    存储单元,用于根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
  15. 根据权利要求14所述的装置,其特征在于,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
  16. 根据权利要求15所述的装置,其特征在于,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,
    所述确定单元具体用于:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;
    在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所 述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  17. 根据权利要求15所述的装置,其特征在于,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,
    所述确定单元具体用于:
    根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
  18. 根据权利要求17所述的装置,其特征在于,所述确定单元具体用于:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;
    在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  19. 根据权利要求15至18中任一项所述的装置,其特征在于,所述存储单元还用于:
    根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    所述装置还包括:
    删除单元,用于删除所述第二RZ中的所述全部或部分副本。
  20. 一种任务分配装置,其特征在于,包括:
    接收单元,用于接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;
    分配单元,用于根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;
    发送单元,用于向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
  21. 根据权利要求20所述的装置,其特征在于,所述装置还包括:
    匹配单元,用于根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;
    过滤单元,用于从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;
    确定单元,用于从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
  22. 根据权利要求21所述的装置,其特征在于,所述计算任务分配请求包括所述第 一节点的标识信息,所述过滤单元具体用于:
    根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;
    根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
  23. 根据权利要求20至22中任一项所述的装置,其特征在于,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
  24. 根据权利要求20至23中任一项所述的装置,其特征在于,所述共享策略为下列策略中的任意一个:
    严格保留策略、空闲时共享策略以及公平共享策略,
    其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
  25. 根据权利要求20至24中任一项所述的装置,其特征在于,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:
    在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或
    在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节点的计算资源;
    其中,所述至少一个第三节点不包括所述第一节点。
  26. 根据权利要求25所述的装置,其特征在于,所述借用策略还包括:
    所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
  27. 一种数据存储装置,其特征在于,包括:收发器、存储器以及处理器,其中,所述存储器用于存储指令,所述处理器与所述存储器和所述收发器相连,用于执行所述存储器存储的所述指令,以在执行所述指令时执行如下步骤:
    通过所述收发器接收第一租户通过客户端发送的数据写入请求,所述数据写入请求用于表示所述第一租户请求存储待写入数据的N个副本,N为大于或等于1的整数;
    根据所述数据写入请求以及所述第一租户对多个资源区RZ中的每个RZ的存储权限,从所述多个RZ中确定所述第一租户能够使用的至少一个RZ;
    根据所述数据写入请求以及第一数据分布策略,确定所述N个副本在所述至少一个RZ的分布,所述第一数据分布策略用于表示所述N个副本在所述至少一个RZ中的分布优先级;
    根据所述N个副本在所述至少一个RZ的分布以及第二数据分布策略,将所述N个副本分别存储至所述至少一个RZ对应的至少一个节点中,所述第二数据分布策略用于表示所述N个副本在所述至少一个RZ中每个RZ对应的多个节点中的分布优先级。
  28. 根据权利要求27所述的装置,其特征在于,所述至少一个RZ包括第一RZ和第二RZ,所述第一RZ为仅允许所述第一租户使用的保留资源区RRZ,所述第二RZ为允许包括所述第一租户的多个租户使用的共享资源区SRZ。
  29. 根据权利要求28所述的装置,其特征在于,所述第一数据分布策略为将所述N个副本优先存储至所述第一RZ中,
    所述处理器具体用于:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N小于或等于P的情况下,确定所述N个副本分布于所述第一RZ中;
    在N大于P的情况下,确定所述N个副本中的P个副本分布于所述第一RZ中,所述N个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  30. 根据权利要求28所述的装置,其特征在于,所述第一数据分布策略为将所述N个副本中的Q个副本存储至所述第二RZ中,Q为大于或等于1的整数,且Q小于或等于N,
    所述处理器具体用于:
    根据所述数据写入请求以及所述第一数据分布策略,确定所述N个副本中的Q个副本分布于所述第二RZ中,所述N个副本中除所述Q个副本外的剩余N-Q个副本分布于所述第一RZ中。
  31. 根据权利要求30所述的装置,其特征在于,所述处理器具体用于:
    根据所述数据写入请求、所述第一数据分布策略以及所述第一RZ的空间占用状态,确定所述第一RZ中能够存储所述待写入数据的P个副本,P为大于或等于1的整数,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    在N-Q小于或等于P的情况下,确定所述N-Q个副本分布于所述第一RZ中;
    在N-Q大于P的情况下,确定所述N-Q个副本中的P个副本分布于所述第一RZ中,所述N-Q个副本中除所述P个副本外的剩余副本分布于所述第二RZ中。
  32. 根据权利要求28至31中任一项所述的装置,其特征在于,所述处理器还用于:
    根据所述第一RZ的空间占用状态,将所述第二RZ中的全部或部分副本存储至所述第一RZ中,所述空间占用状态用于表示所述第一RZ已被占用的空间大小或剩余的空间大小;
    删除所述第二RZ中的所述全部或部分副本。
  33. 一种任务分配装置,其特征在于,包括:收发器、存储器以及处理器,其中,所述存储器用于存储指令,所述处理器与所述存储器和所述收发器相连,用于执行所述存储器存储的所述指令,以在执行所述指令时执行如下步骤:
    通过所述收发器接收第一节点发送的计算任务分配请求,所述计算任务分配请求用于请求为所述第一节点分配计算任务;
    根据所述计算任务分配请求、所述第一节点的共享策略以及至少一个租户的借用策略,从所述至少一个租户的计算任务中,为所述第一节点分配第一计算任务,其中,所述共享策略用于表示所述第一节点为所述至少一个租户中i个租户的计算任务提供计算资 源,所述借用策略用于表示所述至少一个租户中第一租户允许使用j个节点的计算资源,i和j均为大于0的整数;
    通过所述收发器向所述第一节点发送任务指示信息,所述任务指示信息用于指示所述第一计算任务。
  34. 根据权利要求33所述的装置,其特征在于,所述处理器具体用于:
    根据所述计算任务分配请求,将所述至少一个租户的计算任务与所述共享策略以及所述借用策略进行匹配;
    从所述至少一个租户的计算任务中过滤掉不满足所述共享策略以及所述借用策略的m个租户的计算任务,m为大于或等于1的整数;
    从除所述m个租户的计算任务外的剩余的计算任务中确定所述第一计算任务。
  35. 根据权利要求34所述的装置,其特征在于,所述计算任务分配请求包括所述第一节点的标识信息,所述处理器具体用于:
    根据所述第一节点的标识信息和所述共享策略,过滤掉p个第一租户的计算任务,所述p个第一租户不属于所述i个租户,p为大于或等于0的整数;
    根据所述第一节点的标识信息和所述借用策略,在除所述p个第一租户的计算任务外的剩余租户的计算任务中过滤掉m-p个第二租户的计算任务,所述第一节点不属于所述j个节点。
  36. 根据权利要求33至35中任一项所述的装置,其特征在于,所述第一节点是第一资源区RZ中的节点,所述第一资源区中包括的节点具有相同的共享策略,所述相同的共享策略为所述第一资源区的共享策略。
  37. 根据权利要求33至36中任一项所述的装置,其特征在于,所述共享策略为下列策略中的任意一个:
    严格保留策略、空闲时共享策略以及公平共享策略,
    其中,所述严格保留策略用于表示仅允许所述i个租户的计算任务使用所述第一节点的计算资源,所述空闲时共享策略用于表示仅在所述第一节点空闲时允许除所述i个租户之外的其他租户使用所述第一节点的计算资源,所述公平共享策略用于表示允许所述至少一个租户公平地使用所述第一节点的计算资源。
  38. 根据权利要求33至37中任一项所述的装置,其特征在于,所述至少一个租户中的第三租户被初始配置至少一个第三节点,所述第三租户的借用策略包括:
    在所述至少一个第三节点中能够使用的节点的数量小于第一阈值的情况下,所述第三租户允许借用所述第一节点的计算资源;和/或
    在所述第三租户已借用的节点的数量大于第二阈值的情况下,所述第三租户不允许借用所述第一节点的计算资源;
    其中,所述至少一个第三节点不包括所述第一节点。
  39. 根据权利要求38所述的装置,其特征在于,所述借用策略还包括:
    所述第三租户优先使用第四节点,所述第四节点存储有与所述第三租户的计算任务对应的数据,所述第四节点属于所述第三租户的节点资源。
  40. 一种计算机可读存储介质,包括指令,当所述指令在计算机上运行时,使得所述计算机执行如权利要求1至6中任一项所述的方法。
  41. 一种计算机可读存储介质,包括指令,当所述指令在计算机上运行时,使得所述计算机执行如权利要求7至13中任一项所述的方法。
PCT/CN2018/073315 2017-03-29 2018-01-19 数据存储方法及装置 WO2018176998A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
EP18776326.3A EP3594798B1 (en) 2017-03-29 2018-01-19 Data storage method and device
US16/586,074 US10972542B2 (en) 2017-03-29 2019-09-27 Data storage method and apparatus
US17/198,908 US11575748B2 (en) 2017-03-29 2021-03-11 Data storage method and apparatus for combining different data distribution policies

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201710198809.3 2017-03-29
CN201710198809.3A CN108667867B (zh) 2017-03-29 2017-03-29 数据存储方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/586,074 Continuation US10972542B2 (en) 2017-03-29 2019-09-27 Data storage method and apparatus

Publications (1)

Publication Number Publication Date
WO2018176998A1 true WO2018176998A1 (zh) 2018-10-04

Family

ID=63675218

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/073315 WO2018176998A1 (zh) 2017-03-29 2018-01-19 数据存储方法及装置

Country Status (4)

Country Link
US (2) US10972542B2 (zh)
EP (1) EP3594798B1 (zh)
CN (1) CN108667867B (zh)
WO (1) WO2018176998A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580755A (zh) * 2020-05-09 2020-08-25 杭州海康威视系统技术有限公司 分布式数据处理系统、分布式数据处理方法

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11029999B1 (en) * 2018-09-06 2021-06-08 Amazon Technologies, Inc. Lottery-based resource allocation with capacity guarantees
CN109960587A (zh) * 2019-02-27 2019-07-02 厦门市世纪网通网络服务有限公司 超融合云计算系统的存储资源分配方法和装置
US20200364093A1 (en) * 2019-05-14 2020-11-19 Pricewaterhousecoopers Llp System and methods for generating secure ephemeral cloud-based computing resources for data operations
US11157186B2 (en) * 2019-06-24 2021-10-26 Western Digital Technologies, Inc. Distributed object storage system with dynamic spreading
CN112578992B (zh) * 2019-09-27 2022-07-22 西安华为技术有限公司 一种数据存储方法和数据存储装置
US11082487B1 (en) * 2020-09-22 2021-08-03 Vignet Incorporated Data sharing across decentralized clinical trials using customized data access policies
US11875046B2 (en) * 2021-02-05 2024-01-16 Samsung Electronics Co., Ltd. Systems and methods for storage device resource management
CN113448726B (zh) * 2021-05-28 2022-11-04 山东英信计算机技术有限公司 一种资源调度方法和装置
US11687492B2 (en) * 2021-06-21 2023-06-27 International Business Machines Corporation Selective data deduplication in a multitenant environment
JP7412405B2 (ja) * 2021-12-23 2024-01-12 株式会社日立製作所 情報処理システム、情報処理方法
US11790107B1 (en) 2022-11-03 2023-10-17 Vignet Incorporated Data sharing platform for researchers conducting clinical trials
CN115994019B (zh) * 2023-01-10 2023-06-06 杭州比智科技有限公司 基于大数据集群下多租户资源动态计算的策略方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096602A (zh) * 2009-12-15 2011-06-15 中国移动通信集团公司 一种任务调度方法及其系统和设备
US20130166207A1 (en) * 2011-12-21 2013-06-27 Telenav, Inc. Navigation system with point of interest harvesting mechanism and method of operation thereof
CN103384550A (zh) * 2012-12-28 2013-11-06 华为技术有限公司 储存数据的方法及装置
CN105630418A (zh) * 2015-12-24 2016-06-01 曙光信息产业(北京)有限公司 一种数据存储方法及装置
CN106095586A (zh) * 2016-06-23 2016-11-09 东软集团股份有限公司 一种任务分配方法、装置及系统
CN106201338A (zh) * 2016-06-28 2016-12-07 华为技术有限公司 数据存储方法及装置

Family Cites Families (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2004110367A (ja) 2002-09-18 2004-04-08 Hitachi Ltd 記憶装置システムの制御方法、記憶制御装置、および記憶装置システム
CN100385860C (zh) * 2005-04-29 2008-04-30 北京邦诺存储科技有限公司 一种保障存储网络数据安全的方法及装置
CN101499061A (zh) * 2008-01-30 2009-08-05 国际商业机器公司 面向多租户的数据库引擎及其数据访问方法
US8560639B2 (en) * 2009-04-24 2013-10-15 Microsoft Corporation Dynamic placement of replica data
US8566520B1 (en) * 2009-10-05 2013-10-22 Marvell International Ltd. Storage space allocation for logical disk creation
US8161077B2 (en) * 2009-10-21 2012-04-17 Delphix Corp. Datacenter workflow automation scenarios using virtual databases
CN102110060A (zh) * 2009-12-25 2011-06-29 联想(北京)有限公司 一种管理并访问多存储区域的方法和终端
KR101544485B1 (ko) * 2011-04-25 2015-08-17 주식회사 케이티 클라우드 스토리지 시스템에서 복수개의 복제본을 분산 저장하는 방법 및 장치
CN102664923A (zh) 2012-03-30 2012-09-12 浪潮电子信息产业股份有限公司 一种利用Linux全局文件系统实现共享存储池的方法
CN102663096B (zh) * 2012-04-11 2015-12-16 北京像素软件科技股份有限公司 一种基于数据缓存技术读取数据的方法
US10169090B2 (en) * 2012-09-12 2019-01-01 Salesforce.Com, Inc. Facilitating tiered service model-based fair allocation of resources for application servers in multi-tenant environments
CN102946429A (zh) * 2012-11-07 2013-02-27 浪潮电子信息产业股份有限公司 一种基于云存储的高效资源动态调度方法
CN103873507A (zh) * 2012-12-12 2014-06-18 鸿富锦精密工业(深圳)有限公司 数据分块上传与存储系统及方法
US9571567B2 (en) 2013-03-14 2017-02-14 Vmware, Inc. Methods and systems to manage computer resources in elastic multi-tenant cloud computing systems
US20140280595A1 (en) * 2013-03-15 2014-09-18 Polycom, Inc. Cloud Based Elastic Load Allocation for Multi-media Conferencing
US9794135B2 (en) * 2013-11-11 2017-10-17 Amazon Technologies, Inc. Managed service for acquisition, storage and consumption of large-scale data streams
EP3072263B1 (en) * 2013-11-18 2017-10-25 Telefonaktiebolaget LM Ericsson (publ) Multi-tenant isolation in a cloud environment using software defined networking
US10372685B2 (en) * 2014-03-31 2019-08-06 Amazon Technologies, Inc. Scalable file storage service
US9471803B2 (en) 2014-08-07 2016-10-18 Emc Corporation System and method for secure multi-tenancy in an operating system of a storage system
CN104881749A (zh) * 2015-06-01 2015-09-02 北京圆通慧达管理软件开发有限公司 面向多租户的数据管理方法和数据存储系统
US10642783B2 (en) * 2018-01-12 2020-05-05 Vmware, Inc. System and method of using in-memory replicated object to support file services wherein file server converts request to block I/O command of file handle, replicating said block I/O command across plural distributed storage module and performing said block I/O command by local storage module

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102096602A (zh) * 2009-12-15 2011-06-15 中国移动通信集团公司 一种任务调度方法及其系统和设备
US20130166207A1 (en) * 2011-12-21 2013-06-27 Telenav, Inc. Navigation system with point of interest harvesting mechanism and method of operation thereof
CN103384550A (zh) * 2012-12-28 2013-11-06 华为技术有限公司 储存数据的方法及装置
CN105630418A (zh) * 2015-12-24 2016-06-01 曙光信息产业(北京)有限公司 一种数据存储方法及装置
CN106095586A (zh) * 2016-06-23 2016-11-09 东软集团股份有限公司 一种任务分配方法、装置及系统
CN106201338A (zh) * 2016-06-28 2016-12-07 华为技术有限公司 数据存储方法及装置

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3594798A4

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111580755A (zh) * 2020-05-09 2020-08-25 杭州海康威视系统技术有限公司 分布式数据处理系统、分布式数据处理方法
CN111580755B (zh) * 2020-05-09 2022-07-05 杭州海康威视系统技术有限公司 分布式数据处理系统、分布式数据处理方法

Also Published As

Publication number Publication date
US11575748B2 (en) 2023-02-07
US10972542B2 (en) 2021-04-06
US20210203723A1 (en) 2021-07-01
CN108667867A (zh) 2018-10-16
CN108667867B (zh) 2021-05-18
EP3594798A1 (en) 2020-01-15
EP3594798A4 (en) 2020-04-01
EP3594798B1 (en) 2024-03-06
US20200028911A1 (en) 2020-01-23

Similar Documents

Publication Publication Date Title
WO2018176998A1 (zh) 数据存储方法及装置
US11704144B2 (en) Creating virtual machine groups based on request
US10846140B2 (en) Off-site backup of workloads for multi-tenant cloud computing system
US9491313B2 (en) Optimizing storage between mobile devices and cloud storage providers
US8914469B2 (en) Negotiating agreements within a cloud computing environment
CN103442049B (zh) 一种面向构件的混合型云操作系统体系结构及其通信方法
US9514318B2 (en) Dynamic access control for documents in electronic communications within a networked computing environment
CN102971724B (zh) 与数据中心环境内的基于单元式虚拟资源的管理有关的方法和装置
US20080263553A1 (en) Dynamic Service Level Manager for Image Pools
JP2017519308A (ja) マルチテナントアプリケーションサーバ環境におけるワークマネージャを提供するためのシステムおよび方法
US9679119B2 (en) Software utilization privilege brokering in a networked computing environment
WO2011144560A1 (en) Message broadcasting in a clustered computing environment
CN118012572A (zh) 用于自动配置用于容器应用的最小云服务访问权限的技术
US20230110628A1 (en) QUANTUM COMPUTING SERVICE WITH QUALITY OF SERVICE (QoS) ENFORCEMENT VIA OUT-OF-BAND PRIORITIZATION OF QUANTUM TASKS
CN110661842A (zh) 一种资源的调度管理方法、电子设备和存储介质
Ma et al. vLocality: Revisiting data locality for MapReduce in virtualized clouds
US9417997B1 (en) Automated policy based scheduling and placement of storage resources
WO2017054533A1 (zh) 云互通的外部资源管理方法、装置及系统
WO2014000554A1 (zh) 构建基于角色的访问控制系统的方法及云服务器
Kandi et al. An integer linear-programming based resource allocation method for SQL-like queries in the cloud
Thu Dynamic replication management scheme for effective cloud storage
Umar et al. JOINED HETEROGENEOUS CLOUDS RESOURCES MANAGEMENT: AN ALGORITHM DESIGN
Lilhore et al. A novel performance improvement model for cloud computing
WO2023274014A1 (zh) 容器集群的存储资源管理方法、装置及系统
Shan et al. Heterogeneous MacroTasking (HeMT) for Parallel Processing in the Public Cloud

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18776326

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2018776326

Country of ref document: EP

Effective date: 20191007