CN108234551B

CN108234551B - Data processing method and device

Info

Publication number: CN108234551B
Application number: CN201611159311.8A
Authority: CN
Inventors: 朱虹; 罗朝亮; 胡林红; 李小宁
Original assignee: Tencent Technology Shenzhen Co Ltd
Current assignee: Tencent Technology Shenzhen Co Ltd
Priority date: 2016-12-15
Filing date: 2016-12-15
Publication date: 2021-06-25
Anticipated expiration: 2036-12-15
Also published as: CN108234551A

Abstract

The embodiment of the invention discloses a data processing method and a device, wherein the method comprises the following steps: acquiring the storage performance grade of first data needing to be stored currently; acquiring a first cluster identifier corresponding to the storage performance grade of the first data according to a pre-established corresponding relation between the storage performance grade and the cluster identifier of the storage cluster; and scheduling storage resources in the storage cluster corresponding to the first cluster identifier, wherein the storage resources are used for storing the first data, and the resource amount of the storage resources is matched with the data amount of the first data. By adopting the embodiment of the invention, the requirements of different data on the storage performance can be met, and the reliability of data storage is improved.

Description

Data processing method and device

Technical Field

The present invention relates to the field of internet technologies, and in particular, to a data processing method and apparatus.

Background

The cloud storage is a cloud computing system taking data storage and management as a core, and specifically can be a system which integrates storage clusters in a network through application software to cooperatively work through functions of cluster application, network technology or a distributed file system and the like, and provides data storage and service access functions to the outside. However, one cloud computing system generally manages multiple data, performance requirements of different data on storage resources are different, and a traditional cloud storage system does not divide a storage cluster according to different storage performances, so that the performance requirements of different data on the storage resources cannot be met, and the storage reliability is reduced.

Disclosure of Invention

The technical problem to be solved by the embodiments of the present invention is to provide a data processing method and apparatus, which can meet the requirements of different data on storage performance and improve the reliability of data storage.

In order to solve the above technical problem, an embodiment of the present invention provides a data processing method, where the method includes:

acquiring the storage performance grade of first data needing to be stored currently;

acquiring a first cluster identifier corresponding to the storage performance grade of the first data according to a pre-established corresponding relation between the storage performance grade and the cluster identifier of the storage cluster;

and scheduling storage resources in the storage cluster corresponding to the first cluster identifier, wherein the storage resources are used for storing the first data, and the resource amount of the storage resources is matched with the data amount of the first data.

Correspondingly, an embodiment of the present invention further provides a data processing apparatus, where the apparatus includes:

the device comprises a performance grade acquisition unit, a storage performance grade acquisition unit and a storage unit, wherein the performance grade acquisition unit is used for acquiring the storage performance grade of first data needing to be stored currently;

a cluster identifier obtaining unit, configured to obtain, according to a correspondence between a storage performance level and a cluster identifier of a storage cluster that are established in advance, a first cluster identifier corresponding to the storage performance level of the first data;

and the storage resource scheduling unit is used for scheduling storage resources in the storage cluster corresponding to the first cluster identifier, wherein the storage resources are used for storing the first data, and the resource amount of the storage resources is matched with the data amount of the first data.

By implementing the embodiment of the invention, the storage performance grade of the first data needing to be stored currently is obtained, the first cluster identifier corresponding to the storage performance grade of the first data is obtained according to the pre-established corresponding relationship between the storage performance grade and the cluster identifier of the storage cluster, and the storage resource is scheduled in the storage cluster corresponding to the first cluster identifier, and is used for storing the first data, so that the requirements of different data on the storage performance can be met, and the reliability of data storage is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts;

FIG. 1 is a block diagram of a data processing system according to an embodiment of the present invention;

FIG. 2 is a flow chart of a data processing method provided in an embodiment of the present invention;

fig. 3 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present invention;

FIG. 4 is a schematic structural diagram of a controller provided in an embodiment of the present invention;

FIG. 5 is a schematic diagram of a configuration file provided in an embodiment of the present invention;

FIG. 6 is a schematic diagram of a configuration file provided in another embodiment of the present invention;

FIG. 7 is a diagram illustrating details of a configuration storage backend provided in an embodiment of the present invention;

FIG. 8 is a schematic diagram of a configuration volume type provided in an embodiment of the present invention;

FIG. 9a is a schematic illustration of one embodiment of the present invention;

FIG. 9b is a schematic illustration provided in another embodiment of the present invention;

FIG. 10 is a schematic diagram of creating a volume provided in an embodiment of the invention;

fig. 11 is a schematic diagram of mounting a volume to a virtual machine according to an embodiment of the present invention.

Detailed Description

The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The embodiment of the invention provides a data processing method, wherein a controller can acquire the storage performance grade of first data needing to be stored currently, acquire a first cluster identifier corresponding to the storage performance grade of the first data according to a pre-established corresponding relation between the storage performance grade and a cluster identifier of a storage cluster, and schedule storage resources in the storage cluster corresponding to the first cluster identifier, wherein the storage resources are used for storing the first data, so that the requirements of different data on the storage performance can be met, and the reliability of data storage is improved.

For example, the controller may be a cloud controller or a cluster controller in an OpenStack (open source cloud computing management platform) base cloud, and the like, and is not limited in this embodiment of the present application.

Based on the above principle, the embodiment of the present invention discloses an architecture diagram of a data processing system shown in fig. 1. As shown in fig. 1, the data processing system may include at least one controller, a Ceph (extensible storage space) cluster 1, a Ceph cluster 2, and a Kernel-based Virtual Machine (KVM) cluster, where the controller may establish a communication connection with the Ceph cluster 1, the Ceph cluster 2, and the KVM cluster, respectively, the Ceph cluster 1 may establish a communication connection with the KVM cluster, and the Ceph cluster 2 may establish a communication connection with the KVM cluster.

One Region of an OpenStack may include one KVM cluster and at least two Ceph clusters. A Ceph cluster may include at least one storage node and a KVM cluster may include at least one virtual machine. The controller may manage the one KVM cluster and the at least two Ceph clusters. The controller may respectively create cloud storage volumes in the Ceph cluster 1 and the Ceph cluster 2, where a volume refers to a storage resource whose resource amount is a specified resource amount, and the volume created in the Ceph cluster 1 and the volume created in the Ceph cluster 2 may be mounted to the same virtual machine in the KVM cluster at the same time or different virtual machines in the KVM cluster.

Based on the above principle, the embodiment of the present invention discloses an architecture diagram of a data processing method shown in fig. 2. As shown in fig. 2, the data processing method may include at least the following steps:

s201, obtaining the storage performance level of the first data which needs to be stored currently.

The controller may obtain a storage performance level of first data that needs to be stored currently, for example, when the first data is audio and video data or instant messaging data, the first data requires high-performance storage because the audio and video data or the instant messaging data have a high real-time requirement, the storage performance level of the first data is higher, and if the first data is web page data, the first data may be stored normally because the web page data has a low real-time requirement, the storage performance level of the first data is lower. Illustratively, the controller may be the controller shown in FIG. 1.

Wherein the storage performance levels include at least: a first storage performance level for storing data at a first storage speed, and a second storage performance level for storing data at a second storage speed. It should be noted that, in the embodiment of the present invention, the storage performance levels include, but are not limited to, two storage performance levels, for example, the controller configures the data types of the data into a first data type, a second data type, and a third data type based on the real-time requirement of the data, where the real-time requirement of the data of the first data type is higher, and then the storage performance level of the data of the first data type may be the first storage performance level for storing the data at the first storage speed; the data of the second data type has a lower real-time requirement than the data of the first data type, and the storage performance level of the data of the second data type may be a second storage performance level for storing the data at a second storage speed; the data of the third data type may have a lower real-time requirement than the data of the second data type, and the storage performance level of the data of the third data type may be a third storage performance level for storing the data at a third storage speed, wherein the first storage speed may be greater than the second storage speed, and the second storage speed may be greater than the third storage speed.

Optionally, the controller may receive a storage resource acquisition request sent by the virtual machine, where the storage resource acquisition request carries a storage performance level of the first data, acquire a first cluster identifier corresponding to the storage performance level of the first data according to a correspondence between a pre-established storage performance level and a cluster identifier of the storage cluster, schedule the storage resource in the storage cluster corresponding to the first cluster identifier, and send the storage resource to the virtual machine, so that the virtual machine uses the storage resource to store the first data. Illustratively, the storage cluster may be a Ceph cluster in fig. 1, such as Ceph cluster 1 or Ceph cluster 2.

In a specific implementation, the controller may pre-establish a correspondence between the storage performance level and the cluster identifier of the storage cluster, for example, if the controller configures two storage clusters, which are a first storage cluster and a second storage cluster, respectively, where the storage performance of the first storage cluster is higher than that of the second storage cluster, the controller may establish a correspondence between the first storage performance level and the cluster identifier of the first storage cluster, and establish a correspondence between the second storage performance level and the cluster identifier of the second storage cluster. The first storage performance level is used for storing data at a first storage speed, the second storage performance level is used for storing data at a second storage speed, and the first storage performance level is higher than the second storage performance level if the first storage speed is higher than the second storage speed. It should be noted that the controller may update the control node included in each storage cluster according to the need of an application scenario, so as to change the storage performance level of each storage cluster, which is not limited by the embodiment of the present invention specifically. The cluster identifier of the storage cluster may be used to uniquely identify the storage cluster, and the cluster identifier may include a cluster name of the storage cluster or a cluster number configured for the storage cluster, for example, the cluster identifier of the first storage cluster is Ceph cluster 1, and the cluster identifier of the second storage cluster is Ceph cluster 2.

In addition, when first data needing to be stored exists in one KVM in the KVM cluster, the KVM may acquire a storage performance level of the first data, and generate a storage resource acquisition request about the first data, where the storage resource acquisition request carries the storage performance level of the first data, and the KVM may send the storage resource acquisition request to the controller. The controller acquires a first cluster identifier corresponding to the storage performance grade of the first data according to a pre-established corresponding relationship between the storage performance grade and the cluster identifier of the storage cluster, schedules storage resources in the storage cluster corresponding to the first cluster identifier, and sends the storage resources to the virtual machine, so that the virtual machine stores the first data by using the storage resources obtained by scheduling.

S202, acquiring a first cluster identifier corresponding to the storage performance level of the first data according to the pre-established corresponding relationship between the storage performance level and the cluster identifier of the storage cluster.

Specifically, after the controller obtains the storage performance level of the first data that needs to be stored currently, the controller may obtain the first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster. For example, the first storage performance level corresponds to a cluster identifier of a first storage cluster, and the second storage performance level corresponds to a cluster identifier of a second storage cluster, so that after the controller acquires that the storage performance level of the first data is the first storage performance level, the controller may acquire that the first cluster identifier corresponding to the storage performance level of the first data is the cluster identifier of the first storage cluster; or after the controller acquires that the storage performance level of the first data is the second storage performance level, the controller may acquire that the first cluster identifier corresponding to the storage performance level of the first data is the cluster identifier of the second storage cluster.

Optionally, before the controller obtains the first cluster identifier corresponding to the storage performance level of the first data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, a configuration file of each storage cluster may be generated, where the configuration file may include the cluster identifier of the storage cluster and attribute information thereof, and the attribute information includes control authority of each user to the storage cluster, device information of storage devices included in the storage cluster, and a sum of storage resources included in the storage cluster.

For example, when there are two storage clusters (e.g., Ceph cluster 1 and Ceph cluster 2), the controller may configure a directory Ceph1 for Ceph cluster 1 that is used to place the configuration file for Ceph cluster 1. The directory ceph1 may be as follows:

[root@DEVSSTEST-CON1～]#cd/etc/ceph

ceph1/cephbak/

in addition, the configuration file of the Ceph cluster 1 can be as shown in fig. 5, where a circle, a gland, or a root can represent a user, -rw, -or-rw-r-can represent the rights of different users. For example, the authority corresponding to the user root is-rw-for indicating that the configuration file of the Ceph cluster 1 is readable and writable for the user root, that is, the user root can read the configuration file of the Ceph cluster 1 and modify the configuration file of the Ceph cluster 1. For another example, the authority corresponding to the user cinder is-rw-r-, and the configuration file for representing the Ceph cluster 1 is readable and non-writable by the user cinder, that is, the user cinder can read the configuration file of the Ceph cluster 1, but cannot modify the configuration file of the Ceph cluster 1. For another example, the right corresponding to the user gland is-rw-r-, and the configuration file for representing the Ceph cluster 1 is readable and non-writable for the user gland, that is, the user gland can read the configuration file of the Ceph cluster 1, but cannot modify the configuration file of the Ceph cluster 1. In the embodiment of the present invention, the controller needs to ensure that the user and the authority of the user to the configuration file are correct in the configuration file, so as to correctly operate the Ceph cluster 1. For example, mon _ host 10.125.224.82, 10.125.224.83, 10.125.224.84 may indicate that the IP address of Ceph cluster 1 is 10.125.224.82, 10.125.224.83, or 10.125.224.84. fsid 26a79278-9104-4e72-9343-9a369395a7d5 may represent the cluster identity of Ceph cluster 1.

Further, the controller may also configure a directory Ceph2 of the Ceph cluster 2, which is used to place a configuration file of the Ceph cluster 2. The directory ceph2 may be as follows:

[root@DEVSSTEST-CON1～]#cd/etc/ceph

Ceph2/cephbak/

in addition, the configuration file of the Ceph cluster 2 may be as shown in fig. 6, where a circle, a gland, or a root may represent a user, -rw, -or-rw-r-may represent the rights of different users. For example, the authority corresponding to the user root is-rw-for indicating that the configuration file of the Ceph cluster 2 is readable and writable for the user root, that is, the user root can read the configuration file of the Ceph cluster 2 and modify the configuration file of the Ceph cluster 2. For another example, the authority corresponding to the user cinder is-rw-r-, and the configuration file for representing the Ceph cluster 2 is readable and non-writable by the user cinder, that is, the user cinder can read the configuration file of the Ceph cluster 2, but cannot modify the configuration file of the Ceph cluster 2. For another example, the right corresponding to the user gland is-rw-r-, and the configuration file for representing the Ceph cluster 2 is readable and non-writable for the user gland, that is, the user gland can read the configuration file of the Ceph cluster 2, but cannot modify the configuration file of the Ceph cluster 2. In the embodiment of the present invention, the controller needs to ensure that the user and the authority of the user to the configuration file are correct in the configuration file, so as to correctly operate the Ceph cluster 2. For example, mon _ host 10.125.224.13, 10.125.224.19, 10.125.224.21 may indicate that the IP address of the Ceph cluster 2 is 10.125.224.13, 10.125.224.19, or 10.125.224.21. fsid 6260a31d-137f-4e54-bb51-a67065510d8f may represent the cluster identity of Ceph cluster 2.

Optionally, the controller may schedule a first storage resource with a resource amount of a first preset resource amount in the first cluster, schedule a second storage resource with a resource amount of a second preset resource amount in the second cluster, determine that a resource type of the first storage resource matches with a cluster identifier of the first cluster, determine that a resource type of the second storage resource matches with a cluster identifier of the second cluster, send the first storage resource and the second storage resource to the virtual machine, so that the virtual machine obtains a storage performance level of the first data, obtain, by the virtual machine, the first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, determine, by the virtual machine, the resource type matching with the first cluster identifier, and store the first data using the storage resource indicated by the resource type.

For example, the controller may configure a circular _ conf file, which is used to configure storage devices corresponding to the storage clusters, such as enabled _ backings ═ cephliver-1 and cephliver-2. The storage device may be a storage backend, that is, an entity device corresponding to the storage cluster. The controller may configure detailed information of each storage device, such as a device identifier of the storage device, a number of storage nodes included in the storage device, or a node identifier of each storage node, which may be as follows, for example: volume _ driver, volume _ backup _ name, rbd _ pool, rbd _ ceph _ conf, rbd _ flip _ volume _ from _ snap, rbd _ max _ clone _ depth, rbd _ store _ chunk _ size, rados _ connect _ timeout, pane _ api _ version, rbd _ user, rbd _ secret _ uuid. Details of the storage device of the controller configuration may be as shown in fig. 7.

In addition, the controller may declare the type of the volume in the block storage, as shown in fig. 8, the controller declares the type of the volume as ssd in the first storage cluster and declares the type of the volume as all in the second storage cluster. Since the storage performance of the first storage cluster is higher than that of the second storage cluster, the memory performance of the volume of type ssd is higher than that of the volume of type all.

In addition, the controller may also generate a correspondence between the type of the volume and the device identifier of the storage device, and the correspondence between the type of the volume and the device identifier of the storage device may be as shown in fig. 9a and 9 b.

Further, the controller may schedule a first storage resource with a first predetermined amount of resources in the first cluster, and schedule a second storage resource with a second predetermined amount of resources in the second cluster. Taking the schematic shown in fig. 10 as an example, the controller may create a first volume in the Ceph cluster 1 and a second volume in the Ceph cluster 2, where the types of the first volume and the second volume are different. Taking the schematic shown in fig. 11 as an example, the controller may mount the first volume and the second volume created as described above to the same KVM, so that the KVM may use different types of volumes to store data with different storage performance requirements, so as to meet the requirements of the data with different storage performance requirements on the storage resources.

S203, scheduling storage resources in the storage cluster corresponding to the first cluster identifier.

After the controller obtains the first cluster identifier corresponding to the storage performance level of the first data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, a storage resource may be scheduled in the storage cluster corresponding to the first cluster identifier, where the storage resource may be used to store the first data, and a resource amount of the storage resource may be matched with a data amount of the first data, for example, the resource amount of the storage resource may be greater than or equal to the data amount of the first data, so as to completely store the first data.

Optionally, the controller may further obtain a storage performance level of second data that needs to be read currently, obtain, according to a correspondence between the storage performance level and a cluster identifier of the storage cluster that is established in advance, a second cluster identifier corresponding to the storage performance level of the second data, and read the second data in the storage cluster corresponding to the second cluster identifier. For example, when the controller needs to read the second data, the controller may determine the storage cluster storing the second data according to the storage performance level of the second data, that is, the controller may obtain the second cluster identifier corresponding to the storage performance level of the second data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, and read the second data in the storage cluster corresponding to the second cluster identifier.

In the embodiment of the invention, the storage performance grade of the first data needing to be stored currently is obtained, the first cluster identifier corresponding to the storage performance grade of the first data is obtained according to the pre-established corresponding relation between the storage performance grade and the cluster identifier of the storage cluster, the storage resource is scheduled in the storage cluster corresponding to the first cluster identifier, and the storage resource is used for storing the first data, so that the requirements of different data on the storage performance can be met, and the reliability of data storage is improved.

Referring to fig. 3, fig. 3 is a schematic structural diagram of a data processing apparatus provided in an embodiment of the present invention, where the data processing apparatus may be configured to implement part or all of the steps in the method embodiment shown in fig. 2, and as shown in the figure, the data processing apparatus in this embodiment may at least include a performance level obtaining unit 301, a cluster identifier obtaining unit 302, and a storage resource scheduling unit 303, where:

a performance level obtaining unit 301, configured to obtain a storage performance level of the first data that needs to be stored currently.

A cluster identifier obtaining unit 302, configured to obtain, according to a correspondence between a storage performance level and a cluster identifier of a storage cluster, a first cluster identifier corresponding to the storage performance level of the first data.

A storage resource scheduling unit 303, configured to schedule a storage resource in the storage cluster corresponding to the first cluster identifier, where the storage resource is used to store the first data, and a resource amount of the storage resource is matched with a data amount of the first data.

Optionally, the storage performance level at least includes: a first storage performance level for storing data at a first storage speed, and a second storage performance level for storing data at a second storage speed.

Optionally, the performance level obtaining unit 301 is further configured to obtain a storage performance level of the second data that needs to be currently read.

The cluster identifier obtaining unit 302 is further configured to obtain a second cluster identifier corresponding to the storage performance level of the second data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster.

Further, the data processing apparatus in the embodiment of the present invention may further include:

a data reading unit 304, configured to read the second data in the storage cluster corresponding to the second cluster identifier.

Optionally, the data processing apparatus in the embodiment of the present invention may further include:

a configuration file generating unit 305, configured to generate a configuration file of each storage cluster before the cluster identifier obtaining unit 302 obtains the first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, where the configuration file includes the cluster identifier of the storage cluster and attribute information thereof, and the attribute information includes control authority of each user to the storage cluster, device information of storage devices included in the storage cluster, and a total sum of storage resources included in the storage cluster.

Optionally, the performance level obtaining unit 301 is configured to receive a storage resource obtaining request sent by a virtual machine, where the storage resource obtaining request carries the storage performance level of the first data.

a storage resource sending unit 306, configured to send the storage resource to the virtual machine after the storage resource scheduling unit 303 schedules the storage resource in the storage cluster corresponding to the first cluster identifier, so that the virtual machine uses the storage resource to store the first data.

Optionally, the storage resource scheduling unit 303 is further configured to schedule a first storage resource with a resource amount being a first preset resource amount in the first cluster, and schedule a second storage resource with a resource amount being a second preset resource amount in the second cluster.

a determining unit 307, configured to determine that the resource type of the first storage resource matches the cluster identifier of the first cluster, and that the resource type of the second storage resource matches the cluster identifier of the second cluster.

A storage resource sending unit 306, configured to send the first storage resource and the second storage resource to a virtual machine, so that the virtual machine obtains a storage performance level of the first data, the virtual machine obtains a first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and a cluster identifier of a storage cluster, the virtual machine determines a resource type matched with the first cluster identifier, and the virtual machine stores the first data using the storage resource indicated by the resource type.

In the embodiment of the present invention, the performance level obtaining unit 301 obtains the storage performance level of the first data that needs to be stored currently, the cluster identifier obtaining unit 302 obtains the first cluster identifier corresponding to the storage performance level of the first data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, and the storage resource scheduling unit 303 schedules the storage resource in the storage cluster corresponding to the first cluster identifier, so as to meet the requirements of different data on the storage performance and improve the reliability of data storage.

Referring to fig. 4, fig. 4 is a schematic structural diagram of a controller according to an embodiment of the present invention, the controller according to an embodiment of the present invention may be used to implement the method according to the embodiment of the present invention shown in fig. 2, for convenience of description, only a part related to the embodiment of the present invention is shown, and details of the specific technology are not disclosed, please refer to the embodiment of the present invention shown in fig. 2.

As shown in fig. 4, the controller includes: at least one processor 401, such as a CPU, at least one input device 403, at least one output device 404, memory 405, at least one communication bus 402. Wherein a communication bus 402 is used to enable connective communication between these components. The input device 403 may specifically be a network interface, and the like, and is used for interacting with the storage cluster. The output device 404 may specifically be a network interface, and the like, and is used for interacting with the storage cluster. The memory 405 may include a high-speed RAM memory, and may also include a non-volatile memory, such as at least one disk memory, specifically for storing a correspondence between storage performance levels and cluster identifiers of storage clusters. The memory 405 may optionally include at least one memory device located remotely from the processor 401 as previously described. A set of program codes is stored in the memory 405, and the processor 401, the input device 403, and the output device 404 call the program codes stored in the memory 405 for performing the following operations:

the processor 401 obtains the storage performance level of the first data that needs to be stored currently.

The processor 401 obtains a first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster.

The processor 401 schedules a storage resource in the storage cluster corresponding to the first cluster identifier, where the storage resource is used to store the first data, and a resource amount of the storage resource matches with a data amount of the first data.

Optionally, the processor 401 may further perform the following operations:

the processor 401 obtains the storage performance level of the second data that needs to be read currently.

The processor 401 obtains a second cluster identifier corresponding to the storage performance level of the second data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster.

The processor 401 reads the second data in the storage cluster corresponding to the second cluster identifier.

Optionally, before the processor 401 obtains the first cluster identifier corresponding to the storage performance level of the first data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, the following operations may also be performed:

and generating a configuration file of each storage cluster, wherein the configuration file comprises a cluster identifier of the storage cluster and attribute information thereof, and the attribute information comprises control authority of each user to the storage cluster, equipment information of storage equipment contained in the storage cluster and the sum of storage resources contained in the storage cluster.

Optionally, the obtaining, by the processor 401, the storage performance level of the first data that needs to be stored currently may specifically be:

the input device 403 receives a storage resource acquisition request sent by a virtual machine, where the storage resource acquisition request carries the storage performance level of the first data.

After the processor 401 schedules the storage resource in the storage cluster corresponding to the first cluster identifier, the following operations may also be performed:

output device 404 sends the storage resource to the virtual machine to cause the virtual machine to store the first data using the storage resource.

Optionally, the processor 401 may further perform the following operations:

the processor 401 schedules a first storage resource with a first preset resource amount in the first cluster, and schedules a second storage resource with a second preset resource amount in the second cluster.

Processor 401 determines that the resource type of the first storage resource matches the cluster identification of the first cluster and the resource type of the second storage resource matches the cluster identification of the second cluster.

The output device 404 sends the first storage resource and the second storage resource to a virtual machine, so that the virtual machine obtains the storage performance level of the first data, the virtual machine obtains a first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and a cluster identifier of a storage cluster, the virtual machine determines a resource type matched with the first cluster identifier, and the virtual machine stores the first data by using the storage resource indicated by the resource type.

Specifically, the controller described in the embodiment of the present invention may be used to implement part or all of the process in the embodiment of the method described in conjunction with fig. 2.

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by a computer program, which can be stored in a computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

The above disclosure is only for the purpose of illustrating the preferred embodiments of the present invention, and it is therefore to be understood that the invention is not limited by the scope of the appended claims.

Claims

1. A method of data processing, the method comprising:

scheduling storage resources in a storage cluster corresponding to the first cluster identifier, where the storage resources are used for storing the first data, and a resource amount of the storage resources is matched with a data amount of the first data;

scheduling a first storage resource with a resource amount of a first preset resource amount in a first cluster, and scheduling a second storage resource with a resource amount of a second preset resource amount in a second cluster;

determining that the resource type of the first storage resource matches the cluster identifier of the first cluster, and the resource type of the second storage resource matches the cluster identifier of the second cluster;

sending the first storage resource and the second storage resource to a virtual machine, so that the virtual machine obtains a storage performance level of the first data, the virtual machine obtains a first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and a cluster identifier of a storage cluster, the virtual machine determines a resource type matched with the first cluster identifier, and the virtual machine stores the first data by using the storage resource indicated by the resource type.

2. The method of claim 1, wherein the storage performance level comprises at least: a first storage performance level for storing data at a first storage speed, and a second storage performance level for storing data at a second storage speed.

3. The method of claim 1, wherein the method further comprises:

acquiring the storage performance grade of second data which needs to be read currently;

acquiring a second cluster identifier corresponding to the storage performance grade of the second data according to a pre-established corresponding relation between the storage performance grade and the cluster identifier of the storage cluster;

and reading the second data in the storage cluster corresponding to the second cluster identifier.

4. The method of claim 1, wherein before obtaining the first cluster identifier corresponding to the storage performance level of the first data according to the pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, the method further comprises:

5. The method of claim 1, wherein obtaining a storage performance level of the first data currently required to be stored comprises:

receiving a storage resource acquisition request sent by a virtual machine, wherein the storage resource acquisition request carries the storage performance level of the first data;

after scheduling the storage resources in the storage cluster corresponding to the first cluster identifier, the method further includes:

sending the storage resource to the virtual machine so that the virtual machine stores the first data by using the storage resource.

6. A data processing apparatus, characterized in that the apparatus comprises:

a storage resource scheduling unit, configured to schedule a storage resource in a storage cluster corresponding to the first cluster identifier, where the storage resource is used to store the first data, and a resource amount of the storage resource is matched with a data amount of the first data;

the storage resource scheduling unit is further configured to schedule a first storage resource with a first preset resource amount in the first cluster, and schedule a second storage resource with a second preset resource amount in the second cluster;

a determining unit, configured to determine that a resource type of the first storage resource matches a cluster identifier of the first cluster, and that a resource type of the second storage resource matches a cluster identifier of the second cluster;

a storage resource sending unit, configured to send the first storage resource and the second storage resource to a virtual machine, so that the virtual machine obtains a storage performance level of the first data, the virtual machine obtains a first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and a cluster identifier of a storage cluster, the virtual machine determines a resource type matched with the first cluster identifier, and the virtual machine stores the first data using the storage resource indicated by the resource type.

7. The apparatus of claim 6, wherein the storage performance level comprises at least: a first storage performance level for storing data at a first storage speed, and a second storage performance level for storing data at a second storage speed.

8. The apparatus of claim 6,

the performance grade acquiring unit is further configured to acquire a storage performance grade of second data that needs to be read currently;

the cluster identifier obtaining unit is further configured to obtain a second cluster identifier corresponding to the storage performance level of the second data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster;

the device further comprises:

and the data reading unit is used for reading the second data in the storage cluster corresponding to the second cluster identifier.

9. The apparatus of claim 6, wherein the apparatus further comprises:

a configuration file generating unit, configured to generate a configuration file of each storage cluster before the cluster identifier obtaining unit obtains the first cluster identifier corresponding to the storage performance level of the first data according to a pre-established correspondence between the storage performance level and the cluster identifier of the storage cluster, where the configuration file includes the cluster identifier of the storage cluster and attribute information thereof, and the attribute information includes control authority of each user to the storage cluster, device information of storage devices included in the storage cluster, and a total sum of storage resources included in the storage cluster.

10. The apparatus of claim 6,

the performance level acquiring unit is configured to receive a storage resource acquiring request sent by a virtual machine, where the storage resource acquiring request carries a storage performance level of the first data;

the device further comprises:

a storage resource sending unit, configured to send the storage resource to the virtual machine after the storage resource scheduling unit schedules the storage resource in the storage cluster corresponding to the first cluster identifier, so that the virtual machine uses the storage resource to store the first data.

11. A controller, characterized in that the controller comprises:

a memory for storing program code;

a processor for calling the program code stored in the memory to execute the data processing method of any one of claims 1 to 5.

12. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which, when executed by a controller, causes the controller to execute the data processing method according to any one of claims 1 to 5.