CN113806066A - Big data resource scheduling method, system and storage medium - Google Patents

Big data resource scheduling method, system and storage medium Download PDF

Info

Publication number
CN113806066A
CN113806066A CN202110369275.2A CN202110369275A CN113806066A CN 113806066 A CN113806066 A CN 113806066A CN 202110369275 A CN202110369275 A CN 202110369275A CN 113806066 A CN113806066 A CN 113806066A
Authority
CN
China
Prior art keywords
service
data
cluster node
room
computing resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110369275.2A
Other languages
Chinese (zh)
Inventor
杨泽森
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jingdong Technology Holding Co Ltd
Original Assignee
Jingdong Technology Holding Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jingdong Technology Holding Co Ltd filed Critical Jingdong Technology Holding Co Ltd
Priority to CN202110369275.2A priority Critical patent/CN113806066A/en
Publication of CN113806066A publication Critical patent/CN113806066A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5072Grid computing

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application provides a big data resource scheduling method, a big data resource scheduling system and a storage medium. The method comprises the following steps: acquiring container operation states and computing resource use information of a plurality of cluster nodes in each service room; establishing a mapping relation between the address information of each cluster node and the running state of the container and the use information of the computing resources, and storing the mapping relation; in response to a received computing resource allocation request sent by a first service room server, determining a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, container operation states of the cluster nodes in other service rooms and computing resource use information; and acquiring the address information of the target cluster node according to the stored mapping relation, and sending a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation. The method and the device realize the cross-cluster computing resource scheduling function of the cross-service machine room.

Description

Big data resource scheduling method, system and storage medium
Technical Field
The present application relates to the field of big data technologies, and in particular, to a big data resource scheduling method, system, and storage medium.
Background
With the arrival of the big data era, a super-large scale group and an enterprise construct different big data platforms or data middle platform environments to realize big data calculation and storage, data statistical analysis and data mining and make decision support for enterprise digital marketing and digital operation. Because different groups or companies have different services and different input research and development resources and research and development capabilities, the function difference of large data platforms of the same multi-sub-group, sub-company or branch organization is great.
Disclosure of Invention
The application provides a big data resource scheduling method, a big data resource scheduling system and a storage medium.
According to a first aspect of the present application, a big data resource scheduling method is provided, including:
acquiring container operation states and computing resource use information of a plurality of cluster nodes in different service rooms;
establishing a mapping relation between the address information of each cluster node and the running state of the container and the use information of the computing resources, and storing the mapping relation;
in response to a received computing resource allocation request sent by a first service room server, determining a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, container operation states of the cluster nodes in other service rooms and computing resource use information; the other service machine rooms are service machine rooms except the first service machine room in the different service machine rooms;
and acquiring the address information of the target cluster node according to the stored mapping relation, and sending a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation.
According to a second aspect of the present application, there is provided a big data resource scheduling system, including:
the cross-machine-room resource scheduling module is used for acquiring the container running states and the computing resource use information of a plurality of cluster nodes in each service machine room, establishing a mapping relation between the address information of each cluster node and the container running states and the computing resource use information, and storing the mapping relation;
the cross-machine-room resource scheduling module is further configured to determine a target cluster node from the plurality of cluster nodes in the other service room according to the received computing resource allocation request, container operation states of the plurality of cluster nodes in the other service room and computing resource use information in response to the received computing resource allocation request sent by the first service room server; the other service rooms are service rooms except the first service room in each service room;
the cross-machine-room resource scheduling module is further configured to obtain address information of the target cluster node according to the stored mapping relationship, and send a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation.
According to a third aspect of the present application, there is provided a computer-readable storage medium, on which a computer program is stored, which, when executed by a processor, implements the big data resource scheduling method provided by the foregoing first aspect of the present application.
According to the technical scheme of the embodiment of the application, the container operation state and the computing resource use information of the cluster node under each service room can be obtained, and when a computing resource allocation request put by a certain service room server is received, a target cluster node capable of meeting the computing resource allocation request can be determined from the cluster nodes under other service rooms according to the container operation state and the computing resource use information of each cluster node under other service rooms, so that the scheduling operation can be performed on the target cluster node, namely the computing operation of data to be processed is performed on the target cluster node, and the computing operation result is returned to the first service room server to complete the scheduling of computing resources, thereby realizing the function of scheduling the computing resources of cross-service rooms and cross-clusters and ensuring the effective utilization of the resources of the cluster nodes under each service room, and the time efficiency of task execution is improved while computing resources are utilized to the maximum extent.
It should be understood that the statements in this section do not necessarily identify key or critical features of the embodiments of the present application, nor do they limit the scope of the present application. Other features of the present application will become apparent from the following description.
Drawings
The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
fig. 1 is a schematic flowchart of a big data resource scheduling method according to an embodiment of the present application;
fig. 2 is a block diagram illustrating a structure of a big data resource scheduling system according to an embodiment of the present disclosure;
FIG. 3 is a functional architecture diagram of a big data resource scheduling system according to an embodiment of the present application;
fig. 4 is a diagram illustrating a technical architecture of a big data resource scheduling system according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.
It should be noted that, the service room in the embodiment of the present application may be a room corresponding to an information data service provided by a large enterprise or a group. For example, for a large enterprise, it may have a plurality of different groups, each group having a corresponding business machine room, to implement management of corresponding business data.
With the arrival of the big data era, a super-large scale group and an enterprise construct different big data platforms or data middle platform environments to realize big data calculation and storage, data statistical analysis and data mining and make decision support for enterprise digital marketing and digital operation. Because different groups or companies have different services and different input research and development resources and research and development capabilities, the function difference of large data platforms of the same multi-sub-group, sub-company or branch organization is great. Meanwhile, in many cases, decision analysis of an enterprise requires not only its own business data but also data of siblings, subgroups, branches and the like closely related to the business of the enterprise to perform data sharing, data fusion and data association calculation. In this case, it is necessary to perform data sharing, data exchange, and data calculation between the large data platform environments (or data middleboxes) within a group or an enterprise.
In the related technology, the existing groups have large data platforms at the same time, and the use of large data resources is wasted. The large data platforms of each group have insufficient resource use and idle resources, and if one copy of data is copied to the demand side cluster (generally 3 copies) when data sharing is needed, the waste of storage resources is aggravated, and the effective utilization of resources is not facilitated. Therefore, the application provides a method, a system and a storage medium for scheduling big data resources of a cross-service computer room.
The following describes a big data resource scheduling method, system, and storage medium according to an embodiment of the present application with reference to the drawings.
Fig. 1 is a flowchart illustrating a big data resource scheduling method according to an embodiment of the present application. It should be noted that the big data resource scheduling method in the embodiment of the present application is applicable to different service room-spanning scenarios. The service rooms may be multiple, for example, a large enterprise includes multiple groups, each group has a corresponding service room, and each service room has a corresponding cluster node.
It should be further noted that the big data resource scheduling method according to the embodiment of the present application may be applied to the big data resource scheduling system according to the embodiment of the present application, where the big data resource scheduling system may include a cross-machine-room resource scheduling module, so as to implement a function of computing resource scheduling of a cross-service machine room.
As shown in fig. 1, the big data resource scheduling method may include the following steps.
In step 101, container operation states and computing resource usage information of a plurality of cluster nodes in each service room are obtained.
It should be noted that, in the embodiment of the present application, a yann ((Another Resource coordinator) Router) may be used to implement scheduling routing of computing resources across a computer room, and a plurality of yann logical clusters (each RM (Resource Manager) may be regarded as one yann logical cluster) across a service computer room are implemented through a yann Federation.
In the embodiment of the present application, the resource manager RM may be deployed on a cluster node in a service room. The resource manager RM is a global resource manager, and can be used to manage and allocate resources of the whole system. Wherein, the resource manager RM is mainly composed of two components: a resource Scheduler (Scheduler) and an application Manager (ASM). Among them, the resource Scheduler is mainly responsible for allocating resources to various running applications, which are limited by capacity, queues, etc. The resource Scheduler may perform its scheduling function according to the resource requirement of the application program, and is implemented based on the abstract concept of a resource Container (Container) containing factors such as memory, CPU, disk, network, etc. The Application manager may be responsible for accepting Job's submission requests, allocating the first container for the Application to run the Application Master (an instance of a specific computing framework), and providing services to restart the container running the Application Master upon failure.
In this embodiment of the present application, a plurality of NM (Node Manager, resource and task Manager on a Node) may also be deployed on a cluster Node in the service room. Wherein, NM can report resource usage and operation state of each container to RM. The NM may also receive and process various requests from the application manager for container start or stop.
That is, the NM on the cluster node under each service room can report the resource usage on the node and the operation status of each container to the RM in a timely manner. When receiving the resource use condition reported by each node and the running state of each container, the RM on the cluster node in the service room can report the received resource use condition reported by each node and the running state of each container to the cross-room resource scheduling module, so that the cross-room resource scheduling module obtains the container running states and the computing resource use information of a plurality of cluster nodes in each service room.
In step 102, a mapping relation between the address information of each cluster node and the container operation state and the computing resource use information is established, and the mapping relation is stored.
In the embodiment of the application, the address information of each cluster node in each service room can be acquired, the mapping relation between the address information of each cluster node and the container operation state and the computing resource use information of each cluster node is established, and the mapping relation is stored. Each cluster node can report the address information of the cluster node to the cross-machine-room resource scheduling module, so that the cross-machine-room resource scheduling module obtains the address information of each cluster node.
For example, assuming that an a-clique service room and a B-clique service room are provided, the a-clique service room has cluster nodes a1 and a2, and the B-clique service room has cluster nodes B1 and B2, the cross-room resource scheduling module may obtain address information of node a1, node a2, node B1 and node B2, respectively, and obtain a container operation state and computing resource usage information of node a1, a container operation state and computing resource usage information of node a2, a container operation state and computing resource usage information of node B1, a container operation state and computing resource usage information of node B2. For each cluster node, a mapping relation between the address information of the cluster node and the container operation state and the computing resource use information can be constructed and stored.
It can be understood that the purpose of the present application to establish the mapping relationship between the address information of the cluster node and the container operation state and the computing resource usage information is to be able to definitely find the location of the cluster node that can provide resource scheduling.
In step 103, in response to the received computing resource allocation request sent by the first service room server, a target cluster node is determined from the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operating states of the plurality of cluster nodes in other service rooms, and the computing resource usage information. The other service rooms can be understood as service rooms other than the first service room in each service room.
In the embodiment of the application, when the first service room server determines that the use of the computing resources of the cluster node in the service room is in shortage, the first service room server may send a computing resource allocation request to the cross-room resource scheduling module to request the cross-room resource scheduling module to schedule the computing resources to other service rooms. When receiving a computing resource allocation request sent by a first service room server, a cross-room resource scheduling module can determine a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operation states of the cluster nodes in other service rooms and computing resource use information.
It should be noted that there are many examples of implementation manners for determining the target cluster node from among multiple cluster nodes in the other service room. Two examples will be given below to describe the determination of the target cluster node:
as an example, the plurality of cluster nodes in the other service room may be sorted according to the container operating states and the computing resource usage information of the plurality of cluster nodes in the other service room, and the target cluster node may be determined from the plurality of cluster nodes in the other service room according to the sorting result and the computing resource allocation request.
For example, the cluster nodes in the other service room may be sorted according to the remaining resource condition according to the container operation states and the computing resource usage information of the cluster nodes in the other service room, the cluster nodes with a larger remaining resource amount and a smaller container operation amount may be arranged in front of the cluster nodes with a smaller remaining resource amount and a larger container operation amount may be arranged behind the cluster nodes with the smaller remaining resource amount and the larger container operation amount, and thus, the target cluster node may be determined from the cluster nodes in the other service room according to the sorting result and the computing resource allocation request.
As another example, the computing resource to be scheduled is determined according to the computing resource allocation request, and the target cluster node is determined from the plurality of cluster nodes in other service rooms according to the computing resource to be scheduled, the container operating states of the plurality of cluster nodes in other service rooms, and the computing resource usage information. That is to say, the computing resource to be scheduled may be determined according to the computing resource allocation request, and according to the container operating states of the plurality of cluster nodes in other service rooms and the computing resource usage information, a cluster node that can satisfy the computing resource to be scheduled is determined from the plurality of cluster nodes in other service rooms, and the cluster node that can satisfy the computing resource to be scheduled is determined as the target cluster node.
It is to be understood that the above two examples are only examples for letting those skilled in the art know how to determine the target cluster node, and are not to be taken as specific limitations of the present application, that is, how to determine the target cluster node may be implemented in other ways, and the present application is not limited specifically.
In step 104, the address information of the target cluster node is obtained according to the stored mapping relationship, and a resource scheduling request is sent to the target cluster node according to the address information of the target cluster node to perform scheduling operation.
In the embodiment of the present application, when a target cluster node is determined, address information of the target cluster node may be obtained according to a pre-stored mapping relationship, and a resource scheduling request may be sent to the target cluster node according to the address information of the target cluster node. The resource scheduling request is used for indicating a target cluster node to acquire data to be processed and performing calculation operation on the data to be processed.
That is, the cross-machine room resource scheduling module sends a resource scheduling request to the target cluster node according to the address information of the target cluster node. When receiving a resource scheduling request sent by a cross-machine-room resource scheduling module, a target cluster node can acquire data to be processed and perform calculation operation on the processed data.
It should be noted that, in order to implement a data storage function of a cross-service computer room, implement a data sharing function of the cross-service computer room, and simultaneously play a role of disaster tolerance for data in different places, shared data sent by each service computer room server may be sent to a cluster node under another service computer room in a form of a copy file for storage. Optionally, in some embodiments of the present application, when a data sharing request sent by a first service room server is received, a copy file of data to be shared is obtained; and sending the duplicate file to the storage sharing areas of a plurality of cluster nodes in other service rooms for storage.
For example, when the cluster node in the first service room receives and stores a certain data, the cluster node may locally store the data to be stored, and may also use the data to be stored as shared data to perform remote storage to the remote service room. When a data sharing request sent by a first service room server is received, a duplicate file of data to be shared can be obtained, and the duplicate file is sent to storage sharing areas of a plurality of cluster nodes in other service rooms for storage.
In order to save computing resources and improve scheduling efficiency, optionally, in this embodiment of the present application, a specific implementation process of the target cluster node acquiring to-be-processed data may be as follows: and acquiring the data to be processed from the copy file stored in the storage sharing area of the target cluster node. That is to say, when the target cluster node receives the resource scheduling request sent by the cross-machine-room resource scheduling module, since the data shared by the first service machine room server is stored in the storage sharing area of the target cluster node, the data to be processed can be directly obtained from the replica file stored in the storage sharing area of the target cluster node, so as to perform the calculation operation on the data to be processed.
It should be noted that, a name space (NameNode, NN for short) for managing a file system and a storage service (DataNode, DN for short) for providing real file data may be deployed on nodes on each cluster in the service room. The NN maintains the file system tree and all the files and directories in the whole tree. This information is permanently stored on the local disk in two files: namespace mirror files and edit log files. The NN also records the data node information where each block in each file is located, but it does not permanently store the location information of the blocks, as this information is reconstructed by the data nodes at system startup. The storage service DN may be plural. HDFS (Hadoop Distributed File System) divides a File into individual blocks (blocks), which may be stored on one DN or on multiple DNs. The DN is responsible for reading and writing actual files on the bottom layer, if a client program initiates a command for reading the files on the HDFS, the NN can inform the client of the DNs on which the real data of the data blocks corresponding to the files are stored, and then the client can directly interact with the DNs to read the real data of the files.
In this embodiment of the present application, when a cluster node receives a copy file of shared data sent by another service room, the cluster node may divide the copy file into a plurality of data blocks, and store the data blocks on one or more DNs. The NN on the cluster node may record information of data nodes where data blocks in the replica file are located, so that the storage location of the data block corresponding to the replica file can be determined when the replica file is accessed. For example, when the cluster node reads the replica file from the shared area to obtain the data to be processed, the cluster node may first obtain, from the NN, data node information for storing each data block in the replica file, that is, determine to which DNs each data block in the replica file is stored, and then the cluster node may directly interact with the determined DNs to read the data of the replica file.
To further enable cross-room data storage, optionally, in some embodiments of the present application, a meta-storage module is provided, through which metadata of a data warehouse in a business room is stored. For example, a hive (data warehouse tool) cluster can be deployed across the machine rooms, and platform users in all the service machine rooms use a hive metastore (metadata in the database, service for managing metadata), so that metadata of the data warehouse across the machine rooms can be unified, a basis is provided for data authority unification, and meanwhile, metadata resource waste caused by a plurality of metastores is avoided, and the function of the unified hive metastore under the multi-user creation scene across the service machine rooms is realized.
According to the big data resource scheduling method, the container operation states and the computing resource use information of the cluster nodes in each service room can be obtained, the mapping relation between the address information of each cluster node and the container operation states and the computing resource use information is established, and the mapping relation is stored. When a computing resource allocation request sent by a first service room server is received, determining a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operation states of the cluster nodes in other service rooms and computing resource use information, acquiring address information of the target cluster node according to a stored mapping relation, and sending a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation. Therefore, by acquiring the container operation state and the computing resource use information of the cluster node under each service room, and when receiving a computing resource allocation request from a certain service room server, the method can determine a target cluster node capable of meeting the computing resource allocation request from the cluster nodes under other service rooms according to the container operation state and the computing resource use information of each cluster node under other service rooms, so as to perform scheduling operation on the target cluster node, namely perform computing operation on data to be processed on the target cluster node, and return the computing operation result to the first service room server to complete the scheduling of computing resources, thereby realizing the cross-cluster computing resource scheduling function of the cross-service rooms, ensuring the effective utilization of the resources of the cluster nodes under each service room, and while utilizing the computing resources to the maximum extent, and the task execution time efficiency is also improved.
In order to implement the above embodiments, the present application further provides a big data resource scheduling system.
Fig. 2 is a block diagram of a big data resource scheduling system according to an embodiment of the present disclosure. As shown in fig. 2, the big data resource scheduling system 200 may include: and a cross-machine room resource scheduling module 201.
Specifically, the cross-machine-room resource scheduling module 201 is configured to obtain the container operating states and the computing resource usage information of the multiple cluster nodes in each service machine room, establish a mapping relationship between the address information of each cluster node and the container operating states and the computing resource usage information, and store the mapping relationship.
In this embodiment of the application, the cross-room resource scheduling module 201 is further configured to determine, in response to a received computing resource allocation request sent by the first service room server, a target cluster node from among the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operating states of the plurality of cluster nodes in other service rooms, and the computing resource usage information.
As an example, the cross-room resource scheduling module 201 may determine a specific implementation process of the target cluster node from the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operation states of the plurality of cluster nodes in other service rooms, and the computing resource usage information, as follows: according to the container running states and the computing resource use information of the cluster nodes in other service rooms, sequencing the cluster nodes in other service rooms; and determining a target cluster node from a plurality of cluster nodes in other service rooms according to the sequencing result and the computing resource allocation request.
As another example, the cross-room resource scheduling module 201 may determine a specific implementation process of the target cluster node from the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operation states of the plurality of cluster nodes in other service rooms, and the computing resource usage information, as follows: determining the computing resources required to be scheduled according to the computing resource allocation request; and determining a target cluster node from the plurality of cluster nodes in other service rooms according to the computing resources to be scheduled, the container operation states of the plurality of cluster nodes in other service rooms and the computing resource use information.
In this embodiment of the application, the cross-machine-room resource scheduling module 201 is further configured to obtain address information of the target cluster node according to the stored mapping relationship, and send a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation. As an example, the cross-machine room resource scheduling module 201 may send a resource scheduling request to the target cluster node according to the address information of the target cluster node. When receiving the resource scheduling request, the target cluster node may obtain data to be processed and perform a calculation operation on the data to be processed.
It should be noted that, in order to implement a data storage function of a cross-service computer room, implement a data sharing function of the cross-service computer room, and simultaneously play a role of disaster tolerance for data in different places, shared data sent by each service computer room server may be sent to a cluster node under another service computer room in a form of a copy file for storage. Optionally, in some embodiments of the present application, as shown in fig. 2, the big data resource scheduling system 200 may further include: a cross-room data storage module 202. The cross-machine-room data storage module 202 is configured to, in response to a received data sharing request sent by a first service machine room server, obtain a replica file of data to be shared, and send the replica file to storage sharing areas of multiple cluster nodes in other service machine rooms for storage.
For example, when the cluster node in the first service room receives and stores a certain data, the cluster node may locally store the data to be stored, and may also use the data to be stored as shared data to perform remote storage to the remote service room. When receiving a data sharing request sent by a first service room server, the cross-room data storage module 202 may obtain a copy file of data to be shared, and send the copy file to storage sharing areas of multiple cluster nodes in other service rooms for storage.
In order to further enable cross-room data storage, optionally, in some embodiments of the present application, as shown in fig. 2, the big data resource scheduling system 200 may further include: a meta storage module 203. The meta storage module 203 is configured to store meta data of a data warehouse in a business machine room; wherein the metadata of the data warehouse is stored in a database. For example, a hive (data warehouse tool) cluster can be deployed across the machine rooms, and platform users in all the service machine rooms use a hive metastore (metadata in the database, service for managing metadata), so that metadata of the data warehouse across the machine rooms can be unified, a basis is provided for data authority unification, and meanwhile, metadata resource waste caused by a plurality of metastores is avoided, and the function of the unified hive metastore under the multi-user creation scene across the service machine rooms is realized.
In order to avoid data flowing into other service rooms and avoiding data management violations, optionally, in some embodiments of the present application, as shown in fig. 2, the big data resource scheduling system 200 may further include: a data acquisition module 204. The data acquisition module 204 is deployed on a service room server, and is used for extracting data from a service production system in a service room to a corresponding service room data warehouse. For example, each group uses the same data collection tool. The data acquisition tool Agent is deployed in the service machine rooms of all groups, and data are directly extracted from the service production systems of all groups to the service machine room data warehouse environment of the group during data acquisition, so that data can be prevented from flowing into the service machine rooms of other groups, and the violation condition of data management is avoided.
For the convenience of user usage, optionally, in some embodiments of the present application, as shown in fig. 2, the big data resource scheduling system 200 may further include: an interactive interface module 205. The interactive interface module 205 is used for providing an interactive interface for customizing a large data task which runs periodically for a user. For example, the interactive interface module 205 may be provided for a user, so that the user may customize a large data task running periodically through the interactive interface module 205, and may display result data corresponding to the task through an interactive interface provided by the interactive interface module 205, so as to support capabilities of data extraction, data statistics, data mining, data pushing, and the like.
In order to make the present application more clearly understood by those skilled in the art, the following will describe a scheme of scheduling large data resources across a service room according to an embodiment of the present application with reference to fig. 3 and fig. 4.
For example, assuming that a certain enterprise has a group a service room and a group B service room, as shown in fig. 3 and 4, in a big data environment (including a big data platform, a data staging platform, or a data warehouse environment), the scheduling of big data resources across the service rooms is divided into the following core functions:
cross-machine room data storage: the cross-service computer room data storage routing can be realized by adopting an HDFS Router, and a plurality of logic clusters (storage under each NN can be regarded as a storage logic cluster) are realized through HDFS Federation. The purpose of cross-service computer room storage is to realize data sharing (2 copies or 1 copy), and simultaneously, the function of remote data disaster tolerance is also achieved.
Cross-machine room resource scheduling: the Yarn Router can be adopted to realize cross-machine-room computing resource scheduling routing, and a plurality of Yarn logic clusters (each RM can be regarded as one Yarn logic cluster) of cross-service machine rooms are realized through Yarn Federation. The cross-service computer room computing resource routing aims to utilize computing resources to the maximum extent and improve task execution timeliness at the same time.
Unified metastore (e.g. meta storage module): in the functional architecture shown in fig. 3, a hive cluster can be deployed across service rooms, and all platform users use the same hive metastore. The method realizes the uniformity of hive metadata of cross-service computer rooms, aims to provide a basis for the uniformity of the authority of the middle platforms in the data, and simultaneously avoids the waste of metadata resources caused by a plurality of metastores.
Unified authority management: and encapsulating the data authority management mechanism based on the hive metastore combined with the hdfs authority mechanism. And the authority management of the hive table and the authority management of the hdfs directory are realized. The goal is to achieve cross-business data management and authorization.
Unified data acquisition: each group uses the same data acquisition tool. The data acquisition tool Agent is deployed in the service machine rooms of all groups, and data are directly extracted from the service production systems of all groups to the service machine room data warehouse environment of the group during data acquisition, so that data can be prevented from flowing into the service machine rooms of other groups, and the violation condition of data management is avoided.
Unified data development: each group service can use the same data development tool. The data development tool is a test environment for developing a data model by data research personnel, and performing data query and data statistical analysis. The data development tool realizes the acquisition of computing resources, data storage and data authority verification through unified basic resource management and control (unified resource scheduling and unified metastore).
Unified task scheduling: each group service can use the same task scheduling tool. The task scheduling tool is a formal production environment for realizing the big data work of each group service, and each group service data research and development personnel can customize a big data task which runs periodically in the tool and support the capabilities of data extraction, data statistics, data mining, data pushing and the like.
Unified metadata management: each group service uses the same metadata management tool. Based on a unified live metastore management mechanism, the metadata management tool realizes the integration of metadata of each group, and users can see the service data asset distribution, data asset consanguinity, data asset query, data asset maintenance and other capabilities of all groups in the metadata management system.
That is to say, in the present application, unified data development, unified job scheduling, and unified metadata management can be implemented in the data console tool, so that a user can interact with the big data resource scheduling system through the interactive interface module provided by the data console tool.
According to the big data resource scheduling system provided by the embodiment of the application, the container operation states and the computing resource use information of a plurality of cluster nodes in each service room can be obtained, the mapping relation between the address information of each cluster node and the container operation states and the computing resource use information is established, and the mapping relation is stored. When a computing resource allocation request sent by a first service room server is received, determining a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operation states of the cluster nodes in other service rooms and computing resource use information, acquiring address information of the target cluster node according to a stored mapping relation, and sending a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation. Therefore, by acquiring the container operation state and the computing resource use information of the cluster node under each service room, and when receiving a computing resource allocation request from a certain service room server, the method can determine a target cluster node capable of meeting the computing resource allocation request from the cluster nodes under other service rooms according to the container operation state and the computing resource use information of each cluster node under other service rooms, so as to perform scheduling operation on the target cluster node, namely perform computing operation on data to be processed on the target cluster node, and return the computing operation result to the first service room server to complete the scheduling of computing resources, thereby realizing the cross-cluster computing resource scheduling function of the cross-service rooms, ensuring the effective utilization of the resources of the cluster nodes under each service room, and while utilizing the computing resources to the maximum extent, and the task execution time efficiency is also improved.
In order to implement the above embodiments, the present application also proposes a computer-readable storage medium. The computer readable storage medium may store thereon a computer program, and the computer program, when executed by a processor, implements the big data resource scheduling method according to any of the above embodiments of the present application.
In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.
Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.
Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware related to instructions of a program, which may be stored in a computer readable storage medium, and when the program is executed, the program includes one or a combination of the steps of the method embodiments.
In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a stand-alone product, may also be stored in a computer readable storage medium.
The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims (15)

1. A big data resource scheduling method is characterized by comprising the following steps:
acquiring container operation states and computing resource use information of a plurality of cluster nodes in each service room;
establishing a mapping relation between the address information of each cluster node and the running state of the container and the use information of the computing resources, and storing the mapping relation;
in response to a received computing resource allocation request sent by a first service room server, determining a target cluster node from a plurality of cluster nodes in other service rooms according to the computing resource allocation request, container operation states of the cluster nodes in other service rooms and computing resource use information;
and acquiring the address information of the target cluster node according to the stored mapping relation, and sending a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation.
2. The method of claim 1, wherein the determining a target cluster node from the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operating status of the plurality of cluster nodes in other service rooms, and the computing resource usage information comprises:
according to the container running states and the computing resource use information of the cluster nodes in the other service rooms, sequencing the cluster nodes in the other service rooms;
and determining a target cluster node from a plurality of cluster nodes in the other service rooms according to the sequencing result and the computing resource allocation request.
3. The method of claim 1, wherein the determining a target cluster node from the plurality of cluster nodes in other service rooms according to the computing resource allocation request, the container operating status of the plurality of cluster nodes in other service rooms, and the computing resource usage information comprises:
determining the computing resources required to be scheduled according to the computing resource allocation request;
and determining a target cluster node from the plurality of cluster nodes in other service rooms according to the computing resources to be scheduled, the container operation states of the plurality of cluster nodes in other service rooms and the computing resource use information.
4. The method of claim 1, wherein the sending a resource scheduling request to the target cluster node for scheduling operation according to the address information of the target cluster node comprises:
sending a resource scheduling request to the target cluster node according to the address information of the target cluster node; the resource scheduling request is used for indicating the target cluster node to acquire data to be processed and performing calculation operation on the data to be processed.
5. The method of claim 4, further comprising:
when a data sharing request sent by the first service machine room server is received, acquiring a copy file of data to be shared;
and sending the duplicate file to storage sharing areas of a plurality of cluster nodes in other service rooms for storage.
6. The method of claim 5, wherein the target cluster node obtaining the data to be processed comprises:
and acquiring the data to be processed from the copy file stored in the storage sharing area of the target cluster node.
7. A big data resource scheduling system, comprising:
the cross-machine-room resource scheduling module is used for acquiring the container running states and the computing resource use information of a plurality of cluster nodes in different service machine rooms, establishing a mapping relation between the address information of each cluster node and the container running states and the computing resource use information, and storing the mapping relation;
the cross-machine-room resource scheduling module is further configured to determine a target cluster node from the plurality of cluster nodes in the other service room according to the received computing resource allocation request, container operation states of the plurality of cluster nodes in the other service room and computing resource use information in response to the received computing resource allocation request sent by the first service room server;
the cross-machine-room resource scheduling module is further configured to obtain address information of the target cluster node according to the stored mapping relationship, and send a resource scheduling request to the target cluster node according to the address information of the target cluster node to perform scheduling operation.
8. The system of claim 7, wherein the cross-room resource scheduling module is specifically configured to:
according to the container running states and the computing resource use information of the cluster nodes in the other service rooms, sequencing the cluster nodes in the other service rooms;
and determining a target cluster node from a plurality of cluster nodes in the other service rooms according to the sequencing result and the computing resource allocation request.
9. The system of claim 7, wherein the cross-room resource scheduling module is specifically configured to:
determining the computing resources required to be scheduled according to the computing resource allocation request;
and determining a target cluster node from the plurality of cluster nodes in other service rooms according to the computing resources to be scheduled, the container operation states of the plurality of cluster nodes in other service rooms and the computing resource use information.
10. The system of claim 7, wherein the cross-room resource scheduling module is specifically configured to:
sending a resource scheduling request to the target cluster node according to the address information of the target cluster node; the resource scheduling request is used for indicating the target cluster node to acquire data to be processed and performing calculation operation on the data to be processed.
11. The system of claim 7, further comprising:
and the cross-machine-room data storage module is used for responding to the received data sharing request sent by the first service machine room server, acquiring a duplicate file of the data to be shared, and sending the duplicate file to the storage sharing areas of the plurality of cluster nodes in other service machine rooms for storage.
12. The system of any one of claims 7 to 11, further comprising:
the metadata storage module is used for storing metadata of a data warehouse in the service machine room; wherein the metadata of the data warehouse is stored in a database.
13. The system of claim 12, further comprising: a data acquisition module, wherein,
the data acquisition module is deployed on a service room server and used for extracting data from a service production system under the service room to a corresponding service room data warehouse.
14. The system of any one of claims 7 to 11, further comprising:
and the interactive interface module is used for providing an interactive interface for customizing the large data task which runs periodically for the user.
15. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, implements the big data resource scheduling method according to any of claims 1 to 6.
CN202110369275.2A 2021-04-06 2021-04-06 Big data resource scheduling method, system and storage medium Pending CN113806066A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110369275.2A CN113806066A (en) 2021-04-06 2021-04-06 Big data resource scheduling method, system and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110369275.2A CN113806066A (en) 2021-04-06 2021-04-06 Big data resource scheduling method, system and storage medium

Publications (1)

Publication Number Publication Date
CN113806066A true CN113806066A (en) 2021-12-17

Family

ID=78892974

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110369275.2A Pending CN113806066A (en) 2021-04-06 2021-04-06 Big data resource scheduling method, system and storage medium

Country Status (1)

Country Link
CN (1) CN113806066A (en)

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114138500A (en) * 2022-01-29 2022-03-04 阿里云计算有限公司 Resource scheduling system and method
CN114390050A (en) * 2021-12-29 2022-04-22 中国电信股份有限公司 Cross-machine-room cluster control method and device
CN114610719A (en) * 2022-03-15 2022-06-10 云粒智慧科技有限公司 Cross-cluster data processing method and device, electronic equipment and storage medium
CN115086341A (en) * 2022-06-22 2022-09-20 中国工商银行股份有限公司 Resource scheduling method and device, computer readable storage medium and electronic equipment
CN115277864A (en) * 2022-07-27 2022-11-01 海通证券股份有限公司 Route determining method and device, computer readable storage medium and terminal
CN115794423A (en) * 2023-02-09 2023-03-14 深圳市华创智能工程技术有限公司 Management method and device of intelligent machine room, electronic equipment and storage medium
CN115904663A (en) * 2022-12-02 2023-04-04 滨州心若网络科技有限公司 Information disaster tolerance method and system based on database and cloud platform
CN116028232A (en) * 2023-02-27 2023-04-28 浪潮电子信息产业股份有限公司 Cross-cabinet server memory pooling method, device, equipment, server and medium
WO2024099246A1 (en) * 2022-11-07 2024-05-16 International Business Machines Corporation Container cross-cluster capacity scaling

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107491343A (en) * 2017-09-08 2017-12-19 中国电子科技集团公司第二十八研究所 A kind of across cluster resource scheduling system based on cloud computing
CN108737270A (en) * 2018-05-07 2018-11-02 北京京东尚科信息技术有限公司 A kind of method for managing resource and device of server cluster
US10142255B1 (en) * 2016-09-08 2018-11-27 Amazon Technologies, Inc. Allocating dynamic resources to service clusters
CN110120979A (en) * 2019-05-20 2019-08-13 华为技术有限公司 A kind of dispatching method, device and relevant device
WO2020253347A1 (en) * 2019-06-17 2020-12-24 深圳前海微众银行股份有限公司 Container cluster management method, device and system
CN112199193A (en) * 2020-09-30 2021-01-08 北京达佳互联信息技术有限公司 Resource scheduling method and device, electronic equipment and storage medium
US20210089350A1 (en) * 2019-09-23 2021-03-25 Hiveio Inc. Virtual computing cluster resource scheduler

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10142255B1 (en) * 2016-09-08 2018-11-27 Amazon Technologies, Inc. Allocating dynamic resources to service clusters
CN107491343A (en) * 2017-09-08 2017-12-19 中国电子科技集团公司第二十八研究所 A kind of across cluster resource scheduling system based on cloud computing
CN108737270A (en) * 2018-05-07 2018-11-02 北京京东尚科信息技术有限公司 A kind of method for managing resource and device of server cluster
CN110120979A (en) * 2019-05-20 2019-08-13 华为技术有限公司 A kind of dispatching method, device and relevant device
WO2020253347A1 (en) * 2019-06-17 2020-12-24 深圳前海微众银行股份有限公司 Container cluster management method, device and system
US20210089350A1 (en) * 2019-09-23 2021-03-25 Hiveio Inc. Virtual computing cluster resource scheduler
CN112199193A (en) * 2020-09-30 2021-01-08 北京达佳互联信息技术有限公司 Resource scheduling method and device, electronic equipment and storage medium

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114390050A (en) * 2021-12-29 2022-04-22 中国电信股份有限公司 Cross-machine-room cluster control method and device
CN114138500A (en) * 2022-01-29 2022-03-04 阿里云计算有限公司 Resource scheduling system and method
CN114138500B (en) * 2022-01-29 2022-07-08 阿里云计算有限公司 Resource scheduling system and method
CN114610719A (en) * 2022-03-15 2022-06-10 云粒智慧科技有限公司 Cross-cluster data processing method and device, electronic equipment and storage medium
CN115086341A (en) * 2022-06-22 2022-09-20 中国工商银行股份有限公司 Resource scheduling method and device, computer readable storage medium and electronic equipment
CN115277864A (en) * 2022-07-27 2022-11-01 海通证券股份有限公司 Route determining method and device, computer readable storage medium and terminal
CN115277864B (en) * 2022-07-27 2024-01-26 海通证券股份有限公司 Route determining method and device, computer readable storage medium and terminal
WO2024099246A1 (en) * 2022-11-07 2024-05-16 International Business Machines Corporation Container cross-cluster capacity scaling
CN115904663A (en) * 2022-12-02 2023-04-04 滨州心若网络科技有限公司 Information disaster tolerance method and system based on database and cloud platform
CN115904663B (en) * 2022-12-02 2024-01-05 中雄世纪征信有限公司 Information disaster recovery method and system based on database and cloud platform
CN115794423A (en) * 2023-02-09 2023-03-14 深圳市华创智能工程技术有限公司 Management method and device of intelligent machine room, electronic equipment and storage medium
CN116028232A (en) * 2023-02-27 2023-04-28 浪潮电子信息产业股份有限公司 Cross-cabinet server memory pooling method, device, equipment, server and medium

Similar Documents

Publication Publication Date Title
CN113806066A (en) Big data resource scheduling method, system and storage medium
CN107066319B (en) Multi-dimensional scheduling system for heterogeneous resources
AU2014346369B2 (en) Managed service for acquisition, storage and consumption of large-scale data streams
AU2014346366B2 (en) Partition-based data stream processing framework
US8874811B2 (en) System and method for providing a flexible buffer management interface in a distributed data grid
US9858322B2 (en) Data stream ingestion and persistence techniques
US7444395B2 (en) Method and apparatus for event handling in an enterprise
US8051170B2 (en) Distributed computing based on multiple nodes with determined capacity selectively joining resource groups having resource requirements
US9462056B1 (en) Policy-based meta-data driven co-location of computation and datasets in the cloud
CN109656879B (en) Big data resource management method, device, equipment and storage medium
US20120078915A1 (en) Systems and methods for cloud-based directory system based on hashed values of parent and child storage locations
US11182217B2 (en) Multilayered resource scheduling
CN110134338B (en) Distributed storage system and data redundancy protection method and related equipment thereof
CN103473365A (en) File storage method and device based on HDFS (Hadoop Distributed File System) and distributed file system
US11182406B2 (en) Increased data availability during replication
CA3030504A1 (en) Blockchain network and task scheduling method therefor
CN109992373B (en) Resource scheduling method, information management method and device and task deployment system
US6671688B1 (en) Virtual replication for a computer directory system
US11561824B2 (en) Embedded persistent queue
CN110012050A (en) Message Processing, storage method, apparatus and system
US11956313B2 (en) Dynamic storage sharing across network devices
CN115037757B (en) Multi-cluster service management system
CN115756955A (en) Data backup and data recovery method and device and computer equipment
CN112769954B (en) Method and system for automatically storing and routing WEB program
CN114866416A (en) Multi-cluster unified management system and deployment method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination