CN109753245B

CN109753245B - Multi-disk load balancing asynchronous read-write scheduling method and device

Info

Publication number: CN109753245B
Application number: CN201811653968.9A
Authority: CN
Inventors: 董隆超; 陈兴利; 张娇娇
Original assignee: Business Intelligence Of Oriental Nations Corp ltd
Current assignee: Business Intelligence Of Oriental Nations Corp ltd
Priority date: 2018-12-28
Filing date: 2018-12-28
Publication date: 2022-02-18
Anticipated expiration: 2038-12-28
Also published as: CN109753245A

Abstract

The method and the device for the multi-disk load balancing asynchronous read-write scheduling configure the HDFS, so that a data node of the HDFS system returns disk information to a name node, and a metadata query service process is started at the name node; acquiring local all disk information, and numbering disks according to disk types in the disk information; establishing independent read-write working threads for the disks respectively, and enabling the disks to wait for the upper layer to send read-write tasks; the metadata query process receives an HDFS data request to be read by an upper layer user, and acquires a disk number of the HDFS data to be read through the metadata query service process of the name node; and when the read-write task is sent to the read-write working thread, the disk number is sent together, the read-write working thread asynchronously processes the respective read-write task according to the disk information when receiving the read-write task, and the read-write working thread notifies an upper layer user after the read-write task is completed. The performance overhead caused by the conflict of a plurality of threads and the conflict of a plurality of disks under high concurrency is reduced, and the data reading and writing efficiency is improved.

Description

Multi-disk load balancing asynchronous read-write scheduling method and device

Technical Field

The embodiment of the invention relates to the technical field of databases, in particular to a multi-disk load balancing asynchronous read-write scheduling method and device.

Background

HDFS refers to a distributed file system, which is typically a distributed file system designed to fit on common hardware. HDFS is a highly fault-tolerant system, suitable for deployment on inexpensive machines, capable of providing high-throughput data access, and well suited for application on large-scale data sets. HDFS is a master-slave architecture, a HDFS cluster is composed of a name node, which is a master server that manages file namespaces and regulates client access to files, and data nodes, usually a node-machine, that manage storage of corresponding nodes. HDFS opens file namespaces to the outside and allows user data to be stored in file form.

At present, under an HDFS system, a data node places actual data in multiple disks, including a common mechanical disk and an SSD solid state disk, and a database also needs to store temporary data in these disks of the data node during execution, most databases directly read and write data stored locally by the HDFS through interfaces of the HDFS, and the database also reads and writes local temporary files through interfaces of a local file system, so that the reading and writing of multiple different types of disks are synchronous under multiple concurrent requests, and the reading and writing of different types of disks are mixed together, thereby causing performance overhead caused by multiple thread contention and multiple disk contention under high concurrency. At present, an asynchronous read-write technical scheme for load balancing of a plurality of disks does not exist in an HDFS system.

Disclosure of Invention

Therefore, the embodiment of the invention provides a multi-disk load balancing asynchronous read-write scheduling method and device, which are used for numbering disks, acquiring the disk type and the disk number of a data node through an HDFS (Hadoop distributed File System), enabling data reading to be correctly corresponding to the local real disk type and the disk number, and respectively and asynchronously processing read-write requests of different disks through different read-write working threads.

In order to achieve the above object, an embodiment of the present invention provides the following: a multi-disk load balancing asynchronous read-write scheduling method comprises the following steps:

the method comprises the steps that the HDFS is configured, so that a data node of the HDFS returns disk information to a name node, and a metadata query service process is started at the name node, wherein the disk information comprises a disk equipment number and a disk type;

when the database is started, acquiring local all disk information through an interface of a local operating system, and numbering disks according to the disk types in the disk information;

creating independent read-write working threads for the plurality of disks respectively, and enabling the disks to wait for the upper layer to send read-write tasks;

when the database executes SQL, the metadata query process receives an HDFS data request to be read by an upper layer user, the disk number of the HDFS data to be read is obtained through the metadata query service process of the name node, the disk number is sent together when the read-write task is sent to the read-write working thread, the read-write working thread asynchronously processes the respective read-write task according to the disk information when receiving the read-write task, and the upper layer user is informed after the read-write task is completed.

As a preferred scheme of the multi-disk load balancing asynchronous read-write scheduling method, the disk types comprise mechanical disks and solid state disks, and the metadata query service process maintains the mapping of disk numbers according to different disk types.

As an optimal scheme of the multi-disk load balancing asynchronous read-write scheduling method, when the disk information is failed to be acquired, the number of the disks is set to be 1, and the type of the disks is set to be a solid state disk;

and when the read HDFS data request is sent through a remote device of the network, setting the disk type as the network.

As an optimal scheme of the multi-disk load balancing asynchronous read-write scheduling method, different numbers of read-write working threads are distributed for different disk types.

Maintaining a read disk set as a preferred scheme of a multi-disk load balancing asynchronous read-write scheduling method, and numbering a disk requesting to read data again if a disk equipment number and a disk type are failed to be searched in the disk set after a metadata query process receives an HDFS data request to be read; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

As a preferred scheme of the multi-disk load balancing asynchronous read-write scheduling method, an upper layer user transmits data and a disk number where the data is located to a read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for the completion of reading, and when the reading of the HDFS data is completed, the read-write working thread returns the read data to the upper layer user;

when the upper layer user reads and writes the temporary file to the local disk, the corresponding disk equipment number is found through the local directory, the corresponding disk number is found through the disk equipment number, then the disk number where the temporary file data is located is transmitted to the corresponding read-write working thread for processing, the read-write working thread waits for the completion of the read-write, and the read-write working thread returns the temporary file data to the upper layer user after the read-write working thread completes the read-write of the temporary file data.

The embodiment of the invention also provides a multi-disk load balancing asynchronous read-write scheduling device, which comprises:

the node configuration module is used for configuring the HDFS, so that a data node of the HDFS system returns the disk information to a name node, and a metadata query service process is started at the name node;

the disk information acquisition module is used for acquiring local all disk information through an interface of a local operating system when the database is started;

the numbering module is used for numbering the disks according to the disk types in the disk information;

the task creating module is used for creating an independent read-write working thread for each disk so that the disk waits for an upper layer to send a read-write task;

a read-write receiving module for receiving HDFS data request to be read by upper layer user through metadata query process when database executes SQL,

the task processing module is used for asynchronously processing respective read-write tasks according to the disk information when the read-write working thread receives the read-write tasks;

and the read-write feedback module is used for sending the read-write task and the disk number to the read-write working thread and informing an upper layer user.

The device also comprises a maintenance module used for maintaining the mapping of the disk numbers by the metadata inquiry service process according to the different disk types, wherein the disk types comprise a mechanical disk and a solid state disk;

when the disk information is failed to be acquired, the number of the disks is set to be 1, and the type of the disks is set to be a solid state disk;

when the read HDFS data request is sent through a network remote device, setting the disk type as a network;

different numbers of read and write work lines are allocated for different disk types.

Maintaining a read disk set as a preferred scheme of a multi-disk load balancing asynchronous read-write scheduling device, and numbering a disk requesting to read data again if a disk equipment number and a disk type are failed to be searched in the disk set after a metadata query process receives an HDFS data request to be read; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

As a preferred scheme of the multi-disk load balancing asynchronous read-write scheduling device, an upper layer user transmits data and a disk number where the data is located to a read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for the completion of reading, and when the reading of the HDFS data is completed, the read-write working thread returns the read data to the upper layer user;

The embodiment of the invention has the following advantages: the disk numbering is carried out on the disks, the disk type and the disk number of the data node can be obtained through the HDFS, data reading and the local real disk type and the disk number are correctly corresponding, the read-write requests of different disks are asynchronously processed through different read-write working threads respectively, the read-write tasks of the HDFS of the same disk and the read-write tasks of local files under multiple disks are asynchronously processed through the same plurality of read-write threads, the performance overhead caused by high-concurrency multi-thread contention and multiple disk contention can be reduced, and the data read-write efficiency is improved.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below. It should be apparent that the drawings in the following description are merely exemplary, and that other embodiments can be derived from the drawings provided by those of ordinary skill in the art without inventive effort.

Fig. 1 is a flowchart of an asynchronous read-write scheduling method for load balancing of multiple disks according to an embodiment of the present invention;

fig. 2 is a schematic diagram of a multi-disk load balancing asynchronous read-write scheduling apparatus provided in an embodiment of the present invention;

in the figure: 1. a node configuration module; 2. a disk information acquisition module; 3. a numbering module; 4. a task creation module; 5. a read-write receiving module; 6. a task processing module; 7. a read-write feedback module; 8. and maintaining the module.

Detailed Description

The present invention is described in terms of particular embodiments, other advantages and features of the invention will become apparent to those skilled in the art from the following disclosure, and it is to be understood that the described embodiments are merely exemplary of the invention and that it is not intended to limit the invention to the particular embodiments disclosed. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Referring to fig. 1, this embodiment provides a multi-disk load balancing asynchronous read-write scheduling method, which includes the following steps:

s1: the method comprises the steps that the HDFS is configured, so that a data node of the HDFS returns disk information to a name node, and a metadata query service process is started at the name node, wherein the disk information comprises a disk equipment number and a disk type;

s2: when the database is started, acquiring local all disk information through an interface of a local operating system, and numbering disks according to the disk types in the disk information;

s3: creating independent read-write working threads for the plurality of disks respectively, and enabling the disks to wait for the upper layer to send read-write tasks;

s4: when the database executes SQL, the metadata query process receives an HDFS data request to be read by an upper layer user, and a disk number of the HDFS data to be read is acquired through the metadata query service process of the name node;

s5: and when the read-write task is sent to the read-write working thread, the read-write working thread sends a disk number together, and when receiving the read-write task, the read-write working thread asynchronously processes the respective read-write task according to the disk information and informs an upper layer user after the read-write task is completed.

In an embodiment of the multi-disk load balancing asynchronous read-write scheduling method, the disk types include a mechanical disk and a solid state disk, and the metadata query service process maintains mapping of disk numbers according to different disk types. Specifically, in S2, when the database is started, all disks and corresponding disk device names are obtained from the/proc/partitions of the Linux file system, and whether the device is a normal disk is obtained from the/sys/block/device-name/queue/rotate. For a common mechanical disk, the number of the disk is increased from 0, and the number of each mechanical disk is increased by 1; for the solid state disks, the number is reduced by 1 from-10; for the network type, the disk number is set to-1, and since there is only one instance of this type, this design does not overlap for different types of disk numbers. Because the disk numbers of different types are not uniform, a Map is established for uniform management, and the Map is used for mapping the disk numbers with the local serial numbers of the disks.

Specifically, in S1, the configuration item dfs, data, HDFS-blocks, metadata, enabled of the HDFS is set to true, so that the data node can report the disk information such as the disk type to the name node, and start the metadata query service process at the name node.

Specifically, the HDFS-site xml configuration file of each data node of the HDFS is modified, and the following configuration items are added

And all processes of the HDFS service are restarted.

Specifically, when the disk information acquisition fails, the number of disks is set to 1, and the type of the disks is set to be a solid state disk; and when the read HDFS data request is sent through a remote device of the network, setting the disk type as the network.

Specifically, different numbers of read-write work threads are allocated to different disk types. For a common mechanical disk, 1 independent read-write working thread is distributed; for the solid state disk, 8 independent read-write working threads are distributed; for the network, 16 independent read-write worker threads are allocated. The above quantity can be modified in the configuration file, so that the optimization and the use are convenient. Each thread corresponds to a disk task queue, and when the thread is idle, the thread waits in the queue without consuming resources. If a task is put into the queue, the read-write working thread is immediately awakened to carry out read-write operation.

Specifically, after the metadata query service is started, two maps are created, which correspond to the disk number mappings of the ordinary mechanical disk HDDMap and the solid state disk SSDMap, respectively. The Key of Map is the node name, Value is the device numbers and corresponding numbers of all disks under this node, and Value itself is also a Map structure (the device number of Key disk, Value is the number of disk). Wherein the device number is obtained from the HDFS and each disk is a unique invariant number.

In one embodiment of the multi-disk load balancing asynchronous read-write scheduling method, a read disk set is maintained, and after the metadata query process receives an HDFS data request to be read, if the disk equipment number and the disk type are not searched in the disk set, the disks requesting to read the data are numbered again; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

In one embodiment of the multi-disk load balancing asynchronous read-write scheduling method, an upper layer user transmits data and a disk number where the data is located to a read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for the completion of reading, and when the reading of the HDFS data is completed, the read-write working thread returns the read data to the upper layer user;

Specifically, when the database executes SQL (Structured Query Language), a metadata Query service process of a name node first obtains a disk number of data to be read, and then the disk number is taken when a read task is actually sent to a read-write line work procedure, and meanwhile, if a temporary file needs to be read and written to a local disk in the execution process, the disk number is also taken when the read-write task is sent. And when receiving the read-write tasks, the read-write working threads of different disks asynchronously process the respective read-write tasks according to the information, and inform an upper layer user after the read-write working threads are completed.

Specifically, after receiving a request for HDFS data to be read, the metadata query process queries the name node for the data node where the data is located, the device number of the disk where the data is located, and the type of the disk. If the disk is a common mechanical disk and is read for the first time, namely the disk cannot be found from the common disk set, setting the number of the disk to be 0 plus the number of elements of the common mechanical disk set, and adding the disk to the common mechanical disk set; if the disk is a solid state disk and is read for the first time, namely the disk cannot be found from the solid state disk set, the serial number is set to minus 10 of the number of elements of the solid state disk set, and the disk is added into the solid state disk set; if this data is not on the local disk, but on the remote network device, the number is set to-1. If the data reading request is the data reading request for the second time and later, the data can be found from the corresponding disk set, and the number which is already done for the first time is returned.

Specifically, if the HDDMap is a common mechanical disk, the HDDMap is searched for the existence of the Map by using the data node, and if the Map does not exist, a Map is constructed and the data node and the Map are inserted into the HDDMap. If the Value exists, the corresponding Value is taken out, then the device number of the disk is continuously used for searching in the Value, if the Value is found, the Value is taken out to be used as the number of the disk and returned to a caller, if the Value is not found, the size of the Value is used as the number, the device number of the disk and the number are input into the Value, and the Value is sent back to the caller. In the case of a solid state disk, the same operation is performed on the SSDMap, with the only difference that, in the case of numbering, the size of minus 10 minus Value is used as the number.

In the actual execution process of the SQL, the corresponding disk queue is found through the disk number, the read-write tasks are placed in the disk queue, the read-write thread corresponding to each disk queue can immediately acquire the read-write tasks, the read-write tasks are respectively read and written, and after the read-write tasks are completed, the upper-layer caller is informed.

Referring to fig. 2, an embodiment of the present invention further provides a multi-disk load balancing asynchronous read-write scheduling apparatus, including:

the node configuration module 1 is used for configuring the HDFS, so that a data node of the HDFS system returns disk information to a name node, and a metadata query service process is started at the name node;

the disk information acquisition module 2 is used for acquiring all local disk information through an interface of a local operating system when the database is started;

the numbering module 3 is used for numbering the disks according to the disk types in the disk information;

the task creating module 4 is used for creating an independent read-write working thread for each disk so that the disk waits for an upper layer to send a read-write task;

a read-write receiving module 5, for receiving the HDFS data request to be read by the upper layer user through the metadata query process when the database executes SQL,

the task processing module 6 is used for asynchronously processing respective read-write tasks according to the disk information when the read-write working thread receives the read-write tasks;

and the read-write feedback module 7 is used for sending the read-write task and the disk number to the read-write working thread and informing an upper layer user.

In an embodiment of the multi-disk load balancing asynchronous read-write scheduling apparatus, the apparatus further includes a maintenance module 8, configured to maintain mapping of disk numbers according to different disk types by a metadata query service process, where the disk types include a mechanical disk and a solid state disk; when the disk information is failed to be acquired, the number of the disks is set to be 1, and the type of the disks is set to be a solid state disk; when the read HDFS data request is sent through a network remote device, setting the disk type as a network; different numbers of read and write work lines are allocated for different disk types.

In one embodiment of the multi-disk load balancing asynchronous read-write scheduling device, a read disk set is maintained, and after a metadata query process receives an HDFS data request to be read, if the disk equipment number and the disk type are not searched in the disk set, the disks requesting to read the data are numbered again; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

In one embodiment of the multi-disk load balancing asynchronous read-write scheduling device, an upper layer user transmits data and a disk number where the data is located to a read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for the completion of reading, and when the reading of the HDFS data is completed, the read-write working thread returns the read data to the upper layer user; when the upper layer user reads and writes the temporary file to the local disk, the corresponding disk equipment number is found through the local directory, the corresponding disk number is found through the disk equipment number, then the disk number where the temporary file data is located is transmitted to the corresponding read-write working thread for processing, the read-write working thread waits for the completion of the read-write, and the read-write working thread returns the temporary file data to the upper layer user after the read-write working thread completes the read-write of the temporary file data.

According to the technical scheme of the embodiment of the invention, a configuration item dfs.datanode.hdfs-blocks-metadata.enabled of an HDFS is set to true, so that a data node can report information such as a disk type to a name node and start a metadata query service process on the name node; when the database is started, an interface of a local Linux operating system acquires all local disks and disk types, and different numbers are respectively given according to the types; when the database is started, establishing independent read-write working threads for a plurality of disks respectively, and waiting for an upper layer to send a read-write task; when the database executes SQL, firstly, a disk number of data to be read is obtained through a metadata query service process of a name node, then the disk number is taken when a read task is actually sent to a read-write working thread, and meanwhile, if a temporary file needs to be read and written to a local disk in the execution process, the disk number is also taken when the read-write task is sent. And when receiving the read-write tasks, the read-write working threads of different disks asynchronously process the respective read-write tasks according to the information, and inform an upper layer user after the read-write working threads are completed. The HDFS reading and writing tasks of the same disk and the reading and writing tasks of the local file under the multiple disks are asynchronously processed by the same plurality of reading and writing threads, so that the performance overhead caused by high-concurrency multi-thread contention and multi-disk contention can be reduced, and the data reading and writing efficiency is improved.

Although the invention has been described in detail above with reference to a general description and specific examples, it will be apparent to one skilled in the art that modifications or improvements may be made thereto based on the invention. Accordingly, such modifications and improvements are intended to be within the scope of the invention as claimed.

Claims

1. A multi-disk load balancing asynchronous read-write scheduling method is characterized by comprising the following steps:

configuring an HDFS (Hadoop distributed file system), enabling a data node of the HDFS to report disk information back to a name node, and starting a metadata query process at the name node, wherein the disk information comprises a disk equipment number and a disk type;

when the database executes SQL, the metadata query process receives an HDFS data request to be read by an upper layer user, and the disk number of the HDFS data to be read is obtained through the metadata query process of the name node;

and when the read-write task is sent to the read-write working thread, the read-write working thread sends a disk number together, and when receiving the read-write task, the read-write working thread asynchronously processes the respective read-write task according to the disk information and informs an upper layer user after the read-write task is completed.

2. The multi-disk load balancing asynchronous read-write scheduling method of claim 1, wherein the disk types include a mechanical disk and a solid state disk, and the metadata query process maintains mapping of disk numbers according to different disk types.

3. The multi-disk load balancing asynchronous read-write scheduling method of claim 1, wherein when obtaining the disk information fails, the number of disks is set to 1, and the disk type is set to a solid state disk;

4. The multi-disk load balancing asynchronous read-write scheduling method of claim 1 wherein different numbers of read-write worker threads are allocated for different disk types.

5. The multi-disk load balancing asynchronous read-write scheduling method according to claim 1, characterized in that a read disk set is maintained, and after the metadata query process receives a request for HDFS data to be read, if the number of a disk device and the type of a disk in the disk set fail to be found, the disks requesting to read the data are numbered again; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

6. The multi-disk load balancing asynchronous read-write scheduling method according to claim 1, wherein an upper user transmits data and a disk number where the data is located to a read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for completion of reading, and when reading the HDFS data is completed, the read-write working thread returns the read data to the upper user;

7. A multi-disk load balancing asynchronous read-write scheduling device is characterized by comprising:

the node configuration module is used for configuring the HDFS, so that a data node of the HDFS system returns the disk information to a name node, and a metadata query process is started at the name node;

8. The multi-disk load balancing asynchronous read-write scheduling device of claim 7, further comprising a maintenance module, configured to maintain mapping of disk numbers according to different disk types in the metadata query process, where the disk types include a mechanical disk and a solid state disk;

9. The multi-disk load balancing asynchronous read-write scheduling device of claim 7, wherein a read disk set is maintained, and when a metadata query process receives a request for HDFS data to be read, if a disk device number and a disk type are not found in the disk set, the disks requesting for reading the data are numbered again; and if the disk equipment number and the disk type are successfully searched in the disk set, adopting the existing disk number for the disk which requests to read data.

10. The multi-disk load balancing asynchronous read-write scheduling device of claim 7, wherein an upper user transmits data and a disk number where the data is located to the read-write subsystem when reading HDFS data, the read-write subsystem sends a read task to a read-write working thread of a disk corresponding to the disk number for processing and waits for completion of reading, and when reading the HDFS data is completed, the read-write working thread returns the read data to the upper user;