CN115718665A - Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment - Google Patents

Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment Download PDF

Info

Publication number
CN115718665A
CN115718665A CN202310030841.6A CN202310030841A CN115718665A CN 115718665 A CN115718665 A CN 115718665A CN 202310030841 A CN202310030841 A CN 202310030841A CN 115718665 A CN115718665 A CN 115718665A
Authority
CN
China
Prior art keywords
thread
processor
threads
configuration mode
processor resource
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202310030841.6A
Other languages
Chinese (zh)
Other versions
CN115718665B (en
Inventor
李锐喆
赵彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Carpura Technology Co ltd
Original Assignee
Beijing Carpura Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Carpura Technology Co ltd filed Critical Beijing Carpura Technology Co ltd
Priority to CN202310030841.6A priority Critical patent/CN115718665B/en
Publication of CN115718665A publication Critical patent/CN115718665A/en
Application granted granted Critical
Publication of CN115718665B publication Critical patent/CN115718665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention relates to the technical field of asynchronous I/O (input/output) and provides a resource scheduling control method, device, medium and equipment of an asynchronous I/O thread processor. The method comprises the following steps: responding to the starting of the application program to the I/O thread, acquiring a processor resource configuration mode for the I/O thread, and scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode; monitoring synchronous waiting expenses of all I/O threads in cooperative work in real time in a competitive dynamic configuration mode, and scheduling the I/O threads to occupy processor resources by combining the synchronous waiting expenses; allowing each I/O thread to run on multiple processor cores in a contention-free dynamic configuration mode or a contention-based dynamic configuration mode. Aiming at the condition that the I/O thread and the calculation thread are operated simultaneously, the control of the I/O thread processor resource scheduling is realized by utilizing the prior technical basis of an operating system so as to enable all the I/O threads to almost simultaneously acquire the processor resource as much as possible, thereby realizing the efficient completion of the I/O request and reducing the influence of the I/O threads on the calculation thread as much as possible.

Description

Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment
Technical Field
The present invention relates to the field of asynchronous I/O technologies, and in particular, to a method, an apparatus, a medium, and a device for controlling resource scheduling of an asynchronous I/O thread processor.
Background
With the increasing computing power of computers and the increasing demand of applications, the amount of data processed by applications is increasing, and the amount of data Input/Output (I/O) from a file system is also increasing. Data I/O is becoming a performance bottleneck for the speed at which most applications run, as I/O speeds increase much slower than computing power increases. Parallel I/O techniques and asynchronous I/O techniques are developed to reduce the impact of I/O overhead.
The parallel I/O technology utilizes the parallel storage capacity of a computer to store the data of an application program in a plurality of processes in a scattered manner, and the plurality of processes read and write the file data cooperatively, so that the I/O speed is increased. Asynchronous I/O technology is to use I/O special process/thread except for computing process/thread to complete data I/O operation, so that the application program can continue to execute other computation without waiting for the completion of data I/O operation, thereby the computation of the application program and the data I/O processing are overlapped. In addition, parallel asynchronous I/O systems exist in the related art, which consider both parallel I/O technology and asynchronous I/O technology, to reduce the impact of I/O overhead on the running speed of an application program to the maximum extent.
There are two main ways of implementing asynchronous I/O technology:
one approach is to use dedicated I/O processes. Each I/O process is an independent process other than the original computing process of the application, and when the operation of the application is started, additional independent computing resources are generally required to be allocated to each I/O process. For example, an application originally adopts 1024 processes/processor cores to perform parallel computation, and if 16I/O processes are to be used, 1024+16=1040 processes/processor cores need to be applied when the application is submitted.
Another approach is to use dedicated I/O threads. Each I/O thread is automatically created by a compute thread of an application and then run on a compute node of the corresponding compute thread.
Through the analysis of the applicant, the following situations can be faced by an I/O thread on the same computing node as a computing thread:
1) The processor of the computing node is provided with a processor core which specially runs an I/O thread. The I/O thread mainly works by communication and I/O, has no actual calculation amount, mainly calculates by integers, and can support the operation of the I/O thread by a simplified processor core. With the rapid development of processor architectures, I/O processor cores dedicated to running I/O threads may appear in the future.
2) The processor of the computing node is not provided with a processor core specially running the I/O thread, but the number of the processor cores running the I/O thread is reserved on the computing node when the application program is started to run, namely the total number of the computing thread and the I/O thread does not exceed the total number of the processor cores. For example, on a compute node with a total number of processor cores of 64, a total of 62 compute threads are run, and 2 processor cores available for running I/O threads are reserved.
3) The total number of compute threads and I/O threads exceeds the total number of cores of the processor. For example, on a compute node with a total core number of 64, 64 compute threads are run, and a newly started I/O thread needs to compete with the compute threads for using a processor core.
Under the condition that the I/O thread and the computing thread compete for using the resource of the processor core, the operating system alternately schedules the I/O thread and the computing thread to the processor core for actual operation on the basis of a time sharing mechanism. Although sleep mechanisms for I/O threads in the absence of I/O requests have been proposed in the related art to reduce contention, there are still situations where compute threads and I/O threads are to run simultaneously. Applicants have found through testing that there is still a competing schedule of processor resources and that the speed of computation and I/O may be significantly reduced if parallel asynchronous I/O is used. I/O threads that cooperatively perform asynchronous parallel I/O are typically distributed among different compute nodes. The I/O threads are used for completing various I/O requests cooperatively through frequent global communication. If I/O threads can be paced in pace, namely: when an I/O request exists, all I/O threads almost simultaneously acquire processor resources, so that the efficient completion of the I/O request can be ensured, and the influence of the I/O threads on the computing threads is further reduced. However, the prior art cannot realize uniform scheduling of the I/O threads on different computing nodes.
Disclosure of Invention
In order to realize uniform scheduling of I/O threads on different computing nodes, the invention provides a resource scheduling control method, a device, a medium and equipment of an asynchronous I/O thread processor.
In a first aspect, an embodiment of the present invention provides a method for controlling resource scheduling of an asynchronous I/O thread processor, including:
responding to the starting of an application program to an I/O thread, and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a non-competitive dynamic configuration mode and a competitive dynamic configuration mode;
scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting expense of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy the processor resources by combining the synchronous waiting expense; and under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode or a contention dynamic configuration mode, allowing each I/O thread to run on a plurality of processor cores.
In some implementations, when the processor resource configuration is a fixed configuration, numbers or keywords of a plurality of processor cores specified by the fixed configuration are obtained, and each I/O thread is fixed to the plurality of processor cores and runs based on the numbers or the keywords.
In some implementations, the monitoring, in real time, a synchronous latency overhead when all I/O threads work cooperatively includes:
and mutually exchanging physical time for starting synchronization among the I/O threads when the I/O threads synchronize for cooperatively taking out the same I/O request, and determining the difference between the latest physical time and the earliest physical time in the physical time corresponding to each I/O thread as the synchronous waiting overhead when all the I/O threads cooperatively work.
In some implementations, said scheduling I/O threads in conjunction with said synchronous wait for overhead occupies processor resources, including:
and when responding to an I/O related instruction initiated by an application program, if the synchronous waiting cost exceeds a first threshold value, the priority of all I/O threads is increased.
In some implementations, said scheduling I/O threads in conjunction with said synchronous latency overheads occupies processor resources, further comprising:
dynamically monitoring the processor resource occupation time of the I/O thread;
and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set time length.
In some implementations, the first threshold is a preset value, or the first threshold is determined by automatic sampling when the application starts the asynchronous parallel I/O system.
In some implementations, the method further includes:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources;
under the condition that all I/O threads acquire processor resources, all I/O threads are automatically tested and sampled to carry out synchronous waiting cost, and the numerical value obtained by amplifying the synchronous waiting cost according to a preset multiple is used as a first threshold value.
In some implementations, the dynamically monitoring processor resource occupancy time for an I/O thread includes:
monitoring processor resource occupation time of an I/O thread when finishing an I/O request, wherein the processor resource occupation time is the difference between end time and starting time, the starting time is the time that the I/O thread occupies the processor resource after being awakened again from the end of the previous dormancy, and the end time is the time when finishing the I/O request;
setting the time length for the I/O thread dormancy comprises the following steps: and after the I/O thread finishes the current I/O request and before taking out the next I/O request, the I/O thread starts the autonomous dormancy according to the set duration.
In some implementations, the allowing each I/O thread to run on multiple processor cores and pinning each I/O thread to run on the number of processor cores is implemented by an affinity set command of an operating system.
In some implementations, when an application initiates an instruction waiting for completion of an asynchronous I/O request, if there are I/O requests that have not yet been completed, the I/O thread does not go to autonomous sleep in responding to all I/O requests that are waiting for completion by the application.
In a second aspect, an embodiment of the present invention provides an apparatus for controlling resource scheduling of an asynchronous I/O thread processor, including:
the acquisition module is used for responding to the starting of an application program to the I/O thread and acquiring a processor resource configuration mode to the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a non-competitive dynamic configuration mode and a competitive dynamic configuration mode;
the scheduling module is used for scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode;
under the condition that the processor resource configuration mode is a competitive dynamic configuration mode, monitoring synchronous waiting expenses of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy processor resources by combining the synchronous waiting expenses; and allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode or a contention dynamic configuration mode.
In a third aspect, an embodiment of the present invention provides a computer storage medium, where a computer program is stored on the computer storage medium, and when the computer program is executed by one or more processors, the computer program implements the method according to the first aspect.
In a fourth aspect, embodiments of the present invention provide a computer device, including one or more processors and a memory, where the memory stores thereon a computer program, and the computer program, when executed by the one or more processors, implements the method according to the first aspect.
The invention can at least bring the following beneficial effects:
aiming at the condition that the I/O thread and the calculation thread are operated simultaneously, the control of the resource scheduling of the I/O thread processor is realized by utilizing the prior art basis of an operating system, so that all the I/O threads almost simultaneously acquire the processor resource as much as possible, the efficient completion of the I/O request is realized, and the influence of the I/O threads on the calculation thread is reduced as much as possible.
Drawings
To more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are required to be used in the embodiments will be briefly described below, and it should be understood that the following drawings only illustrate some embodiments of the present invention and therefore should not be considered as limiting the scope.
FIG. 1 is a flowchart of a method for controlling resource scheduling of an asynchronous I/O thread processor according to an embodiment of the present invention;
fig. 2 is a block diagram of an asynchronous I/O thread processor resource scheduling control apparatus according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Example one
Fig. 1 provides a flowchart of a method for controlling resource scheduling of an asynchronous I/O thread processor, and as shown in fig. 1, the method for controlling resource scheduling of an asynchronous I/O thread processor of this embodiment includes the following steps:
and step S101, responding to the starting of the I/O thread by the application program, and acquiring a processor resource configuration mode of the I/O thread.
The processor resource allocation mode comprises one of a fixed allocation mode, a non-competitive dynamic allocation mode and a competitive dynamic allocation mode.
The fixed configuration mode can refer to dispatching the I/O thread to a plurality of appointed processor cores; the contention-free dynamic configuration mode may refer to a mode that an I/O thread does not need to compete with a computational thread for using processor resources, including the number of processor cores that have been reserved on a compute node to run the I/O thread, i.e., the total number of the computational thread and the I/O thread does not exceed the total number of cores of a processor; a competitive dynamic configuration mode may refer to a mode in which an I/O thread needs to compete with a compute thread for use of processor resources, i.e., the total number of compute threads and I/O threads exceeds the total number of cores of the processor. And under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention dynamic allocation mode, allowing each I/O thread to run on a plurality of processor cores. And under the condition that the processor resource allocation mode is a fixed allocation mode, fixing each I/O thread to run on a plurality of processor cores.
Which processor cores a thread can run on can be set by an affinity (affinity) set command of the operating system. And when an I/O thread is a slave thread created by a compute thread, it typically inherits the compute thread's affinity (affinity) for the processor core.
In this embodiment, allowing each I/O thread to run on multiple processor cores and fixing each I/O thread to run on several processor cores may be implemented by an affinity setting command of an operating system.
Specifically, after an I/O thread is created by a computing thread of an application program, the affinity of the I/O thread is set to a plurality of processor cores by using a corresponding command of an operating system; in a competitive dynamic configuration mode, the affinity of the computing thread of the application program is set to be a plurality of processor cores. The plurality of processor cores may be all processor cores or part of processor cores on the computing node.
And S102, scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode.
And under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting expense of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy the processor resources by combining the synchronous waiting expense. The I/O thread can be scheduled to occupy processor resources by setting settings that affect the operating system scheduling of the I/O thread on the compute node.
Under the condition that the processor resource configuration mode is a fixed configuration mode, acquiring the numbers or keywords of a plurality of processor cores appointed by the fixed configuration mode, scheduling the I/O threads to occupy the processor resources based on the numbers or the keywords, and enabling each I/O thread to be fixed on the plurality of processor cores to run by utilizing a scheduling control command provided by an operating system. The fixed configuration mode is suitable for the condition that the processor core special for the I/O thread exists and the condition that the general processor core is reserved for the I/O thread.
In the case that the processor resource allocation mode is a contention-free dynamic allocation mode, the total number of the computing threads and the I/O threads does not exceed the total number of cores of the processor, and the computing threads and the I/O threads can run on a plurality of processor cores. Thus, this configuration allows the I/O thread to run on multiple processor cores without additional operations or setup.
When the asynchronous parallel I/O system works, all I/O threads firstly take out the same I/O request in a coordinated mode, and then work such as data input/output of the I/O request is finished in a coordinated mode. Synchronization among all I/O threads may be involved in fetching an I/O request. The process of fetching an I/O request is computationally very small and therefore theoretically has a very short completion time. However, fetching of I/O requests can only be done in coordination if all I/O threads have acquired processor resources. The synchronization latency overhead of the I/O request fetch process becomes large when a portion of the I/O threads have fetched processor resources while the remaining I/O threads are waiting for processor resources. Therefore, the present embodiment monitors the actual synchronization waiting overhead taken out by the I/O request in real time and uses it as the synchronization waiting overhead when the I/O threads work in cooperation.
In some implementations, monitoring the synchronization latency overhead of all I/O threads in real time when they work together may include: and mutually exchanging physical time for starting synchronization among the I/O threads when the I/O threads synchronize for cooperatively taking out the same I/O request, and determining the difference between the latest physical time and the earliest physical time in the physical time corresponding to each I/O thread as the synchronous waiting overhead when all the I/O threads cooperatively work.
In such a monitoring mode, when all the I/O threads (each belonging to a process) are synchronized, the physical time for starting synchronization (the physical time between the computing nodes on the high-performance computer is almost the same; even if the physical time is different, correction can be performed) is exchanged, and the difference between the latest physical time and the earliest physical time is the synchronization waiting overhead. And then the I/O thread can be scheduled to occupy the processor resource according to the synchronous waiting overhead monitored in real time.
When the synchronization waiting overhead is combined, the size of the synchronization waiting overhead generally needs to be judged. One possible implementation is to make a decision based on the first threshold, and when the synchronization waiting overhead obtained by actual monitoring is greater than the first threshold, it means that the synchronization overhead is too large.
When setting the setting affecting the operating system to schedule the I/O thread on the compute node in combination with the synchronous wait overhead to schedule the I/O thread to occupy the processor resource, the setting may include: no operation is done, the priority of the I/O thread is improved, and the dormancy set time length of each I/O thread is prolonged. When the actual synchronous waiting cost is too large (exceeds a first threshold), the priority of the I/O threads is increased, and when the I/O threads occupy the processor for too long time, the I/O threads are dormant for a set time length.
In some cases, scheduling I/O threads to occupy processor resources in conjunction with synchronous latency overhead may include:
when an I/O related instruction initiated by an application program is responded, if the synchronous waiting cost exceeds a first threshold value and the synchronous cost is too large, the priority of all I/O threads is improved. The first threshold value may be a preset value, or may be determined by automatic sampling when the application starts the asynchronous parallel I/O system.
Further, in a case that the first threshold is determined by automatic sampling when the application starts the asynchronous parallel I/O system, the method of this embodiment may further include:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources; under the condition that all I/O threads acquire processor resources, synchronous waiting expenses for synchronizing all the I/O threads are automatically tested and sampled, and numerical values obtained by amplifying the synchronous waiting expenses according to preset times serve as first threshold values. The preset multiple may be set according to actual requirements, which is not limited in this embodiment.
The operating system may adjust the priority of the threads in real time during the scheduling process, for example, the priority of the threads that do not occupy the processor for a long time may be increased. Although one thread may not be able to change its scheduling priority, the compute thread may adjust the scheduling priority of the I/O thread because the parent thread can change the priority of the child thread, while the I/O thread is created by the compute thread (i.e., the compute thread is the parent of the I/O thread). A compute thread may interact with an I/O thread only in response to an I/O-related instruction initiated by an application, at which point the priority of the I/O thread may be changed. Therefore, in response to an application-initiated I/O-related instruction, if the compute thread finds that the actual synchronization latency of the I/O thread is too large, the priority of the I/O thread is increased such that the priority of the I/O thread is higher than the priority of the compute thread.
In other cases, scheduling the I/O thread to occupy processor resources in conjunction with the synchronous latency overhead may further include:
dynamically monitoring processor resource occupation time of the I/O thread; and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set time length. The second threshold may be set according to actual requirements, and this embodiment does not limit this.
In light of the foregoing, a compute thread can only raise the priority of an I/O thread if the I/O-related instruction is initiated by an application. Between two I/O related instructions, a situation that I/O threads need to complete a large number of I/O requests may occur, and at this time, all the I/O threads need to occupy processor resources for a long time to complete I/O efficiently and reduce performance influence on numerical calculation. However, when a compute thread and an I/O thread compete for use of processor resources, the I/O thread cannot occupy the processor resources for a long time because its scheduling priority decreases with the increase of the occupied processor resource time, which may cause the I/O thread to be switched out when not completing all I/O requests, and after being scheduled back again, the synchronization waiting overhead between the I/O threads may become large. For the situation, the I/O thread is enabled to sleep for a certain time period, the priority of the I/O thread relative to the computing thread during automatic awakening can be improved, and the synchronous waiting overhead among the I/O threads is enabled to be as small as possible.
Whether the I/O thread needs to be dormant or not in the embodiment is realized by dynamically monitoring whether the processor resource occupation time of the I/O thread exceeds a second threshold value or not, so that the I/O thread is started to be autonomously dormant. The aforementioned dynamically monitoring processor resource occupation time of the I/O thread may further include:
monitoring processor resource occupation time of an I/O thread when finishing an I/O request, wherein the processor resource occupation time is the difference between end time and starting time, the starting time is the time that the I/O thread occupies the processor resource after being awakened again from the end of the previous dormancy, and the end time is the time when finishing the I/O request;
correspondingly, the setting of the time length for the dormancy of each I/O thread comprises the following steps: and after the I/O thread finishes the current I/O request and before taking out the next I/O request, the I/O thread starts the autonomous dormancy according to the set duration.
It should be appreciated that if the processor occupancy time of an I/O thread after completion of a first I/O request does not reach the second threshold, then no autonomous sleep is performed, and a determination is made again after completion of a second I/O request, and so on. And when the autonomous dormancy is determined to be carried out, after the I/O thread finishes the current I/O request and before the next I/O request is taken out, the I/O thread starts the autonomous dormancy according to a second threshold value. The second threshold and the set time length may be preset values or dynamically changed values. For example, when the test finds that the implementation of the above method for autonomous sleeping of the I/O thread is not good (for example, the synchronization waiting overhead is still large), the second threshold or the set time duration may be decreased to ensure that the dynamic priority of the I/O thread at the end of sleeping is higher than that of the computing thread.
In some implementations, the method of this embodiment further includes:
when an application program initiates an instruction waiting for the completion of asynchronous I/O requests, if I/O requests which are not completed exist, the I/O requests need to be completed as soon as possible, and the I/O thread does not conduct autonomous dormancy in the process of responding to all the I/O requests which are waiting to be completed by the application program.
The method of the embodiment, aiming at the situation that the I/O thread and the computing thread are to run simultaneously, utilizes the prior art basis of the operating system to realize the control of the resource scheduling of the I/O thread processor, so that all the I/O threads can almost simultaneously acquire the processor resource as much as possible, thereby realizing the efficient completion of the I/O request and reducing the influence of the I/O threads on the computing thread as much as possible.
Example two
As shown in fig. 2, the asynchronous I/O thread processor resource scheduling control apparatus provided in this embodiment includes:
an obtaining module 201, configured to, in response to starting of an I/O thread by an application program, obtain a processor resource configuration mode for the I/O thread, where the processor resource configuration mode includes one of a fixed configuration mode, a contention-free dynamic configuration mode, and a competitive dynamic configuration mode;
the scheduling module 202 is configured to schedule the I/O thread to occupy the processor resource based on the processor resource configuration mode.
Under the condition that the processor resource configuration mode is a competitive dynamic configuration mode, monitoring synchronous waiting expenses of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy processor resources by combining the synchronous waiting expenses; when the processor resource allocation mode is a contention-free dynamic allocation mode or a contention dynamic allocation mode, each I/O thread can run on a plurality of processor cores.
The fixed configuration mode can refer to dispatching the I/O threads to a plurality of appointed processor cores; the contention-free dynamic configuration mode may refer to a mode in which the I/O thread does not need to compete with the computing thread for using processor resources, including the number of processor cores that have been reserved on the computing node to run the I/O thread, that is, the total number of the computing thread and the I/O thread does not exceed the total number of the processor cores; a competitive dynamic configuration mode may refer to a mode in which an I/O thread needs to compete with a compute thread for use of processor resources, i.e., the total number of compute threads and I/O threads exceeds the total number of cores of the processor. And under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode or a contention dynamic configuration mode, allowing each I/O thread to run on a plurality of processor cores. And under the condition that the processor resource allocation mode is a fixed allocation mode, fixing each I/O thread to run on a plurality of processor cores.
In this embodiment, allowing each I/O thread to run on multiple processor cores and fixing each I/O thread to run on the processor cores may be implemented by an affinity setting command of an operating system.
Specifically, after an I/O thread is created by a computing thread of an application program, the affinity of the I/O thread is set to a plurality of processor cores by using a corresponding command of an operating system; in a competitive dynamic configuration mode, the affinity of the computing thread of the application program is set to be a plurality of processor cores. The plurality of processor cores may be all processor cores or part of processor cores on the computing node.
And under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting expense of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy the processor resources by combining the synchronous waiting expense. The I/O thread can be scheduled to occupy processor resources by setting settings on the compute nodes that affect the operating system to schedule the I/O thread.
Under the condition that the processor resource configuration mode is a fixed configuration mode, acquiring the numbers or keywords of a plurality of processor cores appointed by the fixed configuration mode, scheduling the I/O threads to occupy the processor resources based on the numbers or the keywords, and enabling each I/O thread to be fixed on the plurality of processor cores to run by utilizing a scheduling control command provided by an operating system. The fixed configuration mode is suitable for the condition that a special processor core of the I/O thread exists and the condition that a general processor core is reserved for the I/O thread.
Under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode, the total number of the computing threads and the I/O threads does not exceed the total number of cores of the processor, and the computing threads and the I/O threads can run on a plurality of processor cores. Thus, this configuration allows the I/O thread to run on multiple processor cores without additional operations or without setup.
In some implementations, monitoring the synchronization latency overhead of all I/O threads in real time when they work together may include: and mutually exchanging physical time when starting synchronization between the I/O threads when all the I/O threads synchronize for taking out the same I/O request in a coordinated manner, and determining a difference value between the latest physical time and the earliest physical time in the physical time corresponding to each I/O thread as the synchronization waiting overhead when all the I/O threads work in a coordinated manner.
In such a monitoring mode, when all the I/O threads (each belonging to a process) are synchronized, the physical time for starting synchronization (the physical time between the computing nodes on the high-performance computer is almost the same; even if the physical time is different, correction can be performed) is exchanged, and the difference between the latest physical time and the earliest physical time is the synchronization waiting overhead. And then the I/O thread can be scheduled to occupy the processor resource according to the synchronous waiting overhead monitored in real time.
In some cases, scheduling I/O threads to occupy processor resources in conjunction with synchronous latency overhead may include:
when responding to an I/O related instruction initiated by an application program, if the synchronous waiting cost exceeds a first threshold value and the synchronous cost is overlarge, the priority of all I/O threads is improved. The first threshold value may be a preset value, or may be determined by automatic sampling when the application starts the asynchronous parallel I/O system.
Further, in a case that the first threshold is determined by automatic sampling when the application starts the asynchronous parallel I/O system, the method of this embodiment may further include:
when an application program starts an asynchronous parallel I/O system, all the computing threads are dormant to release processor resources; under the condition that all I/O threads acquire processor resources, synchronous waiting expenses for synchronizing all the I/O threads are automatically tested and sampled, and numerical values obtained by amplifying the synchronous waiting expenses according to preset times serve as first threshold values. The preset multiple may be set according to actual requirements, which is not limited in this embodiment.
In other cases, scheduling the I/O thread to occupy processor resources in conjunction with the synchronous wait for overhead may further include:
dynamically monitoring processor resource occupation time of the I/O thread; and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set time length. The second threshold may be set according to actual requirements, and this embodiment does not limit this.
In light of the foregoing, a compute thread can only raise the priority of an I/O thread if the I/O-related instruction is initiated by an application. Between two I/O related instructions, a situation that I/O threads need to complete a large number of I/O requests may occur, and at this time, all the I/O threads need to occupy processor resources for a long time to complete I/O efficiently and reduce performance influence on numerical calculation. However, when a compute thread and an I/O thread compete for use of processor resources, the I/O thread cannot occupy the processor resources for a long time because its scheduling priority decreases with the increase of the occupied processor resource time, which may cause the I/O thread to be switched out when not completing all I/O requests, and after being scheduled back again, the synchronization waiting overhead between the I/O threads may become large. For the situation, the I/O thread can sleep autonomously for a certain time, so that the priority of the I/O thread relative to the computing thread during automatic awakening can be improved, and the synchronous waiting overhead among the I/O threads is reduced as much as possible.
Whether the I/O thread needs to be dormant or not in the embodiment is realized by dynamically monitoring whether the processor resource occupation time of the I/O thread exceeds a second threshold value or not, so that the I/O thread is started to be autonomously dormant. The aforementioned dynamically monitoring processor resource occupation time of the I/O thread may further include:
monitoring processor resource occupation time of an I/O thread when the I/O thread completes an I/O request, wherein the processor resource occupation time is the difference between ending time and starting time, the starting time is the time that the I/O thread occupies the processor resource after being awakened again from the end of the previous dormancy, and the ending time is the time when the I/O request is completed;
correspondingly, the setting of the time length for the dormancy of each I/O thread comprises the following steps: and after the I/O thread finishes the current I/O request and before taking out the next I/O request, the I/O thread starts the autonomous dormancy according to the set duration.
It should be appreciated that if the processor occupancy time of an I/O thread after completion of a first I/O request does not reach the second threshold, then no autonomous sleep is performed, and a determination is made again after completion of a second I/O request, and so on. And when the autonomous dormancy is determined to be carried out, after the I/O thread finishes the current I/O request and before the next I/O request is taken out, the I/O thread starts the autonomous dormancy according to a second threshold value. The second threshold and the set time length may be preset values or dynamically changed values. For example, when the test finds that the implementation of the above-mentioned I/O thread auto-sleep method is not good (e.g. the synchronization waiting overhead is still large), the second threshold or the set time duration may be decreased to ensure that the dynamic priority of the I/O thread at the end of sleep is higher than that of the computing thread.
In some implementations, the scheduling module 202 is further configured to: when an application program initiates an instruction waiting for the completion of asynchronous I/O requests, if I/O requests which are not completed exist, the I/O requests need to be completed as soon as possible, and the I/O thread does not conduct autonomous dormancy in the process of responding to all the I/O requests which are waiting to be completed by the application program.
The device of this embodiment, for the situation that the I/O thread and the computing thread are to run simultaneously, utilizes the existing technical basis of the operating system to implement control of the scheduling of the I/O thread processor resources, so that all the I/O threads are almost simultaneously acquired to the processor resources as much as possible, thereby implementing efficient completion of the I/O request, and reducing the influence of the I/O threads on the computing thread as much as possible.
EXAMPLE III
The present embodiments provide a computer storage medium having a computer program stored thereon, which, when executed by one or more processors, implements the methods of the preceding embodiments.
The computer-readable storage medium may be implemented by any type of volatile or nonvolatile Memory device or combination thereof, such as Static Random Access Memory (SRAM), electrically Erasable Programmable Read-Only Memory (EEPROM), erasable Programmable Read-Only Memory (EPROM), programmable Read-Only Memory (PROM), read-Only Memory (ROM), magnetic Memory, flash Memory, magnetic disk or optical disk.
Example four
The present embodiment provides a computer device comprising a memory and one or more processors, the memory having stored thereon a computer program that, when executed by the one or more processors, implements the methods of the preceding embodiments.
The Processor may be an Application Specific Integrated Circuit (ASIC), a Digital Signal Processor (DSP), a Digital Signal Processing Device (DSPD), a Programmable Logic Device (PLD), a Field Programmable Gate Array (FPGA), a controller, a Microcontroller (MCU), a microprocessor, or other electronic components, and is configured to perform the method of the above embodiments.
In the embodiments provided in the present invention, it should be understood that the disclosed system and method can be implemented in other ways. The system and method embodiments described above are merely illustrative.
It should be noted that, in this document, the terms "first", "second", and the like in the description and claims of the present application and in the drawings described above are used for distinguishing similar objects, and are not necessarily used for describing a particular order or sequence. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrases "comprising one of 8230; \8230;" 8230; "does not exclude the presence of additional like elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments of the present invention have been described above, the above descriptions are only for the convenience of understanding the present invention, and are not intended to limit the present invention. It will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (13)

1. A method for controlling resource scheduling of an asynchronous I/O thread processor is characterized by comprising the following steps:
responding to the starting of an application program to an I/O thread, and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a non-competitive dynamic configuration mode and a competitive dynamic configuration mode;
scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode;
under the condition that the processor resource configuration mode is a competitive dynamic configuration mode, monitoring synchronous waiting expenses of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy processor resources by combining the synchronous waiting expenses; and allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode or a contention dynamic configuration mode.
2. The method according to claim 1, wherein when the processor resource allocation manner is a fixed allocation manner, numbers or keywords for a plurality of processor cores designated by the fixed allocation manner are acquired, and each I/O thread is fixed to the plurality of processor cores and run based on the numbers or the keywords.
3. The asynchronous I/O threaded processor resource scheduling control method of claim 1, wherein the real-time monitoring of the synchronous latency overhead of all I/O threads working in conjunction comprises:
when all I/O threads are synchronized for cooperatively taking out the same I/O request, the physical time for starting synchronization is exchanged among the I/O threads, and the difference between the latest physical time and the earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronous waiting overhead when all the I/O threads are cooperatively operated.
4. The method of claim 1, wherein the scheduling I/O threads in conjunction with the synchronous latency overhead occupies processor resources and comprises:
and when responding to an I/O related instruction initiated by an application program, if the synchronous waiting cost exceeds a first threshold value, the priority of all I/O threads is increased.
5. The asynchronous I/O threaded processor resource scheduling control method of claim 4, wherein said scheduling an I/O thread in conjunction with said synchronous wait overhead occupies processor resources, further comprising:
dynamically monitoring the processor resource occupation time of the I/O thread;
and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set time length.
6. The asynchronous I/O threaded processor resource scheduling control method of claim 4 wherein the first threshold is a preset value or is determined by auto-sampling when an application starts the asynchronous parallel I/O system.
7. The asynchronous I/O threaded processor resource scheduling control method of claim 6, further comprising:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources;
under the condition that all I/O threads acquire processor resources, all I/O threads are automatically tested and sampled to carry out synchronous waiting cost, and the numerical value obtained by amplifying the synchronous waiting cost according to a preset multiple is used as a first threshold value.
8. The asynchronous I/O thread processor resource scheduling control method of claim 5, wherein said dynamically monitoring processor resource occupancy time of I/O threads comprises:
monitoring processor resource occupation time of an I/O thread when the I/O thread completes an I/O request, wherein the processor resource occupation time is the difference between ending time and starting time, the starting time is the time that the I/O thread occupies the processor resource after being awakened again from the end of the previous dormancy, and the ending time is the time when the I/O request is completed;
setting the time length for the I/O thread dormancy comprises the following steps: and after the I/O thread finishes the current I/O request and before taking out the next I/O request, the I/O thread starts the autonomous dormancy according to the set duration.
9. The asynchronous I/O threaded processor resource scheduling control method according to claim 1 or 2, wherein the allowing of each I/O thread to run on a plurality of processor cores and the pinning of each I/O thread to run on the plurality of processor cores is performed by an affinity setting command of an operating system.
10. The asynchronous I/O threaded processor resource scheduling control method of claim 9 wherein, when an application initiates an instruction waiting for completion of an asynchronous I/O request, if there are I/O requests that have not yet been completed, the I/O thread does not go to autonomous sleep in responding to all I/O requests that are waiting for completion by the application.
11. An asynchronous I/O thread processor resource scheduling control apparatus, comprising:
the acquisition module is used for responding to the starting of an application program on the I/O thread and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a non-competitive dynamic configuration mode and a competitive dynamic configuration mode;
the scheduling module is used for scheduling the I/O thread to occupy the processor resource based on the processor resource configuration mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting expense of all I/O threads in cooperative work in real time, and scheduling the I/O threads to occupy the processor resources by combining the synchronous waiting expense; and under the condition that the processor resource configuration mode is a contention-free dynamic configuration mode or a contention dynamic configuration mode, allowing each I/O thread to run on a plurality of processor cores.
12. A computer storage medium, characterized in that a computer program is stored on the computer-readable storage medium, which computer program, when executed by one or more processors, carries out the method of any one of claims 1 to 10.
13. A computer device comprising one or more processors and memory having stored thereon a computer program that, when executed by the one or more processors, performs the method of any of claims 1-10.
CN202310030841.6A 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment Active CN115718665B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310030841.6A CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310030841.6A CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Publications (2)

Publication Number Publication Date
CN115718665A true CN115718665A (en) 2023-02-28
CN115718665B CN115718665B (en) 2023-06-13

Family

ID=85257946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310030841.6A Active CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN115718665B (en)

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050144332A1 (en) * 2003-12-12 2005-06-30 International Business Machines Corporation Autonomic input/output scheduler selector
CN101246437A (en) * 2008-01-28 2008-08-20 中兴通讯股份有限公司 Built-in real-time system course equalization scheduling method
US20090158299A1 (en) * 2007-10-31 2009-06-18 Carter Ernst B System for and method of uniform synchronization between multiple kernels running on single computer systems with multiple CPUs installed
CN101556545A (en) * 2009-05-22 2009-10-14 北京星网锐捷网络技术有限公司 Method for realizing process support, device and multithreading system
CN103279391A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Load balancing optimization method based on CPU (central processing unit) and MIC (many integrated core) framework processor cooperative computing
US20150339158A1 (en) * 2014-05-22 2015-11-26 Oracle International Corporation Dynamic Co-Scheduling of Hardware Contexts for Parallel Runtime Systems on Shared Machines
CN109426556A (en) * 2017-08-31 2019-03-05 大唐移动通信设备有限公司 A kind of process scheduling method and device
CN109992366A (en) * 2017-12-29 2019-07-09 华为技术有限公司 Method for scheduling task and dispatching device
US20190258514A1 (en) * 2016-11-02 2019-08-22 Huawei Technologies Co., Ltd. I/O Request Scheduling Method and Apparatus
CN112579277A (en) * 2020-12-24 2021-03-30 海光信息技术股份有限公司 Central processing unit, method, device and storage medium for simultaneous multithreading
CN114048026A (en) * 2021-10-27 2022-02-15 北京航空航天大学 GPU resource dynamic allocation method under multitask concurrency condition
CN114385227A (en) * 2022-01-17 2022-04-22 中国农业银行股份有限公司 Service processing method, device, equipment and storage medium
CN115061730A (en) * 2022-07-06 2022-09-16 中银金融科技有限公司 Thread concurrent management method and device
CN115328564A (en) * 2022-10-17 2022-11-11 北京卡普拉科技有限公司 Asynchronous input output thread processor resource allocation method and device
CN115328662A (en) * 2022-09-13 2022-11-11 国网智能电网研究院有限公司 Process thread resource management control method and system
CN115562838A (en) * 2022-10-27 2023-01-03 Oppo广东移动通信有限公司 Resource scheduling method and device, computer equipment and storage medium

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050144332A1 (en) * 2003-12-12 2005-06-30 International Business Machines Corporation Autonomic input/output scheduler selector
US20090158299A1 (en) * 2007-10-31 2009-06-18 Carter Ernst B System for and method of uniform synchronization between multiple kernels running on single computer systems with multiple CPUs installed
CN101246437A (en) * 2008-01-28 2008-08-20 中兴通讯股份有限公司 Built-in real-time system course equalization scheduling method
CN101556545A (en) * 2009-05-22 2009-10-14 北京星网锐捷网络技术有限公司 Method for realizing process support, device and multithreading system
CN103279391A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Load balancing optimization method based on CPU (central processing unit) and MIC (many integrated core) framework processor cooperative computing
US20150339158A1 (en) * 2014-05-22 2015-11-26 Oracle International Corporation Dynamic Co-Scheduling of Hardware Contexts for Parallel Runtime Systems on Shared Machines
US20190258514A1 (en) * 2016-11-02 2019-08-22 Huawei Technologies Co., Ltd. I/O Request Scheduling Method and Apparatus
CN109426556A (en) * 2017-08-31 2019-03-05 大唐移动通信设备有限公司 A kind of process scheduling method and device
CN109992366A (en) * 2017-12-29 2019-07-09 华为技术有限公司 Method for scheduling task and dispatching device
CN112579277A (en) * 2020-12-24 2021-03-30 海光信息技术股份有限公司 Central processing unit, method, device and storage medium for simultaneous multithreading
CN114048026A (en) * 2021-10-27 2022-02-15 北京航空航天大学 GPU resource dynamic allocation method under multitask concurrency condition
CN114385227A (en) * 2022-01-17 2022-04-22 中国农业银行股份有限公司 Service processing method, device, equipment and storage medium
CN115061730A (en) * 2022-07-06 2022-09-16 中银金融科技有限公司 Thread concurrent management method and device
CN115328662A (en) * 2022-09-13 2022-11-11 国网智能电网研究院有限公司 Process thread resource management control method and system
CN115328564A (en) * 2022-10-17 2022-11-11 北京卡普拉科技有限公司 Asynchronous input output thread processor resource allocation method and device
CN115562838A (en) * 2022-10-27 2023-01-03 Oppo广东移动通信有限公司 Resource scheduling method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN115718665B (en) 2023-06-13

Similar Documents

Publication Publication Date Title
US11507420B2 (en) Systems and methods for scheduling tasks using sliding time windows
US9244733B2 (en) Apparatus and method for scheduling kernel execution order
EP3436944B1 (en) Fast transfer of workload between multiple processors
CN109783157B (en) Method and related device for loading algorithm program
US9229765B2 (en) Guarantee real time processing of soft real-time operating system by instructing core to enter a waiting period prior to transferring a high priority task
US20140359636A1 (en) Multi-core system performing packet processing with context switching
CN110300959B (en) Method, system, device, apparatus and medium for dynamic runtime task management
US20190044883A1 (en) NETWORK COMMUNICATION PRIORITIZATION BASED on AWARENESS of CRITICAL PATH of a JOB
JP6464982B2 (en) Parallelization method, parallelization tool, in-vehicle device
CN111459622B (en) Method, device, computer equipment and storage medium for scheduling virtual CPU
EP2282265A1 (en) A hardware task scheduler
CN115061803A (en) Multi-core processing system and task scheduling method, chip and storage medium thereof
CN112925616A (en) Task allocation method and device, storage medium and electronic equipment
JP5726006B2 (en) Task and resource scheduling apparatus and method, and control apparatus
CN109766168B (en) Task scheduling method and device, storage medium and computing equipment
CN115718665A (en) Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment
US20230067432A1 (en) Task allocation method, apparatus, electronic device, and computer-readable storage medium
US20050066093A1 (en) Real-time processor system and control method
CN109189581B (en) Job scheduling method and device
CN115309507B (en) CPU resource occupancy rate calculation method, device, equipment and medium
CN116244073A (en) Resource-aware task allocation method for hybrid key partition real-time operating system
JP6617511B2 (en) Parallelization method, parallelization tool, in-vehicle device
JPS6368934A (en) Task scheduing system
CN111831390B (en) Resource management method and device of server and server
CN116225673A (en) Task processing method and device based on many-core chip, processing core and electronic equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant