CN115718665B - Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment - Google Patents

Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment Download PDF

Info

Publication number
CN115718665B
CN115718665B CN202310030841.6A CN202310030841A CN115718665B CN 115718665 B CN115718665 B CN 115718665B CN 202310030841 A CN202310030841 A CN 202310030841A CN 115718665 B CN115718665 B CN 115718665B
Authority
CN
China
Prior art keywords
thread
threads
processor
processor resource
time
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310030841.6A
Other languages
Chinese (zh)
Other versions
CN115718665A (en
Inventor
李锐喆
赵彤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Carpura Technology Co ltd
Original Assignee
Beijing Carpura Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Carpura Technology Co ltd filed Critical Beijing Carpura Technology Co ltd
Priority to CN202310030841.6A priority Critical patent/CN115718665B/en
Publication of CN115718665A publication Critical patent/CN115718665A/en
Application granted granted Critical
Publication of CN115718665B publication Critical patent/CN115718665B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The invention relates to the technical field of asynchronous I/O, and provides a resource scheduling control method, device, medium and equipment of an asynchronous I/O thread processor. The method comprises the following steps: responding to the starting of an application program to an I/O thread, acquiring a processor resource allocation mode of the I/O thread, and scheduling the I/O thread to occupy processor resources based on the processor resource allocation mode; real-time monitoring synchronous waiting expenditure when all I/O threads work cooperatively in a competitive dynamic configuration mode, and scheduling the I/O threads to occupy processor resources by combining the synchronous waiting expenditure; each I/O thread is allowed to run on multiple processor cores in either a contention-free dynamic configuration mode or a contention-based dynamic configuration mode. Aiming at the condition that the I/O thread and the computing thread need to run simultaneously, the control of the I/O thread processor resource scheduling is realized by utilizing the prior art foundation of an operating system, so that all the I/O threads can acquire the processor resource almost simultaneously as much as possible, the efficient completion of the I/O request is realized, and the influence of the I/O thread on the computing thread is reduced as much as possible.

Description

Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment
Technical Field
The present invention relates to the field of asynchronous I/O technologies, and in particular, to a method, an apparatus, a medium, and a device for controlling resource scheduling of an asynchronous I/O thread processor.
Background
With the continuous enhancement of computing power and the continuous enhancement of application demands, the amount of data computation processed by application programs is continuously increasing, and the amount of data Input and Output (I/O) from a file system is also continuously increasing. Data I/O is increasingly a performance bottleneck for most application programs running at speeds, as I/O speeds increase far slower than computing power. To reduce the impact of I/O overhead, parallel I/O techniques and asynchronous I/O techniques have evolved.
The parallel I/O technology utilizes the parallel capacity of the computer to store the data of the application program in a plurality of processes in a scattered way, and the processes read and write the file data in a coordinated way, so that the I/O speed is improved. Asynchronous I/O techniques employ special processes/threads other than computing processes/threads to complete data I/O operations, so that an application can continue to perform other computations without waiting for completion of the data I/O operations, thereby overlapping the computation of the application with the data I/O processing. In addition, there is a parallel asynchronous I/O system that combines both parallel I/O technology and asynchronous I/O technology in the related art to minimize the impact of I/O overhead on the running speed of the application.
Asynchronous I/O technology has two main implementation approaches:
one approach is to use a dedicated I/O process. Each I/O process is an independent process other than the original computing process of the application program, and when the application program is started, additional independent computing resources are generally required to be allocated to each I/O process. For example, an application would originally employ 1024 processes/processor cores for parallel computation, and if 16I/O processes were to be used, then 1024+16=1040 processes/processor cores would need to be applied at the time of application submission.
Another approach is to use dedicated I/O threads. Each I/O thread is automatically created by a computing thread of an application program and then runs on a computing node of the corresponding computing thread.
Through applicants' analysis, an I/O thread on one compute node with a compute thread may be faced with the following:
1) The processor of the computing node has a processor core that runs exclusively I/O threads. The I/O threads operate primarily in communication and I/O, with no real computation, and primarily in integer computing, the simplified design of the processor core is sufficient to support the operation of the I/O threads. With the rapid development of processor architecture, I/O processor cores dedicated to running I/O threads may be presented in the future.
2) The processor core of the computing node is not specially used for running the I/O thread, but when the running of the application program is started, the number of the processor cores for running the I/O thread is reserved on the computing node, namely the total number of the computing thread and the I/O thread does not exceed the total number of the processor cores. For example, on a compute node with a total number of processor cores of 64, a total of 62 compute threads are run, reserving 2 processor cores available for running I/O threads.
3) The total number of compute threads and I/O threads exceeds the total number of cores of the processor. For example, on a compute node with a total number of cores of 64, a total of 64 compute threads are running, and newly started I/O threads need to compete with the compute threads for use of the processor cores.
In the case that the I/O thread and the computing thread compete for use of the processor core resource, the operating system may alternately schedule the I/O thread and the computing thread to the processor core for actual operation based on a time sharing mechanism. Although the related art proposes a sleep mechanism of an I/O thread in the absence of an I/O request to reduce contention, there are cases where a computing thread and an I/O thread are to run simultaneously. Applicants have found through testing that competing schedules of processor resources still exist and that the speed of computation and I/O may be significantly reduced with the use of parallel asynchronous I/O. I/O threads that cooperate to accomplish asynchronous parallel I/O are typically distributed among different compute nodes. The I/O threads cooperatively complete each I/O request through frequent global communication. If the I/O threads can be paced together in lockstep, namely: when the I/O request exists, all the I/O threads acquire the processor resources almost simultaneously, so that the efficient completion of the I/O request can be ensured, and the influence of the I/O threads on the computing threads is further reduced. However, the prior art does not enable uniform scheduling of I/O threads on different compute nodes.
Disclosure of Invention
In order to realize unified scheduling of I/O threads on different computing nodes, the invention provides a resource scheduling control method, a device, a medium and equipment of an asynchronous I/O thread processor.
In a first aspect, an embodiment of the present invention provides a method for controlling resource scheduling of an asynchronous I/O thread processor, including:
responding to the starting of an application program to an I/O thread, and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a contention-free dynamic configuration mode and a contention-free dynamic configuration mode;
scheduling the I/O threads to occupy processor resources based on the processor resource allocation mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, synchronous waiting expenditure when all I/O threads work cooperatively is monitored in real time, and the processor resource is occupied by the I/O threads in combination with the synchronous waiting expenditure; and allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode.
In some implementations, in a case that the processor resource allocation manner is a fixed allocation manner, numbers or keywords of a plurality of processor cores specified for the fixed allocation manner are acquired, and each I/O thread is fixed to run on the plurality of processor cores based on the numbers or the keywords.
In some implementations, the real-time monitoring of the synchronous wait overhead when all I/O threads work in concert includes:
when all I/O threads are synchronized for cooperatively taking out the same I/O request, the I/O threads exchange each other to start the physical time of synchronization, and the difference value between the latest and earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronization waiting overhead when all I/O threads cooperatively work.
In some implementations, the scheduling I/O threads in conjunction with the synchronized wait overhead occupies processor resources, including:
and when responding to the I/O related instruction initiated by the application program, if the synchronous waiting overhead exceeds a first threshold, the priority of all I/O threads is improved.
In some implementations, the scheduling the I/O threads in conjunction with the synchronized latency overhead occupies processor resources, further comprising:
dynamically monitoring processor resource occupation time of an I/O thread;
and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set duration.
In some implementations, the first threshold is a preset value, or the first threshold is determined by automatic sampling when an application starts an asynchronous parallel I/O system.
In some implementations, the method further comprises:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources;
under the condition that all I/O threads acquire processor resources, automatically testing and sampling synchronous waiting cost of all the I/O threads for synchronization, and taking the value obtained after the synchronous waiting cost is amplified according to a preset multiple as a first threshold value.
In some implementations, the dynamically monitoring processor resource occupancy time of an I/O thread includes:
monitoring the processor resource occupation time of an I/O thread when an I/O request is completed, wherein the processor resource occupation time is the difference between the ending time and the starting time, the starting time is the time from the ending of the previous dormancy of the I/O thread to the time when the I/O thread is awakened again and the processor resource occupation time is the time when the I/O request is completed;
the sleep setting duration of each I/O thread comprises the following steps: after the I/O thread completes the current I/O request and before the next I/O request is fetched, the I/O thread starts autonomous dormancy according to the set time length.
In some implementations, the allowing and fixing the I/O threads to run on the plurality of processor cores is accomplished by an affinity set command of the operating system.
In some implementations, when an application initiates an instruction to wait for an asynchronous I/O request to complete, if there are outstanding I/O requests, the I/O thread does not go to autonomous sleep in responding to all I/O requests that the application waits for to complete.
In a second aspect, an embodiment of the present invention provides an asynchronous I/O thread processor resource scheduling control apparatus, including:
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for responding to the starting of an application program to an I/O thread and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a contention-free dynamic configuration mode and a contention-free dynamic configuration mode;
the scheduling module is used for scheduling the I/O threads to occupy the processor resources based on the processor resource allocation mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, synchronous waiting expenditure when all I/O threads work cooperatively is monitored in real time, and the processor resource is occupied by the I/O threads in combination with the synchronous waiting expenditure; and allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode.
In a third aspect, embodiments of the present invention provide a computer storage medium having a computer program stored thereon, which when executed by one or more processors, implements a method as described in the first aspect.
In a fourth aspect, an embodiment of the present invention provides a computer device comprising a memory and one or more processors, the memory having stored thereon a computer program which, when executed by the one or more processors, implements a method as described in the first aspect.
The invention has at least the following beneficial effects:
aiming at the condition that the I/O thread and the computing thread need to run simultaneously, the control of the I/O thread processor resource scheduling is realized by utilizing the prior art foundation of an operating system, so that all the I/O threads can acquire the processor resource almost simultaneously as much as possible, the efficient completion of the I/O request is realized, and the influence of the I/O thread on the computing thread is reduced as much as possible.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the embodiments will be briefly described below, it being understood that the following drawings only illustrate certain embodiments of the present invention and therefore should not be considered as limiting the scope.
FIG. 1 is a flow chart of a method for controlling resource scheduling of an asynchronous I/O thread processor according to an embodiment of the present invention;
FIG. 2 is a block diagram of an asynchronous I/O thread processor resource scheduling control device provided by an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. The components of the embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the invention, as presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be made by a person skilled in the art without making any inventive effort, are intended to be within the scope of the present invention.
Example 1
FIG. 1 is a flowchart of an asynchronous I/O thread processor resource scheduling control method, as shown in FIG. 1, the asynchronous I/O thread processor resource scheduling control method of the present embodiment includes the following steps:
step S101, responding to the starting of an application program to an I/O thread, and acquiring a processor resource allocation mode of the I/O thread.
The processor resource allocation mode comprises one of a fixed allocation mode, a contention-free dynamic allocation mode and a competitive dynamic allocation mode.
The fixed configuration may refer to scheduling I/O threads onto a specified number of processor cores; the contention-free dynamic configuration mode may refer to a mode that an I/O thread does not need to compete with a computing thread for use of processor resources, including the number of processor cores that have been reserved on a compute node to run the I/O thread, i.e., the total number of computing threads and I/O threads does not exceed the total number of cores of the processor; a competitive dynamic configuration may refer to a manner in which an I/O thread needs to compete with a compute thread for use of processor resources, i.e., the total number of compute threads and I/O threads exceeds the total number of cores of the processor. And allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode. Under the condition that the processor resource allocation mode is a fixed allocation mode, each I/O thread is fixed to run on a plurality of processor cores.
On which processor cores a thread can run may be set by an affinity (affinity) setting command of the operating system. And when an I/O thread is a slave thread created by a computing thread, it will typically inherit the computing thread about processor core affinity (affinity).
In this embodiment, allowing each I/O thread to run on multiple processor cores and having each I/O thread running on several processor cores is accomplished by an affinity set command of the operating system.
Specifically, after the computing thread of the application program creates an I/O thread, setting the affinity of the I/O thread to a plurality of processor cores by using a corresponding command of an operating system; in a competitive dynamic configuration mode, the affinity of the application computing thread is set to a plurality of processor cores. The plurality of processor cores may be all or part of the processor cores on the compute node.
Step S102, the I/O thread is scheduled to occupy the processor resource based on the processor resource allocation mode.
And under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting cost when all the I/O threads work cooperatively, and scheduling the I/O threads to occupy the processor resource by combining the synchronous waiting cost. By setting settings on the compute nodes that affect the operating system's scheduling of I/O threads, the scheduling of I/O threads that occupy processor resources can be achieved.
Under the condition that the processor resource allocation mode is a fixed allocation mode, the number or key words of a plurality of processor cores designated by the fixed allocation mode are acquired, the I/O threads are scheduled to occupy the processor resources based on the number or key words, and the scheduling control commands provided by the operating system are utilized to enable the I/O threads to be fixedly operated on the plurality of processor cores. The fixed configuration mode is suitable for the condition that the special processor core of the I/O thread exists, and is also suitable for the condition that the general processor core is reserved for the I/O thread.
Under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode, the total number of the computing threads and the I/O threads does not exceed the total number of cores of the processor, and the computing threads and the I/O threads can run on a plurality of processor cores. Thus, this configuration can be run on multiple processor cores without performing other operations or requiring only setting up the I/O threads.
When the asynchronous parallel I/O system works, all I/O threads firstly cooperatively fetch the same I/O request, and then cooperatively complete the data input/output and other works of the I/O request. Synchronization between all I/O threads may be involved in fetching I/O requests. The fetch I/O request process is computationally small and therefore theoretically its completion time is very short. However, the fetching of I/O requests can be done cooperatively only if all I/O threads have fetched processor resources. When a portion of the I/O threads acquire processor resources while the remaining I/O threads are waiting for processor resources, the synchronization latency overhead of the I/O request fetch process may become greater. Therefore, the present embodiment monitors the actual synchronous waiting overhead of the I/O request fetch in real time and regards it as the synchronous waiting overhead when the I/O threads work cooperatively.
In some implementations, monitoring the synchronous latency overhead while all I/O threads are working in concert may include: when all I/O threads are synchronized for cooperatively taking out the same I/O request, the I/O threads exchange each other to start the physical time of synchronization, and the difference value between the latest and earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronization waiting overhead when all I/O threads cooperatively work.
The monitoring mode can exchange the physical time (the physical time among all computing nodes on the high-performance computer is almost the same when all I/O threads (each I/O thread belongs to one process) are synchronized, and can correct even if the physical time is different), and the difference between the latest and earliest physical time is the synchronization waiting cost. And the I/O thread can be scheduled to occupy the processor resource according to the real-time monitored synchronous waiting overhead.
When the synchronization waiting overhead is combined, it is generally necessary to determine the size of the synchronization waiting overhead. One possible implementation is to make a determination based on a first threshold, and when the synchronization waiting overhead obtained by actual monitoring is greater than the first threshold, this means that the synchronization overhead is too large.
In connection with scheduling I/O threads in synchronization with wait for overhead to occupy processor resources, when settings on a compute node that affect operating system scheduling I/O threads may exist including: does not do any operation, improves the priority of the I/O thread and sets the time length for dormancy of each I/O thread. When the actual synchronous waiting overhead is too large (exceeds a first threshold), the priority of the I/O thread is increased, and when the I/O thread is found to occupy the processor for too long, the sleep time of each I/O thread is set for a long time.
In some cases, scheduling I/O threads in conjunction with synchronous wait overhead to occupy processor resources may include:
and when responding to the I/O related instruction initiated by the application program, if the synchronization waiting cost exceeds a first threshold value and the synchronization cost is overlarge, the priority of all the I/O threads is improved. The first threshold value can be a preset value or can be determined through automatic sampling when an application program starts the asynchronous parallel I/O system.
Further, in the case where the aforementioned first threshold is determined by automatic sampling when the application program starts the asynchronous parallel I/O system, the method of this embodiment may further include:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources; under the condition that all I/O threads acquire processor resources, automatically testing and sampling synchronous waiting cost of all the I/O threads for synchronization, and taking the value obtained after the synchronous waiting cost is amplified according to a preset multiple as a first threshold value. The preset multiple can be set according to actual requirements, which is not limited in this embodiment.
The operating system may adjust the priority of the threads in real time during the scheduling process, e.g., the priority of threads that do not occupy the processor for a long period of time may be increased. Although one thread may not be able to change its own scheduling priority, the computing thread may adjust the scheduling priority of the I/O thread because the parent thread can change the priority of the child thread, while the I/O thread is created by the computing thread (i.e., the computing thread is the parent thread of the I/O thread). The computing thread can only interact with the I/O thread when responding to the I/O related instruction initiated by the application program, and the priority of the I/O thread can be changed at the moment. Thus, in response to an application initiated I/O related instruction, if the computing thread finds that the actual synchronous wait overhead of the I/O thread is too great, the priority of the I/O thread is raised so that the priority of the I/O thread is higher than the priority of the computing thread.
In other cases, scheduling I/O threads in conjunction with synchronous wait overhead occupies processor resources may further include:
dynamically monitoring processor resource occupation time of an I/O thread; and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set duration. The second threshold may be set according to actual requirements, which is not limited in this embodiment.
According to the foregoing, the compute thread can only raise the priority of the I/O thread when an I/O related instruction is initiated by the application. Between two I/O related instructions, there may be a situation that the I/O threads need to complete a large number of I/O requests, and at this time, all I/O threads need to occupy processor resources for a long time, so that I/O can be efficiently completed and the performance impact on numerical computation is reduced. However, when a computing thread competes with an I/O thread for use of processor resources, the I/O thread cannot occupy processor resources for a long period of time because its scheduling priority decreases with increasing processor resource utilization time, which may cause the I/O thread to be switched out when all I/O requests are not completed, and after being scheduled back again, synchronization latency between I/O threads may become significant. For the situation, the I/O thread is automatically dormant for a certain period of time, so that the priority of the I/O thread relative to the computing thread during automatic wake-up can be improved, and the synchronous waiting overhead among the I/O threads is reduced as much as possible.
Whether the I/O thread needs to sleep or not in the embodiment is realized by dynamically monitoring whether the occupied time of the processor resource of the I/O thread exceeds a second threshold value, so that the I/O thread is started to sleep autonomously. The foregoing dynamically monitoring the processor resource occupation time of the I/O thread may further include:
monitoring the processor resource occupation time of an I/O thread when an I/O request is completed, wherein the processor resource occupation time is the difference between the ending time and the starting time, the starting time is the time from the ending of the previous dormancy of the I/O thread to the time when the I/O thread is awakened again and the processor resource occupation time is the time when the I/O request is completed;
accordingly, each I/O thread dormancy setup period includes: after the I/O thread completes the current I/O request and before the next I/O request is fetched, the I/O thread starts autonomous dormancy according to the set time length.
It should be appreciated that if the processor footprint of the I/O thread after completion of the first I/O request does not reach the second threshold, then autonomous sleep is not performed, and a determination is made again after completion of the second I/O request, and so on. And when determining to carry out autonomous dormancy, the I/O thread starts autonomous dormancy according to the second threshold after the I/O thread completes the current I/O request and before the next I/O request is fetched. The second threshold and the set duration may be preset values or dynamically changed values. For example, when the test finds that the implementation effect of the above-mentioned I/O thread autonomous sleep method is poor (for example, the synchronization waiting overhead is still large), the second threshold or the set duration may be reduced, so as to ensure that the dynamic priority of the I/O thread at the time of ending sleep is higher than that of the computing thread.
In some implementations, the method of this embodiment further includes:
when an application program initiates an instruction for waiting for completion of asynchronous I/O requests, if I/O requests which are not completed exist, and the I/O requests need to be completed as soon as possible, the I/O thread does not carry out autonomous dormancy in the process of responding to all I/O requests which are waiting for completion by the application program.
According to the method, aiming at the condition that the I/O thread and the computing thread need to run simultaneously, the control of the I/O thread processor resource scheduling is realized by utilizing the prior art foundation of an operating system, so that all the I/O threads can acquire the processor resource almost simultaneously as much as possible, the efficient completion of the I/O request is realized, and the influence of the I/O thread on the computing thread is reduced as much as possible.
Example two
As shown in fig. 2, the asynchronous I/O thread processor resource scheduling control device provided in this embodiment includes:
an obtaining module 201, configured to obtain, in response to starting of an I/O thread by an application, a processor resource allocation manner for the I/O thread, where the processor resource allocation manner includes one of a fixed allocation manner, a contention-free dynamic allocation manner, and a contention-free dynamic allocation manner;
the scheduling module 202 is configured to schedule the I/O threads to occupy the processor resources based on the processor resource configuration mode.
Under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, synchronous waiting expenditure when all I/O threads work cooperatively is monitored in real time, and the processor resource is occupied by the I/O threads in combination with the synchronous waiting expenditure; in the case where the processor resource configuration is either contention-free or contention-based, each I/O thread is capable of running on multiple processor cores.
The fixed configuration may refer to scheduling I/O threads onto a specified number of processor cores; the contention-free dynamic configuration mode may refer to a mode that an I/O thread does not need to compete with a computing thread for use of processor resources, including the number of processor cores that have been reserved on a compute node to run the I/O thread, i.e., the total number of computing threads and I/O threads does not exceed the total number of cores of the processor; a competitive dynamic configuration may refer to a manner in which an I/O thread needs to compete with a compute thread for use of processor resources, i.e., the total number of compute threads and I/O threads exceeds the total number of cores of the processor. And allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode. Under the condition that the processor resource allocation mode is a fixed allocation mode, each I/O thread is fixed to run on a plurality of processor cores.
In this embodiment, allowing each I/O thread to run on multiple processor cores and having each I/O thread running on the multiple processor cores fixed may be implemented by an affinity set command of the operating system.
Specifically, after the computing thread of the application program creates an I/O thread, setting the affinity of the I/O thread to a plurality of processor cores by using a corresponding command of an operating system; in a competitive dynamic configuration mode, the affinity of the application computing thread is set to a plurality of processor cores. The plurality of processor cores may be all or part of the processor cores on the compute node.
And under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, monitoring the synchronous waiting cost when all the I/O threads work cooperatively, and scheduling the I/O threads to occupy the processor resource by combining the synchronous waiting cost. By setting settings on the compute nodes that affect the operating system's scheduling of I/O threads, the scheduling of I/O threads that occupy processor resources can be achieved.
Under the condition that the processor resource allocation mode is a fixed allocation mode, the number or key words of a plurality of processor cores designated by the fixed allocation mode are acquired, the I/O threads are scheduled to occupy the processor resources based on the number or key words, and the scheduling control commands provided by the operating system are utilized to enable the I/O threads to be fixedly operated on the plurality of processor cores. The fixed configuration mode is suitable for the condition that the special processor core of the I/O thread exists, and is also suitable for the condition that the general processor core is reserved for the I/O thread.
Under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode, the total number of the computing threads and the I/O threads does not exceed the total number of cores of the processor, and the computing threads and the I/O threads can run on a plurality of processor cores. Thus, this configuration can be run on multiple processor cores without performing other operations or requiring only setting up the I/O threads.
In some implementations, monitoring the synchronous latency overhead while all I/O threads are working in concert may include: when all I/O threads are synchronized for cooperatively taking out the same I/O request, the I/O threads exchange each other to start the physical time of synchronization, and the difference value between the latest and earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronization waiting overhead when all I/O threads cooperatively work.
The monitoring mode can exchange the physical time (the physical time among all computing nodes on the high-performance computer is almost the same when all I/O threads (each I/O thread belongs to one process) are synchronized, and can correct even if the physical time is different), and the difference between the latest and earliest physical time is the synchronization waiting cost. And the I/O thread can be scheduled to occupy the processor resource according to the real-time monitored synchronous waiting overhead.
In some cases, scheduling I/O threads in conjunction with synchronous wait overhead to occupy processor resources may include:
and when responding to the I/O related instruction initiated by the application program, if the synchronization waiting cost exceeds a first threshold value and the synchronization cost is overlarge, the priority of all the I/O threads is improved. The first threshold value can be a preset value or can be determined through automatic sampling when an application program starts the asynchronous parallel I/O system.
Further, in the case where the aforementioned first threshold is determined by automatic sampling when the application program starts the asynchronous parallel I/O system, the method of this embodiment may further include:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources; under the condition that all I/O threads acquire processor resources, automatically testing and sampling synchronous waiting cost of all the I/O threads for synchronization, and taking the value obtained after the synchronous waiting cost is amplified according to a preset multiple as a first threshold value. The preset multiple can be set according to actual requirements, which is not limited in this embodiment.
In other cases, scheduling I/O threads in conjunction with synchronous wait overhead occupies processor resources may further include:
dynamically monitoring processor resource occupation time of an I/O thread; and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set duration. The second threshold may be set according to actual requirements, which is not limited in this embodiment.
According to the foregoing, the compute thread can only raise the priority of the I/O thread when an I/O related instruction is initiated by the application. Between two I/O related instructions, there may be a situation that the I/O threads need to complete a large number of I/O requests, and at this time, all I/O threads need to occupy processor resources for a long time, so that I/O can be efficiently completed and the performance impact on numerical computation is reduced. However, when a computing thread competes with an I/O thread for use of processor resources, the I/O thread cannot occupy processor resources for a long period of time because its scheduling priority decreases with increasing processor resource utilization time, which may cause the I/O thread to be switched out when all I/O requests are not completed, and after being scheduled back again, synchronization latency between I/O threads may become significant. For the situation, the I/O thread is automatically dormant for a certain period of time, so that the priority of the I/O thread relative to the computing thread during automatic wake-up can be improved, and the synchronous waiting overhead among the I/O threads is reduced as much as possible.
Whether the I/O thread needs to sleep or not in the embodiment is realized by dynamically monitoring whether the occupied time of the processor resource of the I/O thread exceeds a second threshold value, so that the I/O thread is started to sleep autonomously. The foregoing dynamically monitoring the processor resource occupation time of the I/O thread may further include:
monitoring the processor resource occupation time of an I/O thread when an I/O request is completed, wherein the processor resource occupation time is the difference between the ending time and the starting time, the starting time is the time from the ending of the previous dormancy of the I/O thread to the time when the I/O thread is awakened again and the processor resource occupation time is the time when the I/O request is completed;
accordingly, each I/O thread dormancy setup period includes: after the I/O thread completes the current I/O request and before the next I/O request is fetched, the I/O thread starts autonomous dormancy according to the set time length.
It should be appreciated that if the processor footprint of the I/O thread after completion of the first I/O request does not reach the second threshold, then autonomous sleep is not performed, and a determination is made again after completion of the second I/O request, and so on. And when determining to carry out autonomous dormancy, the I/O thread starts autonomous dormancy according to the second threshold after the I/O thread completes the current I/O request and before the next I/O request is fetched. The second threshold and the set duration may be preset values or dynamically changed values. For example, when the test finds that the implementation effect of the above-mentioned I/O thread autonomous sleep method is poor (for example, the synchronization waiting overhead is still large), the second threshold or the set duration may be reduced, so as to ensure that the dynamic priority of the I/O thread at the time of ending sleep is higher than that of the computing thread.
In some implementations, the scheduling module 202 is further to: when an application program initiates an instruction for waiting for completion of asynchronous I/O requests, if I/O requests which are not completed exist, and the I/O requests need to be completed as soon as possible, the I/O thread does not carry out autonomous dormancy in the process of responding to all I/O requests which are waiting for completion by the application program.
The device of the embodiment realizes the control of the I/O thread processor resource scheduling by utilizing the prior art foundation of an operating system aiming at the condition that the I/O thread and the computing thread need to run simultaneously, so that all the I/O threads can acquire the processor resource almost simultaneously as much as possible, thereby realizing the efficient completion of the I/O request and reducing the influence of the I/O thread on the computing thread as much as possible.
Example III
The present embodiment provides a computer storage medium having a computer program stored thereon, which when executed by one or more processors, implements the method of the previous embodiments.
The computer readable storage medium may be implemented by any type or combination of volatile or nonvolatile Memory devices, such as static random access Memory (Static Random Access Memory, SRAM for short), electrically erasable programmable Read-Only Memory (Electrically Erasable Programmable Read-Only Memory, EPROM for short), programmable Read-Only Memory (Programmable Read-Only Memory, PROM for short), read-Only Memory (ROM for short), magnetic Memory, flash Memory, magnetic disk, or optical disk.
Example IV
The present embodiment provides a computer device comprising a memory and one or more processors, the memory having stored thereon a computer program which, when executed by the one or more processors, implements the method of the preceding embodiments.
The processor may be an application specific integrated circuit (Application Specific Integrated Circuit, abbreviated as ASIC), a digital signal processor (Digital Signal Processor, abbreviated as DSP), a digital signal processing device (Digital Signal Processing Device, abbreviated as DSPD), a programmable logic device (Programmable Logic Device, abbreviated as PLD), a field programmable gate array (Field Programmable Gate Array, abbreviated as FPGA), a controller, a microcontroller (Microcontroller Unit, MCU), a microprocessor or other electronic component implementation for performing the methods in the above embodiments.
In the several embodiments provided in the embodiments of the present invention, it should be understood that the disclosed system and method may be implemented in other manners. The system and method embodiments described above are merely illustrative.
It should be noted that, in this document, the terms "first," "second," and the like in the description and the claims of the present application and the above drawings are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. The terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
Although the embodiments of the present invention are described above, the embodiments are only used for facilitating understanding of the present invention, and are not intended to limit the present invention. Any person skilled in the art can make any modification and variation in form and detail without departing from the spirit and scope of the present disclosure, but the scope of the present disclosure is still subject to the scope of the appended claims.

Claims (11)

1. An asynchronous I/O thread processor resource scheduling control method, comprising:
responding to the starting of an application program to an I/O thread, and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a contention-free dynamic configuration mode and a contention-free dynamic configuration mode;
scheduling the I/O threads to occupy processor resources based on the processor resource allocation mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, synchronous waiting expenditure when all I/O threads work cooperatively is monitored in real time, and the processor resource is occupied by the I/O threads in combination with the synchronous waiting expenditure; allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode;
the scheduling of I/O threads in conjunction with the synchronous wait overhead occupies processor resources, including:
when responding to an I/O related instruction initiated by an application program, if the synchronous waiting overhead exceeds a first threshold value, the priority of all I/O threads is improved;
the real-time monitoring of the synchronous waiting overhead when all the I/O threads work cooperatively comprises the following steps:
when all I/O threads are synchronized for cooperatively taking out the same I/O request, the I/O threads exchange physical time when synchronization is started, and the difference value between the latest and earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronization waiting overhead when all I/O threads cooperatively work.
2. The method according to claim 1, wherein when the processor resource allocation is a fixed allocation, a number or a key of a plurality of processor cores designated by the fixed allocation is acquired, and each I/O thread is fixedly executed on the plurality of processor cores based on the number or the key.
3. The method for controlling asynchronous I/O thread processor resource scheduling according to claim 1, wherein said scheduling I/O threads in conjunction with said synchronous wait overhead occupies processor resources, further comprising:
dynamically monitoring processor resource occupation time of an I/O thread;
and under the condition that the processor resource occupation time of the I/O threads exceeds a second threshold value, each I/O thread sleeps for a set duration.
4. The method of claim 1, wherein the first threshold is a preset value or is determined by auto-sampling when an application starts an asynchronous parallel I/O system.
5. The method of asynchronous I/O thread processor resource scheduling control of claim 4, further comprising:
when an application program starts an asynchronous parallel I/O system, all computing threads sleep to release processor resources;
under the condition that all I/O threads acquire processor resources, automatically testing and sampling synchronous waiting cost of all the I/O threads for synchronization, and taking the value obtained after the synchronous waiting cost is amplified according to a preset multiple as a first threshold value.
6. The method for controlling asynchronous I/O thread processor resource scheduling according to claim 3, wherein dynamically monitoring processor resource occupation time of an I/O thread comprises:
monitoring the processor resource occupation time of an I/O thread when an I/O request is completed, wherein the processor resource occupation time is the difference between the ending time and the starting time, the starting time is the time from the ending of the previous dormancy of the I/O thread to the time when the I/O thread is awakened again and the processor resource occupation time is the time when the I/O request is completed;
the sleep setting duration of each I/O thread comprises the following steps: after the I/O thread completes the current I/O request and before the next I/O request is fetched, the I/O thread starts autonomous dormancy according to the set time length.
7. The method according to claim 1 or 2, wherein the allowing each I/O thread to run on a plurality of processor cores and the fixing each I/O thread to run on a plurality of processor cores are implemented by an affinity setting command of an operating system.
8. The method of claim 7, wherein when the application initiates an instruction to wait for completion of an asynchronous I/O request, if there is an I/O request that has not yet been completed, the I/O thread does not go to autonomous sleep in response to all I/O requests that are waiting for completion by the application.
9. An asynchronous I/O thread processor resource scheduling control apparatus, comprising:
the system comprises an acquisition module, a control module and a control module, wherein the acquisition module is used for responding to the starting of an application program to an I/O thread and acquiring a processor resource configuration mode of the I/O thread, wherein the processor resource configuration mode comprises one of a fixed configuration mode, a contention-free dynamic configuration mode and a contention-free dynamic configuration mode;
the scheduling module is used for scheduling the I/O threads to occupy the processor resources based on the processor resource allocation mode;
under the condition that the processor resource allocation mode is a competitive dynamic allocation mode, synchronous waiting expenditure when all I/O threads work cooperatively is monitored in real time, and the processor resource is occupied by the I/O threads in combination with the synchronous waiting expenditure; allowing each I/O thread to run on a plurality of processor cores under the condition that the processor resource allocation mode is a contention-free dynamic allocation mode or a contention-free dynamic allocation mode; the scheduling of I/O threads in conjunction with the synchronous wait overhead occupies processor resources, including: when responding to an I/O related instruction initiated by an application program, if the synchronous waiting overhead exceeds a first threshold value, the priority of all I/O threads is improved; the real-time monitoring of the synchronous waiting overhead when all the I/O threads work cooperatively comprises the following steps: when all I/O threads are synchronized for cooperatively taking out the same I/O request, the I/O threads exchange physical time when synchronization is started, and the difference value between the latest and earliest physical time in the physical time corresponding to each I/O thread is determined as the synchronization waiting overhead when all I/O threads cooperatively work.
10. A computer readable storage medium, having stored thereon a computer program which, when executed by one or more processors, implements the method of any of claims 1 to 8.
11. A computer device comprising a memory and one or more processors, the memory having stored thereon a computer program which, when executed by the one or more processors, implements the method of any of claims 1 to 8.
CN202310030841.6A 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment Active CN115718665B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310030841.6A CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310030841.6A CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Publications (2)

Publication Number Publication Date
CN115718665A CN115718665A (en) 2023-02-28
CN115718665B true CN115718665B (en) 2023-06-13

Family

ID=85257946

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310030841.6A Active CN115718665B (en) 2023-01-10 2023-01-10 Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment

Country Status (1)

Country Link
CN (1) CN115718665B (en)

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115562838A (en) * 2022-10-27 2023-01-03 Oppo广东移动通信有限公司 Resource scheduling method and device, computer equipment and storage medium

Family Cites Families (15)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7197577B2 (en) * 2003-12-12 2007-03-27 International Business Machines Corporation Autonomic input/output scheduler selector
US20090158299A1 (en) * 2007-10-31 2009-06-18 Carter Ernst B System for and method of uniform synchronization between multiple kernels running on single computer systems with multiple CPUs installed
CN101246437B (en) * 2008-01-28 2010-06-09 中兴通讯股份有限公司 Built-in real-time system course equalization scheduling method
CN101556545B (en) * 2009-05-22 2011-04-06 北京星网锐捷网络技术有限公司 Method for realizing process support, device and multithreading system
CN103279391A (en) * 2013-06-09 2013-09-04 浪潮电子信息产业股份有限公司 Load balancing optimization method based on CPU (central processing unit) and MIC (many integrated core) framework processor cooperative computing
US9542221B2 (en) * 2014-05-22 2017-01-10 Oracle International Corporation Dynamic co-scheduling of hardware contexts for parallel runtime systems on shared machines
CN108009006B (en) * 2016-11-02 2022-02-18 华为技术有限公司 Scheduling method and device of I/O (input/output) request
CN109426556B (en) * 2017-08-31 2021-06-04 大唐移动通信设备有限公司 Process scheduling method and device
CN109992366B (en) * 2017-12-29 2023-08-22 华为技术有限公司 Task scheduling method and task scheduling device
CN112579277B (en) * 2020-12-24 2022-09-16 海光信息技术股份有限公司 Central processing unit, method, device and storage medium for simultaneous multithreading
CN114048026A (en) * 2021-10-27 2022-02-15 北京航空航天大学 GPU resource dynamic allocation method under multitask concurrency condition
CN114385227A (en) * 2022-01-17 2022-04-22 中国农业银行股份有限公司 Service processing method, device, equipment and storage medium
CN115061730A (en) * 2022-07-06 2022-09-16 中银金融科技有限公司 Thread concurrent management method and device
CN115328662A (en) * 2022-09-13 2022-11-11 国网智能电网研究院有限公司 Process thread resource management control method and system
CN115328564B (en) * 2022-10-17 2023-04-25 北京卡普拉科技有限公司 Asynchronous input/output thread processor resource allocation method and device

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115562838A (en) * 2022-10-27 2023-01-03 Oppo广东移动通信有限公司 Resource scheduling method and device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN115718665A (en) 2023-02-28

Similar Documents

Publication Publication Date Title
CN108549574B (en) Thread scheduling management method and device, computer equipment and storage medium
JP2017004511A (en) Systems and methods for scheduling tasks using sliding time windows
JPH06139189A (en) Common bus arbitrating mechanism
US9612651B2 (en) Access based resources driven low power control and management for multi-core system on a chip
US20190044883A1 (en) NETWORK COMMUNICATION PRIORITIZATION BASED on AWARENESS of CRITICAL PATH of a JOB
CN111459622B (en) Method, device, computer equipment and storage medium for scheduling virtual CPU
EP2282265A1 (en) A hardware task scheduler
JP2017073000A (en) Parallelization method, parallelization tool, and on-vehicle device
CN115718665B (en) Asynchronous I/O thread processor resource scheduling control method, device, medium and equipment
CN112559176B (en) Instruction processing method and device
CN111767121A (en) Operation method, device and related product
US20050066093A1 (en) Real-time processor system and control method
CN116089049B (en) Asynchronous parallel I/O request-based process synchronous scheduling method, device and equipment
CN115328564B (en) Asynchronous input/output thread processor resource allocation method and device
US20230067432A1 (en) Task allocation method, apparatus, electronic device, and computer-readable storage medium
CN116244073A (en) Resource-aware task allocation method for hybrid key partition real-time operating system
JPH064314A (en) Inter-task synchronizing communication equipment
JP6617511B2 (en) Parallelization method, parallelization tool, in-vehicle device
CN115599459B (en) Cross-power-domain multiprocessor operation device and communication method thereof
TWI823655B (en) Task processing system and task processing method applicable to intelligent processing unit
JP2003140787A (en) Power controller and power control method and power control program
US12045671B2 (en) Time-division multiplexing method and circuit for arbitrating concurrent access to a computer resource based on a processing slack associated with a critical program
JPS6146552A (en) Information processor
CN116225673A (en) Task processing method and device based on many-core chip, processing core and electronic equipment
CN118672743A (en) Deterministic task scheduling method, system and readable storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant