CN103136035B - For mixing the method and apparatus of the thread management of the program of thread mode - Google Patents

For mixing the method and apparatus of the thread management of the program of thread mode Download PDF

Info

Publication number
CN103136035B
CN103136035B CN201110391185.XA CN201110391185A CN103136035B CN 103136035 B CN103136035 B CN 103136035B CN 201110391185 A CN201110391185 A CN 201110391185A CN 103136035 B CN103136035 B CN 103136035B
Authority
CN
China
Prior art keywords
thread
work item
virtual
program
belonging
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201110391185.XA
Other languages
Chinese (zh)
Other versions
CN103136035A (en
Inventor
刘弢
林海波
王旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to CN201110391185.XA priority Critical patent/CN103136035B/en
Publication of CN103136035A publication Critical patent/CN103136035A/en
Application granted granted Critical
Publication of CN103136035B publication Critical patent/CN103136035B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Landscapes

  • Debugging And Monitoring (AREA)

Abstract

The present invention relates to the thread management of the program of mixing thread mode, thread management when particularly relating in the multi-core computer system of non-preemption gonosome architecture the program running the programming of mixing thread provides a kind of method and apparatus of thread management of the program for mixing thread mode, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, and the method comprises: set up the multiple virtual threads be associated respectively with multiple hardware resource; Intercept and capture a work item from the thread of program; Relation belonging to analytical work item between thread and other thread; According to analysis result, work item is assigned to a virtual thread in multiple virtual thread.

Description

For mixing the method and apparatus of the thread management of the program of thread mode
Technical field
The present invention relates to the thread management of the program of mixing thread mode, particularly relate in the multi-core computer system (multi-coresystem) of non-preemption (non-preemptive) architecture the thread management during program running the programming of mixing thread.
Background technology
High-performance calculation HPC (HighPerformanceComputing) more and more since multi-core computer system.This computer system adopts polycaryon processor (such as the CPU of double-core and four cores).Run parallel multithreading on multi-core processor system, the efficiency of multi-core processor system can be improved.Therefore, in HPC application, generally multithread programming is adopted.
An example of multithread programming is the POSIX thread mode (pthreadmodel) and the OpenMP pattern (OpenMPmodel) that adopt mixing.
POSIX thread (POSIXthread) is a kind of thread of POSIX standard, referred to as " Pthread ".POSIX standard defines inner API and creates and handle thread.POSIX standard adopts a function library being used for producing, controlling thread.According to POSIX thread mode, multiple thread can be produced for program, by a main thread, work be taken apart, give each sub-thread and go to perform, finally by the result of the synchronous each thread execution of main thread.
OpenMP is for writing concurrent program and the application programming interface (API) designed on multiprocessor.It comprises a set of compiling and instructs statement (compilerdirective) and one to be used for supporting its function library.OpenMP provides the abstractdesription of the high level to parallel algorithm, and programmer by adding the intention that special instruction indicates oneself in the source code of program, and program can be carried out parallelization by compiler automatically thus.
According to the program that the multithread programming of the POSIX thread mode and OpenMP pattern that adopt mixing is write, perform in non-preemption gonosome architecture (non-preemptivearchitecture), easily produce the problem of thread deadlock.In the program that the multithread programming according to the POSIX thread mode and OpenMP pattern that adopt mixing is write, Pthread and OpenMP thread does not all know the other side side by side also existed each other.And in non-preemption (also known as " exclusivity ") architecture, multiple thread shared computation resource, to the control of computational resource between thread, is non-preemption.Such as, suppose that certain resource S current is controlled by certain thread Ta, another thread Tb needs access resources S.Thread Tb can not preempting resources S, but after only having the control initiatively abandoned resource S as thread Ta, thread Tb just can obtain the control to resource S.But if thread Ta is a Pthread, thread Tb is an OpenMP thread, thread Tb will attempt to take resource S, thus produce conflict, cause exiting of the endless loop of program or program.
Summary of the invention
On the one hand, a kind of method of thread management of the program for mixing thread mode is provided, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, and the method comprises: set up the multiple virtual thread VT be associated respectively with multiple hardware resource; Intercept and capture a work item from the thread of program; Relation belonging to analytical work item between thread and other thread; According to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
On the other hand, a kind of device of thread management of the program for mixing thread mode, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, this device comprises: virtual thread creation module, is configured to set up the multiple virtual thread VT be associated respectively with multiple hardware resource; Work item interception module, is configured to intercept and capture a work item from the thread of program; Thread relationship analysis module, is configured to the relation between thread and other thread belonging to analytical work item; Virtual thread distribution module, is configured to according to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
Utilize the present invention, can strengthen or the management of process function of alternative OpenMP dynamic base and pthread dynamic base, inherit other function of OpenMPI run-time library and pthread run-time library simultaneously.
Accompanying drawing explanation
By reference to the accompanying drawings and with reference to following detailed description, the feature of each embodiment of the present invention, advantage and other aspects will become more obvious, show some embodiments of the present invention by way of example, and not by way of limitation at this.In the accompanying drawings:
Fig. 1 shows the block diagram of the exemplary computer system 100 be suitable for for realizing embodiment of the present invention;
Fig. 2 A and 2B represents that an example adopts the process flow diagram of the POSIX thread mode of mixing and the source program of OpenMP pattern;
Fig. 2 C schematically shows the issuable thread conflict when performing of the program shown in Fig. 2 A and 2B;
Fig. 3 is the process flow diagram of the method according to one embodiment of the invention;
Fig. 4 schematically shows the block diagram of the device according to one embodiment of the invention;
Fig. 5 schematically shows the operation of the method according to one embodiment of the invention.
Embodiment
Process flow diagram in accompanying drawing and block diagram, illustrate according to the architectural framework in the cards of the system of the various embodiment of the present invention, method and computer program product, function and operation.In this, each square frame in process flow diagram or block diagram can represent a part for module, program segment or a code, and a part for described module, program segment or code comprises one or more executable instruction for realizing the logic function specified.Also it should be noted that at some as in alternative realization, the function marked in square frame also can be different from occurring in sequence of marking in accompanying drawing.Such as, in fact the square frame that two adjoining lands represent can perform substantially concurrently, and they also can perform by contrary order sometimes, and this determines according to involved function.Also it should be noted that, the combination of the square frame in each square frame in block diagram and/or process flow diagram and block diagram and/or process flow diagram, can realize by the special hardware based system of the function put rules into practice or operation, or can realize with the combination of specialized hardware and computer instruction.
Below with reference to some illustrative embodiments, principle of the present invention and spirit are described.Should be appreciated that providing these embodiments is only used to enable those skilled in the art understand better and then realize the present invention, and not limit the scope of the invention by any way.
Fig. 1 shows the block diagram of the exemplary computer system 100 be suitable for for realizing embodiment of the present invention.As shown, computing system 100 can comprise: CPU (CPU (central processing unit)) 101, RAM (random access memory) 102, ROM (ROM (read-only memory)) 103, system bus 104, hard disk controller 105, keyboard controller 106, serial interface controller 107, parallel interface controller 108, display controller 109, hard disk 110, keyboard 111, serial peripheral equipment 112, concurrent peripheral equipment 113 and display 114.In such devices, what be coupled with system bus 104 has CPU101, RAM102, ROM103, hard disk controller 105, keyboard controller 106, serialization controller 107, parallel controller 108 and display controller 109.Hard disk 110 is coupled with hard disk controller 105, keyboard 111 is coupled with keyboard controller 106, serial peripheral equipment 112 is coupled with serial interface controller 107, and concurrent peripheral equipment 113 is coupled with parallel interface controller 108, and display 114 is coupled with display controller 109.Should be appreciated that the structured flowchart described in Fig. 1 illustrates just to the object of example, instead of limitation of the scope of the invention.In some cases, can increase or reduce some equipment as the case may be.Such as, computing system 100 can configuration network adapter, to have the function of access computer network.
First, illustrate conventionally, adopt the program of POSIX thread mode and the OpenMP mode programming mixed operationally, the resource contention that may exist between the variety classes thread of program.
Fig. 2 A schematically shows that an example adopts the POSIX thread mode of mixing and the program 200 of OpenMP thread mode.Fig. 2 B represents the thread structure of the program 200 shown in Fig. 2 A.
Following thread is shown: main thread 210, POSIX thread (hereinafter referred " P thread ") P thread 230 and 231, OpenMP thread are (hereinafter referred to as " omp thread " 250,251,252 and 253 in Fig. 2 B.Wherein main thread 210 is threads that master routine Main (m) is corresponding.
P thread 230, P thread 231 are two similar sub-threads of main thread 210, and they are created by thread handle function Pthread_create () of master routine.Vertical line in Fig. 2 B schematically shows, and function Pthread_create () creates two parallel execution districts.
Omp thread 250 and omp thread 251 are two sub-threads of foreign peoples of P thread 230, and namely dissimilar with P thread thread, they are created by OpenMPparallel statement.Omp thread 250 and omp thread 251 are parallel fraternal threads, have common father's thread 230.
Omp thread 252 and omp thread 253 are two sub-threads of foreign peoples of P thread 231, and they are created by ompparallel statement.Omp thread 252 and omp thread 253 are parallel fraternal threads, have common father's thread 231.
Fig. 2 C schematically shows the issuable thread conflict when performing of the program 200 shown in Fig. 2 A and 2B.
Four hardware resources 291,292,293 and 294 are shown, such as four CPU cores in Fig. 2 C.
Frame 201,202,203 represents three groups of threads.Wherein, frame 201 represents main thread 210 and sub-thread 230 and 231 thereof, visible mutually between them; Frame 202 represents father's thread 230 and sub-thread 250 and 251 thereof, visible mutually between them; Frame 202 represents father's thread 231 and sub-thread 252 and 253 thereof, visible mutually between them.
When program 200 is performed, first hardware resource 291 is distributed to the main thread 210 of program by operating system.
P thread run-time library is known in system four hardware resources 291,292,293 and 294, so hardware resource 292 and 293 is distributed to pthread_0230 and pthread_1231.
In the concurrent working district that P thread 230 generates, comprise two omp threads and 250 and 251.
Omp thread run-time library is known in system four hardware resources 291,292,293 and 294, when distributing hardware resource for omp thread 250 and 251, think that the main thread of present procedure is thread 230, according to the thread management strategy of omp thread run-time library, omp thread 250 is assigned to the identical hardware resource 292 with thread 230.
Due to pthread thread and openMP thread mutually invisible, omp thread run-time library does not also know that hardware resource 291 has been assigned to thread 210, so thread 251 is sequentially assigned to hardware resource 291.Now, the main thread 210 that thread 251 is real with program there occurs hardware resource conflict.
In another concurrent working district that P thread 231 generates, comprise two omp threads and 252 and 253.
Similarly, omp thread run-time library is when distributing hardware resource for omp thread 252 and 253, think that the main thread of present procedure is thread 231, according to the thread management strategy of omp thread run-time library, omp thread 252 is assigned to the identical hardware resource 293 with thread 230.Because two parallel districts 271 and 272 are generated by two different P threads 230 and 231, do not know the thread of the other side how to distribute hardware resource each other, omp thread run-time library is except knowing hardware resource 293 and being assigned with, and do not know that hardware resource 291 and 292 is assigned with, therefore thread 253 is sequentially assigned to hardware resource 291.Like this, also there is hardware resource conflict with main thread 210 in thread 253.
Inventor finds through experiment, above-mentioned thread management, also there is thread 253 and can not get performing, and hardware resource 294 is not by situation about using.
For this reason, the present invention proposes a kind of method and apparatus of thread management of the program for mixing thread mode.General plotting of the present invention, that the work item of different types of thread of program is rearranged in virtual thread, by the execution of task on exclusive formula hardware resource of virtual thread, realize sharing hardware resource, make the work item from dissimilar thread can obtain the chance of execution.
Referring to accompanying drawing 3-5, describe various embodiment of the present invention in detail.First referring to Fig. 3, the figure shows the process flow diagram of the method according to one embodiment of the invention.
In short, shown in Fig. 3 is a kind of method of thread management of the program for mixing thread mode, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, and the method includes the steps of: set up the multiple virtual thread VT be associated respectively with multiple hardware resource; Intercept and capture a work item from the thread of program; Relation belonging to analytical work item between thread and other thread; According to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
Referring now to accompanying drawing, describe the operation of each step in detail.For convenience of description, below using the program 200 shown in Fig. 2 A as the example of the program in the embodiment of the present invention.It is to be noted, program 200 adopts the POSIX thread mode of mixing and the program of OpenMP mode programming, but, person of ordinary skill in the field should be clear, instructions, as example, not merely means that the present invention is only only limitted to the program of POSIX thread mode and the OpenMP mode programming mixed.
As shown in Figure 2 B, the thread produced when program 200 is run comprises main thread 210, P thread 230 and 231 and omp thread 250,251,252 and 253, and as shown in Figure 2 C, operational hardware resource has hardware resource 291,292,293 and 294.Hardware resource also claims hardware thread, the parts needed for computer instructions code, such as CPU core, internal memory etc.
Turn and see Fig. 3, the process of the method for the embodiment of the present invention, from step 390.
As initialization, step 390 sets up the multiple virtual thread VT (VirtualThread) be associated respectively with multiple hardware resource.
Multiple virtual threads that step 390 is set up as shown in Figure 5.Show four virtual threads 581,582,583 and 584 set up in step 390 in Fig. 5, they associate with 294 respectively with hardware resource 291,292,293.Virtual thread 581,582,583 and 584 has the function of general thread, is not only that the execution of program 200 produces.
In step 391, intercept and capture a work item of the thread from program.
Person of ordinary skill in the field knows, work item is the work that a thread will complete, and in target program, a work item is exactly one section of binary code, indicates its start address by a pointer.A thread, can have multiple work item; During execution thread, whenever only perform a work at present item, after work at present item executes, then perform another work item.
In different threading models, work item may be represented by different titles, such as, in OpenMP threading model, also the work item of thread is called " work " (workitem).
Person of ordinary skill in the field knows, the work item of dissimilar thread, all will submit to run-time library by corresponding api interface and perform.The work item of P thread and the work item of omp thread are also will be performed by P thread run-time library and omp thread run-time library.Therefore, according to one embodiment of the invention, the work item that the thread can intercepting and capturing program from api interface is submitted to run-time library.
By the work item of the mark of band square frame exemplified with the thread in program 200 in Fig. 2 B.Such as, the mark w101 of square frame is with to represent the work item of main thread 210.Similarly, the work item that w201 and w202 represents P thread 230 and 231 is respectively marked; Mark w301, w302, w303 and w304 represent the work item of omp thread 250,251,252 and 253 respectively.
Performing the result of step 391, such as, is intercepted and captured the work item w101 from main thread 210.
In step 392, the relation belonging to analytical work item between thread and other thread.
Relation between thread, comprises set membership, brotherhood; Create the thread of a sub-thread, be the father's thread being created thread, by the multiple sub-thread of same thread creation, there is brotherhood.
Relation between thread, also comprises similar relation and heterogeneous relationships.Such as, P thread and omp thread, have heterogeneous relationships.
Such as, according to one embodiment of the invention, for work item w101, according to Program Semantics, the thread 210 can analyzed belonging to it is main threads.
If that step 391 is intercepted and captured is work item w201, according to Program Semantics, the thread 230 can learning belonging to work item w201 is that therefore, the thread 230 belonging to work item w201 is sub-threads of main thread 210 by work item the Pthread_create () function creation in master routine.Equally, if that step 391 intercepting and capturing is work item w202, the thread 230 that also can analyze belonging to it is the sub-thread of main thread 210.
According to one embodiment of the invention, can according to the contextual information of work item, the relation belonging to analytical work item between thread and other thread.The contextual information of work item comprises following content:
The mark of father's thread of thread belonging to-work item and kind;
The mark of thread brother thread belonging to-work item;
The start address of-work item;
The state of-work item.
Therefore, by the mark of the mark of father's thread of thread belonging to work item and kind and affiliated thread brother thread, can relation belonging to analytical work item between thread and other thread.
In step 393, according to analysis result, work item is assigned to a virtual thread in multiple virtual thread.Be assigned to the work item of this virtual thread, will a task of this virtual thread be become.
Below referring to 2B and Fig. 5, the various modes how according to analysis result, work item being assigned to virtual thread are described.
Such as, having intercepted and captured the work item w101 from main thread 210 in step 391, is main threads in step 392 thread 210 analyzed belonging to work item w101.
Main thread all can be assigned to first available hardware resource by operating system usually.The present invention follows same principle, in step 393, work item w101 is assigned to the virtual thread 581 be associated with first available hardware resource 291, as a task of virtual thread 581.Specifically, as shown in Figure 5, by being placed into by work item w101 in the task queue Q581 of virtual thread 581, work item w101 is distributed to virtual thread 581.Because virtual thread 581 is associated with hardware resource 291, work item w101 is distributed to virtual thread 581, be equivalent to the execution of work item w101, hardware resource 291 can be taken.
Again such as, having intercepted and captured the work item w201 from P thread 230 in step 391, is sub-threads of main thread 210 in step 392 thread 230 analyzed belonging to work item w201.In step 393, work item w201 is assigned to virtual thread 582, as a task of virtual thread 582.Specifically, work item w201 is placed in the task queue Q582 of virtual thread 582.Because virtual thread 582 is associated with hardware resource 292, therefore, the execution of work item w201, can take hardware resource 292.
Again such as, the work item w202 from P thread 231 has been intercepted and captured in step 391, in the sub-thread that step 392 thread 231 analyzed belonging to work item w202 is main threads 210, it is the fraternal thread of thread 230, because the work item w101 of the thread 210 and work item w201 of thread 230 is assigned to virtual thread 291 and 292 respectively, in step 393, work item w202 is assigned to virtual thread 583, namely work item w202 is placed in the task queue Q583 of virtual thread 583 as a task.Thus, the execution of work item w202, can take hardware resource 293.
Above-mentioned is that work item w101, w201 and w202 distribute in the process of virtual thread, and what follow is an allocation rule of P thread, that is: the work item of father's thread, is not distributed in same virtual thread with the work item of sub-thread.
When the work item w301 of omp molded line journey 250 arrives, this work item w301 is intercepted and captured in step 391.Omp threads in step 392 thread 250 analyzed belonging to work item w301, and be the sub-thread of P thread 230, in step 393, work item w301 is assigned to virtual thread 582, namely work item w301 is placed in task queue Q582 as a task.
Above-mentioned is that the work item w301 of omp molded line journey 250 distributes in the process of virtual thread, and what follow is an allocation rule of Omp thread, that is: the work item of father's thread will be distributed in same virtual thread with the work item of a sub-thread.
According to one embodiment of the invention, when work item being assigned to a virtual thread in multiple virtual thread, if father's thread and sub-thread are dissimilar threads, then the distribution of work item is according to the allocation rule of thread type belonging to sub-thread, work item is assigned to a virtual thread in multiple virtual thread.Such as, the thread 250 belonging to work item w301 is omp molded line journeys, and thread 230 is P threads, and the thread type of the two is different.Therefore, when distributing virtual thread for work item w301, the allocation rule of the thread type of thread belonging to work item w301 is followed, i.e. the allocation rule of Omp thread: the work item of father's thread will be distributed in same virtual thread with the work item of a sub-thread.
If contrary, father's thread is omp thread, and sub-thread is P thread, then will follow the allocation rule of above-mentioned P thread, that is: the work item of father's thread, is not distributed in same virtual thread with the work item of sub-thread.
Return Fig. 2 B and Fig. 5, how go on to say is the work item distribution virtual thread from other thread.
When the work item w302 of omp molded line journey 250 arrives, this work item w301 is intercepted and captured in step 391.Analyzing in step 392, the thread 251 belonging to work item w302, is an omp thread, and is fraternal threads with thread 250, is all the sub-thread of P thread 230.
As described above, the allocation rule of Omp thread comprises: the work item of father's thread will be distributed in same virtual thread with the work item of a sub-thread.
According to one embodiment of the invention, the allocation rule of Omp thread comprises further: by the work item of the synchronous sub-thread between responsible fraternal thread, be distributed in the sub-thread of same virtual thread with the work item of father's thread.Person of ordinary skill in the field knows, by the barrier function in work item, can determine whether the thread belonging to work item is responsible synchronous thread.
In step 393, according to the allocation rule of Omp thread, judge whether thread 251 is the synchronous threads being responsible between fraternal thread (251,252).If so, then work item 320 to be assigned to virtual thread 582.
According to one embodiment of the invention, the allocation rule of omp thread comprises further: the work item of an omp thread, is not distributed in same virtual thread with the work item of the sub-thread of the synchronous omp type be responsible between fraternal thread.
Therefore, if intercept work item 301 and 302 simultaneously, and thread 251 belonging to work item 302 is responsible synchronous threads, then work item 302 will be assigned to virtual thread 582, and time different, work item 301 is assigned to virtual thread 582.If thread 250 belonging to work item 301 is responsible synchronous threads, then work item 301 to be assigned to virtual thread 582, and time different, work item 302 be assigned to virtual thread 582.
Here, suppose that thread 251 is not responsible synchronous thread, and work item 301 has been assigned to virtual thread 582, therefore, work item 302 is not distributed to virtual thread 582.
According to meeting balance principle, work item w301 being assigned to virtual thread 584, namely work item w301 being placed in task queue Q584 as a task.
Finally, if intercept work item w303 and w304 in step, because work item w303 and w304 and w301 and w302 mentioned above is similar, belong to two omp threads 252 and 253 respectively, and thread 252 and 253 is all the sub-thread of P thread 231, therefore, to the allocation scheme of work item w303 and w304, similar to the allocation scheme of w301 with w302.As shown in the figure, work item w303 and w304 is assigned to virtual thread 583 and 584 respectively.
It is pointed out that in the task queue Q584 of virtual thread 584, between the task that work item w304 and work item w302 is corresponding, without the need to specifying sequencing, this is because there is no direct dependence between work item w304 and work item w302.
According to one embodiment of the invention, according to the relation of thread belonging to work item and other thread, the precedence relationship between other work item in work item and assigned virtual thread can be determined.Such as, the thread 230 belonging to work item w201 is father's threads of the thread 250 belonging to work item w301, therefore can in task queue Q582, and the task that regulation work item w201 the is corresponding task more corresponding than work item w301 has higher priority.
Be explained above the dynamic process of setting up virtual thread and work item distribution virtual thread for the thread of program of step 390 to step 393.As be labeled as 394 in Fig. 3 empty frame shown in, work item, just can the hardware resource that is associated of accesses virtual thread as after task matching to virtual thread, and there will not be the deadlock situation shown in Fig. 2 C.
Be explained above the various embodiments of the method for the thread management of the program for mixing thread mode of the present invention.According to identical inventive concept, the present invention also provides a kind of device of thread management of the program for mixing thread mode.
Fig. 4 is the block diagram of the device of the thread management of the program for mixing thread mode according to one embodiment of the invention.
Device 400 shown in Fig. 4, can be used for the thread management of the program mixing thread mode, the thread of described program shares multiple hardware resource in non-preemption mode.This device comprises virtual thread creation module 490, work item interception module 491, thread relationship analysis module 492 and virtual thread distribution module 493.
Virtual thread creation module 490 is configured to set up the multiple virtual thread VT be associated respectively with multiple hardware resource.
Work item interception module 491 is configured to intercept and capture a work item from the thread of program.
Thread relationship analysis module 492 is configured to the relation belonging to analytical work item between thread and other thread.
Virtual thread distribution module 493 is configured to according to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
According to one embodiment of the invention, thread belonging to work item and the relation between other thread comprise following any one or more: set membership; Brotherhood; Similar relation; Heterogeneous relationships.
According to one embodiment of the invention, thread relationship analysis module 492 is further configured, with according to Program Semantics, and the relation belonging to analytical work item between thread and other thread.
According to one embodiment of the invention, thread relationship analysis module 492 is further configured, with the contextual information according to work item, and the relation belonging to analytical work item between thread and other thread.
According to one embodiment of the invention, the thread of described program comprises P thread and omp thread, wherein, virtual thread distribution module 493 is further configured, with when father's thread is different with the thread type of sub-thread, according to the allocation rule of thread type belonging to sub-thread, work item is assigned to a virtual thread in multiple virtual thread.
According to one embodiment of the invention, the allocation rule of P thread comprises: the work item of father's thread, is not distributed in same virtual thread with the work item of sub-thread; The allocation rule of omp thread comprises: the work item of one of them sub-thread of the work item of father's thread and father's thread is assigned to same virtual thread.
According to one embodiment of the invention, the allocation rule of omp thread comprises further: by the work item of the sub-thread of the synchronous omp type between responsible fraternal thread, be distributed in the sub-thread of same virtual thread with the work item of father's thread.
According to one embodiment of the invention, the allocation rule of omp thread comprises further: the work item of an omp thread, is not distributed in same virtual thread with the work item of the sub-thread of the synchronous omp type be responsible between fraternal thread.
According to one embodiment of the invention, the work item that the thread that work item interception module 491 is configured to intercept and capture program by api interface is submitted to run-time library.
According to one embodiment of the invention, virtual thread distribution module is further configured, and with the relation of thread and other thread belonging to work item, determines the precedence relationship between other work item in work item and assigned virtual thread.
The foregoing describe the device of the thread management of the program for mixing thread mode according to the embodiment of the present invention, owing to describe in detail the method for the thread management of the program for mixing thread mode according to various embodiments of the invention above, above-mentioned in the description of device, eliminate obviously with the description of method is repeated or is easy to amplifying the content drawn the description of method.
Be to be noted that above description is only example, instead of limitation of the present invention.In other embodiments of the invention, the method can have more, less or different steps, to the numbering of step, is to make explanation simpler and clearer, instead of the considered critical to the ordinal relation between each step, each step and the order between step can with described different.
Therefore, in some embodiments of the invention, above-mentioned one or more optional step can be there is no.The concrete executive mode of each step can be different from described.All these changes are all within the spirit and scope of the present invention.
The present invention can take hardware embodiment, Software Implementation or not only comprise nextport hardware component NextPort but also comprised the form of embodiment of component software.In a preferred embodiment, the present invention is embodied as software, and it includes but not limited to firmware, resident software, microcode etc.
And the present invention can also take can from the form of the computer program that computing machine can be used or computer-readable medium is accessed, and these media provide program code use for computing machine or any instruction execution system or be combined with it.For the purpose of description, computing machine can with or computer-readable mechanism can be any tangible device, it can comprise, store, communicate, propagate or transmission procedure with by instruction execution system, device or equipment use or be combined with it.
Medium can be electric, magnetic, light, electromagnetism, ultrared or the system of semiconductor (or device or device) or propagation medium.The example of computer-readable medium comprises semiconductor or solid-state memory, tape, removable computer diskette, random access storage device (RAM), ROM (read-only memory) (ROM), hard disc and CD.The example of current CD comprises compact disk-ROM (read-only memory) (CD-ROM), compact disk-read/write (CD-R/W) and DVD.
Be suitable for store/or the data handling system of executive routine code will comprise at least one processor, it directly or by system bus is coupled to memory component indirectly.Local storage, mass storage that memory component utilizes the term of execution of can being included in program code actual and provide the interim storage of program code at least partially the cache memory of the number of times of code must be fetched the term of execution of minimizing from mass storage.
I/O or I/O equipment (including but not limited to keyboard, display, indication equipment etc.) directly or by middle I/O controller can be coupled to system.
Network adapter also can be coupled to system, can be coupled to other data handling systems or remote printer or memory device to make data handling system by middle privately owned or public network.Modulator-demodular unit, cable modem and Ethernet card are only several examples of current available types of network adapters.
Should be appreciated that without departing from the true spirit of the invention from foregoing description, can each embodiment of the present invention be modified and be changed.Description in this instructions is only illustrative, and should not be considered to restrictive.Scope of the present invention is only by the restriction of appended claims.

Claims (18)

1. for mixing a method for the thread management of the program of thread mode, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, and the method comprises:
Set up the multiple virtual threads be associated respectively with multiple hardware resource;
Intercept and capture a work item from the thread of program;
Relation belonging to analytical work item between thread and other thread, wherein, thread belonging to work item and the relation between other thread comprise following any one or more: set membership, brotherhood, similar relation, heterogeneous relationships;
According to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
2., according to the process of claim 1 wherein, the relation belonging to described analytical work item between thread and other thread, comprises according to Program Semantics, the relation belonging to analytical work item between thread and other thread.
3., according to the process of claim 1 wherein, the relation belonging to described analytical work item between thread and other thread, comprises the contextual information according to work item, the relation belonging to analytical work item between thread and other thread.
4. according to the method for Claims 2 or 3, wherein, the thread of described program comprises POSIX thread and OpenMP thread, wherein, it is described that according to analysis result, by work item, the virtual thread be assigned in multiple virtual thread comprises, if father's thread is different with the thread type of sub-thread, then according to the allocation rule of thread type belonging to sub-thread, work item is assigned to a virtual thread in multiple virtual thread.
5. according to the method for claim 4, wherein,
The allocation rule of POSIX thread comprises: the work item of father's thread, is not distributed in same virtual thread with the work item of sub-thread;
The allocation rule of OpenMP thread comprises: the work item of one of them sub-thread of the work item of father's thread and father's thread is assigned to same virtual thread.
6. according to the method for claim 5, wherein, the allocation rule of OpenMP thread comprises further: by the work item of the synchronous OpenMP type thread between responsible fraternal thread, be distributed in the sub-thread of same virtual thread with the work item of father's thread.
7. according to the method for claim 6, wherein, the allocation rule of OpenMP thread comprises further: the work item of an OpenMP thread, is not distributed in same virtual thread with the work item of the synchronous OpenMP type thread be responsible between fraternal thread.
8., according to the process of claim 1 wherein, described intercepting and capturing, from a work item of the thread of program, comprise the work item submitted to run-time library by the thread of application programming interface API intercepting and capturing program.
9. according to the method for claim 1, comprise further: the relation of thread and other thread belonging to work item, determine the precedence relationship between other work item in work item and assigned virtual thread.
10. for mixing a device for the thread management of the program of thread mode, wherein, the thread of described program shares multiple hardware resource in non-preemption mode, and this device comprises:
Virtual thread creation module, is configured to set up the multiple virtual threads be associated respectively with multiple hardware resource;
Work item interception module, is configured to intercept and capture a work item from the thread of program;
Thread relationship analysis module, be configured to the relation between thread and other thread belonging to analytical work item, wherein, thread belonging to work item and the relation between other thread comprise following any one or more: set membership, brotherhood, similar relation, heterogeneous relationships;
Virtual thread distribution module, is configured to according to analysis result, work item is assigned to a virtual thread in multiple virtual thread.
11. according to the device of claim 10, and wherein, thread relationship analysis module is further configured with according to Program Semantics, the relation belonging to analytical work item between thread and other thread.
12. according to the device of claim 10, and wherein, thread relationship analysis module is further configured with the contextual information according to work item, the relation belonging to analytical work item between thread and other thread.
13. according to the device of claim 11 or 12, the type of the thread of wherein said program comprises POSIX thread and OpenMP thread, wherein, virtual thread distribution module is further configured, with when father's thread is different with the thread type of sub-thread, according to the allocation rule of thread type belonging to sub-thread, work item is assigned to a virtual thread in multiple virtual thread.
14. according to the device of claim 13, and wherein, the allocation rule of POSIX thread comprises: the work item of father's thread, is not distributed in same virtual thread with the work item of sub-thread; The allocation rule of OpenMP thread comprises: the work item of one of them sub-thread of the work item of father's thread and father's thread is assigned to same virtual thread.
15. according to the device of claim 14, wherein, the allocation rule of OpenMP thread comprises further: by the work item of the sub-thread of synchronous OpenMP between responsible fraternal thread, be distributed in the sub-thread of same virtual thread with the work item of father's thread.
16. according to the device of claim 15, and wherein, the allocation rule of OpenMP thread comprises further: the work item of an OpenMP thread, is not distributed in same virtual thread with the work item of the sub-thread of synchronous OpenMP be responsible between fraternal thread.
17. according to the device of claim 10, and wherein, described intercepting and capturing, from a work item of the thread of program, comprise the work item that the thread of intercepting and capturing program by application programming interface API is submitted to run-time library.
18. according to the device of claim 10, and wherein, virtual thread distribution module is further configured, and with the relation of thread and other thread belonging to work item, determines the precedence relationship between other work item in work item and assigned virtual thread.
CN201110391185.XA 2011-11-30 2011-11-30 For mixing the method and apparatus of the thread management of the program of thread mode Expired - Fee Related CN103136035B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110391185.XA CN103136035B (en) 2011-11-30 2011-11-30 For mixing the method and apparatus of the thread management of the program of thread mode

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110391185.XA CN103136035B (en) 2011-11-30 2011-11-30 For mixing the method and apparatus of the thread management of the program of thread mode

Publications (2)

Publication Number Publication Date
CN103136035A CN103136035A (en) 2013-06-05
CN103136035B true CN103136035B (en) 2015-11-25

Family

ID=48495899

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110391185.XA Expired - Fee Related CN103136035B (en) 2011-11-30 2011-11-30 For mixing the method and apparatus of the thread management of the program of thread mode

Country Status (1)

Country Link
CN (1) CN103136035B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103713938A (en) * 2013-12-17 2014-04-09 江苏名通信息科技有限公司 Multi-graphics-processing-unit (GPU) cooperative computing method based on Open MP under virtual environment
CN103617091B (en) * 2013-12-18 2017-06-16 深圳市道通科技股份有限公司 The implementation method and device of hardware resource dynamically distributes
US9588811B2 (en) * 2015-01-06 2017-03-07 Mediatek Inc. Method and apparatus for analysis of thread latency
CN106933534B (en) * 2015-12-31 2020-07-28 阿里巴巴集团控股有限公司 Data synchronization method and device

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097514A (en) * 2006-06-27 2008-01-02 国际商业机器公司 Managing execution of mixed workloads in a simultaneous multi-threaded (smt) enabled system

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8332852B2 (en) * 2008-07-21 2012-12-11 International Business Machines Corporation Thread-to-processor assignment based on affinity identifiers
JP5173714B2 (en) * 2008-09-30 2013-04-03 ルネサスエレクトロニクス株式会社 Multi-thread processor and interrupt processing method thereof
US9021483B2 (en) * 2009-04-27 2015-04-28 International Business Machines Corporation Making hardware objects and operations thread-safe

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101097514A (en) * 2006-06-27 2008-01-02 国际商业机器公司 Managing execution of mixed workloads in a simultaneous multi-threaded (smt) enabled system

Also Published As

Publication number Publication date
CN103136035A (en) 2013-06-05

Similar Documents

Publication Publication Date Title
US10861214B2 (en) Graphics processor with non-blocking concurrent architecture
Krömer et al. Many-threaded implementation of differential evolution for the CUDA platform
US8601486B2 (en) Deterministic parallelization through atomic task computation
Tillenius Superglue: A shared memory framework using data versioning for dependency-aware task-based parallelization
US11163677B2 (en) Dynamically allocated thread-local storage
CN103136035B (en) For mixing the method and apparatus of the thread management of the program of thread mode
Lopez et al. An OpenMP free agent threads implementation
Khasanov et al. Implicit data-parallelism in Kahn process networks: Bridging the MacQueen Gap
Zheng et al. HiWayLib: A software framework for enabling high performance communications for heterogeneous pipeline computations
Berezovskyi et al. Faster makespan estimation for GPU threads on a single streaming multiprocessor
Mivule et al. A review of cuda, mapreduce, and pthreads parallel computing models
Vo et al. HyperFlow: A Heterogeneous Dataflow Architecture.
Dubrulle et al. A low-overhead dedicated execution support for stream applications on shared-memory CMP
Chandrashekhar et al. Performance analysis of parallel programming paradigms on CPU-GPU clusters
Li et al. Concurrent query processing in a GPU-based database system
Skrzypczak et al. Efficient parallel implementation of crowd simulation using a hybrid CPU+ GPU high performance computing system
Che et al. Work stealing in a shared virtual-memory heterogeneous environment: A case study with betweenness centrality
Dalmia et al. Improving the Scalability of GPU Synchronization Primitives
Evripidou et al. Data-flow vs control-flow for extreme level computing
Quang-Hung et al. Implementing genetic algorithm accelerated by Intel Xeon Phi
Camacho-Mora Ray Tracing acceleration through a custom scheduling policy to take advantage of the cache affinity in a Linux-based Special-Purpose Operating System
Mayer Leveraging Compiled Languages to Optimize Python Frameworks
US20150293780A1 (en) Method and System for Reconfigurable Virtual Single Processor Programming Model
Knorr et al. Automatic Discovery of Collective Communication Patterns in Parallelized Task Graphs
Mahapatra Designing Efficient Barriers and Semaphores for Graphics Processing Units

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20151125

Termination date: 20201130

CF01 Termination of patent right due to non-payment of annual fee