US20070043916A1 - System and method for light weight task switching when a shared memory condition is signaled - Google Patents
System and method for light weight task switching when a shared memory condition is signaled Download PDFInfo
- Publication number
- US20070043916A1 US20070043916A1 US11/204,424 US20442405A US2007043916A1 US 20070043916 A1 US20070043916 A1 US 20070043916A1 US 20442405 A US20442405 A US 20442405A US 2007043916 A1 US2007043916 A1 US 2007043916A1
- Authority
- US
- United States
- Prior art keywords
- thread
- lock
- data
- line reservation
- handler
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F12/00—Accessing, addressing or allocating within memory systems or architectures
- G06F12/02—Addressing or allocation; Relocation
- G06F12/08—Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
- G06F12/0802—Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
- G06F12/0806—Multiuser, multiprocessor or multiprocessing cache systems
- G06F12/0842—Multiuser, multiprocessor or multiprocessing cache systems for multiprocessing or multitasking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/52—Program synchronisation; Mutual exclusion, e.g. by means of semaphores
- G06F9/526—Mutual exclusion algorithms
Definitions
- the present invention relates in general to a system and method for lightweight task switching when a shared memory condition is signaled. More particularly, the present invention relates to a system and method for using a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or whether a processor acquires a mutex lock.
- Computer applications typically run multiple threads to perform different tasks that request access to shared data.
- Common approaches to accessing shared data are 1) using a mutual exclusion (mutex) lock primitive or 2) using a condition wait primitive.
- mutex mutual exclusion
- a mutex lock allows multiple threads to “take turns” sharing the same resource, such as accessing a file.
- the program creates a mutex object for a given resource by requesting the resource from the system, whereby the system returns a unique name or identifier for the resource.
- a thread requiring the resource uses the mutex to “lock” the resource from other threads while the thread uses the resource.
- the system typically queue's threads requesting the resource and then gives control to the threads when the mutex becomes unlocked.
- a condition wait primitive allows a thread to identify whether a condition has occurred by accessing cache line data, such as whether a video card has completed a vertical retrace.
- the condition wait primitive allows a thread to wait until a particular condition is met before proceeding.
- a challenge found with both mutex lock primitives and condition wait primitives is that system performance decreases when threads wait for data to become available.
- Task switching is a common approach to increasing system performance when a system invokes multiple threads.
- Task switching allows a processor to switch from one thread to another thread without losing its “spot” in the first thread.
- Task switching is different than multitasking because, in multitasking, a processor switches back and forth quickly between threads, giving the appearance that all programs are running simultaneously. In task switching, the processor does not switch back and forth between threads, but executes one thread at a time.
- a challenge found with task switching is that task switching typically occurs at pre-determined intervals. For example, a processor may check every 10 milliseconds as to whether a particular lock has been acquired for a requesting thread.
- a synergistic processing unit invokes a first thread and, during execution, the first thread requests to update external data.
- the external data may be shared with other threads or processors in the system, which is protected by a mutex lock or other shared memory synchronization constructs.
- the thread may request to update a linked data structure, which is protected by a mutex lock that prevents other threads or processors from traversing the linked data structure while the linked data structure is being updated.
- the SPU issues a lock line reservation to L2 memory corresponding to the thread's request.
- L2 memory includes a cache bus interface controller and a cache. The cache bus interface controller receives the lock line reservation, and retrieves corresponding data from the cache. The cache bus interface controller then sends the cache line data to the SPU.
- the SPU analyzes the cache line data and determines, based upon the analysis, that the requested cache line is not available.
- the cache line data may include a different thread's task identifier that is currently accessing the same data that the first thread wishes to access. Since the requested data is not available, the SPU switches from the first thread to a second thread.
- the SPU also enables asynchronous interrupts and invokes the handler to monitor incoming asynchronous interrupts.
- the cache bus interface controller determines that a “reservation is lost” for one of the cache's cache lines, the cache bus interface controller issues a “lock line reservation lost” event to inform the SPU.
- the handler detects the lock line reservation lost event, and sends a “get lock line reservation” to the L2 memory in order to receive updated cache line data.
- the cache bus interface controller receives the get lock line reservation and provides the cache line data to the SPU.
- the handler analyzes the cache line data and determines whether the first thread's requested cache line is now available. If the requested cache line is still not available, the handler waits for another asynchronous interrupt and the SPU continues to process the second thread.
- the handler When the handler determines that the requested cache line is available, the handler performs a “conditional store” using the first thread's task identifier in an attempt to secure a mutex lock for the first thread's requested data that is located in the cache.
- the conditional store operation has an associated status register, which indicates whether or not the conditional store operation succeeds. If the conditional store is successful, the SPU switches from the second thread back to the first thread and processes the first thread's original external data request.
- the SPU For condition wait primitives, when the SPU receives cache line data, the SPU identifies whether a particular condition is true by analyzing the cache line data. When the thread's requested condition is not true, the SPU switches from the first thread to the second thread, enables asynchronous interrupts, and invokes a handler to monitor-incoming asynchronous interrupts.
- the handler when the handler detects a lock line reservation lost event, the handler issues a get lock line reservation to L2 memory in order to receive updated cache line data.
- the handler analyzes the cache line data and determines whether the requested condition is true. If the condition is still not true, the handler waits for another asynchronous interrupt and the SPU continues to process the second thread. When the condition is true, the SPU switches from the second thread back to the first thread and process the thread's request.
- FIG. 1 is a diagram of a processor switching tasks based upon acquiring a mutex lock
- FIG. 2 is a diagram of a processor switching tasks based upon determining a condition variable is true
- FIG. 3 is a flowchart showing steps taken in task switching between threads using a mutex lock primitive
- FIG. 4 is a flowchart showing steps taken in invoking a handler to process asynchronous interrupts corresponding to mutex lock requests
- FIG. 5 is a flowchart showing steps taken in task switching between threads using a condition wait primitive
- FIG. 6 is a flowchart showing steps taken in receiving asynchronous interrupts corresponding to a condition wait primitive and switching tasks in response to determining that a condition is true;
- FIG. 7A is a diagram showing an example of a processor's mutex lock pseudo-code
- FIG. 7B is a diagram showing an example of mutex lock handler pseudo-code that acquires a mutex lock based upon detecting a lock line reservation lost event
- FIG. 8A is a diagram showing an example of a processor's condition wait pseudo-code
- FIG. 8B is a diagram showing an example of condition wait pseudo-code that an event handler performs upon receiving a lock line reservation lost event.
- FIG. 9 is a block diagram of an information handling system capable of implementing the present invention.
- FIG. 1 is a diagram of a processor switching tasks based upon acquiring a mutex lock.
- Synergistic processing complex (SPC) 100 includes synergistic processing unit (SPU) 110 , which processes thread A 120 (e.g., a first thread).
- SPU 110 is preferably a single instruction, multiple data (SIMD) processor, such as a digital signal processor, a microcontroller, a microprocessor, or a combination of these cores.
- SIMD single instruction, multiple data
- thread A 120 requests external data that is located in cache 150 .
- thread A 120 may wish to update a linked data structure that is located in cache 150 .
- SPU 110 issues get lock line reservation 160 to L2 140 corresponding to thread A 120 's request. Get lock line reservation 160 instructs L2 140 to provide data from a particular cache line.
- L2 140 includes cache bus interface controller 145 and cache 150 .
- Cache bus interface controller 145 receives get lock line reservation 160 , and retrieves corresponding data from cache 150 .
- Cache bus interface controller 145 then sends cache line data 165 to SPU 110 .
- Cache line data 165 includes data corresponding to a particular cache line.
- SPU 110 analyzes-cache line data 165 and determines, based upon the analysis, that the requested cache line is not available.
- cache line data 165 may include a different thread's task identifier that is currently accessing the same data as thread A 120 wishes to access.
- SPU 110 switches from thread A 120 to thread B 125 (e.g., a second thread). SPU 110 also enables asynchronous interrupts and invokes handler 115 (e.g., a software subroutine) to monitor incoming asynchronous interrupts (see FIGS. 7B, 8B , and corresponding text for further details regarding handler properties).
- handler 115 e.g., a software subroutine
- cache bus interface controller 145 determines that a reservation is lost for one of cache 150 's cache lines
- cache bus interface controller 145 issues lock line reservation lost 170 to inform SPU 110 that a reservation has been lost corresponding to one of the cache lines.
- Handler 115 detects lock line reservation lost 170 , and sends get lock line reservation 175 to L2 140 in order to receive subsequent cache line data.
- Cache bus interface controller 145 receives get lock line reservation 175 and provides cache line data 180 to SPU 110 .
- Handler 115 analyzes cache line data 180 and determines whether the requested cache line is now available. If the requested cache line is still not available, handler 115 waits for another asynchronous interrupt and SPU 110 continues to process thread B 125 .
- handler 115 When handler 115 determines that the requested cache line is available, handler 115 performs conditional store 190 using thread A 120 's task identifier, which attempts to secure a mutex lock for thread A 120 's requested data that is located in cache 150 .
- the conditional store operation has an associated status register, which indicates whether or not the conditional store operation succeeds. If the conditional store is successful, SPU 110 switches from thread B 125 back to thread A 120 and processes thread A 120 's original external data request (see FIGS. 3, 4 , and corresponding text for further details regarding mutex lock primitive and handler steps).
- FIG. 2 is a diagram of a processor switching tasks using a condition wait primitive.
- a condition wait primitive allows a thread to identify whether a condition has occurred by accessing cache line data, such as whether a video card has completed a vertical retrace.
- FIG. 2 is similar to FIG. 1 with the exception that handler 115 analyzes cache line data to determine whether a particular condition is true, such as a video card indicating that a vertical retrace is complete.
- SPC 100 , SPU 110 , handler 115 , thread A 120 , thread B 125 , L2 140 , cache bus interface controller 145 , and cache 150 are the same as that shown in FIG. 1 .
- SPU 110 invokes thread A 120 .
- thread A 120 requests external data that is located in cache 150 that identifies whether a particular condition is true.
- SPU 110 sends get lock line reservation 200 to L2 140 in order to receive cache line data 210 from cache bus interface controller 145 .
- SPU 110 analyzes cache line data 210 and determines, based upon the analysis, that the particular condition is not true. For example, SPU 110 may check a bit included in cache line data 210 that identifies that a video card has not completed a vertical retrace. Since thread A 120 's requested condition is not true, SPU 110 switches from thread A 120 to thread B 125 , enables asynchronous interrupts, and invokes handler 115 to monitor incoming asynchronous interrupts.
- cache bus interface controller 145 determines that a reservation is lost for a cache line included in cache 150 , cache bus interface controller 145 issues lock line reservation lost 220 to inform SPU 110 .
- Handler 115 detects lock line reservation lost 220 , and sends get lock line reservation 275 to L2 140 in order to receive subsequent cache line data.
- Cache bus interface controller 145 receives get lock line reservation 275 and, in turn, provides cache line data 280 to SPU 110 .
- Handler 115 analyzes cache line data 280 and determines whether the requested condition is true. If the condition is still not true, handler waits for another asynchronous interrupt and SPU 110 continues to process thread B 125 .
- SPU 110 switches from thread B 125 to thread A 120 and process thread A 120 's request (see FIGS. 5, 6 , and corresponding text for further details regarding condition wait primitive and handler steps).
- FIG. 3 is a flowchart showing steps taken in task switching between threads using a mutex lock primitive.
- a system may use a mutex lock primitive to guarantee mutual exclusion among processors operating on data within critical sections of code, such as updating a linked list data structure.
- Thread A 120 is the same as that shown in FIG. 1 .
- processing receives an external data request from thread A 120 .
- thread A 120 may request data corresponding to a linked data structure that is located in external memory.
- Thread A 120 is the same as that shown in FIG. 1 .
- processing sends a get lock line reservation request to L2 140 , which instructs L2 140 to provide data corresponding to a particular cache line.
- processing receives the requested cache line data from L2 140 .
- L2 140 is the same as that shown in FIG. 1 .
- decision 330 branches to “Yes” branch 338 whereupon processing enters thread A 120 's task identifier and performs a conditional store at step 340 (see FIG. 7A and corresponding text for further details).
- a determination is made as to whether the conditional store is accepted in L2 140 by reading the corresponding memory location (decision 350 ). If the conditional store was not accepted, decision 350 branches to “No” branch 352 , which loops back to send another get lock line reservation request. This looping continues until L2 140 accepts the conditional store, at which point decision 350 branches to “Yes” branch 358 whereupon processing acquires a mutex lock (step 360 ).
- decision 380 A determination is made as to whether to continue processing (decision 380 ). If processing should continue, decision 380 branches to “Yes” branch 382 which loops back to receive and process more external data requests. This looping continues until processing should terminate, at which point decision 380 branches to “No” branch 388 whereupon processing ends at 390 .
- FIG. 4 is a flowchart showing steps taken in invoking a handler to process asynchronous interrupts corresponding to mutex lock requests. Processing commences at 400 , whereupon processing puts thread A 120 to sleep and invokes thread B 125 (step 405 ). Thread A 120 previously requested external data that was not available and, as such, processing switches from thread A 120 to thread B 125 until the data is available (see FIG. 3 and corresponding text for further details). Thread A 120 and thread B 125 are the same as that shown in FIG. 1 .
- processing enables asynchronous interrupts in order for a handler to detect a lock line reservation lost event that corresponds to thread A 120 's external data request.
- Processing invokes the handler at step 415 , such as handler 115 shown in FIG. 1 .
- the handler waits for a lock line reservation lost event from L2 140 .
- the handler sends a get lock line reservation request to L2 140 and receives corresponding cache line data (step 425 ).
- L2 140 is the same as that shown in FIG. 1 .
- the cache line data may include a different thread's task identifier that is currently accessing the same data as the thread wishes to access, which makes the mutex lock unavailable. If the mutex lock is not available, decision 430 branches to “No” branch 432 whereupon processing loops back to wait for another lock line reservation lost event from L2 140 . This looping continues until the mutex lock is available, at which point decision 430 branches to “Yes” branch 438 whereupon the handler enters thread A 120 's task identifier and performs a condition store on the corresponding memory location in L2 140 (step 440 , see FIG.
- processing acquires a mutex lock and switches back to process thread A 120 's external data request. Processing returns at 470 .
- FIG. 5 is a flowchart showing steps taken in task switching between threads using a condition wait primitive.
- a processor may use a condition wait primitive to determine when a condition becomes true, such as a video card indicating that a vertical retrace is complete.
- Thread A 120 is the same as that shown in FIG. 1 .
- processing receives an external data request from thread A 120 .
- thread A 120 may request data corresponding to whether a video card has completed a vertical retrace.
- processing sends a get lock line reservation request to L2 140 , which instructs L2 140 to provide data corresponding to a particular cache line (step 530 ).
- processing receives the requested cache line data from L2 140 .
- L2 140 is the same as that shown in FIG. 1 .
- decision 550 branches to “No” branch 558 whereupon processing switches threads and monitors lock line reservation lost events (pre-defined process block 560 , see FIG. 6 and corresponding text for further details).
- decision 570 A determination is made as to whether to continue task switching steps (decision 570 ). If task switching should continue, decision 570 branches to “Yes” branch 572 which loops back to receive and process more external data requests. This looping continues until processing should stop executing task switching steps, at which point decision 570 branches to “No” branch 578 whereupon processing ends at 580 .
- FIG. 6 is a flowchart showing steps taken in receiving asynchronous interrupts corresponding to a condition wait primitive and switching tasks in response to determining that a condition is true.
- Processing commences at 600 , whereupon processing puts thread A 120 to sleep and invokes thread B 125 (step 610 ).
- Thread A 120 previously requested external data that was not available and, as such, processing switches from thread A 120 to thread B 125 until the data is available (see FIG. 5 and corresponding text for further details).
- Thread A 120 and thread B 125 are the same as that shown in FIG. 1 .
- processing enables asynchronous interrupts in order for processing to detect a lock line reservation lost event that corresponds to thread A 120 's external data request.
- Processing invokes a lock line reservation handler at step 630 , such as handler 115 shown in FIG. 1 .
- the handler waits for a lock line reservation lost event from L2 140 .
- the handler issues a get lock line reservation request to L2 140 and receives corresponding cache line data (step 650 ).
- a determination is made as to whether the condition is true by reading the cache line data (decision 660 ). If the condition is not true, decision 660 branches to “No” branch 662 whereupon processing loops back to wait for another lock line reservation lost event from L2 140 . This looping continues until the condition is true, at which point decision 660 branches to “Yes” branch 668 whereupon the handler switches threads and performs thread A 120 's task (step 670 ). Processing returns at 680 .
- FIG. 7A is a diagram showing an example of a processor's mutex lock pseudo-code.
- Code 700 includes pseudo-code that tests whether a mutex lock is available and, if not, switches threads and enables asynchronous interrupts.
- Code 700 includes lines 710 through 740 .
- Line 710 performs a lock line reservation request in order to receive cache line data that signifies whether a mutex lock is available for a particular memory line.
- Lines 720 and 730 show that if the received cache line data is “0,” indicating that a mutex lock is available, to enter a task identifier and perform a conditional store.
- line 740 switches to another thread, invokes a handler, and enables asynchronous interrupts.
- the handler includes pseudo code such that, when it receives a lock line reservation lost event, the handler attempts to acquire the mutex lock (see FIG. 7B and corresponding text for further details regarding mutex lock handler pseudo code).
- FIG. 7B is a diagram showing an example of mutex lock handler pseudo-code that acquires a mutex lock based upon detecting a lock line reservation lost event.
- Code 750 includes lines 760 through 790 .
- Line 760 performs a lock line reservation request in order to receive cache line data that identifies whether a mutex lock is available for a particular cache line.
- Lines 770 and 780 show that if the received cache line data is “0,” indicating that a mutex lock is available, to enter a task identifier and perform a conditional store.
- line 785 exits the handler and waits for another lock line reservation lost event.
- the mutex lock is acquired and line 790 switches threads for further processing.
- FIG. 8A is a diagram showing an example of a processor's condition wait pseudo-code.
- Code 800 includes pseudo-code that tests whether a condition is true and, if not, switches threads and enables asynchronous interrupts.
- Code 800 includes lines 810 through 830 .
- Line 810 performs a lock line reservation request in order to receive cache line data that signifies whether a particular condition is true.
- Lines 820 and 825 show that if a condition is true, to return and continue processing the existing thread.
- line 830 switches to another thread, invokes a handler, and enables asynchronous interrupts.
- the handler includes pseudo code such that, when it receives a lock line reservation lost event, the handler tests the condition again (see FIG. 8B and corresponding text for further details regarding condition wait handler pseudo code).
- FIG. 8B is a diagram showing an example of condition wait pseudo-code that an event handler performs upon receiving a lock line reservation lost event.
- Code 840 includes lines 850 through 880 .
- Line 850 performs a lock line reservation request in order to receive cache line data that signifies whether a condition is true.
- Lines 860 and 870 show that if the condition is true, to switch threads and continue processing. When the condition is not true, line 880 exits the handler and waits for another lock line reservation lost event.
- FIG. 9 illustrates an information handling system, which is a simplified example of a computer system capable of performing the computing operations described herein.
- Broadband processor architecture (BPA) 900 includes a plurality of heterogeneous processors, a common memory, and a common bus.
- the heterogeneous processors are processors with different instruction sets that share the common memory and the common bus.
- one of the heterogeneous processors may be a digital signal processor and the other heterogeneous processor may be a microprocessor, both sharing the same memory space.
- BPA 900 sends and receives information to/from external devices through input output 970 , and distributes the information to control plane 910 and data plane 940 using processor element bus 960 .
- Control plane 910 manages BPA 900 and distributes work to data plane 940 .
- Control plane 910 includes processing unit 920 , which runs operating system (OS) 925 .
- processing unit 920 may be a Power PC core that is embedded in BPA 900 and OS 925 may be a Linux operating system.
- Processing unit 920 manages a common memory map table for BPA 900 .
- the memory map table corresponds to memory locations included in BPA 900 , such as L2 memory 140 as well as non-private memory included in data plane 940 .
- L2 memory 140 is the same as that shown in FIG. 1 .
- Data plane 940 includes Synergistic Processing Complex's (SPC) 100 , 950 , and 955 .
- SPC Synergistic Processing Complex's
- SPU 100 is the same as that shown in FIG. 1 .
- Each SPC is used to process data information and each SPC may have different instruction sets.
- BPA 900 may be used in a wireless communications system and each SPC may be responsible for separate processing tasks, such as modulation, chip rate processing, encoding, and network interfacing.
- each SPC may have identical instruction sets and may be used in parallel to perform operations benefiting from parallel processes.
- Each SPC includes a synergistic processing unit (SPU).
- SPU synergistic processing unit
- An SPU is preferably a single instruction, multiple data (SIMD) processor, such as a digital signal processor, a microcontroller, a microprocessor, or a combination of these cores.
- SIMD single instruction, multiple data
- each SPU includes a local memory, registers, four floating point units, and four integer units. However, depending upon the processing power required, a greater or lesser number of floating points units and integer units may be employed.
- SPC 100 , 950 , and 955 are connected to processor element bus 960 , which passes information between control plane 910 , data plane 940 , and input/output 970 .
- Bus 960 is an on-chip coherent multi-processor bus that passes information between I/O 970 , control plane 910 , and data plane 940 .
- Input/output 970 includes flexible input-output logic, which dynamically assigns interface pins to input output controllers based upon peripheral devices that are connected to BPA 900 .
- FIG. 9 While the information handling system described in FIG. 9 is capable of executing the processes described herein, this computer system is simply one example of a computer system. Those skilled in the art will appreciate that many other computer system designs are capable of performing the processes described herein, such as gaming systems, imaging systems, seismic computer systems, and animation systems.
- One of the preferred implementations of the invention is a client application, namely, a set of instructions (program code) in a code module that may, for example, be resident in the random access memory of the computer.
- the set of instructions may be stored in another computer memory, for example, in a hard disk drive, or in a removable memory such as an optical disk (for eventual use in a CD ROM) or floppy disk (for eventual use in a floppy disk drive), or downloaded via the Internet or other computer network.
- the present invention may be implemented as a computer program product for use in a computer.
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Software Systems (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Memory System Of A Hierarchy Structure (AREA)
Abstract
A system and method for using a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or a mutex lock is acquired is presented. A synergistic processing unit (SPU) invokes a first thread and, during execution, the first thread requests external data that is shared with other threads or processors in the system. This shared data may be protected with a mutex lock or other shared memory synchronization constructs. When requested data is not available, the SPU switches to a second thread and monitors lock line reservation lost events in order to check when the data is available. When the data is available, the SPU switches back to the first thread and processes the first thread's request.
Description
- 1. Technical Field
- The present invention relates in general to a system and method for lightweight task switching when a shared memory condition is signaled. More particularly, the present invention relates to a system and method for using a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or whether a processor acquires a mutex lock.
- 2. Description of the Related Art
- Computer applications typically run multiple threads to perform different tasks that request access to shared data. Common approaches to accessing shared data are 1) using a mutual exclusion (mutex) lock primitive or 2) using a condition wait primitive.
- A mutex lock allows multiple threads to “take turns” sharing the same resource, such as accessing a file. Typically, when a program starts, the program creates a mutex object for a given resource by requesting the resource from the system, whereby the system returns a unique name or identifier for the resource. After that, a thread requiring the resource uses the mutex to “lock” the resource from other threads while the thread uses the resource. When the mutex is locked, the system typically queue's threads requesting the resource and then gives control to the threads when the mutex becomes unlocked.
- A condition wait primitive allows a thread to identify whether a condition has occurred by accessing cache line data, such as whether a video card has completed a vertical retrace. The condition wait primitive allows a thread to wait until a particular condition is met before proceeding. A challenge found with both mutex lock primitives and condition wait primitives is that system performance decreases when threads wait for data to become available.
- As such, task switching is a common approach to increasing system performance when a system invokes multiple threads. Task switching allows a processor to switch from one thread to another thread without losing its “spot” in the first thread. Task switching is different than multitasking because, in multitasking, a processor switches back and forth quickly between threads, giving the appearance that all programs are running simultaneously. In task switching, the processor does not switch back and forth between threads, but executes one thread at a time. A challenge found with task switching, however, is that task switching typically occurs at pre-determined intervals. For example, a processor may check every 10 milliseconds as to whether a particular lock has been acquired for a requesting thread.
- What is needed, therefore, is a system and method to efficiently task switch between threads when a thread's requested resource becomes available.
- It has been discovered that the aforementioned challenges are resolved using a system and method to use a handler to detect asynchronous lock line reservation lost events, and switching tasks based upon whether a condition is true or a mutex lock is acquired.
- A synergistic processing unit (SPU) invokes a first thread and, during execution, the first thread requests to update external data. The external data may be shared with other threads or processors in the system, which is protected by a mutex lock or other shared memory synchronization constructs. For example, the thread may request to update a linked data structure, which is protected by a mutex lock that prevents other threads or processors from traversing the linked data structure while the linked data structure is being updated. To participate in a mutex lock or other shared memory synchronization constructs, the SPU issues a lock line reservation to L2 memory corresponding to the thread's request. L2 memory includes a cache bus interface controller and a cache. The cache bus interface controller receives the lock line reservation, and retrieves corresponding data from the cache. The cache bus interface controller then sends the cache line data to the SPU.
- For mutex lock primitives, the SPU analyzes the cache line data and determines, based upon the analysis, that the requested cache line is not available. For example, the cache line data may include a different thread's task identifier that is currently accessing the same data that the first thread wishes to access. Since the requested data is not available, the SPU switches from the first thread to a second thread. The SPU also enables asynchronous interrupts and invokes the handler to monitor incoming asynchronous interrupts.
- When the cache bus interface controller determines that a “reservation is lost” for one of the cache's cache lines, the cache bus interface controller issues a “lock line reservation lost” event to inform the SPU. The handler detects the lock line reservation lost event, and sends a “get lock line reservation” to the L2 memory in order to receive updated cache line data. The cache bus interface controller receives the get lock line reservation and provides the cache line data to the SPU. The handler analyzes the cache line data and determines whether the first thread's requested cache line is now available. If the requested cache line is still not available, the handler waits for another asynchronous interrupt and the SPU continues to process the second thread.
- When the handler determines that the requested cache line is available, the handler performs a “conditional store” using the first thread's task identifier in an attempt to secure a mutex lock for the first thread's requested data that is located in the cache. The conditional store operation has an associated status register, which indicates whether or not the conditional store operation succeeds. If the conditional store is successful, the SPU switches from the second thread back to the first thread and processes the first thread's original external data request.
- For condition wait primitives, when the SPU receives cache line data, the SPU identifies whether a particular condition is true by analyzing the cache line data. When the thread's requested condition is not true, the SPU switches from the first thread to the second thread, enables asynchronous interrupts, and invokes a handler to monitor-incoming asynchronous interrupts.
- In turn, when the handler detects a lock line reservation lost event, the handler issues a get lock line reservation to L2 memory in order to receive updated cache line data. The handler analyzes the cache line data and determines whether the requested condition is true. If the condition is still not true, the handler waits for another asynchronous interrupt and the SPU continues to process the second thread. When the condition is true, the SPU switches from the second thread back to the first thread and process the thread's request.
- The foregoing is a summary and thus contains, by necessity, simplifications, generalizations, and omissions of detail; consequently, those skilled in the art will appreciate that the summary is illustrative only and is not intended to be in any way limiting. Other aspects, inventive features, and advantages of the present invention, as defined solely by the claims, will become apparent in the non-limiting detailed description set forth below.
- The present invention may be better understood, and its numerous objects, features, and advantages made apparent to those skilled in the art by referencing the accompanying drawings.
-
FIG. 1 is a diagram of a processor switching tasks based upon acquiring a mutex lock; -
FIG. 2 is a diagram of a processor switching tasks based upon determining a condition variable is true; -
FIG. 3 is a flowchart showing steps taken in task switching between threads using a mutex lock primitive; -
FIG. 4 is a flowchart showing steps taken in invoking a handler to process asynchronous interrupts corresponding to mutex lock requests; -
FIG. 5 is a flowchart showing steps taken in task switching between threads using a condition wait primitive; -
FIG. 6 is a flowchart showing steps taken in receiving asynchronous interrupts corresponding to a condition wait primitive and switching tasks in response to determining that a condition is true; -
FIG. 7A is a diagram showing an example of a processor's mutex lock pseudo-code; -
FIG. 7B is a diagram showing an example of mutex lock handler pseudo-code that acquires a mutex lock based upon detecting a lock line reservation lost event; -
FIG. 8A is a diagram showing an example of a processor's condition wait pseudo-code; -
FIG. 8B is a diagram showing an example of condition wait pseudo-code that an event handler performs upon receiving a lock line reservation lost event; and -
FIG. 9 is a block diagram of an information handling system capable of implementing the present invention. - The following is intended to provide a detailed description of an example of the invention and should not be taken to be limiting of the invention itself. Rather, any number of variations may fall within the scope of the invention, which is defined in the claims following the description.
-
FIG. 1 is a diagram of a processor switching tasks based upon acquiring a mutex lock. Synergistic processing complex (SPC) 100 includes synergistic processing unit (SPU) 110, which processes thread A 120 (e.g., a first thread).SPU 110 is preferably a single instruction, multiple data (SIMD) processor, such as a digital signal processor, a microcontroller, a microprocessor, or a combination of these cores. During execution,thread A 120 requests external data that is located incache 150. For example,thread A 120 may wish to update a linked data structure that is located incache 150.SPU 110 issues getlock line reservation 160 toL2 140 corresponding tothread A 120's request. Getlock line reservation 160 instructsL2 140 to provide data from a particular cache line. -
L2 140 includes cachebus interface controller 145 andcache 150. Cachebus interface controller 145 receives getlock line reservation 160, and retrieves corresponding data fromcache 150. Cachebus interface controller 145 then sendscache line data 165 toSPU 110.Cache line data 165 includes data corresponding to a particular cache line.SPU 110 analyzes-cache line data 165 and determines, based upon the analysis, that the requested cache line is not available. For example,cache line data 165 may include a different thread's task identifier that is currently accessing the same data asthread A 120 wishes to access. - Since the requested data is not available for
thread A 120,SPU 110 switches fromthread A 120 to thread B 125 (e.g., a second thread).SPU 110 also enables asynchronous interrupts and invokes handler 115 (e.g., a software subroutine) to monitor incoming asynchronous interrupts (seeFIGS. 7B, 8B , and corresponding text for further details regarding handler properties). - When cache
bus interface controller 145 determines that a reservation is lost for one ofcache 150's cache lines, cachebus interface controller 145 issues lock line reservation lost 170 to informSPU 110 that a reservation has been lost corresponding to one of the cache lines.Handler 115 detects lock line reservation lost 170, and sends getlock line reservation 175 toL2 140 in order to receive subsequent cache line data. - Cache
bus interface controller 145 receives getlock line reservation 175 and providescache line data 180 toSPU 110.Handler 115 analyzescache line data 180 and determines whether the requested cache line is now available. If the requested cache line is still not available,handler 115 waits for another asynchronous interrupt andSPU 110 continues to processthread B 125. - When
handler 115 determines that the requested cache line is available,handler 115 performsconditional store 190 usingthread A 120's task identifier, which attempts to secure a mutex lock forthread A 120's requested data that is located incache 150. The conditional store operation has an associated status register, which indicates whether or not the conditional store operation succeeds. If the conditional store is successful,SPU 110 switches fromthread B 125 back tothread A 120 and processesthread A 120's original external data request (seeFIGS. 3, 4 , and corresponding text for further details regarding mutex lock primitive and handler steps). -
FIG. 2 is a diagram of a processor switching tasks using a condition wait primitive. A condition wait primitive allows a thread to identify whether a condition has occurred by accessing cache line data, such as whether a video card has completed a vertical retrace.FIG. 2 is similar toFIG. 1 with the exception thathandler 115 analyzes cache line data to determine whether a particular condition is true, such as a video card indicating that a vertical retrace is complete.SPC 100,SPU 110,handler 115,thread A 120,thread B 125,L2 140, cachebus interface controller 145, andcache 150 are the same as that shown inFIG. 1 . -
SPU 110 invokesthread A 120. During execution,thread A 120 requests external data that is located incache 150 that identifies whether a particular condition is true.SPU 110 sends get lock line reservation 200 toL2 140 in order to receivecache line data 210 from cachebus interface controller 145. -
SPU 110 analyzescache line data 210 and determines, based upon the analysis, that the particular condition is not true. For example,SPU 110 may check a bit included incache line data 210 that identifies that a video card has not completed a vertical retrace. Sincethread A 120's requested condition is not true,SPU 110 switches fromthread A 120 tothread B 125, enables asynchronous interrupts, and invokeshandler 115 to monitor incoming asynchronous interrupts. - When cache
bus interface controller 145 determines that a reservation is lost for a cache line included incache 150, cachebus interface controller 145 issues lock line reservation lost 220 to informSPU 110.Handler 115 detects lock line reservation lost 220, and sends getlock line reservation 275 toL2 140 in order to receive subsequent cache line data. - Cache
bus interface controller 145 receives getlock line reservation 275 and, in turn, providescache line data 280 toSPU 110.Handler 115 analyzescache line data 280 and determines whether the requested condition is true. If the condition is still not true, handler waits for another asynchronous interrupt andSPU 110 continues to processthread B 125. - When the condition is true,
SPU 110 switches fromthread B 125 tothread A 120 andprocess thread A 120's request (seeFIGS. 5, 6 , and corresponding text for further details regarding condition wait primitive and handler steps). -
FIG. 3 is a flowchart showing steps taken in task switching between threads using a mutex lock primitive. A system may use a mutex lock primitive to guarantee mutual exclusion among processors operating on data within critical sections of code, such as updating a linked list data structure. - Processing commences at 300, whereupon processing invokes
thread A 120 atstep 305.Thread A 120 is the same as that shown inFIG. 1 . Atstep 310, processing receives an external data request fromthread A 120. For example,thread A 120 may request data corresponding to a linked data structure that is located in external memory.Thread A 120 is the same as that shown inFIG. 1 . - At
step 315, processing sends a get lock line reservation request toL2 140, which instructsL2 140 to provide data corresponding to a particular cache line. Atstep 320, processing receives the requested cache line data fromL2 140.L2 140 is the same as that shown inFIG. 1 . - A determination is made as to whether a mutex lock is available (decision 330). Processing determines this by analyzing the received cache line data and determining whether it includes an existing task identifier corresponding to a different thread. If the mutex lock is not available,
decision 330 branches to “No”branch 332 whereupon processing switches threads and waits for a lock line reservation lost event from L2 140 (pre-defined process block 335, seeFIG. 4 and corresponding text for further details). - On the other hand, if the mutex lock is available,
decision 330 branches to “Yes”branch 338 whereupon processing entersthread A 120's task identifier and performs a conditional store at step 340 (seeFIG. 7A and corresponding text for further details). A determination is made as to whether the conditional store is accepted inL2 140 by reading the corresponding memory location (decision 350). If the conditional store was not accepted,decision 350 branches to “No”branch 352, which loops back to send another get lock line reservation request. This looping continues untilL2 140 accepts the conditional store, at whichpoint decision 350 branches to “Yes”branch 358 whereupon processing acquires a mutex lock (step 360). - A determination is made as to whether to continue processing (decision 380). If processing should continue,
decision 380 branches to “Yes”branch 382 which loops back to receive and process more external data requests. This looping continues until processing should terminate, at whichpoint decision 380 branches to “No”branch 388 whereupon processing ends at 390. -
FIG. 4 is a flowchart showing steps taken in invoking a handler to process asynchronous interrupts corresponding to mutex lock requests. Processing commences at 400, whereupon processing putsthread A 120 to sleep and invokes thread B 125 (step 405).Thread A 120 previously requested external data that was not available and, as such, processing switches fromthread A 120 tothread B 125 until the data is available (seeFIG. 3 and corresponding text for further details).Thread A 120 andthread B 125 are the same as that shown inFIG. 1 . - At
step 410, processing enables asynchronous interrupts in order for a handler to detect a lock line reservation lost event that corresponds tothread A 120's external data request. Processing invokes the handler atstep 415, such ashandler 115 shown inFIG. 1 . - At
step 420, the handler waits for a lock line reservation lost event fromL2 140. When it receives a lock line reservation lost event, the handler, sends a get lock line reservation request toL2 140 and receives corresponding cache line data (step 425).L2 140 is the same as that shown inFIG. 1 . - A determination is made as to whether the mutex lock is available by reading the cache line data (decision 430). For example, the cache line data may include a different thread's task identifier that is currently accessing the same data as the thread wishes to access, which makes the mutex lock unavailable. If the mutex lock is not available,
decision 430 branches to “No”branch 432 whereupon processing loops back to wait for another lock line reservation lost event fromL2 140. This looping continues until the mutex lock is available, at whichpoint decision 430 branches to “Yes”branch 438 whereupon the handler entersthread A 120's task identifier and performs a condition store on the corresponding memory location in L2 140 (step 440, seeFIG. 7B and corresponding text for further details). A determination is made as to whether the conditional store is accepted inL2 140 by reading the corresponding memory location (decision 450). If the conditional store was not accepted,decision 450 branches to “No”branch 452, which loops back to wait for another lock line reservation lost event. This looping continues untilL2 140 accepts the conditional store, at whichpoint decision 450 branches to “Yes”branch 458. - At
step 460, processing acquires a mutex lock and switches back toprocess thread A 120's external data request. Processing returns at 470. -
FIG. 5 is a flowchart showing steps taken in task switching between threads using a condition wait primitive. A processor may use a condition wait primitive to determine when a condition becomes true, such as a video card indicating that a vertical retrace is complete. - Processing commences at 500, whereupon processing invokes
thread A 120 atstep 510.Thread A 120 is the same as that shown inFIG. 1 . Atstep 520, processing receives an external data request fromthread A 120. For example,thread A 120 may request data corresponding to whether a video card has completed a vertical retrace. In turn, processing sends a get lock line reservation request toL2 140, which instructsL2 140 to provide data corresponding to a particular cache line (step 530). Atstep 540, processing receives the requested cache line data fromL2 140.L2 140 is the same as that shown inFIG. 1 . - A determination is made as to whether the condition corresponding to
thread A 120's request is true, such as whether a video card has completed a vertical retrace by checking one of the cache line data's corresponding bits (decision 550). If processing determines that the condition is true,decision 550 branches to “Yes”branch 552 whereupon processing performsthread A 120's task atstep 555. - On the other hand, if the condition is not true,
decision 550 branches to “No”branch 558 whereupon processing switches threads and monitors lock line reservation lost events (pre-defined process block 560, seeFIG. 6 and corresponding text for further details). - A determination is made as to whether to continue task switching steps (decision 570). If task switching should continue,
decision 570 branches to “Yes”branch 572 which loops back to receive and process more external data requests. This looping continues until processing should stop executing task switching steps, at whichpoint decision 570 branches to “No”branch 578 whereupon processing ends at 580. -
FIG. 6 is a flowchart showing steps taken in receiving asynchronous interrupts corresponding to a condition wait primitive and switching tasks in response to determining that a condition is true. Processing commences at 600, whereupon processing putsthread A 120 to sleep and invokes thread B 125 (step 610).Thread A 120 previously requested external data that was not available and, as such, processing switches fromthread A 120 tothread B 125 until the data is available (seeFIG. 5 and corresponding text for further details).Thread A 120 andthread B 125 are the same as that shown inFIG. 1 . - At step 620, processing enables asynchronous interrupts in order for processing to detect a lock line reservation lost event that corresponds to
thread A 120's external data request. Processing invokes a lock line reservation handler atstep 630, such ashandler 115 shown inFIG. 1 . - At
step 640, the handler waits for a lock line reservation lost event fromL2 140. When it receives a lock line reservation lost event, the handler issues a get lock line reservation request toL2 140 and receives corresponding cache line data (step 650). A determination is made as to whether the condition is true by reading the cache line data (decision 660). If the condition is not true,decision 660 branches to “No”branch 662 whereupon processing loops back to wait for another lock line reservation lost event fromL2 140. This looping continues until the condition is true, at whichpoint decision 660 branches to “Yes”branch 668 whereupon the handler switches threads and performsthread A 120's task (step 670). Processing returns at 680. -
FIG. 7A is a diagram showing an example of a processor's mutex lock pseudo-code.Code 700 includes pseudo-code that tests whether a mutex lock is available and, if not, switches threads and enables asynchronous interrupts. -
Code 700 includeslines 710 through 740.Line 710 performs a lock line reservation request in order to receive cache line data that signifies whether a mutex lock is available for a particular memory line.Lines - If a mutex lock is not available (e.g., the cache line data is not “0”),
line 740 switches to another thread, invokes a handler, and enables asynchronous interrupts. The handler includes pseudo code such that, when it receives a lock line reservation lost event, the handler attempts to acquire the mutex lock (seeFIG. 7B and corresponding text for further details regarding mutex lock handler pseudo code). -
FIG. 7B is a diagram showing an example of mutex lock handler pseudo-code that acquires a mutex lock based upon detecting a lock line reservation lost event.Code 750 includeslines 760 through 790.Line 760 performs a lock line reservation request in order to receive cache line data that identifies whether a mutex lock is available for a particular cache line.Lines - If a mutex lock is not available (e.g., the cache line data is not “0”),
line 785 exits the handler and waits for another lock line reservation lost event. When the conditional store is successful, the mutex lock is acquired andline 790 switches threads for further processing. -
FIG. 8A is a diagram showing an example of a processor's condition wait pseudo-code.Code 800 includes pseudo-code that tests whether a condition is true and, if not, switches threads and enables asynchronous interrupts. -
Code 800 includeslines 810 through 830.Line 810 performs a lock line reservation request in order to receive cache line data that signifies whether a particular condition is true.Lines - When the condition is not true,
line 830 switches to another thread, invokes a handler, and enables asynchronous interrupts. The handler includes pseudo code such that, when it receives a lock line reservation lost event, the handler tests the condition again (seeFIG. 8B and corresponding text for further details regarding condition wait handler pseudo code). -
FIG. 8B is a diagram showing an example of condition wait pseudo-code that an event handler performs upon receiving a lock line reservation lost event.Code 840 includeslines 850 through 880.Line 850 performs a lock line reservation request in order to receive cache line data that signifies whether a condition is true. -
Lines line 880 exits the handler and waits for another lock line reservation lost event. -
FIG. 9 illustrates an information handling system, which is a simplified example of a computer system capable of performing the computing operations described herein. Broadband processor architecture (BPA) 900 includes a plurality of heterogeneous processors, a common memory, and a common bus. The heterogeneous processors are processors with different instruction sets that share the common memory and the common bus. For example, one of the heterogeneous processors may be a digital signal processor and the other heterogeneous processor may be a microprocessor, both sharing the same memory space. -
BPA 900 sends and receives information to/from external devices throughinput output 970, and distributes the information to controlplane 910 anddata plane 940 usingprocessor element bus 960.Control plane 910 managesBPA 900 and distributes work todata plane 940. -
Control plane 910 includesprocessing unit 920, which runs operating system (OS) 925. For example, processingunit 920 may be a Power PC core that is embedded inBPA 900 and OS 925 may be a Linux operating system.Processing unit 920 manages a common memory map table forBPA 900. The memory map table corresponds to memory locations included inBPA 900, such asL2 memory 140 as well as non-private memory included indata plane 940.L2 memory 140 is the same as that shown inFIG. 1 . -
Data plane 940 includes Synergistic Processing Complex's (SPC) 100, 950, and 955.SPU 100 is the same as that shown inFIG. 1 . Each SPC is used to process data information and each SPC may have different instruction sets. For example,BPA 900 may be used in a wireless communications system and each SPC may be responsible for separate processing tasks, such as modulation, chip rate processing, encoding, and network interfacing. In another example, each SPC may have identical instruction sets and may be used in parallel to perform operations benefiting from parallel processes. Each SPC includes a synergistic processing unit (SPU). An SPU is preferably a single instruction, multiple data (SIMD) processor, such as a digital signal processor, a microcontroller, a microprocessor, or a combination of these cores. In a preferred embodiment, each SPU includes a local memory, registers, four floating point units, and four integer units. However, depending upon the processing power required, a greater or lesser number of floating points units and integer units may be employed. -
SPC processor element bus 960, which passes information betweencontrol plane 910,data plane 940, and input/output 970.Bus 960 is an on-chip coherent multi-processor bus that passes information between I/O 970,control plane 910, anddata plane 940. Input/output 970 includes flexible input-output logic, which dynamically assigns interface pins to input output controllers based upon peripheral devices that are connected toBPA 900. - While the information handling system described in
FIG. 9 is capable of executing the processes described herein, this computer system is simply one example of a computer system. Those skilled in the art will appreciate that many other computer system designs are capable of performing the processes described herein, such as gaming systems, imaging systems, seismic computer systems, and animation systems. - One of the preferred implementations of the invention is a client application, namely, a set of instructions (program code) in a code module that may, for example, be resident in the random access memory of the computer. Until required by the computer, the set of instructions may be stored in another computer memory, for example, in a hard disk drive, or in a removable memory such as an optical disk (for eventual use in a CD ROM) or floppy disk (for eventual use in a floppy disk drive), or downloaded via the Internet or other computer network. Thus, the present invention may be implemented as a computer program product for use in a computer. In addition, although the various methods described are conveniently implemented in a general purpose computer selectively activated or reconfigured by software, one of ordinary skill in the art would also recognize that such methods may be carried out in hardware, in firmware, or in more specialized apparatus constructed to perform the required method steps.
- While particular embodiments of the present invention have been shown and described, it will be obvious to those skilled in the art that, based upon the teachings herein, that changes and modifications may be made without departing from this invention and its broader aspects. Therefore, the appended claims are to encompass within their scope all such changes and modifications as are within the true spirit and scope of this invention. Furthermore, it is to be understood that the invention is solely defined by the appended claims. It will be understood by those with skill in the art that if a specific number of an introduced claim element is intended, such intent will be explicitly recited in the claim, and in the absence of such recitation no such limitation is present. For non-limiting example, as an aid to understanding, the following appended claims contain usage of the introductory phrases “at least one” and “one or more” to introduce claim elements. However, the use of such phrases should not be construed to imply that the introduction of a claim element by the indefinite articles “a” or “an” limits any particular claim containing such introduced claim element to inventions containing only one such element, even when the same claim includes the introductory phrases “one or more” or “at least one” and indefinite articles such as “a” or “an”; the same holds true for the use in the claims of definite articles.
Claims (20)
1. A computer-implemented method comprising:
receiving an external data request from a first thread;
identifying that data corresponding to the external data request is not available;
switching to a second thread in response to the identifying;
receiving a lock line reservation lost event after the switching;
determining whether to switch from the second thread to the first thread in response to the lock line reservation lost event; and
switching to the first thread in response to the determination.
2. The method of claim 1 wherein the determining further comprises:
requesting a lock line reservation; and
receiving cache line data from a cache bus interface controller in response to the lock line reservation, the determining based upon the cache line data.
3. The method of claim 2 wherein the lock line reservation corresponds to a mutex lock, the determining further comprising:
detecting, based upon the cache line data, that the mutex lock is available;
performing a conditional store that includes a task identifier that corresponds to the second thread; and
determining whether the conditional store is accepted.
4. The method of claim 2 wherein the lock line reservation lost event corresponds to a condition wait primitive, the determining further comprising:
determining whether the cache line data indicates that a condition is true that corresponds to the condition wait primitive.
5. The method of claim 1 further comprising:
invoking a handler to monitor asynchronous interrupts, wherein the lock line reservation lost event is one of the asynchronous interrupts; and
detecting the lock line reservation lost event using the handler.
6. The method of claim 5 further comprising:
enabling asynchronous interrupts in order for the handler to perform the monitoring.
7. The method of claim 1 further comprising:
wherein the method is performed using a broadband processor architecture, the broadband processor architecture including a plurality of heterogeneous processors, a common memory, and a common bus; and;
wherein the plurality of heterogeneous processors use different instruction sets and share the common memory and the common bus.
8. A computer program product comprising:
a computer operable medium having computer readable code, the computer readable code being effective to:
receive an external data request from a first thread;
identify that data corresponding to the external data request is not available;
switch to a second thread in response to the identifying;
receive a lock line reservation lost event after the switching;
determine whether to switch from the second thread to the first thread in response to the lock line reservation lost event; and
switch to the first thread in response to the determination.
9. The computer program product of claim 1 wherein the computer readable code is further effective to:
request a lock line reservation; and
receive cache line data from a cache bus interface controller in response to the lock line reservation, the determining based upon the cache line data.
10. The computer program product of claim 9 wherein the lock line reservation corresponds to a mutex lock, the computer readable code further effective to:
detect, based upon the cache line data, that the mutex lock is available;
perform a conditional store that includes a task identifier that corresponds to the second thread; and
determine whether the conditional store is accepted.
11. The computer program product of claim 9 wherein the lock line reservation lost event corresponds to a condition wait primitive, the computer readable code further effective to:
determine whether the cache line data indicates that a condition is true that corresponds to the condition wait primitive.
12. The computer program product of claim 8 wherein the computer readable code is further effective to:
invoke a handler to monitor asynchronous interrupts, the lock line reservation lost event being one of the asynchronous interrupts; and
detect the lock line reservation lost event using the handler.
13. The computer program product of claim 12 wherein the computer readable code is further effective to:
enable asynchronous interrupts in order for the handler to perform the monitoring.
14. The computer program product of claim 1 wherein the computer readable code is executed using a broadband processor architecture.
15. An information handling system comprising:
one or more processors;
a memory accessible by the processors;
one or more nonvolatile storage devices accessible by the processors; and
a task-switching tool for switching tasks, the task-switching tool being effective to:
receive an external data request from a first thread;
identify that data included in the memory corresponding to the external data request is not available;
switch to a second thread in response to the identifying;
receive a lock line reservation lost event after the switching;
determine whether to switch from the second thread to the first thread in response to the lock line reservation lost event; and
switch to the first thread in response to the determination.
16. The information handling system of claim 15 wherein the task-switching tool is further effective to:
request a lock line reservation; and
receive cache line data included in the memory from a cache bus interface controller in response to the lock line reservation, the determining based upon the cache line data.
17. The information handling system of claim 16 wherein the lock line reservation corresponds to a mutex lock, the task-switching tool further effective to:
detect, based upon the cache line data, that the mutex lock is available;
perform a conditional store to the memory that includes a task identifier that corresponds to the second thread; and
determine whether the conditional store is accepted.
18. The information handling system of claim 16 wherein the lock line reservation lost event corresponds to a condition wait primitive, the task-switching tool further effective to:
determine whether the cache line data indicates that a condition is true that corresponds to the condition wait primitive.
19. The information handling system of claim 15 wherein the task-switching tool is further effective to:
invoke a handler to monitor asynchronous interrupts, the lock line reservation lost event being one of the asynchronous interrupts; and
detect the lock line reservation lost event using the handler.
20. The information handling system of claim 15 wherein the information handling system is a broadband processor architecture that includes a plurality of heterogeneous processors that share the memory, the plurality of heterogeneous processors using different instruction sets.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/204,424 US20070043916A1 (en) | 2005-08-16 | 2005-08-16 | System and method for light weight task switching when a shared memory condition is signaled |
US12/049,317 US8458707B2 (en) | 2005-08-16 | 2008-03-15 | Task switching based on a shared memory condition associated with a data request and detecting lock line reservation lost events |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US11/204,424 US20070043916A1 (en) | 2005-08-16 | 2005-08-16 | System and method for light weight task switching when a shared memory condition is signaled |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/049,317 Continuation US8458707B2 (en) | 2005-08-16 | 2008-03-15 | Task switching based on a shared memory condition associated with a data request and detecting lock line reservation lost events |
Publications (1)
Publication Number | Publication Date |
---|---|
US20070043916A1 true US20070043916A1 (en) | 2007-02-22 |
Family
ID=37768494
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US11/204,424 Abandoned US20070043916A1 (en) | 2005-08-16 | 2005-08-16 | System and method for light weight task switching when a shared memory condition is signaled |
US12/049,317 Expired - Fee Related US8458707B2 (en) | 2005-08-16 | 2008-03-15 | Task switching based on a shared memory condition associated with a data request and detecting lock line reservation lost events |
Family Applications After (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US12/049,317 Expired - Fee Related US8458707B2 (en) | 2005-08-16 | 2008-03-15 | Task switching based on a shared memory condition associated with a data request and detecting lock line reservation lost events |
Country Status (1)
Country | Link |
---|---|
US (2) | US20070043916A1 (en) |
Cited By (9)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20070271450A1 (en) * | 2006-05-17 | 2007-11-22 | Doshi Kshitij A | Method and system for enhanced thread synchronization and coordination |
EP1939752A1 (en) * | 2006-12-27 | 2008-07-02 | Intel Corporation | Obscuring memory access patterns |
US7512773B1 (en) * | 2005-10-18 | 2009-03-31 | Nvidia Corporation | Context switching using halt sequencing protocol |
US7768515B1 (en) * | 2006-11-03 | 2010-08-03 | Nvidia Corporation | Apparatus, system, and method for reducing shadowed state memory requirements for identifying driver command exceptions in a graphics system |
US7898546B1 (en) | 2006-11-03 | 2011-03-01 | Nvidia Corporation | Logical design of graphics system with reduced shadowed state memory requirements |
US7916146B1 (en) | 2005-12-02 | 2011-03-29 | Nvidia Corporation | Halt context switching method and system |
US20110093857A1 (en) * | 2009-10-20 | 2011-04-21 | Infineon Technologies Ag | Multi-Threaded Processors and Multi-Processor Systems Comprising Shared Resources |
US20150312165A1 (en) * | 2014-04-29 | 2015-10-29 | Silicon Graphics International Corp. | Temporal based collaborative mutual exclusion control of a shared resource |
US20220121451A1 (en) * | 2019-10-29 | 2022-04-21 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Inter-core data processing method, system on chip and electronic device |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
DE102006014019A1 (en) * | 2006-03-27 | 2007-10-11 | Siemens Ag | A method of controlling accesses to resources of a computer system |
US8266607B2 (en) * | 2007-08-27 | 2012-09-11 | International Business Machines Corporation | Lock reservation using cooperative multithreading and lightweight single reader reserved locks |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6018785A (en) * | 1993-12-30 | 2000-01-25 | Cypress Semiconductor Corp. | Interrupt-generating hardware semaphore |
US6493741B1 (en) * | 1999-10-01 | 2002-12-10 | Compaq Information Technologies Group, L.P. | Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit |
US6578065B1 (en) * | 1999-09-23 | 2003-06-10 | Hewlett-Packard Development Company L.P. | Multi-threaded processing system and method for scheduling the execution of threads based on data received from a cache memory |
US7174554B2 (en) * | 2002-12-20 | 2007-02-06 | Microsoft Corporation | Tools and methods for discovering race condition errors |
Family Cites Families (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5179702A (en) * | 1989-12-29 | 1993-01-12 | Supercomputer Systems Limited Partnership | System and method for controlling a highly parallel multiprocessor using an anarchy based scheduler for parallel execution thread scheduling |
US6513057B1 (en) * | 1996-10-28 | 2003-01-28 | Unisys Corporation | Heterogeneous symmetric multi-processing system |
GB0118294D0 (en) * | 2001-07-27 | 2001-09-19 | Ibm | Method and system for deadlock detection and avoidance |
US7069556B2 (en) * | 2001-09-27 | 2006-06-27 | Intel Corporation | Method and apparatus for implementing a parallel construct comprised of a single task |
US7480909B2 (en) * | 2002-02-25 | 2009-01-20 | International Business Machines Corporation | Method and apparatus for cooperative distributed task management in a storage subsystem with multiple controllers using cache locking |
US7234143B2 (en) * | 2002-06-20 | 2007-06-19 | Hewlett-Packard Development Company, L.P. | Spin-yielding in multi-threaded systems |
US6829762B2 (en) * | 2002-10-10 | 2004-12-07 | International Business Machnies Corporation | Method, apparatus and system for allocating and accessing memory-mapped facilities within a data processing system |
US7882488B2 (en) * | 2003-10-20 | 2011-02-01 | Robert Zeidman | Software tool for synthesizing a real-time operating system |
US7240137B2 (en) | 2004-08-26 | 2007-07-03 | International Business Machines Corporation | System and method for message delivery across a plurality of processors |
US7765547B2 (en) * | 2004-11-24 | 2010-07-27 | Maxim Integrated Products, Inc. | Hardware multithreading systems with state registers having thread profiling data |
US7856636B2 (en) * | 2005-05-10 | 2010-12-21 | Hewlett-Packard Development Company, L.P. | Systems and methods of sharing processing resources in a multi-threading environment |
-
2005
- 2005-08-16 US US11/204,424 patent/US20070043916A1/en not_active Abandoned
-
2008
- 2008-03-15 US US12/049,317 patent/US8458707B2/en not_active Expired - Fee Related
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6018785A (en) * | 1993-12-30 | 2000-01-25 | Cypress Semiconductor Corp. | Interrupt-generating hardware semaphore |
US6578065B1 (en) * | 1999-09-23 | 2003-06-10 | Hewlett-Packard Development Company L.P. | Multi-threaded processing system and method for scheduling the execution of threads based on data received from a cache memory |
US6493741B1 (en) * | 1999-10-01 | 2002-12-10 | Compaq Information Technologies Group, L.P. | Method and apparatus to quiesce a portion of a simultaneous multithreaded central processing unit |
US7174554B2 (en) * | 2002-12-20 | 2007-02-06 | Microsoft Corporation | Tools and methods for discovering race condition errors |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7512773B1 (en) * | 2005-10-18 | 2009-03-31 | Nvidia Corporation | Context switching using halt sequencing protocol |
US7916146B1 (en) | 2005-12-02 | 2011-03-29 | Nvidia Corporation | Halt context switching method and system |
US20070271450A1 (en) * | 2006-05-17 | 2007-11-22 | Doshi Kshitij A | Method and system for enhanced thread synchronization and coordination |
US7768515B1 (en) * | 2006-11-03 | 2010-08-03 | Nvidia Corporation | Apparatus, system, and method for reducing shadowed state memory requirements for identifying driver command exceptions in a graphics system |
US7898546B1 (en) | 2006-11-03 | 2011-03-01 | Nvidia Corporation | Logical design of graphics system with reduced shadowed state memory requirements |
US7610448B2 (en) | 2006-12-27 | 2009-10-27 | Intel Corporation | Obscuring memory access patterns |
US20100299479A1 (en) * | 2006-12-27 | 2010-11-25 | Mark Buxton | Obscuring memory access patterns |
US20080162816A1 (en) * | 2006-12-27 | 2008-07-03 | Mark Buxton | Obscuring memory access patterns |
EP1939752A1 (en) * | 2006-12-27 | 2008-07-02 | Intel Corporation | Obscuring memory access patterns |
US8078801B2 (en) | 2006-12-27 | 2011-12-13 | Intel Corporation | Obscuring memory access patterns |
US20110093857A1 (en) * | 2009-10-20 | 2011-04-21 | Infineon Technologies Ag | Multi-Threaded Processors and Multi-Processor Systems Comprising Shared Resources |
US8695002B2 (en) * | 2009-10-20 | 2014-04-08 | Lantiq Deutschland Gmbh | Multi-threaded processors and multi-processor systems comprising shared resources |
US20150312165A1 (en) * | 2014-04-29 | 2015-10-29 | Silicon Graphics International Corp. | Temporal based collaborative mutual exclusion control of a shared resource |
US9686206B2 (en) * | 2014-04-29 | 2017-06-20 | Silicon Graphics International Corp. | Temporal based collaborative mutual exclusion control of a shared resource |
US20220121451A1 (en) * | 2019-10-29 | 2022-04-21 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Inter-core data processing method, system on chip and electronic device |
US11853767B2 (en) * | 2019-10-29 | 2023-12-26 | Guangdong Oppo Mobile Telecommunications Corp., Ltd. | Inter-core data processing method, system on chip and electronic device |
Also Published As
Publication number | Publication date |
---|---|
US8458707B2 (en) | 2013-06-04 |
US20080163241A1 (en) | 2008-07-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8458707B2 (en) | Task switching based on a shared memory condition associated with a data request and detecting lock line reservation lost events | |
US10884822B2 (en) | Deterministic parallelization through atomic task computation | |
CN105579961B (en) | Data processing system, operating method and hardware unit for data processing system | |
US9830158B2 (en) | Speculative execution and rollback | |
US20070150895A1 (en) | Methods and apparatus for multi-core processing with dedicated thread management | |
US7506339B2 (en) | High performance synchronization of accesses by threads to shared resources | |
US7631308B2 (en) | Thread priority method for ensuring processing fairness in simultaneous multi-threading microprocessors | |
US9891949B2 (en) | System and method for runtime scheduling of GPU tasks | |
US10579413B2 (en) | Efficient task scheduling using a locking mechanism | |
US6785887B2 (en) | Technique for using shared resources on a multi-threaded processor | |
CN104731560B (en) | Functional unit supporting multithread processing, processor and operation method thereof | |
US8645963B2 (en) | Clustering threads based on contention patterns | |
JPH1115793A (en) | Protection method for resource maintainability | |
US9886327B2 (en) | Resource mapping in multi-threaded central processor units | |
US7475397B1 (en) | Methods and apparatus for providing a remote serialization guarantee | |
US9652301B2 (en) | System and method providing run-time parallelization of computer software using data associated tokens | |
US20050251790A1 (en) | Systems and methods for instrumenting loops of an executable program | |
US20120159487A1 (en) | Identifying threads that wait for a mutex | |
US20170068306A1 (en) | Managing a free list of resources to decrease control complexity and reduce power consumption | |
US10360652B2 (en) | Wavefront resource virtualization | |
KR20080008683A (en) | Method and apparatus for processing according to multi-threading/out-of-order merged scheme | |
US20160328276A1 (en) | System, information processing device, and method | |
Francis et al. | Implementation of parallel clustering algorithms using Join and Fork model | |
US10922128B1 (en) | Efficiently managing the interruption of user-level critical sections | |
US12039363B2 (en) | Synchronizing concurrent tasks using interrupt deferral instructions |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: GERHARDT, DIANA R., TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AGUILAR, MAXIMINO;DAY, MICHAEL NORMAN;NUTTER, MARK RICHARD;REEL/FRAME:016501/0023;SIGNING DATES FROM 20050729 TO 20050809 Owner name: MACHINES, INTERNATIONAL BUSINESS, TEXAS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:AGUILAR, MAXIMINO;DAY, MICHAEL NORMAN;NUTTER, MARK RICHARD;REEL/FRAME:016501/0023;SIGNING DATES FROM 20050729 TO 20050809 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION |