US20130014120A1 - Fair Software Locking Across a Non-Coherent Interconnect - Google Patents

Fair Software Locking Across a Non-Coherent Interconnect Download PDF

Info

Publication number
US20130014120A1
US20130014120A1 US13/179,344 US201113179344A US2013014120A1 US 20130014120 A1 US20130014120 A1 US 20130014120A1 US 201113179344 A US201113179344 A US 201113179344A US 2013014120 A1 US2013014120 A1 US 2013014120A1
Authority
US
United States
Prior art keywords
resource
owner
shared
hardware
access
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
US13/179,344
Other versions
US9158597B2 (en
Inventor
Jonathan Ross
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Microsoft Technology Licensing LLC
Original Assignee
Microsoft Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Microsoft Corp filed Critical Microsoft Corp
Priority to US13/179,344 priority Critical patent/US9158597B2/en
Assigned to MICROSOFT CORPORATION reassignment MICROSOFT CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: ROSS, JONATHAN
Publication of US20130014120A1 publication Critical patent/US20130014120A1/en
Assigned to MICROSOFT TECHNOLOGY LICENSING, LLC reassignment MICROSOFT TECHNOLOGY LICENSING, LLC ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MICROSOFT CORPORATION
Application granted granted Critical
Publication of US9158597B2 publication Critical patent/US9158597B2/en
Active legal-status Critical Current
Adjusted expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/526Mutual exclusion algorithms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/52Indexing scheme relating to G06F9/52
    • G06F2209/522Manager

Definitions

  • locks are typically used to limit access to a shared resource to only one process at a time. This prevents multiple users from concurrently modifying the same shared data. For example, a group of processes may each have to acquire a lock before accessing a particular shared resource. When one process has acquired the lock, none of the other processes can acquire the lock, which provides exclusive access and control of the shared resource to the process that first acquired the lock.
  • the ability to acquire the lock may depend in part upon how fast an execution unit accesses the lock and how often the execution unit reattempts to acquire the lock when a first attempt is unsuccessful.
  • an execution unit that is remote from other execution units may be at a disadvantage due to the transmission delay of lock acquisition signals compared to the delays associated with closer execution units. If two units begin an attempt to acquire the lock at approximately the same time, the closer execution unit is likely to always have its request arrive first, and requests from a farther execution unit are likely to be too late. Additionally, when an execution unit cannot acquire a lock that was already in use by another device, the execution unit may back off for a period and will reattempt to acquire the lock at a later time. In the meantime, other devices may acquire the lock before the execution unit has reattempted acquiring the lock. As a result, if a number of other devices are attempting to acquire the lock, the execution device may have difficulty acquiring the lock in a timely manner.
  • Access to a shared resource by a plurality of execution units is organized and controlled by issuing tickets to each execution unit as they request access to the resource.
  • the tickets are issued by a hardware atomic unit so that each execution unit receives a unique ticket number.
  • a current owner field indicates the ticket number of the execution unit that currently has access to the shared resource. When an execution unit has completed its access, it releases the shared resource and increments the owner field. Execution units awaiting access to the shared resource periodically check the current value of the owner field and take control of the shared resource when their respective ticket values match the owner field.
  • multiple execution units may access the shared resource concurrently.
  • the execution units determine if they are allowed to access the shared resource by determining if their unique ticket number is within a concurrency number of the owner field value.
  • the execution units release the shared resource upon completion of their required access.
  • the execution units increment the owner field value after releasing the shared resource.
  • the execution units identify a last ticket number issued by the hardware atomic unit.
  • the execution units compare the last issued ticket number to a number one less than the current value of the owner field. If the last issued ticket number is equal to the number one less than the current owner field value, then the execution unit may expect to achieve immediate access to the shared resource and, therefore, requests a new unique ticket from the hardware atomic unit. If the last issued ticket number is not equal to the number one less than the current owner field value, then the execution unit does not expect to achieve immediate access to the shared resource and, therefore, does not request a new unique ticket from the hardware atomic unit.
  • FIG. 1 illustrates a multicore processor chip according to an example embodiment
  • FIG. 2 illustrates a system, such as a multicore processor, comprising a core running a plurality of applications or threads according to one embodiment
  • FIG. 3 is a flowchart illustrating a process for providing fair access to a shared resource
  • FIG. 4 is a flowchart illustrating a conditional access process for a shared resource according to one embodiment.
  • FIG. 1 illustrates a multicore processor chip 100 having cores 101 . Although only two cores 101 - 1 , 101 - 2 are illustrated, it will be understood that chip 100 may have any number of cores 101 .
  • Each core 101 has a processing unit 102 , a cache 103 , and configuration registers 104 .
  • Core bus 105 provides a communication medium for the components of core 101 .
  • Cores 101 communicate via a chip bus 106 .
  • Cores 101 may also access an on-chip memory 107 using chip bus 106 .
  • One core 101 - 1 may access and manipulate the cache 103 of another core 101 - 2 .
  • intra-core communications on bus 105 will be faster than inter-core communications on bus 106 .
  • Multicore chip 100 may have a coherency protocol or a locking mechanism to allow multiple cores 101 to manipulate a cache 103 or memory 107 in a coherent and deterministic manner.
  • FIG. 1 may be a system with any form of parallel independent processing. It will be understood that the present invention is not limited to applications on a multi-core chip.
  • Shared data or resources may be simultaneously required for two or more execution units, such as threads, applications, or processes.
  • an atomic lock is often used to prevent data collisions where two execution units attempt to access the shared resource at the same time.
  • an atomic lock instruction is implemented when a first device accesses the shared resource, which prevents other devices from accessing the shared resource or changing the lock state.
  • the lock is a hardware atomic primitive that provides mutual exclusion among the execution units.
  • An execution unit that requires exclusive access to a shared resource will repeatedly request access until the request is granted.
  • the waiting execution unit may use any one of a number of well-known mechanisms to reduce communication resource consumption while requesting access. For example, the waiting execution unit may issue a new request at regular intervals, or the execution unit may use exponential back-off to determine when to issue new requests.
  • a requesting execution unit such as a processor or thread, may attempt to reduce communication congestion by backing off on its retry interval.
  • the requesting execution unit uses longer periods between attempts to access the resource, it allows other devices more opportunities to acquire the desired resource instead.
  • backing-off the requesting execution unit is at a disadvantage compared to other requests that arrive soon after the release of the resource.
  • Thread A may be waiting for a resource while a third thread C currently owns the resource.
  • Thread A tries to acquire the resource, but is denied since the resource is owned by C.
  • thread A backs off and waits for a number of cycles before trying again.
  • thread A is waiting to re-try its access, thread C releases the resource and thread B begins attempts to access the resource.
  • Thread B which started its attempts to access the resource after thread A, will acquire the resource before thread A.
  • Another problem involves differences in access latencies within hardware implementing the request. For systems with non-uniform access latency among components, requesting execution units that are further away from the atomic lock hardware are at a disadvantage due to propagation delay of the request. As a result, a more remote execution unit may be starved for forward progress by requesters that are closer to the resource.
  • threads A, B, and C may be waiting for a resource, and thread C may have longer access latency for the resource than either thread A or B. If all three threads contend for the resource, then thread A or B will be more likely to acquire the resource than thread C. Moreover, in the event that thread A acquires the resource and threads B and C continue to contend for access, when A releases the resource, then thread B will be more likely to acquire the resource than thread C. Furthermore, in the event that thread A attempts to acquire the resource again before B releases the resource, when B releases the resource, then thread A will again be more likely to acquire the resource than thread C because of thread A's proximity. As a result, threads A and B may starve thread C from resource access and may limit thread C's forward progress.
  • requesters' access requests for a shared resource are ordered to make the access process fairer.
  • a hardware device dispenses “tickets” that guarantee a spot in a queue of requesting threads.
  • An owner field identifies the current owner of the shared resource—like a “now serving” sign—and is used to indicate which ticket currently owns the resource. When a requesting thread sees the value of its ticket in the owner field, then that thread has exclusive access to the associated resources.
  • Chip 100 includes ticket generation unit 108 that generates tickets 109 .
  • Ticket generation unit 109 is a hardware atomic primitive that returns a value T, which is an atomically incremented number. The atomic increment of T in each ticket 109 is suited to non-coherent systems as there is no requirement to gain ownership of a cache-line or bus-lock.
  • Chip 100 may have multiple shared resources, such as cache 103 - 1 , 103 - 2 .
  • Chip 100 further comprises Owner storage locations 111 associated with each shared resource.
  • Owner storage locations 111 may be any dedicated hardware location or a software-determined general-purpose memory location.
  • the owner storage location may be a direct-map cache location, a hardware register, or a memory location.
  • the Owner storage location 111 identifies the resource owner.
  • the value O in storage location 111 indicates the ticket value T for the current owner of the associated resource. If the shared resource is to be initialized as available, then the value O 111 is initialized to contain the next value T 109 that will be returned from the ticket generation unit 108 . If a resource is to be initialized as already held, then O 111 is set to a value that is one less than the next value T 109 to be returned from the ticket generation unit 108 .
  • a thread X that requires access to a shared resource first requests a ticket from ticket generation unit 108 .
  • Ticket generation unit 108 issues a ticket T X to thread X and then atomically increases the hardware counter 109 .
  • Thread X compares the value of the ticket T X to the current owner O value 111 for the shared resource. If the value of O 111 does not match the ticket T X , then thread X periodically reads the value O 111 for the resource until O 111 matches the waiting thread's ticket value T X .
  • thread X then owns the shared resource and can operate upon or interact with the shared resource accordingly.
  • Owner field O 111 can be considered as protected by the resource and, therefore, does not require atomic accesses or special hardware support for updating O 111 .
  • Conditional acquisition may be implemented using compare-and-swap hardware to issue a ticket T 109 only if an incremented T matches the current value in O.
  • the conditional sequence with the hardware compare-and-swap as the atomic step, is:
  • an execution unit once an execution unit has taken a ticket, it must continue to monitor the current value of the owner field O and, when its ticket value T equals the owner field value O, the execution unit must access the resource or—at a minimum—increment the owner field value if it does not access the resource.
  • An execution unit cannot ignore the owner field after it has taken a ticket, or the resource will become stalled and other devices will not be able to access the resource until the execution unit updates the owner field and allows the next device in line to access the resource.
  • the example above has a concurrency level of one, meaning only one thread may access to the resource at a time.
  • the ticket/owner mechanism described herein may be generalized to an arbitrary concurrency level. For a concurrency level “N”—where N threads are allowed to operate concurrently—a thread is allowed to access the resource if: T ⁇ O ⁇ N.
  • the update of O 111 must be performed atomically.
  • a hardware mechanism identical to ticket generation unit, which provides an atomic update for T can be used to update O.
  • the hardware atomic mechanism for updating O may be configured to provide no return value.
  • the mechanism for updating O may be streamlined as a write for which the thread does not need to wait for completion.
  • FIG. 2 illustrates a system 200 , such as a multicore processor, comprising a core 201 running a plurality of applications or threads A X-Z 202 .
  • System 200 includes a shared resource 203 that is used by each of the threads A X-Z 202 .
  • Owner field 204 identifies the current owner of shared resource 203 .
  • Each of the threads A X-Z 202 may access ticket generation unit 205 to request a ticket T to access shared resource 203 .
  • Each thread A X-Z 202 compares its ticket, T X-Z , to owner field O 204 to determine if it is allowed to access shared resource 203 .
  • each thread A X-Z 202 compares its ticket T X-Z to the owner field and evaluates whether it meets the criteria T ⁇ O ⁇ N. Any of the threads A X-Z 202 that have a ticket T X-Z that is within N of O is allowed to access shared resource 203 .
  • the width—in bits—of the atomic counter that is used to generate the tickets should be wide enough to count the maximum number of threads, which may be determined by the number of waiting threads plus the concurrency level.
  • the atomic increment is implemented as a read to a defined address, which returns an atomically incremented number.
  • the owner field is implemented as regular memory or as dedicated hardware storage.
  • releasing a concurrency level 1 resource can be a non-atomic or an atomic increment of the owner field value O.
  • releasing a resource is implemented as a load, increment, store, or as one transaction that causes hardware to increment O, thereby reducing the number of hardware transactions required to release the resource.
  • FIG. 3 is a flowchart 300 illustrating a process for providing fair access to a shared resource.
  • the process illustrated in FIG. 3 may be applied to a shared resource that may be accessed by one or many execution units at the same time.
  • the concurrency parameter—N—is the number of execution units that may simultaneously access the shared resource. For concurrency of one, as discussed above, N 1.
  • an execution unit such as an application, thread, or process that requires access to the shared resource, requests a ticket from a hardware atomic unit configured to distribute tickets having unique values.
  • the shared resource may be hardware or data, such as a memory block, register, device driver, or other resource.
  • the execution unit reads or otherwise obtains the current value of the owner field associated with the shared resource.
  • the owner field identifies the ticket value of the execution unit that is currently in control of the shared resource.
  • step 303 the execution unit compares the ticket value (obtained in step 301 ) and the current owner field value (read in step 302 ) to the concurrency level N for the shared resource. If T ⁇ O ⁇ N, then the execution unit's ticket is not yet “up” and the execution unit moves to step 304 and continues to wait. The execution unit then returns to step 302 where it obtains a new current value of the owner field. The process then continues to the comparison in step 303 .
  • step 304 the execution unit may immediately move to step 302 to obtain an updated owner field value, or the execution unit may delay for a predetermined period before moving back to step 302 .
  • the predetermined period may be a fixed or variable interval. For example, the execution unit may use a backoff procedure to adjust the predetermined period, which may be employed to minimize traffic on a communication bus and/or to avoid collisions with other execution units that may be reading the owner field.
  • step 306 the execution unit releases the shared resource and then to step 307 where the execution unit increments the owner field value.
  • FIG. 4 is a flowchart 400 illustrating a conditional access process for the shared resource according to one embodiment.
  • an execution unit receives a ticket, it must continue to monitor the current owner field to prevent the shared resource from being stalled. When the issued ticket number matches the owner field, then the execution unit must increment the owner field at a minimum, whether or not the execution unit actually accesses the shared resource. In some embodiments, an execution unit may not want to wait to access the shared resource if it is not immediately available. The process illustrated in FIG. 4 allows an execution unit to determine whether it can gain immediate access to the shared resource by “pulling” the next ticket.
  • step 401 the execution unit reads the current owner field value O associated with the shared resource.
  • step 402 the execution unit reads the value L of the last ticket issued by the hardware atomic unit.
  • step 403 the execution unit compares the last ticket value L to the current owner field value O.
  • the resource As illustrated in FIG. 3 , when an execution unit completes its access and releases the shared resource ( 306 ), it then increments the owner field value ( 307 ). Accordingly, the next ticket in line will have access to the resource.
  • step 404 when the execution unit cannot gain immediate access to the shared resource (i.e. L ⁇ O ⁇ 1), then the process moves to step 404 and the execution unit does not take a ticket. Instead, the execution unit may proceed with other operations and may reattempt access to the shared resource at a later time and/or attempt to access a different resource.
  • step 405 the execution unit requests a ticket from the hardware atomic unit.
  • the process may then move immediately to step 406 where the execution unit accesses the shared resource.
  • the execution unit may follow the process illustrated in FIG. 3 to verify that it actually has immediate access to the shared resource.
  • step 407 the execution unit releases the shared resource and then to step 408 where the execution unit increments the owner field value.
  • the execution unit could simply read the next ticket value from the hardware atomic unit to determine if the next ticket matches the current owner of the shared resource.
  • such reading of the next value in the hardware atomic unit may be equivalent to issuing a new ticket, which would then require a device to continue to monitor owner field and to wait for a turn to access the shared resource and/or to increment the owner field.
  • the value of the last-issued ticket may be stored in a location that is accessible to the cores.
  • steps 301 - 307 of the process illustrated in FIG. 3 and steps 401 - 408 of the process illustrated in FIG. 4 may be executed simultaneously and/or sequentially. It will be further understood that each step may be performed in any order and may be performed once or repetitiously.
  • processors may include any device or medium that can store or transfer information. Examples of such a processor-readable medium include an electronic circuit, a semiconductor memory device, a flash memory, a ROM, an erasable ROM (EROM), a floppy diskette, a compact disk, an optical disk, a hard disk, a fiber optic medium, etc.
  • the software code segments may be stored in any volatile or non-volatile storage device, such as a hard drive, flash memory, solid state memory, optical disk, CD, DVD, computer program product, or other memory device, that provides computer-readable or machine-readable storage for a processor or a middleware container service.
  • the memory may be a virtualization of several physical storage devices, wherein the physical storage devices are of the same or different kinds.
  • the code segments may be downloaded or transferred from storage to a processor or container via an internal bus, another computer network, such as the Internet or an intranet, or via other wired or wireless networks.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

Access to a shared resource by a plurality of execution units is organized and controlled by issuing tickets to each execution unit as they request access to the resource. The tickets are issued by a hardware atomic unit so that each execution unit receives a unique ticket number. A current owner field indicates the ticket number of the execution unit that currently has access to the shared resource. When an execution unit has completed its access, it releases the shared resource and increments the owner field. Execution units awaiting access to the shared resource periodically check the current value of the owner field and take control of the shared resource when their respective ticket values match the owner field.

Description

    BACKGROUND
  • Multiple computer programs, processes, applications, and/or threads running on a computer or processor often need to access shared data or hardware, such as a memory block, register, device driver, or other common resource. To avoid data collisions and data corruption, locks are typically used to limit access to a shared resource to only one process at a time. This prevents multiple users from concurrently modifying the same shared data. For example, a group of processes may each have to acquire a lock before accessing a particular shared resource. When one process has acquired the lock, none of the other processes can acquire the lock, which provides exclusive access and control of the shared resource to the process that first acquired the lock.
  • Where multiple execution units try to acquire the same lock, the ability to acquire the lock may depend in part upon how fast an execution unit accesses the lock and how often the execution unit reattempts to acquire the lock when a first attempt is unsuccessful. For example, an execution unit that is remote from other execution units may be at a disadvantage due to the transmission delay of lock acquisition signals compared to the delays associated with closer execution units. If two units begin an attempt to acquire the lock at approximately the same time, the closer execution unit is likely to always have its request arrive first, and requests from a farther execution unit are likely to be too late. Additionally, when an execution unit cannot acquire a lock that was already in use by another device, the execution unit may back off for a period and will reattempt to acquire the lock at a later time. In the meantime, other devices may acquire the lock before the execution unit has reattempted acquiring the lock. As a result, if a number of other devices are attempting to acquire the lock, the execution device may have difficulty acquiring the lock in a timely manner.
  • SUMMARY
  • This Summary is provided to introduce a selection of concepts in a simplified form that are further described below in the Detailed Description. This Summary is not intended to identify key features or essential features of the claimed subject matter, nor is it intended to be used to limit the scope of the claimed subject matter.
  • Access to a shared resource by a plurality of execution units is organized and controlled by issuing tickets to each execution unit as they request access to the resource. The tickets are issued by a hardware atomic unit so that each execution unit receives a unique ticket number. A current owner field indicates the ticket number of the execution unit that currently has access to the shared resource. When an execution unit has completed its access, it releases the shared resource and increments the owner field. Execution units awaiting access to the shared resource periodically check the current value of the owner field and take control of the shared resource when their respective ticket values match the owner field.
  • Existing mechanisms require cache coherence to control ticket generation. Increasing cache coherence requirements limit scalability in the system. The mechanism described herein allows, through implementation of the hardware atomic unit, scalable non-cache coherent systems that still support an efficient shared resource arbitration mechanism.
  • In one embodiment, multiple execution units may access the shared resource concurrently. The execution units determine if they are allowed to access the shared resource by determining if their unique ticket number is within a concurrency number of the owner field value.
  • The execution units release the shared resource upon completion of their required access. The execution units increment the owner field value after releasing the shared resource.
  • In one embodiment, the execution units identify a last ticket number issued by the hardware atomic unit. The execution units compare the last issued ticket number to a number one less than the current value of the owner field. If the last issued ticket number is equal to the number one less than the current owner field value, then the execution unit may expect to achieve immediate access to the shared resource and, therefore, requests a new unique ticket from the hardware atomic unit. If the last issued ticket number is not equal to the number one less than the current owner field value, then the execution unit does not expect to achieve immediate access to the shared resource and, therefore, does not request a new unique ticket from the hardware atomic unit.
  • DRAWINGS
  • FIG. 1 illustrates a multicore processor chip according to an example embodiment;
  • FIG. 2 illustrates a system, such as a multicore processor, comprising a core running a plurality of applications or threads according to one embodiment;
  • FIG. 3 is a flowchart illustrating a process for providing fair access to a shared resource; and
  • FIG. 4 is a flowchart illustrating a conditional access process for a shared resource according to one embodiment.
  • DETAILED DESCRIPTION
  • FIG. 1 illustrates a multicore processor chip 100 having cores 101. Although only two cores 101-1, 101-2 are illustrated, it will be understood that chip 100 may have any number of cores 101. Each core 101 has a processing unit 102, a cache 103, and configuration registers 104. Core bus 105 provides a communication medium for the components of core 101. Cores 101 communicate via a chip bus 106. Cores 101 may also access an on-chip memory 107 using chip bus 106. One core 101-1 may access and manipulate the cache 103 of another core 101-2. Often, intra-core communications on bus 105 will be faster than inter-core communications on bus 106. Multicore chip 100 may have a coherency protocol or a locking mechanism to allow multiple cores 101 to manipulate a cache 103 or memory 107 in a coherent and deterministic manner. Alternatively, FIG. 1 may be a system with any form of parallel independent processing. It will be understood that the present invention is not limited to applications on a multi-core chip.
  • Shared data or resources, such as shared memory 107 or shared cache 103, may be simultaneously required for two or more execution units, such as threads, applications, or processes. In prior systems, an atomic lock is often used to prevent data collisions where two execution units attempt to access the shared resource at the same time. For example, an atomic lock instruction is implemented when a first device accesses the shared resource, which prevents other devices from accessing the shared resource or changing the lock state. The lock is a hardware atomic primitive that provides mutual exclusion among the execution units. An execution unit that requires exclusive access to a shared resource will repeatedly request access until the request is granted. The waiting execution unit may use any one of a number of well-known mechanisms to reduce communication resource consumption while requesting access. For example, the waiting execution unit may issue a new request at regular intervals, or the execution unit may use exponential back-off to determine when to issue new requests.
  • However, there are a certain problems with the mechanisms used in the prior systems. One problem involves the timing requests to access the resource. A requesting execution unit, such as a processor or thread, may attempt to reduce communication congestion by backing off on its retry interval. In this case, as the requesting execution unit uses longer periods between attempts to access the resource, it allows other devices more opportunities to acquire the desired resource instead. As a result, by backing-off, the requesting execution unit is at a disadvantage compared to other requests that arrive soon after the release of the resource.
  • For example, two threads A and B may be waiting for a resource while a third thread C currently owns the resource. Thread A tries to acquire the resource, but is denied since the resource is owned by C. After a brief interval of trying to access the resource, thread A backs off and waits for a number of cycles before trying again. While thread A is waiting to re-try its access, thread C releases the resource and thread B begins attempts to access the resource. Thread B, which started its attempts to access the resource after thread A, will acquire the resource before thread A.
  • Another problem involves differences in access latencies within hardware implementing the request. For systems with non-uniform access latency among components, requesting execution units that are further away from the atomic lock hardware are at a disadvantage due to propagation delay of the request. As a result, a more remote execution unit may be starved for forward progress by requesters that are closer to the resource.
  • For example, three threads A, B, and C may be waiting for a resource, and thread C may have longer access latency for the resource than either thread A or B. If all three threads contend for the resource, then thread A or B will be more likely to acquire the resource than thread C. Moreover, in the event that thread A acquires the resource and threads B and C continue to contend for access, when A releases the resource, then thread B will be more likely to acquire the resource than thread C. Furthermore, in the event that thread A attempts to acquire the resource again before B releases the resource, when B releases the resource, then thread A will again be more likely to acquire the resource than thread C because of thread A's proximity. As a result, threads A and B may starve thread C from resource access and may limit thread C's forward progress.
  • In one embodiment, requesters' access requests for a shared resource are ordered to make the access process fairer. A hardware device dispenses “tickets” that guarantee a spot in a queue of requesting threads. An owner field identifies the current owner of the shared resource—like a “now serving” sign—and is used to indicate which ticket currently owns the resource. When a requesting thread sees the value of its ticket in the owner field, then that thread has exclusive access to the associated resources.
  • Chip 100 includes ticket generation unit 108 that generates tickets 109. Ticket generation unit 109 is a hardware atomic primitive that returns a value T, which is an atomically incremented number. The atomic increment of T in each ticket 109 is suited to non-coherent systems as there is no requirement to gain ownership of a cache-line or bus-lock. Chip 100 may have multiple shared resources, such as cache 103-1, 103-2. Chip 100 further comprises Owner storage locations 111 associated with each shared resource. Owner storage locations 111 may be any dedicated hardware location or a software-determined general-purpose memory location. For example, the owner storage location may be a direct-map cache location, a hardware register, or a memory location.
  • The Owner storage location 111 identifies the resource owner. The value O in storage location 111 indicates the ticket value T for the current owner of the associated resource. If the shared resource is to be initialized as available, then the value O 111 is initialized to contain the next value T 109 that will be returned from the ticket generation unit 108. If a resource is to be initialized as already held, then O 111 is set to a value that is one less than the next value T 109 to be returned from the ticket generation unit 108.
  • A thread X that requires access to a shared resource first requests a ticket from ticket generation unit 108. Ticket generation unit 108 issues a ticket TX to thread X and then atomically increases the hardware counter 109. Thread X compares the value of the ticket TX to the current owner O value 111 for the shared resource. If the value of O 111 does not match the ticket TX, then thread X periodically reads the value O 111 for the resource until O 111 matches the waiting thread's ticket value TX. When O matches the ticket value TX, thread X then owns the shared resource and can operate upon or interact with the shared resource accordingly. When thread X is finished with the resource, it increments O 111, which effectively passes ownership of the resource to the next waiting thread. Owner field O 111 can be considered as protected by the resource and, therefore, does not require atomic accesses or special hardware support for updating O 111.
  • Conditional Acquisition
  • Once a waiting thread is granted a ticket T, the thread must continue waiting until it obtains the resource and then must increment O 111 when finished. Conditional acquisition may be implemented using compare-and-swap hardware to issue a ticket T 109 only if an incremented T matches the current value in O. The conditional sequence, with the hardware compare-and-swap as the atomic step, is:
  • Owner = O; // read by software
    P = O−1; // what T must be for conditional wait to succeed
    Y = Atomic(P, Owner) {
       if (P == T) {
          T = O + 1;  // increment
          Return P;
       } else {
          return T;
       }
    }
  • If Y—the returned value—is equal to P, then the resource has been acquired, otherwise the resource has not been acquired and a ticket has not been granted.
  • In one embodiment, once an execution unit has taken a ticket, it must continue to monitor the current value of the owner field O and, when its ticket value T equals the owner field value O, the execution unit must access the resource or—at a minimum—increment the owner field value if it does not access the resource. An execution unit cannot ignore the owner field after it has taken a ticket, or the resource will become stalled and other devices will not be able to access the resource until the execution unit updates the owner field and allows the next device in line to access the resource.
  • Variable Concurrency Level
  • The example above has a concurrency level of one, meaning only one thread may access to the resource at a time. To avoid stalling the resource and/or to allow multiple concurrent users, if supported by the resource, the ticket/owner mechanism described herein may be generalized to an arbitrary concurrency level. For a concurrency level “N”—where N threads are allowed to operate concurrently—a thread is allowed to access the resource if: T−O<N.
  • Because multiple threads operate concurrently on the same shared resource, the update of O 111 must be performed atomically. In one embodiment, a hardware mechanism identical to ticket generation unit, which provides an atomic update for T, can be used to update O. Alternatively, because the return value of O is not required, the hardware atomic mechanism for updating O may be configured to provide no return value. In one embodiment, the mechanism for updating O may be streamlined as a write for which the thread does not need to wait for completion.
  • FIG. 2 illustrates a system 200, such as a multicore processor, comprising a core 201 running a plurality of applications or threads AX-Z 202. System 200 includes a shared resource 203 that is used by each of the threads AX-Z 202. Owner field 204 identifies the current owner of shared resource 203. Each of the threads AX-Z 202 may access ticket generation unit 205 to request a ticket T to access shared resource 203. Each thread AX-Z 202 compares its ticket, TX-Z, to owner field O 204 to determine if it is allowed to access shared resource 203.
  • For the case of concurrency level of 1 (N=1), each thread AX-Z 202 evaluates whether its ticket is equal to the owner field 204 (TX-Z=O) and whichever thread has the matching ticket is allowed to access shared resource 203.
  • For the case of concurrency level N, each thread AX-Z 202 compares its ticket TX-Z to the owner field and evaluates whether it meets the criteria T−O<N. Any of the threads AX-Z 202 that have a ticket TX-Z that is within N of O is allowed to access shared resource 203.
  • Using the shared resource access mechanisms described herein provides the following benefits:
      • 1) Threads gain access to the shared resource in the order in which they present their first request to the ticket-generating hardware atomic unit.
      • 2) Communication traffic to the hardware atomic unit is greatly reduced because only one reference per lock acquisition is required without regard to the level of contention.
      • 3) Back-off mechanisms implemented by threads waiting for resource ownership to be passed to them do not subject those threads to fairness imbalances caused by the waiting patterns or inter-arrival rates of other threads.
      • 4) Latency to the hardware atomic unit determines, at most, which position in line—or which ticket number—is granted to a thread, but such latency will not lead to starvation or a continuing arbitration disadvantage.
    Implementation Considerations
  • In one embodiment, the width—in bits—of the atomic counter that is used to generate the tickets should be wide enough to count the maximum number of threads, which may be determined by the number of waiting threads plus the concurrency level. The minimum number of bits is equal to: log 2(maximum number of threads plus concurrency level), where the maximum number of threads is rounded up to the next power of 2. For example, if the maximum number of threads is 64, then the bit-width must be at least six bits−log 2(64)=6. In some embodiments, this is the number of hardware threads or logical processors in the system.
  • In some embodiments, the atomic increment is implemented as a read to a defined address, which returns an atomically incremented number.
  • In some embodiments, the owner field is implemented as regular memory or as dedicated hardware storage.
  • In other embodiments, releasing a concurrency level 1 resource can be a non-atomic or an atomic increment of the owner field value O.
  • In other embodiments, releasing a resource is implemented as a load, increment, store, or as one transaction that causes hardware to increment O, thereby reducing the number of hardware transactions required to release the resource.
  • FIG. 3 is a flowchart 300 illustrating a process for providing fair access to a shared resource. The process illustrated in FIG. 3 may be applied to a shared resource that may be accessed by one or many execution units at the same time. The concurrency parameter—N—is the number of execution units that may simultaneously access the shared resource. For concurrency of one, as discussed above, N=1. In step 301, an execution unit, such as an application, thread, or process that requires access to the shared resource, requests a ticket from a hardware atomic unit configured to distribute tickets having unique values. The shared resource may be hardware or data, such as a memory block, register, device driver, or other resource. In step 302, the execution unit reads or otherwise obtains the current value of the owner field associated with the shared resource. The owner field identifies the ticket value of the execution unit that is currently in control of the shared resource.
  • In step 303, the execution unit compares the ticket value (obtained in step 301) and the current owner field value (read in step 302) to the concurrency level N for the shared resource. If T−O≧N, then the execution unit's ticket is not yet “up” and the execution unit moves to step 304 and continues to wait. The execution unit then returns to step 302 where it obtains a new current value of the owner field. The process then continues to the comparison in step 303. In step 304, the execution unit may immediately move to step 302 to obtain an updated owner field value, or the execution unit may delay for a predetermined period before moving back to step 302. The predetermined period may be a fixed or variable interval. For example, the execution unit may use a backoff procedure to adjust the predetermined period, which may be employed to minimize traffic on a communication bus and/or to avoid collisions with other execution units that may be reading the owner field.
  • If the difference between the values of the ticket and the owner field are less than the concurrency level (i.e. T−O<N), then the process moves to step 305 and the execution unit is granted access to the shared resource. If the shared resource has a concurrency level of one (N=1), for example, then the execution unit is granted access when the ticket and owner field values are the same (i.e. when T=O, then T−O=0<N=1).
  • After the execution unit has completed its use of the shared resource, the process moves to step 306 where the execution unit releases the shared resource and then to step 307 where the execution unit increments the owner field value.
  • FIG. 4 is a flowchart 400 illustrating a conditional access process for the shared resource according to one embodiment. As noted above, once an execution unit receives a ticket, it must continue to monitor the current owner field to prevent the shared resource from being stalled. When the issued ticket number matches the owner field, then the execution unit must increment the owner field at a minimum, whether or not the execution unit actually accesses the shared resource. In some embodiments, an execution unit may not want to wait to access the shared resource if it is not immediately available. The process illustrated in FIG. 4 allows an execution unit to determine whether it can gain immediate access to the shared resource by “pulling” the next ticket.
  • In step 401, the execution unit reads the current owner field value O associated with the shared resource. In step 402, the execution unit reads the value L of the last ticket issued by the hardware atomic unit. In step 403, the execution unit compares the last ticket value L to the current owner field value O.
  • If the last ticket value L is one less than the current owner field value O (i.e. L=O−1), then the next ticket issued (i.e. L+1=T) will immediately own the resource. As illustrated in FIG. 3, when an execution unit completes its access and releases the shared resource (306), it then increments the owner field value (307). Accordingly, the next ticket in line will have access to the resource.
  • However, if the last ticket value L issued is greater than (O−1) where O is the current Owner field value, then the next ticket pulled will have to wait for access to the resource.
  • In flowchart 400, when the execution unit cannot gain immediate access to the shared resource (i.e. L≠O−1), then the process moves to step 404 and the execution unit does not take a ticket. Instead, the execution unit may proceed with other operations and may reattempt access to the shared resource at a later time and/or attempt to access a different resource.
  • On the other hand, when the execution will gain immediate access to the shared resource (i.e. L=O−1), then the process moves to step 405 where the execution unit requests a ticket from the hardware atomic unit. The process may then move immediately to step 406 where the execution unit accesses the shared resource. Alternatively, between steps 405 and 406, the execution unit may follow the process illustrated in FIG. 3 to verify that it actually has immediate access to the shared resource.
  • After the execution unit has completed its use of the shared resource, the process moves to step 407 where the execution unit releases the shared resource and then to step 408 where the execution unit increments the owner field value.
  • In other embodiments, the execution unit could simply read the next ticket value from the hardware atomic unit to determine if the next ticket matches the current owner of the shared resource. However, in some embodiments, such reading of the next value in the hardware atomic unit may be equivalent to issuing a new ticket, which would then require a device to continue to monitor owner field and to wait for a turn to access the shared resource and/or to increment the owner field. Instead, when a ticket is issued, the value of the last-issued ticket may be stored in a location that is accessible to the cores.
  • The process illustrated in flowchart 400 is for the case of concurrency level one, but may be generalized to allow higher concurrency levels N. For example, if the next ticket T minus the concurrency level N is less than the current owner value (i.e. T−N<O), then the next ticket T will not have to wait for access to the resource. In terms of the last ticket value L (i.e. L=T−1), this can be represented as L−N<O−1.
  • It will be understood that steps 301-307 of the process illustrated in FIG. 3 and steps 401-408 of the process illustrated in FIG. 4 may be executed simultaneously and/or sequentially. It will be further understood that each step may be performed in any order and may be performed once or repetitiously.
  • Many of the functions described herein may be implemented in hardware, software, and/or firmware, and/or any combination thereof. When implemented in software, code segments perform the necessary tasks or steps. The program or code segments may be stored in a processor-readable, computer-readable, or machine-readable medium. The processor-readable, computer-readable, or machine-readable medium may include any device or medium that can store or transfer information. Examples of such a processor-readable medium include an electronic circuit, a semiconductor memory device, a flash memory, a ROM, an erasable ROM (EROM), a floppy diskette, a compact disk, an optical disk, a hard disk, a fiber optic medium, etc.
  • The software code segments may be stored in any volatile or non-volatile storage device, such as a hard drive, flash memory, solid state memory, optical disk, CD, DVD, computer program product, or other memory device, that provides computer-readable or machine-readable storage for a processor or a middleware container service. In other embodiments, the memory may be a virtualization of several physical storage devices, wherein the physical storage devices are of the same or different kinds. The code segments may be downloaded or transferred from storage to a processor or container via an internal bus, another computer network, such as the Internet or an intranet, or via other wired or wireless networks.
  • Although the subject matter has been described in language specific to structural features and/or methodological acts, it is to be understood that the subject matter defined in the appended claims is not necessarily limited to the specific features or acts described above. Rather, the specific features and acts described above are disclosed as example forms of implementing the claims.

Claims (18)

1. A method, comprising:
receiving a unique number from a hardware atomic unit;
receiving an owner number associated with a shared system resource; and
accessing the shared system resource when the unique number matches the owner number.
2. The method of claim 1, further comprising:
releasing the shared system resource; and
incrementing the owner number.
3. The method of claim 1, further comprising:
comparing the unique number to the owner number;
receiving an updated owner number after a delay period; and
accessing the shared system resource when the unique number matches the updated owner number.
4. The method of claim 3, wherein the delay period is a fixed interval.
5. The method of claim 3, wherein the delay period is an exponential backoff interval.
6. The method of claim 1, wherein the shared system resource is a hardware device.
7. The method of claim 1, wherein the shared system resource is a data storage location.
8. A method, comprising:
receiving a current number associated with a hardware atomic unit;
receiving an owner number associated with a shared system resource;
comparing the current number associated with the hardware atomic unit to a number that is one less than the owner number; and
either requesting a new number when the current number associated with the hardware atomic unit is equal to a number that is one less than the owner number, or not requesting a new number when the current number associated with the hardware atomic unit is not equal to a number that is one less than the owner number.
9. The method of claim 8, further comprising:
accessing the shared system resource when the new current number matches the owner number.
10. The method of claim 9, further comprising:
releasing the shared system resource; and
incrementing the owner number.
11. The method of claim 8, wherein the current number associated with the hardware atomic unit is equal to a last number issued by the hardware atomic unit.
12. The method of claim 8, wherein the current number associated with the hardware atomic unit is equal to a last number issued by the hardware atomic unit minus a concurrency level for the shared system resource.
13. The method of claim 8, wherein the shared system resource is a hardware device.
14. The method of claim 8, wherein the shared system resource is a data storage location.
15. A system, comprising:
a hardware atomic unit adapted to issue ticket numbers upon request from execution units operating on the system;
a shared resource accessible by the execution units; and
a storage device adapted to hold an owner value associated with the shared resource;
wherein the execution units receive a unique ticket number from the hardware atomic unit and are permitted access to the shared resource when the unique ticket number matches the owner value.
16. The system of claim 15, wherein the execution units are selected from one or more of threads, applications, and processes.
17. The system of claim 15, wherein the storage device is selected from a direct-map cache location, a hardware register, or a memory location.
18. The system of claim 15, wherein the shared resource permits concurrent access by a plurality of execution units, and wherein the execution units are permitted access to the shared resource when the unique ticket number is within a concurrency level number of the owner value.
US13/179,344 2011-07-08 2011-07-08 Controlling access to shared resource by issuing tickets to plurality of execution units Active 2032-07-16 US9158597B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US13/179,344 US9158597B2 (en) 2011-07-08 2011-07-08 Controlling access to shared resource by issuing tickets to plurality of execution units

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US13/179,344 US9158597B2 (en) 2011-07-08 2011-07-08 Controlling access to shared resource by issuing tickets to plurality of execution units

Publications (2)

Publication Number Publication Date
US20130014120A1 true US20130014120A1 (en) 2013-01-10
US9158597B2 US9158597B2 (en) 2015-10-13

Family

ID=47439448

Family Applications (1)

Application Number Title Priority Date Filing Date
US13/179,344 Active 2032-07-16 US9158597B2 (en) 2011-07-08 2011-07-08 Controlling access to shared resource by issuing tickets to plurality of execution units

Country Status (1)

Country Link
US (1) US9158597B2 (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104199800A (en) * 2014-07-21 2014-12-10 上海寰创通信科技股份有限公司 Method for eliminating mutual exclusion of table items in multi-core system
US9313208B1 (en) * 2014-03-19 2016-04-12 Amazon Technologies, Inc. Managing restricted access resources
WO2016126516A1 (en) * 2015-02-02 2016-08-11 Optimum Semiconductor Technologies, Inc. Vector processor configured to operate on variable length vectors with asymmetric multi-threading
WO2017018976A1 (en) * 2015-07-24 2017-02-02 Hewlett Packard Enterprise Development Lp Lock manager
US20190129846A1 (en) * 2017-10-30 2019-05-02 International Business Machines Corporation Dynamic Resource Visibility Tracking to Avoid Atomic Reference Counting
US10423464B2 (en) 2016-09-30 2019-09-24 Hewlett Packard Enterprise Patent Development LP Persistent ticket operation
US20200034214A1 (en) * 2019-10-02 2020-01-30 Juraj Vanco Method for arbitration and access to hardware request ring structures in a concurrent environment
US20200356485A1 (en) * 2019-05-09 2020-11-12 International Business Machines Corporation Executing multiple data requests of multiple-core processors
US11269692B2 (en) * 2011-12-29 2022-03-08 Oracle International Corporation Efficient sequencer for multiple concurrently-executing threads of execution
US11321146B2 (en) 2019-05-09 2022-05-03 International Business Machines Corporation Executing an atomic primitive in a multi-core processor system
US11681567B2 (en) * 2019-05-09 2023-06-20 International Business Machines Corporation Method and processor system for executing a TELT instruction to access a data item during execution of an atomic primitive

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
GB2532424B (en) 2014-11-18 2016-10-26 Ibm An almost fair busy lock
US10146689B2 (en) 2017-01-20 2018-12-04 Hewlett Packard Enterprise Development Lp Locally poll flag in multi processing node system to determine whether a resource is free to use for thread
CN113535412B (en) 2020-04-13 2024-05-10 伊姆西Ip控股有限责任公司 Method, apparatus and computer program product for tracking locks

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030195920A1 (en) * 2000-05-25 2003-10-16 Brenner Larry Bert Apparatus and method for minimizing lock contention in a multiple processor system with multiple run queues
US20040098723A1 (en) * 2002-11-07 2004-05-20 Zoran Radovic Multiprocessing systems employing hierarchical back-off locks
US20040215858A1 (en) * 2003-04-24 2004-10-28 International Business Machines Corporation Concurrent access of shared resources
US20070300226A1 (en) * 2006-06-22 2007-12-27 Bliss Brian E Efficient ticket lock synchronization implementation using early wakeup in the presence of oversubscription
US20080098180A1 (en) * 2006-10-23 2008-04-24 Douglas Larson Processor acquisition of ownership of access coordinator for shared resource
US20110252166A1 (en) * 2009-01-23 2011-10-13 Pradeep Padala System and Methods for Allocating Shared Storage Resources

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7529844B2 (en) 2002-04-26 2009-05-05 Sun Microsystems, Inc. Multiprocessing systems employing hierarchical spin locks
US7698523B2 (en) 2006-09-29 2010-04-13 Broadcom Corporation Hardware memory locks
US8392925B2 (en) 2009-03-26 2013-03-05 Apple Inc. Synchronization mechanisms based on counters
US8838944B2 (en) 2009-09-22 2014-09-16 International Business Machines Corporation Fast concurrent array-based stacks, queues and deques using fetch-and-increment-bounded, fetch-and-decrement-bounded and store-on-twin synchronization primitives

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030195920A1 (en) * 2000-05-25 2003-10-16 Brenner Larry Bert Apparatus and method for minimizing lock contention in a multiple processor system with multiple run queues
US20040098723A1 (en) * 2002-11-07 2004-05-20 Zoran Radovic Multiprocessing systems employing hierarchical back-off locks
US20040215858A1 (en) * 2003-04-24 2004-10-28 International Business Machines Corporation Concurrent access of shared resources
US20070300226A1 (en) * 2006-06-22 2007-12-27 Bliss Brian E Efficient ticket lock synchronization implementation using early wakeup in the presence of oversubscription
US20080098180A1 (en) * 2006-10-23 2008-04-24 Douglas Larson Processor acquisition of ownership of access coordinator for shared resource
US20110252166A1 (en) * 2009-01-23 2011-10-13 Pradeep Padala System and Methods for Allocating Shared Storage Resources

Cited By (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11269692B2 (en) * 2011-12-29 2022-03-08 Oracle International Corporation Efficient sequencer for multiple concurrently-executing threads of execution
US9313208B1 (en) * 2014-03-19 2016-04-12 Amazon Technologies, Inc. Managing restricted access resources
CN104199800A (en) * 2014-07-21 2014-12-10 上海寰创通信科技股份有限公司 Method for eliminating mutual exclusion of table items in multi-core system
KR102255313B1 (en) 2015-02-02 2021-05-24 옵티멈 세미컨덕터 테크놀로지스 인코포레이티드 Vector processor configured to operate on variable length vectors using asymmetric multi-threading
WO2016126516A1 (en) * 2015-02-02 2016-08-11 Optimum Semiconductor Technologies, Inc. Vector processor configured to operate on variable length vectors with asymmetric multi-threading
KR20170110685A (en) * 2015-02-02 2017-10-11 옵티멈 세미컨덕터 테크놀로지스 인코포레이티드 A vector processor configured to operate on variable length vectors using asymmetric multi-threading;
US10339094B2 (en) 2015-02-02 2019-07-02 Optimum Semiconductor Technologies, Inc. Vector processor configured to operate on variable length vectors with asymmetric multi-threading
WO2017018976A1 (en) * 2015-07-24 2017-02-02 Hewlett Packard Enterprise Development Lp Lock manager
US10423464B2 (en) 2016-09-30 2019-09-24 Hewlett Packard Enterprise Patent Development LP Persistent ticket operation
US20190129846A1 (en) * 2017-10-30 2019-05-02 International Business Machines Corporation Dynamic Resource Visibility Tracking to Avoid Atomic Reference Counting
US10621086B2 (en) * 2017-10-30 2020-04-14 International Business Machines Corporation Dynamic resource visibility tracking to avoid atomic reference counting
CN113767372A (en) * 2019-05-09 2021-12-07 国际商业机器公司 Executing multiple data requests of a multi-core processor
US20200356485A1 (en) * 2019-05-09 2020-11-12 International Business Machines Corporation Executing multiple data requests of multiple-core processors
US11321146B2 (en) 2019-05-09 2022-05-03 International Business Machines Corporation Executing an atomic primitive in a multi-core processor system
US11681567B2 (en) * 2019-05-09 2023-06-20 International Business Machines Corporation Method and processor system for executing a TELT instruction to access a data item during execution of an atomic primitive
US20200034214A1 (en) * 2019-10-02 2020-01-30 Juraj Vanco Method for arbitration and access to hardware request ring structures in a concurrent environment
US11748174B2 (en) * 2019-10-02 2023-09-05 Intel Corporation Method for arbitration and access to hardware request ring structures in a concurrent environment

Also Published As

Publication number Publication date
US9158597B2 (en) 2015-10-13

Similar Documents

Publication Publication Date Title
US9158597B2 (en) Controlling access to shared resource by issuing tickets to plurality of execution units
US7861042B2 (en) Processor acquisition of ownership of access coordinator for shared resource
US8539486B2 (en) Transactional block conflict resolution based on the determination of executing threads in parallel or in serial mode
US9996402B2 (en) System and method for implementing scalable adaptive reader-writer locks
US9619303B2 (en) Prioritized conflict handling in a system
US9170844B2 (en) Prioritization for conflict arbitration in transactional memory management
JP3871305B2 (en) Dynamic serialization of memory access in multiprocessor systems
US8015248B2 (en) Queuing of conflicted remotely received transactions
US8689221B2 (en) Speculative thread execution and asynchronous conflict events
JP5787629B2 (en) Multi-processor system on chip for machine vision
US11461151B2 (en) Controller address contention assumption
JP2000076217A (en) Lock operation optimization system and method for computer system
US8141089B2 (en) Method and apparatus for reducing contention for computer system resources using soft locks
US9747210B2 (en) Managing a lock to a resource shared among a plurality of processors
EP3379421B1 (en) Method, apparatus, and chip for implementing mutually-exclusive operation of multiple threads
US20110320659A1 (en) Dynamic multi-level cache including resource access fairness scheme
CN106068497B (en) Transactional memory support
US9442971B2 (en) Weighted transaction priority based dynamically upon phase of transaction completion
Zhang et al. Scalable adaptive NUMA-aware lock
CN112306703A (en) Critical region execution method and device in NUMA system
WO2017131624A1 (en) A unified lock
US11880304B2 (en) Cache management using cache scope designation
EP2707793B1 (en) Request to own chaining in multi-socketed systems
US8930628B2 (en) Managing in-line store throughput reduction
CN117687744A (en) Method for dynamically scheduling transaction in hardware transaction memory

Legal Events

Date Code Title Description
AS Assignment

Owner name: MICROSOFT CORPORATION, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:ROSS, JONATHAN;REEL/FRAME:026564/0939

Effective date: 20110707

AS Assignment

Owner name: MICROSOFT TECHNOLOGY LICENSING, LLC, WASHINGTON

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MICROSOFT CORPORATION;REEL/FRAME:034544/0001

Effective date: 20141014

STCF Information on status: patent grant

Free format text: PATENTED CASE

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 4

MAFP Maintenance fee payment

Free format text: PAYMENT OF MAINTENANCE FEE, 8TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1552); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

Year of fee payment: 8