US10061568B2 - Dynamic alias checking with transactional memory - Google Patents

Dynamic alias checking with transactional memory Download PDF

Info

Publication number
US10061568B2
US10061568B2 US15/850,668 US201715850668A US10061568B2 US 10061568 B2 US10061568 B2 US 10061568B2 US 201715850668 A US201715850668 A US 201715850668A US 10061568 B2 US10061568 B2 US 10061568B2
Authority
US
United States
Prior art keywords
code
region
alias
optimized
thread
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
US15/850,668
Other versions
US20180095736A1 (en
Inventor
Yaoqing Gao
William G. O'Farrell
Denis Palmeiro
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
International Business Machines Corp
Original Assignee
International Business Machines Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by International Business Machines Corp filed Critical International Business Machines Corp
Priority to US15/850,668 priority Critical patent/US10061568B2/en
Assigned to INTERNATIONAL BUSINESS MACHINES CORPORATION reassignment INTERNATIONAL BUSINESS MACHINES CORPORATION ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: GAO, YAOQING, O'FARRELL, WILLIAM G., PALMEIRO, DENIS
Publication of US20180095736A1 publication Critical patent/US20180095736A1/en
Application granted granted Critical
Publication of US10061568B2 publication Critical patent/US10061568B2/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/43Checking; Contextual analysis
    • G06F8/433Dependency analysis; Data or control flow analysis
    • G06F8/434Pointers; Aliasing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/30Creation or generation of source code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/42Syntactic analysis
    • G06F8/427Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4434Reducing the memory space required by the program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/44Encoding
    • G06F8/443Optimisation
    • G06F8/4441Reducing the execution time required by the program code
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F8/00Arrangements for software engineering
    • G06F8/40Transformation of program code
    • G06F8/41Compilation
    • G06F8/45Exploiting coarse grain parallelism in compilation, i.e. parallelism between groups of instructions
    • G06F8/458Synchronisation, e.g. post-wait, barriers, locks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/466Transaction processing
    • G06F9/467Transactional memory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores

Definitions

  • the present invention relates generally to the field of compiler optimizations, and more particularly to aliasing.
  • compilation is a process in which a compiler transforms source code written in a programming language (e.g., Java, C++, etc.) into machine instructions for creating an executable program.
  • a compiler may optimize the code to be compiled. For example, a compiler may minimize or transform certain attributes and/or segments of the code.
  • a compiler optimization technique is automatic vectorization.
  • Aliasing is a situation in computer programming where a data object associated with a defined location in computer memory can be accessed through a plurality of different pointers in code. Therefore, changing the data object via any one of the plurality of pointers implicitly changes the data object for all the other associated pointers. This situation may also be referred to as an alias dependence.
  • a method for dynamic run-time alias checking comprising creating, by a dependency checker, a main thread and a helper thread; computing, by the dependency checker, an optimized first region of code in a rollback-only transactional memory associated with the main thread; checking, by the dependency checker, for one or more alias dependencies in an un-optimized first region of code; responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing, by the dependency checker, a transaction; and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing, by the dependency checker, a rollback of the transaction and executing the un-optimized first region of code.
  • FIG. 1 is a functional block diagram illustrating a distributed data processing environment, in accordance with an embodiment of the present invention
  • FIG. 2 is a flowchart depicting operational steps of a dependency checker on a computer system within the data processing environment of FIG. 1 , in accordance with an embodiment of the present invention.
  • FIG. 3 is a block diagram of components of the computer system executing the dependency checker, in accordance with an embodiment of the present invention.
  • Embodiments of the present invention recognize that with alias dependencies present, an execution of an optimized version of code can lead to incorrect results.
  • the memory locations of data objects associated with pointers are not known at the time of compilation. Therefore, unforeseen alias dependencies may cause optimized executable programs to function incorrectly or differently than expected.
  • embodiments of the present invention provide a solution for dynamic run-time alias checking for the prevention of a transaction to memory associated with an execution of optimized code containing alias dependencies.
  • references in the specification to “an embodiment,” “other embodiments,” etc. indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, describing a particular feature, structure or characteristic in connection with an embodiment, one skilled in the art has the knowledge to affect such feature, structure or characteristic in connection with other embodiments whether or not explicitly described.
  • FIG. 1 is a functional block diagram illustrating a distributed data processing environment 100 , in accordance with one embodiment of the present invention.
  • Distributed data processing environment 100 comprises computer system 102 and communication device 120 , interconnected over network 140 .
  • Computer system 102 can be a laptop computer, tablet computer, netbook computer, personal computer (PC), a desktop computer, a personal digital assistant (PDA), a smart phone, or any programmable electronic device capable of communicating with communication device 118 via network 140 .
  • Computer system 102 comprises dependency checker 104 , a component for dynamically checking program instructions for alias dependencies.
  • Computer system 102 may include internal and external hardware components, as depicted and described in further detail with respect to FIG. 4 .
  • Dependency checker 104 comprises thread creator 106 .
  • Thread creator 106 can create a plurality of threads, e.g., main thread 108 and helper thread 110 , which can reside on a single processor. In some embodiments, the plurality of threads can reside on a plurality of separate computer processors.
  • Main thread 108 can perform a speculative computation of an optimized version of code in a form of transactional memory such as, but not limited to, a rollback-only transaction.
  • Helper thread 110 is synchronized with main thread 108 and performs address range computations between any pointers which are suspected to be aliased in the optimized region of code that has been speculatively computed by main thread 108 .
  • helper thread 110 can dynamically check for alias dependencies that may be present in the optimized code in parallel with main thread 108 and without having to write any data to memory.
  • Helper thread 110 can communicate results of an alias dependency check to main thread 108 via a synchronization mechanism such as, but not limited to, a pairing of larx (Load and Reserve) and stcx (Store Conditional) instructions.
  • a speculative computation can refer to an execution of optimized code, e.g., a region of optimized code, by main thread 108 wherein it is assumed that no alias dependencies are present.
  • optimized code e.g., a region of optimized code
  • main thread 108 can be configured not to commit the transaction to memory, e.g., computer memory comprising computer system 102 , until it is confirmed that no alias dependencies are present, as will be discussed subsequently in greater detail.
  • Embodiments of the present invention can utilize rollback-only transactions for performing speculative computations of optimized code.
  • Rollback-only transactions are a form of transactional memory with several advantages over conventional transactions, i.e., atomic transactions.
  • Rollback-only transactions are not used to manipulate shared data and therefore do not adhere to memory coherence protocols for synchronization and conflict tracking required by atomic transactions.
  • Memory barriers associated with instructions for performing atomic transactions add overhead in terms of computing resources consumed.
  • Rollback-only transactions provide greater efficiency in that they do not require memory barriers nor the serialization needed to synchronize with other concurrent (conventional) transactions.
  • Main thread 108 and helper thread 110 are therefore able to run in parallel without conflicting with one another's data.
  • computing device 120 can be a laptop computer, tablet computer, netbook computer, personal computer (PC), a desktop computer, a personal digital assistant (PDA), a smart phone, or any programmable electronic device capable of communicating with computer system 102 via network 140 .
  • Computing device 120 can be generally representative of any number of such computing devices.
  • main thread 108 and/or helper thread 110 can reside on computer processors comprising computing device 120 .
  • Network 140 can be, for example, a local area network (LAN), a wide area network (WAN) such as the Internet, or a combination of the two, and can include wired, wireless, or fiber optic connections.
  • network 140 can be any combination of connections and protocols that will support communications between computer system 102 and communication device 120 .
  • FIG. 2 is a flowchart 200 depicting operational steps of dependency checker 104 for dynamically checking program instructions for alias dependencies, in accordance with an embodiment of the present invention.
  • the illustrative example of FIG. 2 is provided to facilitate discussion of aspects of the present invention, and it should be appreciated that FIG. 2 provides only an illustration of an embodiment of the present invention and does not imply any limitations with regard to the variations or configurations in which different embodiments may be implemented.
  • Thread creator 106 creates main thread 108 and helper thread 110 for executing program instructions and checking for alias dependencies (step 202 ).
  • main thread 108 and helper thread 110 reside in a process associated with a computer processor comprising computer system 102 .
  • Main thread 108 can execute program instructions, e.g., machine code, in a form of transactional memory such as a rollback-only transaction.
  • Helper thread 110 is synchronized with main thread 108 and placed in a busy-wait state until receiving further instructions. For example, main thread 108 executes larx/stcx instructions to tell the helper thread 110 to exit the busy-wait state.
  • main thread 108 performs a speculative computation of an optimized first region of code, associated with an optimized version of code (step 204 ).
  • the optimized first region of code may have been optimized by a compiler using an optimization technique such as, but not limited to, automatic vectorization.
  • Helper thread 110 synchronized with main thread 108 , leaves the busy-wait state and performs a check for alias dependencies in an un-optimized version of code (step 206 ).
  • main thread 108 executes larx/stcx instructions to have helper thread 110 leave the busy-wait state, wherein helper thread 110 performs a check for alias dependencies in an un-optimized first region of code corresponding to the optimized first region of code.
  • Helper thread 110 can perform a conventional alias check such as would be apparent to one of ordinary skill in the art.
  • helper thread 110 can provide notification of the results of the check to main thread 108 .
  • helper thread 110 provides notification via larx/stcx instructions which are visible to main thread 108 .
  • main thread 108 Responsive to determining in a configurable, predetermined amount of time (step 208 , YES branch) that no alias dependencies are present in the un-optimized first region of code (step 210 , NO branch), main thread 108 commits the transaction to memory (step 216 ). In other words, the execution of program instructions associated with the optimized first region of code are written to memory. For example, main thread 108 can check periodically for larx/stcx instructions executed by helper thread 110 , indicating that no alias dependencies were found.
  • Main thread 108 can begin keeping track of the time helper thread 110 is taking for an alias check, for example, once main thread 108 has sent helper thread 110 instructions to leave the busy-wait state and perform the alias check (step 206 ). Subsequent to completing the alias check, helper thread 110 can return to the busy-wait state, and main thread 108 can continue execution of program instructions in a non-transactional memory state until arriving at another region of code which has been optimized. Helper thread 110 remains in the busy-wait state until receiving instructions, e.g., from main thread 108 , to leave the busy-wait state and perform another alias check for another region of code.
  • main thread 108 Responsive to a determination of completed alias check results in an amount of time exceeding the predetermined amount of time (step 208 , NO branch), main thread 108 performs a rollback of the transaction, wherein the speculatively computed program instructions associated with the optimized first region of code are not committed to memory (step 212 ). Main thread 108 branches into the un-optimized version of code and executes the un-optimized first region of code (step 214 ).
  • main thread 108 determines that alias check results were completed in the predetermined amount of time (step 208 , YES branch) but one or more alias dependencies were found (step 210 , YES branch)
  • main thread 108 performs a rollback of the transaction (step 212 ), branches into the un-optimized version of code and executes the un-optimized first region of code (step 214 ).
  • FIG. 3 depicts a block diagram 300 of components of computer system 102 in accordance with an illustrative embodiment of the present invention. It should be appreciated that FIG. 3 provides only an illustration of one implementation and does not imply any limitations with regard to the environments in which different embodiments may be implemented. Many modifications to the depicted environment may be made.
  • Computer system 102 includes communications fabric 302 , which provides communications between cache 316 , memory 306 , persistent storage 308 , communications unit 310 , and input/output (I/O) interface(s) 312 .
  • Communications fabric 302 can be implemented with any architecture designed for passing data and/or control information between processors (such as microprocessors, communications and network processors, etc.), system memory, peripheral devices, and any other hardware components within a system.
  • processors such as microprocessors, communications and network processors, etc.
  • Communications fabric 302 can be implemented with one or more buses or a crossbar switch.
  • Memory 306 and persistent storage 308 are computer readable storage media.
  • memory 306 includes random access memory (RAM).
  • RAM random access memory
  • memory 306 can include any suitable volatile or non-volatile computer readable storage media.
  • Cache 316 is a fast memory that enhances the performance of computer processor(s) 304 by holding recently accessed data, and data near accessed data, from memory 306 .
  • Dependency checker 104 can be stored in persistent storage 308 and in memory 306 for execution by one or more of the respective computer processors 304 via cache 316 .
  • persistent storage 308 includes a magnetic hard disk drive.
  • persistent storage 308 can include a solid state hard drive, a semiconductor storage device, read-only memory (ROM), erasable programmable read-only memory (EPROM), flash memory, or any other computer readable storage media that is capable of storing program instructions or digital information.
  • the media used by persistent storage 308 can also be removable.
  • a removable hard drive can be used for persistent storage 308 .
  • Other examples include optical and magnetic disks, thumb drives, and smart cards that are inserted into a drive for transfer onto another computer readable storage medium that is also part of persistent storage 308 .
  • Communications unit 310 in these examples, provides for communications with other data processing systems or devices.
  • communications unit 310 includes one or more network interface cards.
  • Communications unit 310 can provide communications through the use of either or both physical and wireless communications links.
  • Dependency checker 104 can be downloaded to persistent storage 308 through communications unit 310 .
  • I/O interface(s) 312 allows for input and output of data with other devices that can be connected to computer system 102 .
  • I/O interface 312 can provide a connection to external devices 318 such as a keyboard, keypad, a touch screen, and/or some other suitable input device.
  • External devices 318 can also include portable computer readable storage media such as, for example, thumb drives, portable optical or magnetic disks, and memory cards.
  • Software and data used to practice embodiments of the present invention, e.g., dependency checker 104 can be stored on such portable computer readable storage media and can be loaded onto persistent storage 308 via I/O interface(s) 312 .
  • I/O interface(s) 312 also connect to a display 320 .
  • Display 320 provides a mechanism to display data to a user and can be, for example, a computer monitor.
  • the present invention can be a system, a method, and/or a computer program product at any possible technical detail level of integration
  • the computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention
  • the computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device.
  • the computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing.
  • a non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing.
  • RAM random access memory
  • ROM read-only memory
  • EPROM or Flash memory erasable programmable read-only memory
  • SRAM static random access memory
  • CD-ROM compact disc read-only memory
  • DVD digital versatile disk
  • memory stick a floppy disk
  • a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon
  • a computer readable storage medium is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
  • Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network.
  • the network can comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers.
  • a network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
  • Computer readable program instructions for carrying out operations of the present invention can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages.
  • the computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server.
  • the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider).
  • electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
  • These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
  • the computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
  • each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s).
  • the functions noted in the block can occur out of the order noted in the figures.
  • two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Executing Machine-Instructions (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

An approach to dynamic run-time alias checking comprising creating a main thread and a helper thread, computing an optimized first region of code in a rollback-only transactional memory associated with the main thread checking for one or more alias dependencies in an un-optimized first region of code, responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing a transaction and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing a rollback of the transaction and executing the un-optimized first region of code.

Description

BACKGROUND
The present invention relates generally to the field of compiler optimizations, and more particularly to aliasing.
In computing, compilation is a process in which a compiler transforms source code written in a programming language (e.g., Java, C++, etc.) into machine instructions for creating an executable program. To save time and other resources, a compiler may optimize the code to be compiled. For example, a compiler may minimize or transform certain attributes and/or segments of the code. One such example of a compiler optimization technique is automatic vectorization.
Aliasing is a situation in computer programming where a data object associated with a defined location in computer memory can be accessed through a plurality of different pointers in code. Therefore, changing the data object via any one of the plurality of pointers implicitly changes the data object for all the other associated pointers. This situation may also be referred to as an alias dependence.
SUMMARY
According to one embodiment of the present invention, a method for dynamic run-time alias checking is provided, the method comprising creating, by a dependency checker, a main thread and a helper thread; computing, by the dependency checker, an optimized first region of code in a rollback-only transactional memory associated with the main thread; checking, by the dependency checker, for one or more alias dependencies in an un-optimized first region of code; responsive to a determination in a predetermined amount of time that no alias dependencies are present in the un-optimized first region of code, committing, by the dependency checker, a transaction; and responsive to at least one of a failure to determine results of the check for one or more alias dependencies in the predetermined amount of time and a determination in the predetermined amount of time that alias dependencies are present in the un-optimized first region of code, performing, by the dependency checker, a rollback of the transaction and executing the un-optimized first region of code. A corresponding computer program product and computer system are also disclosed herein.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a functional block diagram illustrating a distributed data processing environment, in accordance with an embodiment of the present invention;
FIG. 2 is a flowchart depicting operational steps of a dependency checker on a computer system within the data processing environment of FIG. 1, in accordance with an embodiment of the present invention; and
FIG. 3 is a block diagram of components of the computer system executing the dependency checker, in accordance with an embodiment of the present invention.
DETAILED DESCRIPTION
Embodiments of the present invention recognize that with alias dependencies present, an execution of an optimized version of code can lead to incorrect results. The memory locations of data objects associated with pointers are not known at the time of compilation. Therefore, unforeseen alias dependencies may cause optimized executable programs to function incorrectly or differently than expected. With this in mind, embodiments of the present invention provide a solution for dynamic run-time alias checking for the prevention of a transaction to memory associated with an execution of optimized code containing alias dependencies.
In describing embodiments in detail with reference to the figures, it should be noted that references in the specification to “an embodiment,” “other embodiments,” etc., indicate that the embodiment described may include a particular feature, structure, or characteristic, but every embodiment may not necessarily include the particular feature, structure, or characteristic. Moreover, such phrases are not necessarily referring to the same embodiment. Further, describing a particular feature, structure or characteristic in connection with an embodiment, one skilled in the art has the knowledge to affect such feature, structure or characteristic in connection with other embodiments whether or not explicitly described.
The present invention will now be described in detail with reference to the figures. FIG. 1 is a functional block diagram illustrating a distributed data processing environment 100, in accordance with one embodiment of the present invention. Distributed data processing environment 100 comprises computer system 102 and communication device 120, interconnected over network 140.
Computer system 102 can be a laptop computer, tablet computer, netbook computer, personal computer (PC), a desktop computer, a personal digital assistant (PDA), a smart phone, or any programmable electronic device capable of communicating with communication device 118 via network 140. Computer system 102 comprises dependency checker 104, a component for dynamically checking program instructions for alias dependencies. Computer system 102 may include internal and external hardware components, as depicted and described in further detail with respect to FIG. 4.
Dependency checker 104 comprises thread creator 106. Thread creator 106 can create a plurality of threads, e.g., main thread 108 and helper thread 110, which can reside on a single processor. In some embodiments, the plurality of threads can reside on a plurality of separate computer processors. Main thread 108 can perform a speculative computation of an optimized version of code in a form of transactional memory such as, but not limited to, a rollback-only transaction. Helper thread 110 is synchronized with main thread 108 and performs address range computations between any pointers which are suspected to be aliased in the optimized region of code that has been speculatively computed by main thread 108. In this way, helper thread 110 can dynamically check for alias dependencies that may be present in the optimized code in parallel with main thread 108 and without having to write any data to memory. Helper thread 110 can communicate results of an alias dependency check to main thread 108 via a synchronization mechanism such as, but not limited to, a pairing of larx (Load and Reserve) and stcx (Store Conditional) instructions.
It should be noted that in the context of this disclosure, a speculative computation can refer to an execution of optimized code, e.g., a region of optimized code, by main thread 108 wherein it is assumed that no alias dependencies are present. An advantage of performing a speculative computation in a transactional memory context is that main thread 108 can be configured not to commit the transaction to memory, e.g., computer memory comprising computer system 102, until it is confirmed that no alias dependencies are present, as will be discussed subsequently in greater detail.
Embodiments of the present invention can utilize rollback-only transactions for performing speculative computations of optimized code. Rollback-only transactions are a form of transactional memory with several advantages over conventional transactions, i.e., atomic transactions. Rollback-only transactions are not used to manipulate shared data and therefore do not adhere to memory coherence protocols for synchronization and conflict tracking required by atomic transactions. Memory barriers associated with instructions for performing atomic transactions add overhead in terms of computing resources consumed. Rollback-only transactions provide greater efficiency in that they do not require memory barriers nor the serialization needed to synchronize with other concurrent (conventional) transactions. Main thread 108 and helper thread 110 are therefore able to run in parallel without conflicting with one another's data.
In various embodiments of the present invention, computing device 120 can be a laptop computer, tablet computer, netbook computer, personal computer (PC), a desktop computer, a personal digital assistant (PDA), a smart phone, or any programmable electronic device capable of communicating with computer system 102 via network 140. Computing device 120 can be generally representative of any number of such computing devices. In some embodiments, main thread 108 and/or helper thread 110 can reside on computer processors comprising computing device 120.
Network 140 can be, for example, a local area network (LAN), a wide area network (WAN) such as the Internet, or a combination of the two, and can include wired, wireless, or fiber optic connections. In general, network 140 can be any combination of connections and protocols that will support communications between computer system 102 and communication device 120.
FIG. 2 is a flowchart 200 depicting operational steps of dependency checker 104 for dynamically checking program instructions for alias dependencies, in accordance with an embodiment of the present invention. The illustrative example of FIG. 2 is provided to facilitate discussion of aspects of the present invention, and it should be appreciated that FIG. 2 provides only an illustration of an embodiment of the present invention and does not imply any limitations with regard to the variations or configurations in which different embodiments may be implemented.
Thread creator 106 creates main thread 108 and helper thread 110 for executing program instructions and checking for alias dependencies (step 202). In one embodiment, main thread 108 and helper thread 110 reside in a process associated with a computer processor comprising computer system 102. Main thread 108 can execute program instructions, e.g., machine code, in a form of transactional memory such as a rollback-only transaction. Helper thread 110 is synchronized with main thread 108 and placed in a busy-wait state until receiving further instructions. For example, main thread 108 executes larx/stcx instructions to tell the helper thread 110 to exit the busy-wait state.
In a form of transactional memory, e.g., a rollback-only transaction, main thread 108 performs a speculative computation of an optimized first region of code, associated with an optimized version of code (step 204). The optimized first region of code may have been optimized by a compiler using an optimization technique such as, but not limited to, automatic vectorization.
Helper thread 110, synchronized with main thread 108, leaves the busy-wait state and performs a check for alias dependencies in an un-optimized version of code (step 206). For example, main thread 108 executes larx/stcx instructions to have helper thread 110 leave the busy-wait state, wherein helper thread 110 performs a check for alias dependencies in an un-optimized first region of code corresponding to the optimized first region of code. Helper thread 110 can perform a conventional alias check such as would be apparent to one of ordinary skill in the art. When helper thread 110 has completed the alias check, helper thread 110 can provide notification of the results of the check to main thread 108. For example, helper thread 110 provides notification via larx/stcx instructions which are visible to main thread 108.
Responsive to determining in a configurable, predetermined amount of time (step 208, YES branch) that no alias dependencies are present in the un-optimized first region of code (step 210, NO branch), main thread 108 commits the transaction to memory (step 216). In other words, the execution of program instructions associated with the optimized first region of code are written to memory. For example, main thread 108 can check periodically for larx/stcx instructions executed by helper thread 110, indicating that no alias dependencies were found. Main thread 108 can begin keeping track of the time helper thread 110 is taking for an alias check, for example, once main thread 108 has sent helper thread 110 instructions to leave the busy-wait state and perform the alias check (step 206). Subsequent to completing the alias check, helper thread 110 can return to the busy-wait state, and main thread 108 can continue execution of program instructions in a non-transactional memory state until arriving at another region of code which has been optimized. Helper thread 110 remains in the busy-wait state until receiving instructions, e.g., from main thread 108, to leave the busy-wait state and perform another alias check for another region of code.
Responsive to a determination of completed alias check results in an amount of time exceeding the predetermined amount of time (step 208, NO branch), main thread 108 performs a rollback of the transaction, wherein the speculatively computed program instructions associated with the optimized first region of code are not committed to memory (step 212). Main thread 108 branches into the un-optimized version of code and executes the un-optimized first region of code (step 214). Alternatively, if main thread 108 determines that alias check results were completed in the predetermined amount of time (step 208, YES branch) but one or more alias dependencies were found (step 210, YES branch), main thread 108 performs a rollback of the transaction (step 212), branches into the un-optimized version of code and executes the un-optimized first region of code (step 214).
FIG. 3 depicts a block diagram 300 of components of computer system 102 in accordance with an illustrative embodiment of the present invention. It should be appreciated that FIG. 3 provides only an illustration of one implementation and does not imply any limitations with regard to the environments in which different embodiments may be implemented. Many modifications to the depicted environment may be made.
Computer system 102 includes communications fabric 302, which provides communications between cache 316, memory 306, persistent storage 308, communications unit 310, and input/output (I/O) interface(s) 312. Communications fabric 302 can be implemented with any architecture designed for passing data and/or control information between processors (such as microprocessors, communications and network processors, etc.), system memory, peripheral devices, and any other hardware components within a system. For example, communications fabric 302 can be implemented with one or more buses or a crossbar switch.
Memory 306 and persistent storage 308 are computer readable storage media. In this embodiment, memory 306 includes random access memory (RAM). In general, memory 306 can include any suitable volatile or non-volatile computer readable storage media. Cache 316 is a fast memory that enhances the performance of computer processor(s) 304 by holding recently accessed data, and data near accessed data, from memory 306.
Dependency checker 104 can be stored in persistent storage 308 and in memory 306 for execution by one or more of the respective computer processors 304 via cache 316. In an embodiment, persistent storage 308 includes a magnetic hard disk drive. Alternatively, or in addition to a magnetic hard disk drive, persistent storage 308 can include a solid state hard drive, a semiconductor storage device, read-only memory (ROM), erasable programmable read-only memory (EPROM), flash memory, or any other computer readable storage media that is capable of storing program instructions or digital information.
The media used by persistent storage 308 can also be removable. For example, a removable hard drive can be used for persistent storage 308. Other examples include optical and magnetic disks, thumb drives, and smart cards that are inserted into a drive for transfer onto another computer readable storage medium that is also part of persistent storage 308.
Communications unit 310, in these examples, provides for communications with other data processing systems or devices. In these examples, communications unit 310 includes one or more network interface cards. Communications unit 310 can provide communications through the use of either or both physical and wireless communications links. Dependency checker 104 can be downloaded to persistent storage 308 through communications unit 310.
I/O interface(s) 312 allows for input and output of data with other devices that can be connected to computer system 102. For example, I/O interface 312 can provide a connection to external devices 318 such as a keyboard, keypad, a touch screen, and/or some other suitable input device. External devices 318 can also include portable computer readable storage media such as, for example, thumb drives, portable optical or magnetic disks, and memory cards. Software and data used to practice embodiments of the present invention, e.g., dependency checker 104, can be stored on such portable computer readable storage media and can be loaded onto persistent storage 308 via I/O interface(s) 312. I/O interface(s) 312 also connect to a display 320. Display 320 provides a mechanism to display data to a user and can be, for example, a computer monitor.
The programs described herein are identified based upon the application for which they are implemented in a specific embodiment of the invention. However, it should be appreciated that any particular program nomenclature herein is used merely for convenience, and thus the invention should not be limited to use solely in any specific application identified and/or implied by such nomenclature.
The present invention can be a system, a method, and/or a computer program product at any possible technical detail level of integration. The computer program product can include a computer readable storage medium (or media) having computer readable program instructions thereon for causing a processor to carry out aspects of the present invention.
The computer readable storage medium can be a tangible device that can retain and store instructions for use by an instruction execution device. The computer readable storage medium can be, for example, but is not limited to, an electronic storage device, a magnetic storage device, an optical storage device, an electromagnetic storage device, a semiconductor storage device, or any suitable combination of the foregoing. A non-exhaustive list of more specific examples of the computer readable storage medium includes the following: a portable computer diskette, a hard disk, a random access memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or Flash memory), a static random access memory (SRAM), a portable compact disc read-only memory (CD-ROM), a digital versatile disk (DVD), a memory stick, a floppy disk, a mechanically encoded device such as punch-cards or raised structures in a groove having instructions recorded thereon, and any suitable combination of the foregoing. A computer readable storage medium, as used herein, is not to be construed as being transitory signals per se, such as radio waves or other freely propagating electromagnetic waves, electromagnetic waves propagating through a waveguide or other transmission media (e.g., light pulses passing through a fiber-optic cable), or electrical signals transmitted through a wire.
Computer readable program instructions described herein can be downloaded to respective computing/processing devices from a computer readable storage medium or to an external computer or external storage device via a network, for example, the Internet, a local area network, a wide area network and/or a wireless network. The network can comprise copper transmission cables, optical transmission fibers, wireless transmission, routers, firewalls, switches, gateway computers and/or edge servers. A network adapter card or network interface in each computing/processing device receives computer readable program instructions from the network and forwards the computer readable program instructions for storage in a computer readable storage medium within the respective computing/processing device.
Computer readable program instructions for carrying out operations of the present invention can be assembler instructions, instruction-set-architecture (ISA) instructions, machine instructions, machine dependent instructions, microcode, firmware instructions, state-setting data, or either source code or object code written in any combination of one or more programming languages, including an object oriented programming language such as Smalltalk, C++ or the like, and conventional procedural programming languages, such as the “C” programming language or similar programming languages. The computer readable program instructions can execute entirely on the user's computer, partly on the user's computer, as a stand-alone software package, partly on the user's computer and partly on a remote computer or entirely on the remote computer or server. In the latter scenario, the remote computer can be connected to the user's computer through any type of network, including a local area network (LAN) or a wide area network (WAN), or the connection may be made to an external computer (for example, through the Internet using an Internet Service Provider). In some embodiments, electronic circuitry including, for example, programmable logic circuitry, field-programmable gate arrays (FPGA), or programmable logic arrays (PLA) may execute the computer readable program instructions by utilizing state information of the computer readable program instructions to personalize the electronic circuitry, in order to perform aspects of the present invention.
Aspects of the present invention are described herein with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each block of the flowchart illustrations and/or block diagrams, and combinations of blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer readable program instructions.
These computer readable program instructions can be provided to a processor of a general purpose computer, special purpose computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions/acts specified in the flowchart and/or block diagram block or blocks. These computer readable program instructions can also be stored in a computer readable storage medium that can direct a computer, a programmable data processing apparatus, and/or other devices to function in a particular manner, such that the computer readable storage medium having instructions stored therein comprises an article of manufacture including instructions which implement aspects of the function/act specified in the flowchart and/or block diagram block or blocks.
The computer readable program instructions can also be loaded onto a computer, other programmable data processing apparatus, or other device to cause a series of operational steps to be performed on the computer, other programmable apparatus or other device to produce a computer implemented process, such that the instructions which execute on the computer, other programmable apparatus, or other device implement the functions/acts specified in the flowchart and/or block diagram block or blocks.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods, and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of instructions, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block can occur out of the order noted in the figures. For example, two blocks shown in succession can, in fact, be executed substantially concurrently, or the blocks can sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems that perform the specified functions or acts or carry out combinations of special purpose hardware and computer instructions.
The descriptions of the various embodiments of the present invention have been presented for purposes of illustration, but are not intended to be exhaustive or limited to the embodiments disclosed. Many modifications and variations will be apparent to those of ordinary skill in the art without departing from the scope and spirit of the invention. The terminology used herein was chosen to best explain the principles of the embodiment, the practical application or technical improvement over technologies found in the marketplace, or to enable others of ordinary skill in the art to understand the embodiments disclosed herein.

Claims (1)

What is claimed is:
1. A method for alias dependence checking during speculative performance compiler optimizations using transactional memory rollbacks, the method comprising:
creating, by a dependency checker, a main thread and a helper thread of a transaction, wherein the main thread and the helper thread are executed in parallel, and wherein a Load and Reserve/Store Conditional (larx/stcx) instruction enables a synchronization mechanism between the main thread and the helper thread;
performing, by the dependency checker, a speculative computation of an optimized version of a first region of code in a rollback-only transactional memory associated with the main thread, and the helper thread maintains a busy-wait mode; wherein the optimized version of the first region of code is a compiler optimization performed on a region of code corresponding to an un-optimized first region of code, and the helper thread;
generating, by the dependency checker, a first larx/stcx instruction to the helper thread to leave a busy-wait state;
performing, by the dependency checker, a run-time first check, associated with the helper thread, for one or more alias dependencies in the un-optimized version of the first region of code, wherein the helper thread executes a second larx/stcx instructions indicating results of the run-time first check for one or more alias dependencies;
responsive to a determination of an absence of alias dependencies present in the un-optimized version of the first region of code within a predetermined amount of time, committing, by the dependency checker, the transaction, wherein committing the transaction comprises writing an execution of the optimized first region of code to computer memory; and
responsive to at least one of: a failure to determine results of the run-time first check for one or more alias dependencies in the predetermined amount of time, and a determination within the predetermined amount of time that alias dependencies are present in the un-optimized version of the first region of code, performing, by the dependency checker, a rollback of the optimized version of the first region of code of the transaction, and executing the un-optimized first region of code.
US15/850,668 2016-09-27 2017-12-21 Dynamic alias checking with transactional memory Expired - Fee Related US10061568B2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/850,668 US10061568B2 (en) 2016-09-27 2017-12-21 Dynamic alias checking with transactional memory

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/276,952 US10216496B2 (en) 2016-09-27 2016-09-27 Dynamic alias checking with transactional memory
US15/850,668 US10061568B2 (en) 2016-09-27 2017-12-21 Dynamic alias checking with transactional memory

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US15/276,952 Continuation US10216496B2 (en) 2016-09-27 2016-09-27 Dynamic alias checking with transactional memory

Publications (2)

Publication Number Publication Date
US20180095736A1 US20180095736A1 (en) 2018-04-05
US10061568B2 true US10061568B2 (en) 2018-08-28

Family

ID=61686306

Family Applications (2)

Application Number Title Priority Date Filing Date
US15/276,952 Expired - Fee Related US10216496B2 (en) 2016-09-27 2016-09-27 Dynamic alias checking with transactional memory
US15/850,668 Expired - Fee Related US10061568B2 (en) 2016-09-27 2017-12-21 Dynamic alias checking with transactional memory

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US15/276,952 Expired - Fee Related US10216496B2 (en) 2016-09-27 2016-09-27 Dynamic alias checking with transactional memory

Country Status (1)

Country Link
US (2) US10216496B2 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068102A1 (en) 2019-10-08 2021-04-15 Intel Corporation Reducing compiler type check costs through thread speculation and hardware transactional memory

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117260B (en) * 2018-08-30 2021-01-01 百度在线网络技术(北京)有限公司 Task scheduling method, device, equipment and medium
CN110795106B (en) * 2019-10-30 2022-10-04 中国人民解放军战略支援部队信息工程大学 Dynamic and static combined memory alias analysis processing method and device in program vectorization process
CN112685149A (en) * 2020-12-18 2021-04-20 宝能(广州)汽车研究院有限公司 Starting method of android application, storage medium and electronic device
US11775337B2 (en) 2021-09-02 2023-10-03 International Business Machines Corporation Prioritization of threads in a simultaneous multithreading processor core

Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5926832A (en) * 1996-09-26 1999-07-20 Transmeta Corporation Method and apparatus for aliasing memory data in an advanced microprocessor
US6173444B1 (en) * 1997-03-24 2001-01-09 International Business Machines Corporation Optimizing compilation of pointer variables in the presence of indirect function calls
US20090150890A1 (en) * 2007-12-10 2009-06-11 Yourst Matt T Strand-based computing hardware and dynamically optimizing strandware for a high performance microprocessor system
US7634635B1 (en) * 1999-06-14 2009-12-15 Brian Holscher Systems and methods for reordering processor instructions
US7752613B2 (en) * 2006-12-05 2010-07-06 Intel Corporation Disambiguation in dynamic binary translation
US20110219208A1 (en) 2010-01-08 2011-09-08 International Business Machines Corporation Multi-petascale highly efficient parallel supercomputer
US20110289303A1 (en) * 2010-05-19 2011-11-24 International Business Machines Corporation Setjmp/longjmp for speculative execution frameworks
US8151252B2 (en) 2008-02-22 2012-04-03 Oracle America, Inc. Compiler framework for speculative automatic parallelization with transactional memory
US20120198428A1 (en) * 2011-01-28 2012-08-02 International Business Machines Corporation Using Aliasing Information for Dynamic Binary Optimization
US8495607B2 (en) 2010-03-01 2013-07-23 International Business Machines Corporation Performing aggressive code optimization with an ability to rollback changes made by the aggressive optimizations
US8677337B2 (en) 2008-05-01 2014-03-18 Oracle America, Inc. Static profitability control for speculative automatic parallelization
US8832415B2 (en) * 2010-01-08 2014-09-09 International Business Machines Corporation Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests
US20140331030A1 (en) 2002-07-09 2014-11-06 Bluerisc Inc. Statically speculative compilation and execution
US20150058832A1 (en) * 2010-09-23 2015-02-26 Apple Inc. Auto multi-threading in macroscalar compilers
US9009689B2 (en) 2010-11-09 2015-04-14 Intel Corporation Speculative compilation to generate advice messages
US20160179586A1 (en) * 2014-12-17 2016-06-23 Intel Corporation Lightweight restricted transactional memory for speculative compiler optimization

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6332214B1 (en) * 1998-05-08 2001-12-18 Intel Corporation Accurate invalidation profiling for cost effective data speculation
US9170812B2 (en) * 2002-03-21 2015-10-27 Pact Xpp Technologies Ag Data processing system having integrated pipelined array data processor
US8291197B2 (en) * 2007-02-12 2012-10-16 Oracle America, Inc. Aggressive loop parallelization using speculative execution mechanisms

Patent Citations (16)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5926832A (en) * 1996-09-26 1999-07-20 Transmeta Corporation Method and apparatus for aliasing memory data in an advanced microprocessor
US6173444B1 (en) * 1997-03-24 2001-01-09 International Business Machines Corporation Optimizing compilation of pointer variables in the presence of indirect function calls
US7634635B1 (en) * 1999-06-14 2009-12-15 Brian Holscher Systems and methods for reordering processor instructions
US20140331030A1 (en) 2002-07-09 2014-11-06 Bluerisc Inc. Statically speculative compilation and execution
US7752613B2 (en) * 2006-12-05 2010-07-06 Intel Corporation Disambiguation in dynamic binary translation
US20090150890A1 (en) * 2007-12-10 2009-06-11 Yourst Matt T Strand-based computing hardware and dynamically optimizing strandware for a high performance microprocessor system
US8151252B2 (en) 2008-02-22 2012-04-03 Oracle America, Inc. Compiler framework for speculative automatic parallelization with transactional memory
US8677337B2 (en) 2008-05-01 2014-03-18 Oracle America, Inc. Static profitability control for speculative automatic parallelization
US20110219208A1 (en) 2010-01-08 2011-09-08 International Business Machines Corporation Multi-petascale highly efficient parallel supercomputer
US8832415B2 (en) * 2010-01-08 2014-09-09 International Business Machines Corporation Mapping virtual addresses to different physical addresses for value disambiguation for thread memory access requests
US8495607B2 (en) 2010-03-01 2013-07-23 International Business Machines Corporation Performing aggressive code optimization with an ability to rollback changes made by the aggressive optimizations
US20110289303A1 (en) * 2010-05-19 2011-11-24 International Business Machines Corporation Setjmp/longjmp for speculative execution frameworks
US20150058832A1 (en) * 2010-09-23 2015-02-26 Apple Inc. Auto multi-threading in macroscalar compilers
US9009689B2 (en) 2010-11-09 2015-04-14 Intel Corporation Speculative compilation to generate advice messages
US20120198428A1 (en) * 2011-01-28 2012-08-02 International Business Machines Corporation Using Aliasing Information for Dynamic Binary Optimization
US20160179586A1 (en) * 2014-12-17 2016-06-23 Intel Corporation Lightweight restricted transactional memory for speculative compiler optimization

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
"Version 2.07", IBM Power ISA™-Book II , May 3, 2013, 1 page.
"Version 2.07", IBM Power ISA™—Book II , May 3, 2013, 1 page.
Ahn et al., "DeAliaser: Alias Speculation Using Atomic Region Support", ASPLOS'13, Mar. 16-20, 2013, Houston, Texas, USA. Copyright © 2013 ACM 978-1-4503-1870-9/13/03, pp. 167-180.
Bhattacharyya et al., "Automatic Speculative Parallelization of Loops Using Polyhedral Dependence Analysis", COSMIC '13 Shenzhen, China, Copyright 2013 ACM 978-1-4503-1971-3/13/02, 9 pages.
Cain et al., "Robust Architectural Support for Transactional Memory in the Power Architecture", ISCA '13 Tel-Aviv, Israel Copyright 2013 ACM 978-1-4503-2079-5/13/06, pp. 225-236.
Gao, et al., Dynamic Alias Checking With Transactional Memory, U.S. Appl. No. 15/276,952, filed Sep. 27, 2016 (a copy is not provided as this application is available to the Examiner).
List of IBM Patents or Patent Applications Treated as Related, Appendix P, Filed Herewith, 2 pages.

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021068102A1 (en) 2019-10-08 2021-04-15 Intel Corporation Reducing compiler type check costs through thread speculation and hardware transactional memory
EP4042273A4 (en) * 2019-10-08 2023-06-21 INTEL Corporation Reducing compiler type check costs through thread speculation and hardware transactional memory
US11880669B2 (en) 2019-10-08 2024-01-23 Intel Corporation Reducing compiler type check costs through thread speculation and hardware transactional memory

Also Published As

Publication number Publication date
US20180095736A1 (en) 2018-04-05
US20180088917A1 (en) 2018-03-29
US10216496B2 (en) 2019-02-26

Similar Documents

Publication Publication Date Title
US10061568B2 (en) Dynamic alias checking with transactional memory
US8495607B2 (en) Performing aggressive code optimization with an ability to rollback changes made by the aggressive optimizations
US9886371B2 (en) Test machine management
JP5906325B2 (en) Programs and computing devices with exceptions for code specialization in computer architectures that support transactions
US9652169B2 (en) Adaptive concurrency control using hardware transactional memory and locking mechanism
US9940139B2 (en) Split-level history buffer in a computer processing unit
US20170052726A1 (en) Execution of program region with transactional memory
US10013249B2 (en) Identifying user managed software modules
US10248534B2 (en) Template-based methodology for validating hardware features
US10579441B2 (en) Detecting deadlocks involving inter-processor interrupts
US9535608B1 (en) Memory access request for a memory protocol
US9720713B2 (en) Using hardware transactional memory for implementation of queue operations
US12118355B2 (en) Cache coherence validation using delayed fulfillment of L2 requests
US10956284B2 (en) Using hardware transactional memory to optimize reference counting
US20160210150A1 (en) Accelerated execution of target of execute instruction
US20160179952A1 (en) Application multi-versioning in a traditional language environment
US10061609B2 (en) Method and system using exceptions for code specialization in a computer architecture that supports transactions

Legal Events

Date Code Title Description
AS Assignment

Owner name: INTERNATIONAL BUSINESS MACHINES CORPORATION, NEW Y

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:GAO, YAOQING;O'FARRELL, WILLIAM G.;PALMEIRO, DENIS;REEL/FRAME:044463/0758

Effective date: 20160926

FEPP Fee payment procedure

Free format text: ENTITY STATUS SET TO UNDISCOUNTED (ORIGINAL EVENT CODE: BIG.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCF Information on status: patent grant

Free format text: PATENTED CASE

FEPP Fee payment procedure

Free format text: MAINTENANCE FEE REMINDER MAILED (ORIGINAL EVENT CODE: REM.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

LAPS Lapse for failure to pay maintenance fees

Free format text: PATENT EXPIRED FOR FAILURE TO PAY MAINTENANCE FEES (ORIGINAL EVENT CODE: EXP.); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY

STCH Information on status: patent discontinuation

Free format text: PATENT EXPIRED DUE TO NONPAYMENT OF MAINTENANCE FEES UNDER 37 CFR 1.362

FP Lapsed due to failure to pay maintenance fee

Effective date: 20220828