WO2005096136A1 - Antememorisation des piles avec partage de codes - Google Patents

Antememorisation des piles avec partage de codes Download PDF

Info

Publication number
WO2005096136A1
WO2005096136A1 PCT/CN2004/000290 CN2004000290W WO2005096136A1 WO 2005096136 A1 WO2005096136 A1 WO 2005096136A1 CN 2004000290 W CN2004000290 W CN 2004000290W WO 2005096136 A1 WO2005096136 A1 WO 2005096136A1
Authority
WO
WIPO (PCT)
Prior art keywords
stack
instruction
operand
executed
operand stack
Prior art date
Application number
PCT/CN2004/000290
Other languages
English (en)
Inventor
Jinzhan Peng
Gansha Wu
Guei-Yuan Lueh
Original Assignee
Intel Corporation
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Intel Corporation filed Critical Intel Corporation
Priority to PCT/CN2004/000290 priority Critical patent/WO2005096136A1/fr
Priority to CNB2004800425684A priority patent/CN100461090C/zh
Publication of WO2005096136A1 publication Critical patent/WO2005096136A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/448Execution paradigms, e.g. implementations of programming paradigms
    • G06F9/4482Procedural
    • G06F9/4484Executing subprograms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/45Caching of specific data in cache memory
    • G06F2212/451Stack data

Definitions

  • VMs, runtime systems, and other high level language processors incorporate a stack caching scheme to virtually map bytecode, for example, to an operand stack.
  • stack caching scheme a mixed stack
  • interpreter plays an important role in many runtime systems.
  • Many modern programming languages, such as Java, Forth, Perl, and Python are still employing various interpreters as their execution engines when they are programmed and run on memory/computation constraint devices, for example.
  • stack caching may be an efficient approach to eliminate most of the accesses to the operand stack and is able to speedup interpretation.
  • Stack caching may promote top-of-stack operands to registers, which may reduce the number of memory accesses and results in higher instructions per cycle.
  • a stack- caching interpreter may maintain many copies of execution code for each VM instruction. Such a design incurs code explosion which may consume excessive memory and introduce maintenance complexity.
  • Figure 1 depicts an exemplary embo diment of a mixed stack according to an embodiment of the invention
  • Figure 2 depicts an exemplary embodiment of a system according to a ⁇ embodiment of the invention
  • Figure 3A depicts an exemplary embodiment of a method according to a ⁇ embodiment of the invention
  • Figure 3 B depicts an exemplary embodiment of a method according to an embodiment of the invention
  • Figure 4 depicts an exemplary embodiment of a method according to an embodiment of the invention
  • Figure 5 depicts an exemplary embo diment of a code layout according to an embodiment of the invention
  • Figure 6 A depicts an exemplary embodiment of a code layout according to an embodiment of the invention
  • Figure 6B depicts an exemplary embodiment of a code layout according to an embodiment of the invention
  • Figure 6C depicts an exemplary embodiment of a code layout according to an embodiment of the invention
  • Figure 6D depicts an exemplary embodiment of a code layout according to an embodiment of the invention
  • Figure 6E depicts an exemplary embodiment of a code layout according to an embodiment of the invention
  • Embodiments of the present invention may provide a code sharing mechanism for stack caching that avoids code duplication.
  • a stack caching scheme may use a mixed register-stack model, i.e. a mixed stack, that virtually maps to a bytecode (e.g., Java or CLI) operand stack.
  • the mixed stack may consist of two parts: a register stack and a memory stack.
  • the register stack may be comprised of physical registers that may hold several top elements of the operand stack.
  • the memory stack may be a contiguous memory region that may hold the rest of elements of the operand sta k.
  • Figure 1 depicts an exemplary embodiment of a mixed stack 100.
  • mixed stack 100 may include a register stack 101 and a memory stack 102.
  • Figure 2 depicts an exemplary embodiment of a virtual machine architecture 200.
  • Virtual machine architecture 200 may include interpreter 201, loader 202, garbage collector 203, thread 204, and native module 205.
  • interpreter 201 may include an arithmetic logic unit (ALU) (not shown), a stack (not shown), and memory (not shown).
  • interpreter 201 may use the a-forementioned components to decode instructions and call appropriate functional, units to carry out instructions.
  • ALU arithmetic logic unit
  • Loader 202 may be responsible for loading class files into memory, parsing the class files, and preparing bytecode instructions for interpreter 201.
  • Interpreter 201 may be the execution engine of a VM, and may interpret instructions one at a time, for example.
  • Garbage collector 203 may all ocate new objects and reclaim useless objects.
  • thread 204 may support an application programming interface (API) and native module 205 may support the API for native library functions, for example.
  • API application programming interface
  • native module 205 may support the API for native library functions, for example.
  • machine instructions may take operands from an operand stack, operate on them, and return results to the stack.
  • a stack may be a 32-bit stack, for example, that may be used to pass parameters to methods and receive method results, .as well as to supply parameters for operations and save operation results.
  • a stack may be a mixed stack as is described above.
  • an interpreter such as interpreter 201 may keep most, if not all, bytecode instructions to be operated on in a register stack instead of a memory stack. Doing so may reduce memory accesses and execution time of the instruction.
  • the interpreter may need to perform shift operations to maintain the top-of-s"tack elements of the operand stack in the register stack.
  • register Rl may be removed from the register stack, the resulting value of R2 may need to be shifted to the top, R3 to R2. Because the memory stack is not empty (register stack underflow), the value in slot 1 may also be shifted to R3 so as to keep the register stack fully loaded with values.
  • the memory stack pointer (sp) may also be updated to sp' after slotl is drained.
  • Figure 4 illustrates how an interpreter may work with a mixed stack in an exemplary embodiment of the invention.
  • an instruction such as a bytecode instruction may undergo a stack-state aware translation into threaded code, which may indicate an entry point into shared execution code for executing the instruction.
  • Figure 4 depicts an exemplary embodiment of a transition 400 of an instruction from bytecode. for example, to shared execution code.
  • a bytecode instruction 401 may be passed to or interpreted by a stack-state-aw-are translator 402.
  • the stack- state-aware translator 402 may produce threaded code 403.
  • the instruction 401 may be dispatched according to the operand stack state of the instruction.
  • the entry point into shared execution code 404 may be determined and the instruction may be executed from that entry point.
  • the stack state may be embodied by the number ol " shift operations that are needed after the execution of the instruction.
  • ⁇ (i) denotes the m- ⁇ mber of shift operations that are needed after the execution of instruction i.
  • the integer add instruction, iadd may be used as an example to explain exemplary embodiments of the code-sharing mechanism.
  • a register stack may consist of 2 registers, for example, that include a top-of-stack (tos) register and a next-top-of-stack (nos) register.
  • the instruction iadd consumes two operands, (i.e., tos and nos respectively) and produces one (new tos), there may only be one shift operation required to move the top item on the memory stack to the register stack as the new nos.
  • the iadd instruction may then be dispatched to line 1 of the IADD_S1 case (as shown in Figure 4).
  • Line 2 of the 1 ⁇ DD_ SI case may pop the top element of the memory stack to a temp register, for example.
  • the execution may " then fall through to the IADD_S0 case, in which the register- wise add operation ine 4) may interpret the integer add operation.
  • Line 5 may refill nos by moving temp to nos to keep the top two elements of the operand stack in registers, for example.
  • the combination of lines 2 and 5 may constitute the shift operation.
  • ⁇ (iadd) - 0 may occur when the operand stack has only two elements (both are in the register stack). In such a case, no shift operation may be needed because there may only be one element left as the result of the add operation, iadd may be dispatched to IADD_S0 (tos will be the only stack item after execution).
  • line 4 may interpret the integer add operation. Execution of the refilling statement (line 5) may then become useless and redundant, but may not affect the correctness of the program because only tos may be a legitimate item after execution of IADD.
  • IADD_S0 and IADD SI may share the same execution code to avoid excessive code duplication.
  • execution code and instruction dispatching for various stack states may be reused with a comprehensively designed layout.
  • the stack state for each instruction may be inferred, and then the instruction may be directly dispatched to the appropriate execution entry without a runtime table lookup, for example.
  • the translation phase may perform some optimizations to improve the sequence of interpretation.
  • Figure 5 depicts an exemplary code layout 500 according to an exemplary embodiment of the invention.
  • the general code layout of all VM instructions may be illustrated as is shown in Figure 5, for example.
  • SOk is the code that corresponds to the shift operation for OP_S k .
  • the shifted elements may be moved to the register stack: (RO) after execution of the operation.
  • OP_ S k may also execute all the code of its subsequent entries, OP_ S 0 to OP_ Sk-i.
  • the code of OP_ So to OP__ S k -i may be shared.
  • ID is the code that calls the next instruction.
  • register stack size M 2 (i.e., there are 2 registers in the register stack as described above).
  • the property of an instruction i may be defined as [X(i), Y(i)], where X(i) denotes the number of operands that i consumes and Y(i) denotes the number of stack items that i produces.
  • Figures 6A-6G enumerate all possible code layouts for of 0 ; ⁇ X(i) ⁇ M and 0 ⁇ Y(i) ⁇ M.
  • each code layout represents a_ particular category
  • the stack-state-aware translation pliase may complement the code layout design.
  • the stack-state-aware translation may happen before the instruction is executed.
  • the translator may walk through the bytecode of the instruction in a pseudo-execution r ⁇ xanner, for example, and generate the appropriate threaded bytecode entry for each instruction.
  • the translator may be aware of tine operand stack state and [X(i), Y(i)
  • the translator may infer ⁇ (i) based on a static table lookup or on a calculatio result of a comprehensive formula, such as f(Depth(opstack), M, X(i), Y(i)).
  • the correctness of the stack-state-aware translation may be based on the fact that the stack depth before and after each bytecode instruction can be determined statically (runtime invariant). Such translation may only need one pass for a majority of bytecode instructions. Such embodiments may enable more optimization opportunities that are exposed during the translation.
  • Figure 7 depicts an exemplary embodiment of a computer and/or communications system as may be used to incorporate several components of the system in an exemplary embodiment of the present invention.
  • Figure 7 depicts an exemplary embodiment of a computer 700 as may be used for several computing devices in exemplary embodiments of the present invention.
  • Computex 700 may include, but is not limited to: e.g., any computer device, or comrn ications device including, e.g., a personal computer (PC), a workstation, a mobile device, a phone, a handheld PC, a personal digital assistant (PDA), a thin client, a- fat client, an network appliance, an Internet browser, a paging, or alert device, a "television, an interactive television, a receiver, a tuner, a high definition (HD) television, an HD receiver, a video-on-demand (VOD) system, a server, or other device.
  • PC personal computer
  • PDA personal digital assistant
  • HD high definition
  • VOD video-on-demand
  • Computer 700 may comprise a central processing unit (CPU) or processor 704, which may be coupled to a- bus 702.
  • Processor 704 may, e.g., access main memory 706 via bus 702.
  • Conxputer 700 may be coupled to an Input/Output (I/O) subsystem such as, e.g., a network interface card (NIC) 722, or a modem 724 for access to network 726.
  • I/O Input/Output
  • Computer 700 may also be coupled to a secondary memory 708 directly via bus 7O2, or via main memory 706, for example.
  • Secondary memory 708 may include, e.g., a disk storage unit 710 or other storage medium.
  • Exemplary disk storage units 710 may include, but are not limited to, a magnetic storage device such as, e.g., a hard disk, an optical storage device such as, e.g., a write once read many (WORM drive, or a compact disc (CD), or a magneto optical device.
  • a magnetic storage device such as, e.g., a hard disk
  • an optical storage device such as, e.g., a write once read many (WORM drive, or a compact disc (CD), or a magneto optical device.
  • Another type of secondary memory 708 may include a removable disk storage device 712, which can be used in conjunction with a removable storage medium 714, such as, e.g. a CD -ROM, or a floppy diskette.
  • the disk storage unit 710 may store an a plication program for operating the computer system referred to commonly as an operating system.
  • the disk storage unit 710 may also store documents of a database (not shown).
  • the computer 700 may interact with the I/O subsystems and disk storage unit 710 via bus 702.
  • the bus 702 may also be coupled to a display 720 for output, and input devices such as, but not limited to, a keyboard 718 and a mouse or other pointing/selection device 716.
  • a display 720 for output, and input devices such as, but not limited to, a keyboard 718 and a mouse or other pointing/selection device 716.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Devices For Executing Special Programs (AREA)
  • Executing Machine-Instructions (AREA)

Abstract

Exécution d'une instruction sur une pile d'opérandes, avec, tenant compte de l'état de la pile, exécution d'une traduction de l'instruction en code de séquence d'exécution, de façon à déterminer un état de pile d'opérandes correspondant à l'instruction, affectation de l'instruction en fonction de l'état de pile d'opérandes correspondant à l'instruction, et exécution de l'instruction.
PCT/CN2004/000290 2004-03-31 2004-03-31 Antememorisation des piles avec partage de codes WO2005096136A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2004/000290 WO2005096136A1 (fr) 2004-03-31 2004-03-31 Antememorisation des piles avec partage de codes
CNB2004800425684A CN100461090C (zh) 2004-03-31 2004-03-31 利用代码共享进行堆栈高速缓存的系统、方法和设备

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2004/000290 WO2005096136A1 (fr) 2004-03-31 2004-03-31 Antememorisation des piles avec partage de codes

Publications (1)

Publication Number Publication Date
WO2005096136A1 true WO2005096136A1 (fr) 2005-10-13

Family

ID=35063965

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2004/000290 WO2005096136A1 (fr) 2004-03-31 2004-03-31 Antememorisation des piles avec partage de codes

Country Status (2)

Country Link
CN (1) CN100461090C (fr)
WO (1) WO2005096136A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10265311B2 (en) 2009-07-22 2019-04-23 PureTech Health LLC Methods and compositions for treatment of disorders ameliorated by muscarinic receptor activation

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9436475B2 (en) * 2012-11-05 2016-09-06 Nvidia Corporation System and method for executing sequential code using a group of threads and single-instruction, multiple-thread processor incorporating the same
CN115237475B (zh) * 2022-06-23 2023-04-07 云南大学 一种Forth多核堆栈处理器及指令集

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1440528A (zh) * 2000-10-05 2003-09-03 Arm有限公司 寄存器中堆栈操作数的存储

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE3587591T2 (de) * 1984-11-21 1994-04-28 Harris Corp Mikroprozessor für Forth-ähnliche Sprache.
JPS6273335A (ja) * 1985-09-27 1987-04-04 Toshiba Corp スタツク管理方式
US6131144A (en) * 1997-04-01 2000-10-10 Sun Microsystems, Inc. Stack caching method with overflow/underflow control using pointers
US6654871B1 (en) * 1999-11-09 2003-11-25 Motorola, Inc. Device and a method for performing stack operations in a processing system

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1440528A (zh) * 2000-10-05 2003-09-03 Arm有限公司 寄存器中堆栈操作数的存储

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10265311B2 (en) 2009-07-22 2019-04-23 PureTech Health LLC Methods and compositions for treatment of disorders ameliorated by muscarinic receptor activation

Also Published As

Publication number Publication date
CN1926509A (zh) 2007-03-07
CN100461090C (zh) 2009-02-11

Similar Documents

Publication Publication Date Title
US7502910B2 (en) Sideband scout thread processor for reducing latency associated with a main processor
US6332215B1 (en) Java virtual machine hardware for RISC and CISC processors
Klaiber The technology behind Crusoe processors
Ertl Stack caching for interpreters
CN1134731C (zh) 在计算机系统中编译指令的方法
US7574588B2 (en) Time-multiplexed speculative multi-threading to support single-threaded applications
TW479198B (en) Method and apparatus for implementing execution predicates in a computer processing system
CN101261577B (zh) 微处理器以及在微处理器中存储数据的方法
US6907519B2 (en) Systems and methods for integrating emulated and native code
CN105408859B (zh) 用于指令调度的方法和系统
Craig Virtual machines
US9495136B2 (en) Using aliasing information for dynamic binary optimization
KR100368166B1 (ko) 컴퓨터 처리 시스템에서 스택 레퍼런스를 변경하는 방법
US20050240915A1 (en) Java hardware accelerator using microcode engine
US7665070B2 (en) Method and apparatus for a computing system using meta program representation
Vuletić et al. Virtual memory window for application-specific reconfigurable coprocessors
US6820254B2 (en) Method and system for optimizing code using an optimizing coprocessor
US9817669B2 (en) Computer processor employing explicit operations that support execution of software pipelined loops and a compiler that utilizes such operations for scheduling software pipelined loops
US7424596B2 (en) Code interpretation using stack state information
Souza et al. ISAMAP: instruction mapping driven by dynamic binary translation
WO2005096136A1 (fr) Antememorisation des piles avec partage de codes
US7266811B2 (en) Methods, systems, and computer program products for translating machine code associated with a first processor for execution on a second processor
Kim et al. Demand paging techniques for flash memory using compiler post-pass optimizations
Peng et al. Code sharing among states for stack-caching interpreter
Li et al. A hardware/software codesigned virtual machine to support multiple ISAS

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): BW GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
WWE Wipo information: entry into national phase

Ref document number: 200480042568.4

Country of ref document: CN

NENP Non-entry into the national phase

Ref country code: DE

WWW Wipo information: withdrawn in national office

Country of ref document: DE

122 Ep: pct application non-entry in european phase