WO2002065284A1 - An optimized dynamic bytecode interpreter - Google Patents

An optimized dynamic bytecode interpreter Download PDF

Info

Publication number
WO2002065284A1
WO2002065284A1 PCT/US2002/003716 US0203716W WO02065284A1 WO 2002065284 A1 WO2002065284 A1 WO 2002065284A1 US 0203716 W US0203716 W US 0203716W WO 02065284 A1 WO02065284 A1 WO 02065284A1
Authority
WO
WIPO (PCT)
Prior art keywords
bytecodes
sequence
frequently executed
executed
bytecode
Prior art date
Application number
PCT/US2002/003716
Other languages
French (fr)
Other versions
WO2002065284A8 (en
Inventor
Julius Vanderspek
Original Assignee
Trimedia Technologies, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Trimedia Technologies, Inc. filed Critical Trimedia Technologies, Inc.
Priority to JP2002564736A priority Critical patent/JP2004529413A/en
Priority to EP02706200A priority patent/EP1360584A1/en
Publication of WO2002065284A1 publication Critical patent/WO2002065284A1/en
Publication of WO2002065284A8 publication Critical patent/WO2002065284A8/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45504Abstract machines for programme code execution, e.g. Java virtual machine [JVM], interpreters, emulators
    • G06F9/45508Runtime interpretation or emulation, e g. emulator loops, bytecode interpretation

Definitions

  • the present invention relates generally to code interpretation, and more particularly, to bytecode interpretation.
  • Some programming languages have programs that execute on a virtual machine instead of on a specific hardware platform.
  • a virtual machine executes a bytecode program. Program execution on a virtual machine is divided into two steps. In the first step, the virtual machine determines the virtual machine instructions needed to execute the program These virtual machine instructions are called bytecodes. The second step executes the virtual machine instructions.
  • a bytecode interpreter in the virtual machine steps through each byte in the bytecode and carries out the instruction(s) involved.
  • a bytecode interpreter uses a jump table to translate a particular bytecode to a corresponding native machine code sequence.
  • a bytecode generally translates to a small sequential native program. This process can be inefficient, however, because the interpreter must use the jump table to look up every bytecode that is executed and then execute the corresponding machine code. Also, the machine code may be inefficient. It may perform unnecessary steps in light of the next set of machine code instructions.
  • the present invention relates to an optimized dynamic bytecode interpreter.
  • the interpreter operates in two stages, a profiler stage and a compiler/optimizer stage.
  • the profiler dynamically profiles the bytecodes to select sequences of frequently executed bytecodes.
  • the compiler/optimizer translates the selected sequences of frequently executed bytecodes into machine code.
  • the profiler itself operates in two stages: a method profiling stage and a branch target profiling stage.
  • a method is a function or procedure in object oriented programming.
  • the profiler determines frequently executed methods in the program while it is executing.
  • the branch target profiling stage determines frequently executed sequences of bytecodes for every method found by the method profiling stage.
  • the compiler/optimizer translates the selected sequences of bytecodes into machine code.
  • the compiler/optimizer selects a new, available bytecode and extends a jump table in the virtual machine to include an entry for the sequence of frequently executed bytecodes and the corresponding machine code.
  • the first bytecode in the translated sequence is then replaced with the new bytecode.
  • the virtual machine will encounter the new bytecode and use the jump table to jump directly to the machine code and there will be no need to individually interpret each bytecode in the sequence of frequently executed bytecodes.
  • the described embodiment of the present invention improves the efficiency and execution time of the interpreter.
  • FIG. 1 is an illustration of a system in accordance with a preferred embodiment of the present invention.
  • Fig. 2 is a block diagram illustrating an addition to a bytecode interpreter.
  • Fig. 3 is a block diagram illustrating two stages in a dynamic profiler.
  • Fig. 4 is a flowchart illustrating the operation of a dynamic profiler.
  • Fig. 5 is block diagram illustrating an array of counters in the dynamic profiler.
  • Fig. 6 is a flowchart illustrating virtual machine operation.
  • Fig. 7 is a flowchart illustrating the compiler/optimizer operation of compiling a trace.
  • Fig. 8 is block diagram illustrating a virtual machine jump table.
  • Fig. 9 is block diagram illustrating a virtual machine extended jump table. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
  • FIG. 1 there is shown a virtual machine 4.
  • the virtual machine 4 receives a bytecode in a sequence of bytecodes as an input.
  • the virtual machine processes the bytecode and executes corresponding machine code 12.
  • Figure 2 there is shown a bytecode interpreter in virtual machine 4 that includes a dynamic profiler 6 and a compiler/optimizer 10.
  • the dynamic profiler 6 operates while the bytecodes 2 are being executed.
  • the dynamic profiler 6 determines a set of frequently executed bytecodes 8. That set of frequently executed bytecodes 8 is passed to the compiler/optimizer to be compiled into machine code and used to extend the jump table in the virtual machine.
  • the dynamic profiler 6 receives bytecodes 2 and operates to determine a set of frequently executed bytecodes 8.
  • the dynamic profiler 6 operates at runtime. Thus, the bytecodes are profiled while they are being executed.
  • the dynamic profiler has two stages: a method of profiling stage 13 and a branch target profiling stage 16.
  • the method profiling stage 13 determines a frequently executed method 14.
  • a method is a function or procedure in object oriented programming.
  • the method profiling stage 13 uses two counters, one global and one for each method to determine a frequently executed method 14.
  • One counter, JCOUNT [m] increments when a branch target is executed within method m.
  • the global counter, GJCOUNT increments when a branch target is executed anywhere in the entire program.
  • a method that executes a significant portion of the branch targets in the program is a frequently executed method 14.
  • a frequently executed method Once a frequently executed method has been determined, that method enters the branch target profiling stage 16, which determines sequences of frequently executed bytecodes 8 within a frequently executed method 14. Each sequence of frequently executed bytecodes 8 is then used by the compiler/optimizer 10 to extend the jump table. The compiler/optimizer 10 compiles each sequence of frequently executed bytecodes into its equivalent machine code 12 in the jump table. The machine code 12 is then executed each time the sequence of frequently executed bytecodes 8 is interpreted.
  • a sequence of frequently executed bytecodes is a trace.
  • a trace is a sequence of bytecodes that is executed if its first bytecode is executed.
  • a trace is a sequence of bytecodes that contains no branches.
  • a trace is different from a basic block known in compiler theory.
  • a basic block is a sequence of all bytecodes that are executed if and only if its first bytecode is executed. In a basic block only the first bytecode in the sequence is a branch target. In a trace, the first bytecode in the sequence is a branch target, but other bytecodes in the sequence may also be branch targets. Thus, there may be a trace within a trace.
  • the set of frequently executed bytecodes is a subtrace. A subtrace may be implemented when it is convenient to end a trace prior to the end of the trace. In another embodiment, the set of frequently executed bytecodes may be a basic block. Now referring to Figure 4, there is shown a flowchart illustrating the dynamic profiler
  • the dynamic profiler 6 begins in the method profiling stage 13 to determine a frequently executed method 14. There are two counters, one global and one for each method, in the method profiling stage 13. One counter keeps track of the number of branch targets executed in that particular method 18. In one embodiment of the invention, this counter is labeled JCOUNT. A second counter keeps track of the total number of branch targets executed in the entire bytecode program 20. In one embodiment of the invention, this counter is labeled GJCOUNT.
  • a periodic comparison is made between the two counters 22.
  • the periodic comparison is made whenever one of the counters is in ⁇ emented.
  • JCOUNT > N x GJCOUNT that method enters the next stage of profiling, branch target profiling.
  • N is a predetermined threshold value. In one embodiment of the invention, N is equal to 1/500. The value of N sets a limit on the number of methods that will be chosen as frequently executed methods. Any value of N such that the efficiency of the interpretation is not unduly compromised can be used.
  • the branch target profiling stage uses many counters to keep track of the number of times each branch target has been executed.
  • each bytecode is inspected sequentially 24 as each bytecode is being executed by the bytecode interpreter.
  • the branch target profiling stage determines whether the bytecode is a branch 26. If the bytecode is not a branch, then the next bytecode is inspected as it is executed. If the bytecode is a branch target, then a counter for the corresponding branch target is incremented 28.
  • a counter also keeps track of the total number of branch targets executed in that method 30. In one embodiment of the invention, that counter is JCOUNT.
  • the counter for the particular bytecode branch target is compared to the counter for the total number of branch targets in the method 32. If the particular bytecode branch target is responsible for more than M of the total number of branches, then the particular bytecode branch target is the start of a set of frequently executed bytecodes 34.
  • M is a predetermined threshold value. In one embodiment of the invention, M is equal to 1/5000. Similarly to the value of N, any value of M can be used that will not compromise the desired efficiency of the interpretation.
  • the set of frequently executed bytecodes is a trace and the particular bytecode branch target is the start of a trace.
  • a hitmap is an array ofb counters 38 where b is equal to the number of bytecodes in the method.
  • the offset of the bytecode within its method also indexes the corresponding counter in the hitmap.
  • the hitmap counters are efficient counters.
  • the hitmap counters are incremented only while interpreting branch bytecodes, which is only a fraction of the interpreted bytecodes.
  • the interpreter looks up in the jump table to find the machine code instruction 64 equivalent to the bytecode instructions.
  • the interpreter executes the machine code 66.
  • At the end of the machine code block there is often a return instruction to return from the machine code to the interpreter where the next bytecode gets inspected.
  • the interpreter also should increment the appropriate counters in the hitmap used by the branch target profiling.
  • the interpreter should determine if the bytecode is a branch 68. If the bytecode is not a branch, then the next bytecode is inspected 60. If the bytecode is a branch, GJCOUNT (the counter for counting the total number of branches in a program) is incremented 67.
  • the appropriate counters in the hitmap corresponding to the target of the branch should be incremented 70.
  • the appropriate counters in the hitmap are the counter corresponding to the particular branch target inspected and the counter for the total number of branch targets executed.
  • Periodically the comparisons required for profiling must be performed to determine a frequently executed method and a frequently executed sequence of bytecodes 69.
  • these comparisons are performed every x updates of GJCOUNT, where x is any value that produces the desired efficiency.
  • x is equal to one such that every time GJCOUNT is updated indicating a branch target is interpreted, the profiling counter comparisons are performed. In this way the profiling stages are efficiently interleaved with the other interpretation stages.
  • the interpreter treats it differently from the other bytecodes it inspects 72-76.
  • the interpreter enters a compilation mode 72 and then extends the jump table 74, as described in detail below.
  • the efficiency is improved because any subsequent executions of the frequently executed trace, it is treated as one singfe bytecode, as described in detail below. Instead, the entire trace can be interpreted by the same process used to interpret a single bytecode, looking up in the jump table 64-68.
  • the interpreter enters compilation mode 72.
  • compilation mode the compiler/optimizer compiles the bytecode trace 72 by concatenation of the machine code blocks.
  • the compiler/optimizer extends the jump table at an index of a new bytecode entry with its corresponding machine code location 74.
  • the compiler/optimizer also updates the bytecode program to replace the first bytecode in the trace with the new bytecode indexed in the jump table 76. After extending the jump table and replacing the first bytecode in the frequently executed trace, the interpreter exits compilation mode and returns to interpret mode.
  • the interpreter treats the new bytecode that replaced the first bytecode in the trace the same as any other bytecode, not as a first bytecode in a frequently executed trace.
  • Each bytecode advances a program counter and the new bytecode advances it to skip the remaining bytecodes of the trace.
  • the interpreter will not interpret each bytecode in the trace individually. Instead the interpreter will inspect the new bytecode 60.
  • the new bytecode will not be treated as the start of a frequently executed trace 62.
  • the interpreter will look up the new bytecode in the jump table 64 and execute the corresponding machine code for the entire trace 66.
  • FIG. 7 there is shown a flow chart describing in further detail the operation of the compiler/optimizer while it compiles a trace 72.
  • the compiler/optimizer receives the first bytecode of a frequently executed trace. It uses the jump table to look up the machine code block corresponding to the first bytecode in the frequently executed trace 78. It also looks up in the jump table the machine code blocks corresponding to every other bytecode in the trace 80.
  • the compiler/optimizer determines the last bytecode in the trace in several ways 82. Often at the end of a machine code block there will be a return code instruction. The return code returns from the machine code block to the interpreter to interpret the next bytecode. Occasionally, the return code will not be an instruction to return to the interpreter, instead the instruction will be an instruction that goes deeper into the virtual machine, for example, a print instruction. If there is an instruction to go further into the virtual machine rather than back to the interpreter to inspect the next bytecode in the interpreter, then the compiler/optimizer will end the trace. Also, if there is a branch instruction that causes the interpreter to branch to another location in the bytecode program, the compiler/optimizer will end the trace. Otherwise, if the machine code block contains a return code instruction to return to the interpreter, the compiler will not end the trace.
  • the compiler concatenates the machine code blocks 84 from each bytecode in the frequently executed trace to form the machine language block for the trace. That machine code block will contain return codes from the individual machine code blocks corresponding to each bytecode in the trace. The return codes are easily identifiable.
  • the compiler identifies the return codes 86.
  • the compiler strips the machine code block for thetrace of all but the last return code 86.
  • the compiler optimizes the machine code block using a set of optimization rules that are standard in compiler technology. These optimization techniques are possible because the complier/optimizer has concatenated the machine code blocks for several bytecodes.
  • Figure 8 illustrates the jump table 40 without any extension by the compiler/optimizer. As the interpreter inspects each bytecode, it looks in the jump table 40 for a reference to the bytecode 42 and the corresponding machine code 44. There is a certain portion of the table that is not used by the interpreter 46. That portion of the jump table 40 would remain unused or unmapped without the compiler/optimizer mapping a new bytecode entry.
  • FIG. 9 there is shown the jump table that has been extended by the compiler/optimizer.
  • the interpreter inspects each bytecode, it looks up in the jump table 40 for a reference to the bytecode 42. If the bytecode is the new bytecode referencing the first bytecode of a frequently executed trace 48 that has already been inserted into the table, tien the interpreter will look up that bytecode reference 48 similarly to any other bytecode reference 42. If the bytecode is the start of a frequently executed trace and has not been inserted into the jump table 40, then the compiler/optimizer compiles the bytecodes in the trace and extends the jump table 40. The compiler/optimizer adds a new entry into the jump table 40 in a previously unused or unmapped portion 46 of the jump table 40.
  • the invention disclosed herein can be used to interpret any language that makes use of bytecodes.
  • One example of a language that uses bytecodes is JavaTM.
  • the invention can be used in a JavaTM virtual machine.

Abstract

The present invention relates to bytecode interpretation. The inerpreter selects frequently executed bytecodes and translates them into corresponding machine code. The interpreter also extends a jump table (40) used by the interpreter to index the bytecodes with the machine code (44). The extension includes a reference to the frequently executed bytecodes as well as the corresponding machine code. Thus interpretation is dynamically profiled and optimized.

Description

AN OPTIMIZED DYNAMIC BYTECODE INTERPRETER
BACKGROUND
A. Technical Field
The present invention relates generally to code interpretation, and more particularly, to bytecode interpretation.
B. Background of the Invention
Some programming languages have programs that execute on a virtual machine instead of on a specific hardware platform. A virtual machine executes a bytecode program. Program execution on a virtual machine is divided into two steps. In the first step, the virtual machine determines the virtual machine instructions needed to execute the program These virtual machine instructions are called bytecodes. The second step executes the virtual machine instructions. A bytecode interpreter in the virtual machine steps through each byte in the bytecode and carries out the instruction(s) involved.
A bytecode interpreter uses a jump table to translate a particular bytecode to a corresponding native machine code sequence. A bytecode generally translates to a small sequential native program. This process can be inefficient, however, because the interpreter must use the jump table to look up every bytecode that is executed and then execute the corresponding machine code. Also, the machine code may be inefficient. It may perform unnecessary steps in light of the next set of machine code instructions.
Accordingly, it is desirable to provide a method of interpreting bytecodes that does not involve looking up every bytecode that is executed in the jump table and that can optimize the machine code bytecode to anticipate the next machine code instruction.
SUMMARY OF THE INVENTION
The present invention relates to an optimized dynamic bytecode interpreter. In a described embodiment, the interpreter operates in two stages, a profiler stage and a compiler/optimizer stage. The profiler dynamically profiles the bytecodes to select sequences of frequently executed bytecodes. The compiler/optimizer translates the selected sequences of frequently executed bytecodes into machine code.
The profiler itself operates in two stages: a method profiling stage and a branch target profiling stage. A method is a function or procedure in object oriented programming. In the method profiling stage, the profiler determines frequently executed methods in the program while it is executing. Then, the branch target profiling stage determines frequently executed sequences of bytecodes for every method found by the method profiling stage.
The compiler/optimizer translates the selected sequences of bytecodes into machine code. The compiler/optimizer selects a new, available bytecode and extends a jump table in the virtual machine to include an entry for the sequence of frequently executed bytecodes and the corresponding machine code. The first bytecode in the translated sequence is then replaced with the new bytecode. On a subsequent execution of the sequence of frequently executed bytecodes, the virtual machine will encounter the new bytecode and use the jump table to jump directly to the machine code and there will be no need to individually interpret each bytecode in the sequence of frequently executed bytecodes.
Thus, the described embodiment of the present invention improves the efficiency and execution time of the interpreter.
BRIEF DESCRIPTION OF THE DRAWINGS
Fig. 1 is an illustration of a system in accordance with a preferred embodiment of the present invention.
Fig. 2 is a block diagram illustrating an addition to a bytecode interpreter.
Fig. 3 is a block diagram illustrating two stages in a dynamic profiler.
Fig. 4 is a flowchart illustrating the operation of a dynamic profiler.
Fig. 5 is block diagram illustrating an array of counters in the dynamic profiler. Fig. 6 is a flowchart illustrating virtual machine operation.
Fig. 7 is a flowchart illustrating the compiler/optimizer operation of compiling a trace.
Fig. 8 is block diagram illustrating a virtual machine jump table.
Fig. 9 is block diagram illustrating a virtual machine extended jump table. DETAILED DESCRIPTION OF THE PREFERRED EMBODIMENTS
Now referring to Figure 1, there is shown a virtual machine 4. The virtual machine 4 receives a bytecode in a sequence of bytecodes as an input. The virtual machine processes the bytecode and executes corresponding machine code 12. Now referring to Figure 2, there is shown a bytecode interpreter in virtual machine 4 that includes a dynamic profiler 6 and a compiler/optimizer 10. The dynamic profiler 6 operates while the bytecodes 2 are being executed. The dynamic profiler 6 determines a set of frequently executed bytecodes 8. That set of frequently executed bytecodes 8 is passed to the compiler/optimizer to be compiled into machine code and used to extend the jump table in the virtual machine. The dynamic profiler 6 receives bytecodes 2 and operates to determine a set of frequently executed bytecodes 8. The dynamic profiler 6 operates at runtime. Thus, the bytecodes are profiled while they are being executed.
Now referring to Figure 3, there is shown a block diagram illustrating the operation of the "dynamic profiler 6 of Figure 2. The dynamic profiler has two stages: a method of profiling stage 13 and a branch target profiling stage 16. In one embodiment, the method profiling stage 13 determines a frequently executed method 14. A method is a function or procedure in object oriented programming. The method profiling stage 13 uses two counters, one global and one for each method to determine a frequently executed method 14. One counter, JCOUNT [m], increments when a branch target is executed within method m. The global counter, GJCOUNT, increments when a branch target is executed anywhere in the entire program. A method that executes a significant portion of the branch targets in the program is a frequently executed method 14.
Once a frequently executed method has been determined, that method enters the branch target profiling stage 16, which determines sequences of frequently executed bytecodes 8 within a frequently executed method 14. Each sequence of frequently executed bytecodes 8 is then used by the compiler/optimizer 10 to extend the jump table. The compiler/optimizer 10 compiles each sequence of frequently executed bytecodes into its equivalent machine code 12 in the jump table. The machine code 12 is then executed each time the sequence of frequently executed bytecodes 8 is interpreted. In one embodiment of the invention, a sequence of frequently executed bytecodes is a trace. A trace is a sequence of bytecodes that is executed if its first bytecode is executed. Thus, a trace is a sequence of bytecodes that contains no branches. A trace is different from a basic block known in compiler theory. A basic block is a sequence of all bytecodes that are executed if and only if its first bytecode is executed. In a basic block only the first bytecode in the sequence is a branch target. In a trace, the first bytecode in the sequence is a branch target, but other bytecodes in the sequence may also be branch targets. Thus, there may be a trace within a trace. In another embodiment of the invention, the set of frequently executed bytecodes is a subtrace. A subtrace may be implemented when it is convenient to end a trace prior to the end of the trace. In another embodiment, the set of frequently executed bytecodes may be a basic block. Now referring to Figure 4, there is shown a flowchart illustrating the dynamic profiler
6 operation. The dynamic profiler 6 begins in the method profiling stage 13 to determine a frequently executed method 14. There are two counters, one global and one for each method, in the method profiling stage 13. One counter keeps track of the number of branch targets executed in that particular method 18. In one embodiment of the invention, this counter is labeled JCOUNT. A second counter keeps track of the total number of branch targets executed in the entire bytecode program 20. In one embodiment of the invention, this counter is labeled GJCOUNT.
A periodic comparison is made between the two counters 22. In one embodiment of the invention, the periodic comparison is made whenever one of the counters is inσemented. When JCOUNT > N x GJCOUNT, that method enters the next stage of profiling, branch target profiling. N is a predetermined threshold value. In one embodiment of the invention, N is equal to 1/500. The value of N sets a limit on the number of methods that will be chosen as frequently executed methods. Any value of N such that the efficiency of the interpretation is not unduly compromised can be used. Once a method has been determined to be frequently executed in the method profiling stage, that method enters the branch target profiling stage for that method. For other methods, the method profiling stage may continue to attempt to detect other frequently executed methods. The branch target profiling stage uses many counters to keep track of the number of times each branch target has been executed. In the branch profiling stage, each bytecode is inspected sequentially 24 as each bytecode is being executed by the bytecode interpreter. The branch target profiling stage determines whether the bytecode is a branch 26. If the bytecode is not a branch, then the next bytecode is inspected as it is executed. If the bytecode is a branch target, then a counter for the corresponding branch target is incremented 28. A counter also keeps track of the total number of branch targets executed in that method 30. In one embodiment of the invention, that counter is JCOUNT.
The counter for the particular bytecode branch target is compared to the counter for the total number of branch targets in the method 32. If the particular bytecode branch target is responsible for more than M of the total number of branches, then the particular bytecode branch target is the start of a set of frequently executed bytecodes 34. M is a predetermined threshold value. In one embodiment of the invention, M is equal to 1/5000. Similarly to the value of N, any value of M can be used that will not compromise the desired efficiency of the interpretation.
In one embodiment of the invention, the set of frequently executed bytecodes is a trace and the particular bytecode branch target is the start of a trace.
Now referring to Figure 5, there is shown a hitmap. The branch target profiling stage creates a hitmap to keep track of the various branch targets. A hitmap is an array ofb counters 38 where b is equal to the number of bytecodes in the method. The offset of the bytecode within its method also indexes the corresponding counter in the hitmap. Thus, the hitmap counters are efficient counters. There is also a counter that increments every time a branch is executed within the frequently executed method to keep track of the total number of branch targets executed. In one embodiment of the invention, that counter is JCOUNT. The hitmap counters are incremented only while interpreting branch bytecodes, which is only a fraction of the interpreted bytecodes. When a branch bytecode is interpreted, the counter corresponding to the target of the interpreted branch is incremented. Locating and incrementing the counters is simple in the hitmap format. Thus, this stage of profiling is efficient. When the profiler determines that a particular branch target is responsible for more than M of the total number of branch targets in the frequently executed method, it passes that particular branch target with its following trace to the compiler/optimizer. The complier/optimizer compiles each bytecode in the trace and extends the jump table as described below. Now referring to Figure 6, there is shown the interpreter operation. The interpreter inspects each bytecode in sequence 60. If the bytecode is not a branch, then the interpreter operation continues as it would absent the profiler and the compiler, in an interpret mode. The interpreter looks up in the jump table to find the machine code instruction 64 equivalent to the bytecode instructions. The interpreter executes the machine code 66. At the end of the machine code block there is often a return instruction to return from the machine code to the interpreter where the next bytecode gets inspected. However, in the described embodiment of the invention, the interpreter also should increment the appropriate counters in the hitmap used by the branch target profiling. The interpreter should determine if the bytecode is a branch 68. If the bytecode is not a branch, then the next bytecode is inspected 60. If the bytecode is a branch, GJCOUNT (the counter for counting the total number of branches in a program) is incremented 67. The appropriate counters in the hitmap corresponding to the target of the branch should be incremented 70. The appropriate counters in the hitmap are the counter corresponding to the particular branch target inspected and the counter for the total number of branch targets executed. Periodically the comparisons required for profiling must be performed to determine a frequently executed method and a frequently executed sequence of bytecodes 69. In one embodiment of the present invention, these comparisons are performed every x updates of GJCOUNT, where x is any value that produces the desired efficiency. In one embodiment, x is equal to one such that every time GJCOUNT is updated indicating a branch target is interpreted, the profiling counter comparisons are performed. In this way the profiling stages are efficiently interleaved with the other interpretation stages. If the target of the branch is the start of a frequently executed trace 62, then the interpreter treats it differently from the other bytecodes it inspects 72-76. To achieve more efficient bytecode interpretation, the interpreter enters a compilation mode 72 and then extends the jump table 74, as described in detail below. Thus, the efficiency is improved because any subsequent executions of the frequently executed trace, it is treated as one singfe bytecode, as described in detail below. Instead, the entire trace can be interpreted by the same process used to interpret a single bytecode, looking up in the jump table 64-68.
If the bytecode is the start of a frequently executed trace as determined by the dynamic profiler 62, then the interpreter enters compilation mode 72. In compilation mode, the compiler/optimizer compiles the bytecode trace 72 by concatenation of the machine code blocks. After compilation of the trace, the compiler/optimizer extends the jump table at an index of a new bytecode entry with its corresponding machine code location 74. The compiler/optimizer also updates the bytecode program to replace the first bytecode in the trace with the new bytecode indexed in the jump table 76. After extending the jump table and replacing the first bytecode in the frequently executed trace, the interpreter exits compilation mode and returns to interpret mode.
The interpreter treats the new bytecode that replaced the first bytecode in the trace the same as any other bytecode, not as a first bytecode in a frequently executed trace. Each bytecode advances a program counter and the new bytecode advances it to skip the remaining bytecodes of the trace. Thus, the next time that trace is interpreted by the bytecode interpreter, the interpreter will not interpret each bytecode in the trace individually. Instead the interpreter will inspect the new bytecode 60. The new bytecode will not be treated as the start of a frequently executed trace 62. The interpreter will look up the new bytecode in the jump table 64 and execute the corresponding machine code for the entire trace 66.
Now referring to Figure 7, there is shown a flow chart describing in further detail the operation of the compiler/optimizer while it compiles a trace 72. The compiler/optimizer receives the first bytecode of a frequently executed trace. It uses the jump table to look up the machine code block corresponding to the first bytecode in the frequently executed trace 78. It also looks up in the jump table the machine code blocks corresponding to every other bytecode in the trace 80.
The compiler/optimizer determines the last bytecode in the trace in several ways 82. Often at the end of a machine code block there will be a return code instruction. The return code returns from the machine code block to the interpreter to interpret the next bytecode. Occasionally, the return code will not be an instruction to return to the interpreter, instead the instruction will be an instruction that goes deeper into the virtual machine, for example, a print instruction. If there is an instruction to go further into the virtual machine rather than back to the interpreter to inspect the next bytecode in the interpreter, then the compiler/optimizer will end the trace. Also, if there is a branch instruction that causes the interpreter to branch to another location in the bytecode program, the compiler/optimizer will end the trace. Otherwise, if the machine code block contains a return code instruction to return to the interpreter, the compiler will not end the trace.
The compiler concatenates the machine code blocks 84 from each bytecode in the frequently executed trace to form the machine language block for the trace. That machine code block will contain return codes from the individual machine code blocks corresponding to each bytecode in the trace. The return codes are easily identifiable. The compiler identifies the return codes 86. The compiler strips the machine code block for thetrace of all but the last return code 86. The compiler optimizes the machine code block using a set of optimization rules that are standard in compiler technology. These optimization techniques are possible because the complier/optimizer has concatenated the machine code blocks for several bytecodes.
Now referring to Figure 8, there is shown the jump table used by the bytecode interpreter to index the bytecodes and corresponding machine code instructions. Figure 8 illustrates the jump table 40 without any extension by the compiler/optimizer. As the interpreter inspects each bytecode, it looks in the jump table 40 for a reference to the bytecode 42 and the corresponding machine code 44. There is a certain portion of the table that is not used by the interpreter 46. That portion of the jump table 40 would remain unused or unmapped without the compiler/optimizer mapping a new bytecode entry.
Now referring to Figure 9, there is shown the jump table that has been extended by the compiler/optimizer. As the interpreter inspects each bytecode, it looks up in the jump table 40 for a reference to the bytecode 42. If the bytecode is the new bytecode referencing the first bytecode of a frequently executed trace 48 that has already been inserted into the table, tien the interpreter will look up that bytecode reference 48 similarly to any other bytecode reference 42. If the bytecode is the start of a frequently executed trace and has not been inserted into the jump table 40, then the compiler/optimizer compiles the bytecodes in the trace and extends the jump table 40. The compiler/optimizer adds a new entry into the jump table 40 in a previously unused or unmapped portion 46 of the jump table 40.
There are a finite number of unused entries in the jump table 40. Therefore, it is useful to determine if the frequently executed traces with entries in the unused portion of the jump table are currently frequently executed traces. Interpreted programs usually contain stages that use different methods of different bytecode traces. As the bytecodes continue to be interpreted, some traces that have not been executed recently may occupy memory that would be better served with different traces. It also may become increasingly harder for new frequently methods to exceed the predetermined threshold values, N and M, because GJCOUNT increases over time so that thresholds N and M become harder to reach. A solution to both of these problems is to periodically halve ever counter in the profiler. This halving operation can be performed every J branch targets, where J is a predetermined number. This action reduces the effect of branch targets taken in the past and also has the effect of removing methods and traces that no longer meet the criteria for a frequently executed method and a frequently executed set of bytecodes, respectively.
The invention disclosed herein can be used to interpret any language that makes use of bytecodes. One example of a language that uses bytecodes is Java™. Thus, the invention can be used in a Java™ virtual machine.
From the above description, it will be apparent that the invention disclosed herein provides a novel and advantageous system and method of dynamic optimized bytecode interpretation.

Claims

What is claimed is:
1. A method of dynamically profiling a sequence of bytecodes in a program, the method comprising: determining at runtime a number of branches executed in a method of the program; determining at runtime a total number of branches executed in the program; comparing at runtime the number of branches executed in the method to the total number of branches executed in the program; and determining a frequently executed method based on the comparison of the number of branches executed in the method to the total number of branches executed in the entire program.
2. The method of claim 1, wherein the frequently executed method is one where the number of branches executed in the method is greater than 1/500 of the number of branches executed in the entire program.
3. The method of claim 1, further comprising determining a frequently executed sequence of bytecodes within the frequently executed method.
4. The method of claim 3, wherein the sequence of frequently executed bytecodes is a trace.
5. The method of claim 3, further comprising using the sequence of frequently executed bytecodes to optimize interpretation of the program.
6. The method of claim 3, wherein the bytecodes are Java™ bytecodes.
7. A method of interpreting bytecodes, comprising: determining a sequence of frequently executed bytecodes; compiling the sequence of frequently executed bytecodes into corresponding machine code; creating a new entry in a jump table and labeling the new entry in the jump table with a new bytecode; associating the new bytecode with the corresponding compiled machine code; and replacing a first bytecode in the sequence of frequently executed bytecodes with the new bytecode.
8. The method of claim 7, wherein the determination of the sequence of frequently executed bytecodes is done at runtime.
9. The method of claim 7, wherein the sequence of frequently executed bytecodes is a trace.
10. The method of claim 1, wherein the new jump table entry is in a previously unused entry in the jump table.
11. The method of claim 7, wherein the bytecodes are Java™ bytecodes.
12. The method of claim 7, wherein the determination of the sequence of frequently executed bytecodes further comprises determining a frequently executed method and determining a sequence of frequently executed bytecodes within the frequently executed method.
13. The method of claim 7, wherein the bytecodes are Java™ bytecodes.
14. A method of virtual machine operation to execute a plurality of bytecodes, the method comprising: inspecting a bytecode in the plurality of bytecodes; looking up in a jump table a machine code corresponding to the inspected bytecode, wherein the jump table includes a new bytecode representing a sequence of frequently executed bytecodes and associating the new bytecode to a corresponding machine code; executing the machine code looked up in the jump table; and replacing a first bytecode in the sequence of frequently executed bytecodes with the new bytecode included in the jump table.
15. The method of claim 14, wherein the bytecodes are Java™ bytecodes.
16. The method of claim 14, wherein the determining at runtime the sequence of frequently executed bytecodes comprises determining at runtime a number of branches executed in each method.
17. The method of claim 16, wherein the determining at runtime the sequence of frequently executed bytecodes further comprises determining at runtime a total number of branches executed.
18. The method of claim 17, wherein the determining at runtime the sequence of frequently executed bytecodes further comprises comparing the number of branches executed in each method to the total number of branches executed to determine a frequently executed method.
19. The method of claim 18, wherein the determining at runtime the sequence of frequently executed bytecodes further comprises determining at runtime a number of branches in the frequently executed method.
20. The method of claim 19, wherein the determining at runtime the sequence of frequently executed bytecodes further comprises determining at runtime a number of times e inspected bytecode was executed.
21. The method of claim 20, wherein the determining at runtime the sequence of frequently executed bytecodes further comprises comparing the number of times the inspected bytecode was executed to the number of branches in the method to determine a frequently executed sequence of bytecodes.
22. The method of claim 14, wherein the sequence of frequently executed bytecodes is a trace.
23. A virtual machine receiving a sequence of bytecodes and executing sequences of machine code corresponding to the sequence of bytecodes, comprising: a dynamic profiler having the sequence of bytecodes as an input and a sequence of frequently executed bytecodes as an output; and a jump table including an inserted entry corresponding to the sequence of frequently executed bytecodes and the corresponding machine code.
24. The virtual machine of claim 23, wherein the bytecodes are Java™ bytecodes.
25. The virtual machine of claim 23, wherein a first bytecode in the sequence of frequently executed bytecodes is replaced by the inserted entry in the bytecode table.
26. The virtual machine of claim 23, wherein the dynamic profiler further comprises a first detector to detect a frequently executed method and a second detector to detect the sequence of frequently executed bytecodes within the frequently executed method.
27. A machine readable medium storing a set of instructions for interpreting bytecode, the set of instructions comprising: dynamically profiling a bytecode sequence to determine a sequence of frequently executed bytecodes; extending a jump table to include a new bytecode entry representing the sequence of frequently executed bytecodes and machine code equivalent to the sequence of frequently executed bytecodes; replacing a first bytecode in the sequence of frequently executed bytecodes with the new bytecode entry; and looking up in the jump table the machine code equivalent of the sequence of frequently executed bytecodes.
28. The machine readable medium of claim 27 wherein the bytecodes are Java™ bytecodes.
29. The machine readable medium of claim 27 wherein the sequence of frequently executed bytecodes is a trace.
30. The machine readable medium of claim 27 wherein the dynamic profiling further comprises determining a frequently executed method and determining the sequence of frequently executed bytecodes within the frequently executed method.
PCT/US2002/003716 2001-02-12 2002-02-08 An optimized dynamic bytecode interpreter WO2002065284A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
JP2002564736A JP2004529413A (en) 2001-02-12 2002-02-08 Optimized dynamic bytecode interpreter
EP02706200A EP1360584A1 (en) 2001-02-12 2002-02-08 An optimized dynamic bytecode interpreter

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US78234401A 2001-02-12 2001-02-12
US09/782,344 2001-02-12

Publications (2)

Publication Number Publication Date
WO2002065284A1 true WO2002065284A1 (en) 2002-08-22
WO2002065284A8 WO2002065284A8 (en) 2003-11-06

Family

ID=25125754

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2002/003716 WO2002065284A1 (en) 2001-02-12 2002-02-08 An optimized dynamic bytecode interpreter

Country Status (3)

Country Link
EP (1) EP1360584A1 (en)
JP (1) JP2004529413A (en)
WO (1) WO2002065284A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013257916A (en) * 2004-03-05 2013-12-26 Oracle America Inc Method and device for determining frequency of executing method compiled in virtual machine
US11392357B2 (en) * 2019-12-13 2022-07-19 Sap Se Delegating bytecode runtime compilation to serverless environment

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2010140233A (en) * 2008-12-11 2010-06-24 Nec Computertechno Ltd Emulation system and emulation method
KR20120083803A (en) * 2011-01-18 2012-07-26 삼성전자주식회사 Extra code generating apparatus and method for virtual machine

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5784553A (en) * 1996-01-16 1998-07-21 Parasoft Corporation Method and system for generating a computer program test suite using dynamic symbolic execution of JAVA programs
US5966536A (en) * 1997-05-28 1999-10-12 Sun Microsystems, Inc. Method and apparatus for generating an optimized target executable computer program using an optimized source executable
US6256752B1 (en) * 1998-07-24 2001-07-03 International Business Machines Corporation Method and apparatus for dynamic swappable bytecode loop in java virtual machines
US6298477B1 (en) * 1998-10-30 2001-10-02 Sun Microsystems, Inc. Method and apparatus for selecting ways to compile at runtime
US6321375B1 (en) * 1998-05-14 2001-11-20 International Business Machines Corporation Method and apparatus for determining most recently used method
US6327699B1 (en) * 1999-04-30 2001-12-04 Microsoft Corporation Whole program path profiling

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5784553A (en) * 1996-01-16 1998-07-21 Parasoft Corporation Method and system for generating a computer program test suite using dynamic symbolic execution of JAVA programs
US5966536A (en) * 1997-05-28 1999-10-12 Sun Microsystems, Inc. Method and apparatus for generating an optimized target executable computer program using an optimized source executable
US6321375B1 (en) * 1998-05-14 2001-11-20 International Business Machines Corporation Method and apparatus for determining most recently used method
US6256752B1 (en) * 1998-07-24 2001-07-03 International Business Machines Corporation Method and apparatus for dynamic swappable bytecode loop in java virtual machines
US6298477B1 (en) * 1998-10-30 2001-10-02 Sun Microsystems, Inc. Method and apparatus for selecting ways to compile at runtime
US6327699B1 (en) * 1999-04-30 2001-12-04 Microsoft Corporation Whole program path profiling

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2013257916A (en) * 2004-03-05 2013-12-26 Oracle America Inc Method and device for determining frequency of executing method compiled in virtual machine
US11392357B2 (en) * 2019-12-13 2022-07-19 Sap Se Delegating bytecode runtime compilation to serverless environment

Also Published As

Publication number Publication date
JP2004529413A (en) 2004-09-24
EP1360584A1 (en) 2003-11-12
WO2002065284A8 (en) 2003-11-06

Similar Documents

Publication Publication Date Title
US5751982A (en) Software emulation system with dynamic translation of emulated instructions for increased processing speed
US7725883B1 (en) Program interpreter
US5768593A (en) Dynamic cross-compilation system and method
US20020013938A1 (en) Fast runtime scheme for removing dead code across linked fragments
US5966539A (en) Link time optimization with translation to intermediate program and following optimization techniques including program analysis code motion live variable set generation order analysis, dead code elimination and load invariant analysis
US6591416B1 (en) Interpreting functions utilizing a hybrid of virtual and native machine instructions
US5857105A (en) Compiler for reducing number of indirect calls in an executable code
US6345384B1 (en) Optimized program code generator, a method for compiling a source text and a computer-readable medium for a processor capable of operating with a plurality of instruction sets
US7823140B2 (en) Java bytecode translation method and Java interpreter performing the same
EP0373361A2 (en) Generating efficient code for a computer with dissimilar register spaces
US20070136561A1 (en) Systems, Methods, And Computer Program Products For Packing Instructions Into Register Files
JP2000267862A (en) Hybrid just-in-time compiler for minimizing consumption of resources
JPH04330527A (en) Optimization method for compiler
US5960197A (en) Compiler dispatch function for object-oriented C
Driesen et al. Message dispatch on pipelined processors
CN104407968B (en) A kind of method that the code command longest run time is calculated by static analysis
US5555412A (en) Complier and method for alias checking in a complier
GB2348305A (en) Instruction execution mechanism
US10146565B2 (en) Method for executing a computer program with a parameterised function
WO2002065284A1 (en) An optimized dynamic bytecode interpreter
Berndl et al. Dynamic profiling and trace cache generation
US20050149912A1 (en) Dynamic online optimizer
Haber et al. Optimization opportunities created by global data reordering
US10140135B2 (en) Method for executing a computer program with a parameterised function
JP3327674B2 (en) Program translation apparatus and method

Legal Events

Date Code Title Description
AK Designated states

Kind code of ref document: A1

Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NO NZ OM PL PT RO RU SD SE SG SI SK SL TJ TM TR TT TZ UA UG UZ VN YU ZA ZW

AL Designated countries for regional patents

Kind code of ref document: A1

Designated state(s): GH GM KE LS MW MZ SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE CH CY DE DK ES FI FR GB GR IE IT LU MC NL PT SE TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG

121 Ep: the epo has been informed by wipo that ep was designated in this application
DFPE Request for preliminary examination filed prior to expiration of 19th month from priority date (pct application filed before 20040101)
WWE Wipo information: entry into national phase

Ref document number: 2002564736

Country of ref document: JP

WWE Wipo information: entry into national phase

Ref document number: 2002706200

Country of ref document: EP

CFP Corrected version of a pamphlet front page
CR1 Correction of entry in section i

Free format text: IN PCT GAZETTE 34/2002 DUE TO A TECHNICAL PROBLEM AT THE TIME OF INTERNATIONAL PUBLICATION, SOME INFORMATION WAS MISSING (81). THE MISSING INFORMATION NOW APPEARS IN THE CORRECTED VERSION.

Free format text: IN PCT GAZETTE 34/2002 DUE TO A TECHNICAL PROBLEM AT THE TIME OF INTERNATIONAL PUBLICATION, SOME INFORMATION WAS MISSING (81). THE MISSING INFORMATION NOW APPEARS IN THE CORRECTED VERSION.

WWP Wipo information: published in national office

Ref document number: 2002706200

Country of ref document: EP

REG Reference to national code

Ref country code: DE

Ref legal event code: 8642

WWW Wipo information: withdrawn in national office

Ref document number: 2002706200

Country of ref document: EP