WO2021036173A1 - 解释执行字节码指令流的方法及装置 - Google Patents

解释执行字节码指令流的方法及装置 Download PDF

Info

Publication number
WO2021036173A1
WO2021036173A1 PCT/CN2020/071560 CN2020071560W WO2021036173A1 WO 2021036173 A1 WO2021036173 A1 WO 2021036173A1 CN 2020071560 W CN2020071560 W CN 2020071560W WO 2021036173 A1 WO2021036173 A1 WO 2021036173A1
Authority
WO
WIPO (PCT)
Prior art keywords
instruction
current
register
memory
value
Prior art date
Application number
PCT/CN2020/071560
Other languages
English (en)
French (fr)
Inventor
刘晓建
Original Assignee
创新先进技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 创新先进技术有限公司 filed Critical 创新先进技术有限公司
Priority to US16/786,856 priority Critical patent/US10802854B2/en
Publication of WO2021036173A1 publication Critical patent/WO2021036173A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline or look ahead
    • G06F9/3802Instruction prefetching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/32Address formation of the next instruction, e.g. by incrementing the instruction counter
    • G06F9/321Program or instruction counter, e.g. incrementing

Definitions

  • One or more embodiments of this specification relate to the computer field, and more particularly to methods and devices for interpreting and executing bytecode instruction streams.
  • a virtual machine (Virtual Machine) is a complete computer system with complete hardware system functions that is simulated by software and runs in a completely isolated environment. Because the virtual machine can isolate the impact of the underlying hardware platform and operating system on the upper-layer applications, it is very beneficial to the development of the upper-layer applications. There is no need to pay attention to the details of the underlying platform during the upper application development process, only the specific business logic.
  • the upper layer application is run by the virtual machine, which is responsible for converting the application code into code suitable for execution on the underlying platform.
  • upper-layer applications are written and developed by developers using high-level languages, and then compiled into bytecode by a compiler.
  • Bytecode is a binary file consisting of a sequence of op code (operation code)/data pairs that contains an executable program, and is a kind of intermediate code. Then, the interpreter in the virtual machine interprets and executes the instruction stream represented by the bytecode.
  • a virtual machine in each node of the blockchain network.
  • the Ethereum virtual machine EVM is deployed in each node.
  • Users can write a smart contract in a high-level language, and then compile it into bytecode by a compiler, include the bytecode in the transaction that creates the smart contract, and publish it to the blockchain network, that is, deploy to the blockchain network Of each node.
  • the virtual machine EVM in each node interprets and executes the bytecode.
  • the execution speed of bytecode interpretation by the virtual machine interpreter is critical to the performance of the entire system. Therefore, it is hoped that there will be an improved scheme to further improve the execution efficiency of the bytecode instruction stream.
  • One or more embodiments of this specification describe a method and device for interpreting and executing bytecode instruction streams, in which the function address of the next instruction is prefetched when the current instruction is executed, and the function address of the next instruction is stored in a register, thereby speeding up the word The execution efficiency of the code instruction stream.
  • a method for interpreting and executing bytecode instruction stream is provided, which is executed by a virtual machine interpreter, including:
  • the first value is a valid value
  • storing the first value in a second register where the second register is used to store the current analog function address corresponding to the current instruction in the bytecode instruction stream;
  • the above method further includes, when the first value is not a valid value, obtaining the current simulation function address corresponding to the current instruction from the memory, and storing it in the second register.
  • the address of the next simulation function is obtained in the following manner:
  • mapping table stored in the memory is queried to obtain the analog function address corresponding to the opcode.
  • the operation code corresponding to the next instruction is determined by the following method: accumulate the PC value of the program counter with a predetermined byte length to obtain the position number of the next instruction; query the memory according to the position number In the instruction sequence table stored in, get the opcode corresponding to the next instruction.
  • the operation code corresponding to the next instruction is determined by: determining the instruction length of the current instruction; accumulating the PC value of the program counter to the instruction length to obtain the position number of the next instruction; The position number is used to query the instruction sequence table stored in the memory to obtain the operation code corresponding to the next instruction.
  • the foregoing memory may be a cache or a memory; the querying the mapping table stored in the memory and/or querying the instruction sequence table stored in the memory includes: querying in the cache; In the case of the query, the query is performed in memory.
  • executing the current instruction specifically includes:
  • the method further includes: judging the next simulation function address corresponding to the next designation according to the next simulation function address Whether the function is loaded into the cache; if not, load the next simulation function into the cache.
  • the bytecode instruction stream is a bytecode instruction stream compiled by a smart contract
  • the virtual machine is a WASM virtual machine or a Solidity virtual machine.
  • a device for interpreting and executing bytecode instruction streams which is deployed in an interpreter of a virtual machine, and includes:
  • a reading unit configured to read the first value stored in the first register
  • the storage unit is configured to store the first value in a second register when the first value is a valid value, and the second register is used to store the current instruction corresponding to the current instruction in the bytecode instruction stream. Simulate function address;
  • the prefetch and execution unit is configured to obtain the next simulated function address corresponding to the next instruction of the current instruction from the memory, and store the next simulated function address in the first register, and, according to the slave The current simulation function address read in the second register executes the current instruction.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method of the first aspect.
  • a computing device including a memory and a processor, characterized in that executable code is stored in the memory, and when the processor executes the executable code, the method of the first aspect is implemented .
  • the simulation function address corresponding to the next instruction is prefetched in advance and stored in the register, and the simulation function of the current instruction is executed at the same time. After executing the current instruction, you can directly read the analog function address required by the next instruction from the register for execution. Because the CPU accesses registers extremely fast, this method can greatly reduce the time-consuming execution of bytecode instruction streams and improve execution efficiency.
  • Figure 1 shows a schematic diagram of an application scenario in an embodiment
  • Figure 2 shows a schematic diagram of the process of explaining the execution of bytecode instruction streams in an embodiment
  • Fig. 3 shows a flow chart of a method for explaining the execution of bytecode instruction stream according to an embodiment
  • Figure 4 shows a schematic flow diagram of a complete step of explaining the execution of a bytecode instruction stream according to an embodiment
  • Fig. 5 shows a schematic block diagram of an apparatus for explaining a bytecode instruction stream according to an embodiment.
  • Fig. 1 shows a schematic diagram of an application scenario in an embodiment.
  • a program written in a high-level language is compiled into a bytecode file by a compiler, and the instruction stream represented by the bytecode is interpreted by the virtual machine interpreter to make it in the CPU carried out.
  • Figure 2 shows a schematic diagram of the process of explaining the execution of the bytecode instruction stream in an embodiment.
  • the virtual machine before executing the bytecode file, the virtual machine first loads the bytecode file into the memory, and obtains the instruction sequence table shown in Table A in FIG. 2.
  • Table A 100, 108, etc. on the left side are position numbers, and 20, 31, etc. on the right side, exemplarily represent opcodes. You can see that each opcode is a two-digit hexadecimal number with a length of one byte, which is why it is called a bytecode.
  • the virtual machine uses a program counter (Program Counter) to record the position number of the currently executed instruction, and the value of the program counter is also called the PC value. Therefore, according to Table A, when the PC value is 100, it means that the operation code 20 is currently to be executed.
  • Program Counter Program Counter
  • mapping table which indicates the simulation function corresponding to each operation code.
  • opcode 10 corresponds to the Move function
  • opcode 20 corresponds to the Add function
  • opcode 31 corresponds to the JMP function
  • the mapping table of Table B is sometimes called an instruction set, which is used to record the meaning of the operation instruction corresponding to each operation code (reflected by the simulation function).
  • the mapping table does not record the instruction code itself included in the simulation function, but records the address where the instruction code of the simulation function is stored.
  • the address of the simulation function corresponding to the operation code currently to be executed can be obtained, and the instruction code of the simulation function can be accessed through the address.
  • the instruction code of the simulation function can be in a form suitable for machine execution. Therefore, the instruction represented by each operation code can be executed by executing the instruction code of the simulation function.
  • the current PC value is 100
  • the current operation code to be executed in the bytecode instruction stream is 20.
  • look up table B to obtain the address of the simulation function corresponding to the operation code 20, and execute the code in the simulation function according to the address, thereby executing the current instruction.
  • the bytecode file is an instruction stream that is executed sequentially, the PC value will be sequentially accumulated except for individual jump instructions.
  • the PC value is accumulated to 108, pointing to the next instruction as the current instruction to be executed.
  • the operation code in the instruction currently to be executed at position number 108 is 31.
  • the analog function address corresponding to operation code 31 is obtained, and the code therein is executed, thereby executing the current instruction.
  • the table A and the table B will at least be stored in the memory. Furthermore, considering the size of the cache (cache) in the CPU and the frequency with which Table A and Table B are accessed, Table A and/or Table B may also be stored in the cache. In the process of executing instructions, the CPU will prioritize access to the cache in order to query Table A and Table B, and then access the memory in the case of a cache miss. For most current CPUs, accessing the L1 level cache requires more than 10 clock cycles (Cycle), accessing the L2 level cache requires more than 20 clock cycles, and if accessing the memory, more than 200 clock cycles are required.
  • an address prefetching scheme is proposed in the embodiment of this specification, that is, the analog function address corresponding to the next instruction is prefetched in advance and stored in the register. , Simultaneously execute the simulation function of the current instruction. After executing the current instruction, you can directly read the analog function address required by the next instruction from the register for execution. Because the CPU accesses the register very fast (only 1 clock cycle), this method can greatly reduce the time-consuming execution of the bytecode instruction stream and improve the execution efficiency. The following describes the implementation of the inventive concept.
  • Fig. 3 shows a flow chart of a method for explaining the instruction stream of executing bytecode according to an embodiment. It can be understood that this method can be executed by a virtual machine interpreter, where the virtual machine can be deployed in any device, device, platform, or device cluster with computing and processing capabilities.
  • the virtual machine may be some general virtual machines, such as a Java virtual machine and a Python virtual machine, so as to interpret and execute bytecode files corresponding to various applications.
  • the virtual machine may be a virtual machine used to execute smart contracts in a blockchain network, such as an EVM virtual machine, a WASM virtual machine, and a Solidity virtual machine. These virtual machines can use the interpreter to interpret the bytecode instruction stream generated after executing the smart contract compilation.
  • two registers are used to prefetch the address of the analog function.
  • these two registers are called the first register and the second register.
  • the first register is set to store the analog function address corresponding to the next instruction
  • the second register is set to store the analog function address corresponding to the current instruction.
  • Figure 3 specifically shows the method steps implemented in any instruction execution loop. For clarity and simplicity of description, the description will be given in conjunction with the situation where the nth instruction is to be executed in the nth loop. At this time, the steps to be executed in the loop are described.
  • the nth instruction is called the current instruction, where n>1.
  • the current instruction and the next instruction referred to in the description with reference to Table A in FIG. 2 and FIG. 3 are all opcode instructions to distinguish them from machine instructions directly executed by the processor.
  • step 31 the first value stored in the first register is read, and the validity of the first value is judged.
  • the first register is used to store the analog function address corresponding to the next instruction. Therefore, if the value stored in the first register is invalid, it indicates that the prefetch of the address required by the nth instruction is invalid when the previous n-1 instruction is executed; if the value stored in the first register is valid, then It indicates that the prefetch operation of the address required by the nth instruction is successful when the previous n-1th instruction is executed, and the first value is the function address required by the nth instruction. Specifically, in an example, if the first value stored in the first register is zero, it is an invalid value, otherwise it is a valid value. In another example, whether it is a valid value can also be shown in other ways, for example, a certain bit in the register is set as a status bit, and the status bit is used to indicate whether the previous round of prefetching was successful.
  • the first value is a valid value, it indicates that the previous round of prefetching was successful. Therefore, in step 33, the first value is stored in the second register. Therefore, the above-mentioned first value is stored in the second register, that is, the address of the simulation function corresponding to the nth instruction, that is, the address of the current simulation function corresponding to the current instruction. In this way, the update of the instruction execution round or cycle is realized.
  • the first register can be used to store the function address required by the next instruction again, and the second register stores the function address of the current instruction.
  • step 35 obtain the next simulated function address corresponding to the next instruction of the current nth instruction (the n+1th instruction) by accessing the memory, and store the next simulated function address in the above-mentioned first register And, according to the current simulation function address read from the second register, the current instruction is executed.
  • step 35 includes the operation of prefetching the function address of the next instruction and the operation of executing the current instruction, which will be described separately below.
  • the operation of prefetching the function address of the next instruction is described. This operation is similar to the conventional process of obtaining the analog function address, and is still implemented by querying Table A and Table B shown in FIG. 2. Specifically, the operation of prefetching the function address of the next instruction may include the following process.
  • each instruction is a fixed-length instruction of a predetermined length.
  • each instruction corresponds to an opcode, and the length is one byte of a fixed length.
  • the lengths of the instructions included in the instruction stream are different.
  • the bytecode is encoded by leb 128, and the leb 128 encoding is an indefinite length encoding.
  • the length of each instruction is not consistent.
  • first determine the instruction length of the current instruction add the PC value of the program counter to the instruction length, and get the position number of the next instruction. Then, according to the position number, query the instruction sequence table to determine the operation code corresponding to the next instruction.
  • next simulation function after the address of the next simulation function is obtained, it is determined whether the corresponding next simulation function is loaded into the cache according to the address. If not, load the next simulation function into the cache. In this way, when the next simulation function is not loaded into the cache, it can be loaded into the cache in advance to avoid time-consuming access to the memory when the next instruction is executed, and further speed up the execution of the instruction.
  • step 35 the current simulation function address is read from the second register, and the current simulation function and address are obtained according to the address.
  • the instruction code to realize the simulation function is obtained according to the address.
  • judge whether the current instruction will change the execution order of the instruction stream according to the simulation function that is, judge whether the current instruction is a jump instruction, if it is, it means that the next instruction may not be the instruction determined by executing the update PC value in the above order.
  • the prefetching of an instruction address may be wrong. Therefore, the first register is set to an invalid value, for example, it is reset to zero, or the status bit is set to be invalid.
  • the instruction code of the simulation function corresponding to the current instruction is executed. If the current instruction is not a jump instruction, the instruction code of the simulation function is directly executed.
  • the instruction code is a machine code suitable for direct execution by a CPU.
  • the machine instruction can be directly executed according to the machine code, thereby executing the current operation code instruction.
  • the instruction code is a code that is closer to the form of machine instructions than bytecode, such as assembly code. In this case, the instruction code can be converted into machine code, and the current operation can be executed by executing the machine code. Code instructions.
  • step 35 includes the prefetch operation of prefetching the function address of the next instruction, and the execution operation of executing the current instruction.
  • the prefetch operation since the analog function address must be obtained by querying the mapping table, and the mapping table is stored in the cache or memory, the prefetching of the next instruction address still needs to be achieved by accessing the memory (cache or memory). Assume that it takes time T1 to prefetch the analog function address of the next instruction.
  • the execution of the current instruction is realized by executing the corresponding simulation function, and the simulation function generally corresponds to the instruction stream composed of many machine instructions, so the execution of the current instruction generally takes more time, which is recorded as T2 .
  • the execution time T2 of the current instruction is unavoidable. Therefore, it is hoped that the execution time T1 required to prefetch the address of the next instruction can be "hidden", for example, "hidden” in T2, so that prefetch operations are as few as possible Taking up time alone can improve the overall execution efficiency.
  • step 35 when step 35 is performed, the prefetch operation and the execution operation of the current instruction can be executed in parallel in different execution units (ie, processors) of the CPU, so that the prefetch operation does not take up time at all.
  • execution units ie, processors
  • a more general solution is adopted, that is, at the logic level of the interpreter, the prefetch operation is performed first, and then the current simulation function is executed.
  • the prefetch operation is performed first, and then the current simulation function is executed.
  • Parallel execution hides the execution time of the prefetch operation.
  • most current processors adopt a pipeline method to split machine instructions into smaller sub-processes, and multiple sub-processes can be executed in parallel, thereby supporting the parallel execution of multiple machine instructions.
  • both operations are converted to a series of machine instruction streams. Since the prefetch operation and the execution of the current simulation function are not dependent, therefore, the prefetch operation has not been completed, and the processor will be chaotic in parallel when some machine instructions need to wait (such as waiting when accessing the cache or accessing the memory).
  • the subsequent machine instructions are executed sequentially, that is, the current simulation function is executed, and then the results of the out-of-order execution are reordered to obtain the execution result of the current simulation function. In this way, the parallel execution of the two is effectively realized, and the time T1 of the prefetch operation is hidden in the execution time T2 of the current simulation function.
  • the address of the next simulation function has been prefetched successfully and stored in the first register. Therefore, by executing the method in FIG. 3 again, the next round of instruction execution is entered, in which the required address can be directly obtained from the register, and the simulation function can be executed. In this way, during each round of instruction execution, the time to obtain the simulated function address from the memory is hidden or eliminated, and the execution of the instruction is accelerated.
  • the current instruction is a jump instruction
  • the jump location needs to be determined depending on the execution result of the current instruction, so the next instruction determined in advance may be possible
  • the next instruction determined in advance may be possible
  • set the first register to an invalid value.
  • step 34 in FIG. 3 That is, when the value in the first register is not a valid value, the current simulation function address corresponding to the current instruction is obtained from the memory and stored in the second register. This is equivalent to the previous round of speculative prefetching unsuccessful. In this round, the current analog function address of the current instruction is normally obtained in a conventional manner. And, the current simulation function address is stored in the second register. Then, still perform the above step 35, try to prefetch the address of the next instruction, and execute the simulation function of the current instruction.
  • FIG. 4 shows a schematic flow diagram of a complete step of explaining the execution of a bytecode instruction stream according to an embodiment. The following describes the process shown in FIG. 4 in conjunction with the instruction sequence table shown in Table A in FIG. 2.
  • the operation code of the initial instruction is obtained.
  • the operation code of the initial instruction is obtained according to the preset entry position of the bytecode instruction stream, that is, the entry PC value. In the example in Figure 2, assuming that the entry PC value is 100, then the operation code of the initial instruction is 20.
  • step 402 the analog function address corresponding to the initial instruction is obtained and stored in the first register.
  • the address of the corresponding analog function Add can be obtained. Assuming it is the address A, the address A is stored in the first register.
  • step 403 the position number of the next instruction is determined.
  • the PC value is updated to 108 as the position number of the next instruction.
  • step 404 it is determined whether an end condition is encountered.
  • the end condition can have various settings, such as the completion of the instruction stream, overflow, and so on.
  • step 405 determines whether the first register is a valid value. Since the effective value address A is currently stored in the first register, the flow proceeds to step 406, and the value in the first register is stored in the second register. Thus, the address A is stored in the second register.
  • step 407 the next instruction is obtained according to the position number of the next instruction. Since in step 403, the PC value is updated to 108, the next instruction can be obtained as operation code 31 according to the position number.
  • step 408 the analog function address corresponding to the next instruction is obtained and stored in the first register.
  • the mapping table by querying the mapping table, it can be obtained that the opcode 31 of the next instruction corresponds to the JMP function, and the address of the JMP function and the address J are stored in the first register accordingly. At this time, the value in the first register is updated.
  • step 409 the position number of the next instruction is updated again.
  • the PC value is updated to 116.
  • step 410 the simulation function corresponding to the current instruction is executed according to the address in the second register. That is, according to the aforementioned address A, the Add function is executed.
  • steps 407-408 are not dependent on step 410, the execution time of steps 407-408 is hidden in the execution process of step 410 through the parallel execution of machine instructions.
  • step 404 After the Add function corresponding to the first instruction is executed, the flow returns to step 404. Since the end condition is not encountered, proceed to step 405 to determine whether the first register is a valid value. At this time, the effective value address J is stored in the first register, then the flow proceeds to step 406, and the value in the first register is stored in the second register.
  • step 407 the next instruction is obtained according to the position number of the next instruction. Since in step 409 of the previous round, the PC value is updated to 116, the next instruction can be obtained as opcode 60 according to the position number.
  • step 408 the analog function address corresponding to the next instruction is obtained and stored in the first register. Similarly, by querying the mapping table, it can be obtained that the opcode 60 of the next instruction corresponds to the Push function, and correspondingly, the address of the Push function and the address P are stored in the first register.
  • step 409 the position number of the next instruction is updated again.
  • the PC value is updated to 124.
  • step 410 according to the address J in the second register, the simulation function JMP corresponding to the current instruction is executed.
  • the function is a conditional jump function
  • the jump target is determined according to the value of a certain parameter in the execution.
  • the simulation function is executed, the value stored in the first register is first set to be invalid, for example, the value in the first register is cleared to zero, or its status bit is set to be invalid, and so on.
  • the instruction to be executed next can be determined. Suppose that by executing this function, the instruction to jump to 132 is determined, and the PC value is reset to 132.
  • step 405 it is judged whether the first register is a valid value. At this time, the first register has been cleared or set to be invalid, so the judgment of step 405 is no, and step 411 of another branch is entered.
  • step 411 the operation code of the current instruction is obtained.
  • the PC is reset to 132, so the operation code 10 at 132 can be obtained as the operation code of the current instruction.
  • step 412 the current simulation function address of the current instruction is obtained and stored in the second register. Therefore, by querying the mapping table, the Move function corresponding to the opcode 10 is obtained, and the address M is stored in the second register.
  • step 413 the position number of the next instruction is obtained. Therefore, the PC value is further accumulated and updated to 140.
  • a device for interpreting and executing bytecode instruction streams is provided.
  • the device is deployed in an interpreter of a virtual machine.
  • the virtual machine can be installed on any device, platform or device with computing and processing capabilities.
  • Fig. 5 shows a schematic block diagram of an apparatus for explaining a bytecode instruction stream according to an embodiment. As shown in FIG. 5, the device 500 includes:
  • the reading unit 51 is configured to read the first value stored in the first register
  • the storage unit 53 is configured to store the first value in a second register when the first value is a valid value, and the second register is used to store the corresponding instruction in the bytecode instruction stream.
  • the prefetch and execution unit 55 is configured to obtain the next simulated function address corresponding to the next instruction of the current instruction from the memory, and store the next simulated function address in the first register, and according to the slave The current simulation function address read in the second register executes the current instruction.
  • the device 500 further includes an address obtaining unit 54 configured to obtain the current simulation function address corresponding to the current instruction from the memory when the first value is not a valid value, and store it in the In the second register.
  • the prefetching and executing unit 55 includes a prefetching module 551, and the prefetching module 551 is configured to:
  • mapping table stored in the memory is queried to obtain the analog function address corresponding to the opcode.
  • the prefetching module 551 is further configured to:
  • the instruction sequence table stored in the memory is queried to obtain the operation code corresponding to the next instruction.
  • the prefetch module 551 is further configured to:
  • the instruction sequence table stored in the memory is queried to obtain the operation code corresponding to the next instruction.
  • the memory may be a cache or a memory; accordingly, the prefetch module 551 is configured to: query the mapping table and/or the instruction sequence table in the cache; In this case, the query is performed in memory.
  • the prefetching and executing unit 55 includes an execution module 552, and the execution module 552 is configured to:
  • the prefetching and executing unit 55 is further configured to:
  • next simulation function it is determined whether the next simulation function corresponding to the next designation is loaded into the cache; if not, the next simulation function is loaded into the cache.
  • the bytecode instruction stream is a bytecode instruction stream compiled by a smart contract
  • the virtual machine is a WASM virtual machine or a Solidity virtual machine.
  • a computer-readable storage medium having a computer program stored thereon, and when the computer program is executed in a computer, the computer is caused to execute the method described in conjunction with FIG. 3 and FIG. 4.
  • a computing device including a memory and a processor, the memory is stored with executable code, and when the processor executes the executable code, a combination of FIGS. 3 and 4 is implemented. The method described.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本说明书实施例提供一种解释执行字节码指令流的方法和装置,通过虚拟机的解释器实现,其中利用第一寄存器存储下一条指令的模拟函数地址,利用第二寄存器存储当前指令的模拟函数地址。在方法中,首先读取第一寄存器中存储的第一值;当第一值为有效值时,将第一值存储到第二寄存器中,作为字节码指令流中当前指令对应的当前模拟函数地址。然后,从存储器获取当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在第一寄存器中,并且,根据从第二寄存器中读取的当前模拟函数地址,执行当前指令。

Description

解释执行字节码指令流的方法及装置 技术领域
本说明书一个或多个实施例涉及计算机领域,尤其涉及解释执行字节码指令流的方法和装置。
背景技术
虚拟机(Virtual Machine)是通过软件模拟的具有完整硬件系统功能的、运行在一个完全隔离环境中的完整计算机系统。由于虚拟机可以隔离底层硬件平台以及操作系统对上层应用的影响,因此非常有利于上层应用的开发。上层应用开发过程中无需关注底层平台的细节,只需要关注具体的业务逻辑。开发完成后,由虚拟机运行上层应用,负责将应用的代码转换为适于底层平台执行的代码。具体地,在许多场景中,上层应用由开发人员使用高级语言编写开发,之后通过编译器编译为字节码(bytecode)。字节码是一种包含执行程序,由一序列op代码(操作码)/数据对组成的二进制文件,是一种中间码。然后,虚拟机中的解释器对字节码代表的指令流进行解释和执行。
例如,在支持智能合约的区块链应用场景中,可以在区块链网络的每个节点中部署虚拟机。例如,在以太坊中,每个节点中部署有以太坊虚拟机EVM。用户可以用高级语言编写智能合约,然后经由编译器编译为字节码之后,将该字节码包含在创建智能合约的交易中,发布到区块链网络中,也就是部署到区块链网络的各个节点中。在需要执行智能合约时,由各个节点中的虚拟机EVM对该字节码进行解释执行。
在包括但不限于区块链的各种应用场景中,虚拟机解释器对字节码的解释执行速度对于整个系统的性能都至关重要。因此,希望能有改进的方案,进一步提高字节码指令流的执行效率。
发明内容
本说明书一个或多个实施例描述了一种解释执行字节码指令流的方法和装置,其中在执行当前指令时预取下一条指令的函数地址,并将其存储在寄存器中,从而加快字节码指令流的执行效率。
根据第一方面,提供了一种解释执行字节码指令流的方法,通过虚拟机解释器执行, 包括:
读取第一寄存器中存储的第一值;
当所述第一值为有效值时,将所述第一值存储到第二寄存器中,所述第二寄存器用于存储所述字节码指令流中当前指令对应的当前模拟函数地址;
从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在所述第一寄存器中,并且,根据从所述第二寄存器中读取的所述当前模拟函数地址,执行所述当前指令。
根据一种实施方式,上述方法还包括,当所述第一值不是有效值时,从存储器获取所述当前指令对应的当前模拟函数地址,并将其存储到所述第二寄存器中。
在一个实施例中,通过以下方式获取下一模拟函数地址:
确定下一条指令对应的操作码;
查询所述存储器中存储的映射表,从而得到所述操作码对应的模拟函数地址。
进一步的,在一个实施例中,通过以下方式确定下一条指令对应的操作码:将程序计数器的PC值累加预定字节长度,得到下一条指令的位置编号;根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
在另一实施例中,通过以下方式确定下一条指令对应的操作码:确定所述当前指令的指令长度;将程序计数器的PC值累加所述指令长度,得到下一条指令的位置编号;根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
在不同实施例中,上述存储器可以为高速缓存或内存;所述查询存储器中存储的映射表,和/或,查询存储器中存储的指令顺序表,包括:在高速缓存中进行查询;在不命中的情况下,在内存中进行查询。
根据一个实施例,执行当前指令具体包括:
判断当前指令是否会改变指令流顺序;
如果是,则将所述第一寄存器设为无效值,并执行所述当前指令对应的当前模拟函数;
如果否,直接执行所述当前指令对应的当前模拟函数。
在一个实施例中,在从存储器获取所述当前指令的下一条指令对应的下一模拟函数 地址之后,还包括:根据所述下一模拟函数地址,判断所述下一条指定对应的下一模拟函数是否被加载到高速缓存中;如果没有,则将该下一模拟函数加载到高速缓存中。
根据一个实施例,所述字节码指令流为智能合约编译后的字节码指令流,所述虚拟机为WASM虚拟机或Solidity虚拟机。
根据第二方面,提供了一种解释执行字节码指令流的装置,部署在虚拟机的解释器中,包括:
读取单元,配置为读取第一寄存器中存储的第一值;
存储单元,配置为当所述第一值为有效值时,将所述第一值存储到第二寄存器中,所述第二寄存器用于存储所述字节码指令流中当前指令对应的当前模拟函数地址;
预取及执行单元,配置为从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在所述第一寄存器中,并且,根据从所述第二寄存器中读取的所述当前模拟函数地址,执行所述当前指令。
根据第三方面,提供了一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行第一方面的方法。
根据第四方面,提供了一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现第一方面的方法。
根据本说明书实施例提供的方法和装置,提前预取下一条指令对应的模拟函数地址,并将其存储在寄存器中,同时执行当前指令的模拟函数。在执行完当前指令后,就可以直接从寄存器中读取下一条指令所需的模拟函数地址进行执行。由于CPU访问寄存器速度极快,因此,这样的方式可以极大地缩减执行字节码指令流的耗时,提高执行效率。
附图说明
为了更清楚地说明本发明实施例的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其它的附图。
图1示出在一个实施例中的应用场景示意图;
图2示出在一个实施例中解释执行字节码指令流的过程示意图;
图3示出根据一个实施例的解释执行字节码指令流的方法流程图;
图4示出根据一个实施例的解释执行一段字节码指令流的完整步骤流程示意图;
图5示出根据一个实施例的解释执行字节码指令流的装置的示意性框图。
具体实施方式
下面结合附图,对本说明书提供的方案进行描述。
图1示出在一个实施例中的应用场景示意图。如图1所示,在多种应用场景中,通过编译器将高级语言编写的程序编译为字节码文件,由虚拟机解释器对字节码代表的指令流进行解释,使其在CPU中执行。
图2示出在一个实施例中解释执行字节码指令流的过程示意图。如本领域技术人员所知,在执行字节码文件之前,虚拟机首先会将该字节码文件加载到内存中,并得到图2中表A所示的指令顺序表。在表A中,左侧的100,108…等等是位置编号,右侧的20,31,等等,示例性代表操作码(opcode)。可以看到每个操作码都是两位十六进制数,长度为一个字节,这也正是其称为字节码的原因。
虚拟机使用程序计数器(Program Counter)来记录当前执行指令的位置编号,该程序计数器的值又称为PC值。因此,根据表A,当PC值为100时,表示当前要执行操作码20。
为了执行操作码指示的指令,需要查阅表B所示的映射表,该映射表示出各个操作码对应的模拟函数。例如,根据表B,操作码10对应于Move函数,操作码20对应于Add函数,操作码31对应于JMP函数,等等。表B的映射表有时又称为指令集,用于记录各个操作码对应的操作指令含义(通过模拟函数体现)。不过,更具体而言,该映射表并不记录模拟函数包含的指令代码本身,而是记录存储模拟函数的指令代码的地址。因此,通过查询表B,可以得到当前要执行的操作码所对应的模拟函数地址,经由该地址,可以访问得到模拟函数的指令代码。模拟函数的指令代码可以是适于机器执行的形式,由此,就可以通过执行模拟函数的指令代码,执行各个操作码代表的指令。
结合具体例子而言,假定当前PC值为100,那么通过表A可以得到,字节码指令流中当前要执行的操作码为20。然后,查询表B,得到操作码20对应的模拟函数地址,根据该地址,执行模拟函数中的代码,从而执行当前指令。由于字节码文件是顺序执行的指令流,因此除了个别跳转指令之外,PC值会顺序累加。在执行完100位置处的指 令后,PC值累加到108,指向下一条指令作为当前要执行的指令。相应地,通过表A得到,位置编号108处当前要执行的指令中的操作码为31。类似的,通过查询表B,得到操作码31对应的模拟函数地址,执行其中的代码,从而执行当前指令。如此继续。
通过以上过程可以看到,每执行一条指令,至少需要经过两次查表过程。也就是,首先根据PC值查询表A,得到将要执行的操作码;然后,根据该操作码查询表B,得到操作码对应的模拟函数地址,然后才能执行该模拟函数。
可以理解,通过虚拟机对字节码文件的加载过程,表A和表B至少会存储于内存中。进一步地,综合CPU中高速缓存(cache)的容量大小,以及表A和表B被访问的频次,表A和/或表B还有可能被存储在高速缓存中。在执行指令过程中,CPU为了查询表A和表B,会优先访问高速缓存,在高速缓存不命中的情况下,再去访问内存。对于目前的多数CPU而言,访问L1级高速缓存需要10多个时钟周期(Cycle),访问L2级高速缓存需要20多个时钟周期,如果访问内存,则需要200多个时钟周期。因此,即使表A和表B均存储于L1级高速缓存,两次查表至少要耗时20多个时钟周期;而在访问表A或表B时缓存不命中(miss)的情况下,则需要花费几百个时钟周期从内存中进行查表。
考虑到以上过程中查表造成的耗时,在本说明书的实施例中提出一种地址预取的方案,也就是,提前预取下一条指令对应的模拟函数地址,并将其存储在寄存器中,同时执行当前指令的模拟函数。在执行完当前指令后,就可以直接从寄存器中读取下一条指令所需的模拟函数地址进行执行。由于CPU访问寄存器速度极快(只需1个时钟周期),因此,这样的方式可以极大地缩减执行字节码指令流的耗时,提高执行效率。下面描述该发明构思的实现方式。
图3示出根据一个实施例的解释执行字节码指令流的方法流程图。可以理解,该方法可以通过虚拟机解释器执行,其中虚拟机可以部署于任何具有计算、处理能力的装置、设备、平台、设备集群中。在一个实施例中,虚拟机可以是一些通用的虚拟机,例如Java虚拟机,Python虚拟机,从而用于解释执行各种应用对应的字节码文件。在一个实施例中,虚拟机可以是区块链网络中用于执行智能合约的虚拟机,例如EVM虚拟机,WASM虚拟机,Solidity虚拟机。这些虚拟机可以利用其中的解释器,解释执行智能合约编译后生成的字节码指令流。
根据前述的构思,在图3所示的方法中,利用两个寄存器来进行模拟函数地址的预取,为了表述的方便,将这两个寄存器称为第一寄存器和第二寄存器,其中可以将第一 寄存器设定用于存储下一条指令对应的模拟函数地址,将第二寄存器设定用于存储当前指令对应的模拟函数地址。图3具体示出在任一个指令执行循环中实施的方法步骤,为了描述的清楚和简单,结合第n个循环中要执行第n条指令的情况进行描述,此时,将该循环中要执行的第n条指令称为当前指令,其中n>1。另外需要理解,结合图2中表A和图3描述中所称的当前指令、下一条指令,均为操作码指令,以区别于处理器直接执行的机器指令。
如图3所示,首先在步骤31,读取第一寄存器中存储的第一值,并判断第一值的有效性。如前所述,第一寄存器用于存储下一条指令对应的模拟函数地址。因此,如果第一寄存器中存储的值无效,则表明,在执行之前的第n-1条指令时对第n条指令所需地址的预取无效;如果第一寄存器中存储的值有效,则表明,在执行之前第n-1条指令时对第n条指令所需地址的预取操作成功,且该第一值即为第n条指令所需的函数地址。具体地,在一个示例中,如果第一寄存器中存储的第一值为零,则为无效值,否则为有效值。在另一示例中,还可以通过其他方式示出是否为有效值,例如将寄存器中的某一位设为状态位,通过该状态位示出上一轮是否预取成功。
当第一值为有效值时,表明上一轮的预取成功,因此,在步骤33,将该第一值存储到第二寄存器中。于是,第二寄存器中存储了上述第一值,即第n条指令对应的模拟函数地址,也就是当前指令对应的当前模拟函数地址。由此,实现了指令执行轮次或循环的更新,此时,第一寄存器重新可用于存储下一条指令所需的函数地址,而第二寄存器存储了当前指令的函数地址。
接着,在步骤35,通过访问存储器获取当前的第n条指令的下一条指令(第n+1条指令)对应的下一模拟函数地址,将该下一模拟函数地址存储在上述第一寄存器中,并且,根据从第二寄存器中读取的当前模拟函数地址,执行当前指令。
可以看到,步骤35中包含预取下一条指令的函数地址的操作,以及执行当前指令的操作,下面对其分别进行描述。
首先描述预取下一条指令的函数地址的操作,该操作与常规获取模拟函数地址的过程类似,仍然通过查询图2所示的表A和表B来实现。具体地,预取下一条指令的函数地址的操作可以包括以下过程。
首先,确定下一条指令对应的操作码。
在一个实施例中,各个指令为预定长度的定长指令,例如如图2所示,每个指令对 应一个操作码opcode,长度为定长的一个字节。此时,将程序计数器的PC值累加该预定字节长度,即可得到下一条指令的位置编号;然后,根据该位置编号,查询指令顺序表,即表A,便可得到下一条指令对应的操作码。
在另一实施例中,指令流中包含的指令长度不一。例如,在WASM字节码中,字节码通过leb 128编码,而leb 128编码是不定长编码。由此,各个指令的长度并不一致。在这样的情况下,首先需要确定当前指令的指令长度,将程序计数器的PC值累加该指令长度,得到下一条指令的位置编号。然后,根据该位置编号,查询指令顺序表,确定出下一条指令对应的操作码。
在确定下一条指令对应的操作码后,查询表B所示的映射表,从而得到该操作码对应的模拟函数地址,作为下一模拟函数地址。
在以上查询指令顺序表A和映射表B时,都是首先在高速缓存中查询,如果不命中,则访问内存进行查询。通过查询这两个表,获取到下一条指令对应的下一模拟函数地址。
在一个实施例中,在获取到下一模拟函数地址后,根据该地址,判断对应的下一模拟函数是否被加载到高速缓存中。如果没有,则将该下一模拟函数加载到高速缓存中。如此,可以在下一模拟函数没有加载到高速缓存的情况下,提前将其加载到高速缓存,避免执行下一条指令时访问内存的耗时,进一步加快指令的执行。
下面描述执行当前指令的操作。具体而言,由于在步骤33已经将当前模拟函数地址转存到第二寄存器中,因此,在步骤35,从第二寄存器中读取当前模拟函数地址,根据该地址,获取到当前模拟函数以及实现该模拟函数的指令代码。
首先根据该模拟函数判断当前指令是否会改变指令流执行顺序,即判断当前指令是否为跳转指令,如果是,则说明下一条指令可能并不是以上按照顺序执行更新PC值确定的指令,以上下一条指令地址的预取可能出错。因此,将第一寄存器设为无效值,例如将其重置为零,或者将状态位设为无效。然后,执行当前指令对应的模拟函数的指令代码。如果当前指令不是跳转指令,则直接执行模拟函数的指令代码。
在一个例子中,该指令代码为适于CPU直接执行的机器代码,在这样的情况下,可以直接按照所述机器代码执行机器指令,从而执行当前的操作码指令。在另一例子中,该指令代码为比字节码更接近机器指令形式的代码,例如汇编代码,在这样的情况下,可以将该指令代码转换为机器代码,通过执行机器代码执行当前的操作码指令。
如上所述,步骤35包含预取下一条指令的函数地址的预取操作,和执行当前指令的 执行操作。在预取操作中,由于模拟函数地址必须通过查询映射表才能获取,而映射表存储于高速缓存或内存中,因此预取下一条指令地址仍然需要通过访问存储器(缓存或内存)实现。假定预取下一条指令的模拟函数地址需要时间T1。另一方面,当前指令的执行是通过执行对应的模拟函数实现,而模拟函数一般对应于许多条机器指令构成的指令流,因此当前指令的执行一般要花费更多的时间,该时间记为T2。
可以理解,执行当前指令的时间T2是无法避免的,因此,希望能够将预取下一条指令地址所需的执行时间T1“隐藏”起来,例如“隐藏”在T2中,使得预取操作尽量少地单独占用时间,则可以提升整个执行效率。
在一个实施例中,在执行步骤35时,可以使得预取操作和当前指令的执行操作在CPU的不同执行单元(即处理器)中并行执行,如此使得预取操作完全不会单独占用时间。不过,这仅仅适用于具有不同执行单元的CPU,例如一些多核CPU。
在另一个实施例中采用更普适的方案,也就是,在解释器的逻辑层面上,先执行预取操作,然后执行当前模拟函数,但是,在处理器的执行层面上,通过多机器指令的并行执行,隐藏预取操作的执行时间。
具体而言,当前的多数处理器通过采用流水线(pipeline)方式,将机器指令拆分为更小的子过程,多个子过程之间可以并行执行,从而可以支持多机器指令的并行执行。对于预取操作和当前模拟函数的执行操作而言,这两项操作都被转换给一系列机器指令流。由于预取操作和当前模拟函数的执行并不依赖,因此,预取操作尚未完成,在某些机器指令需要等待的间歇(例如访问高速缓存或访问内存时的等待),处理器会并行地乱序执行后续机器指令,也就是执行当前模拟函数,之后将乱序执行的结果再次排序(reorder),得到当前模拟函数的执行结果。由此,在效果上实现两者的并行执行,将预取操作的时间T1隐藏在当前模拟函数的执行时间T2中。
于是,在当前模拟函数执行完毕时,下一模拟函数的地址已经预取成功,存储到第一寄存器中。于是,通过再次执行图3的方法,进入下一轮指令执行,其中可以从寄存器中直接获取所需地址,执行模拟函数。通过这样的方式,在每一轮指令执行过程中,隐藏或者消除了从存储器获取模拟函数地址的时间,加速了指令的执行。
然而,也会存在地址预取不成功的情况,例如,如前所述,当前指令为跳转指令,需要依赖于当前指令的执行结果确定跳转的位置,那么提前确定的下一条指令有可能不准确,此时,将第一寄存器设置为无效值。这对应于图3的步骤34。也就是说,当第一 寄存器中的值不是有效值时,从存储器获取当前指令对应的当前模拟函数地址,并将其存储到第二寄存器中。这相当于上一轮的投机预取没有成功,在这一轮,按照常规方式正常获取当前指令的当前模拟函数地址。并且,将该当前模拟函数地址存储到第二寄存器中。接着,仍然执行上述步骤35,尝试下一条指令地址的预取,并执行当前指令的模拟函数。
图4示出根据一个实施例的解释执行一段字节码指令流的完整步骤流程示意图。下面仍然结合图2表A所示的指令顺序表描述图4所示的过程。
首先,在步骤401,获取起始指令的操作码。一般地,根据预设的字节码指令流的入口位置,即入口PC值,获取起始指令的操作码。在图2的例子中,假定入口PC值为100,那么起始指令的操作码为20。
接着,在步骤402,获取起始指令对应的模拟函数地址,将其存入第一寄存器。在图1的例子中,根据起始指令的操作码20查询表B,可以得到对应的模拟函数Add函数的地址,假定其为地址A,将该地址A存入第一寄存器。
然后,在步骤403,确定下一条指令的位置编号。在定长的情况下,假定PC值更新为108,作为下一条指令的位置编号。
在步骤404,判断是否遇到结束条件。结束条件可以有多种设定,例如指令流执行完毕,溢出,等等。
由于没有遇到结束条件,继续前进至步骤405,判断第一寄存器中是否为有效值。由于当前第一寄存器中存储了有效值地址A,那么流程前进至步骤406,将第一寄存器中的值存储到第二寄存器中。于是,第二寄存器中存储了地址A。
接着,在步骤407,根据下一条指令的位置编号获取下一条指令。由于在步骤403,PC值更新为108,因此,根据该位置编号可以得到下一条指令为操作码31。
于是,在步骤408,获取下一条指令对应的模拟函数地址,存储到第一寄存器中。延续上例,通过查询映射表,可以得到下一条指令的操作码31对应于JMP函数,相应的将JMP函数的地址,地址J存储在第一寄存器中。此时,第一寄存器中的值得到更新。
接着,在步骤409,再次更新下一条指令的位置编号。此时,PC值更新为116。
并且,在步骤410,根据第二寄存器中的地址,执行当前指令对应的模拟函数。也就是,根据前述的地址A,执行Add函数。
如前所述,在CPU执行过程中,由于步骤407-408与步骤410并不依赖,因此,通过机器指令的并行执行,步骤407-408的执行时间被隐藏在步骤410的执行过程中。
在第一条指令对应的Add函数执行完毕之后,流程回到步骤404。由于没有遇到结束条件,继续前进至步骤405,判断第一寄存器中是否为有效值。此时,第一寄存器中存储了有效值地址J,那么流程前进至步骤406,将第一寄存器中的值存储到第二寄存器中。
接着,在步骤407,根据下一条指令的位置编号获取下一条指令。由于在上一轮的步骤409,PC值更新为116,因此,根据该位置编号可以得到下一条指令为操作码60。
于是,在步骤408,获取下一条指令对应的模拟函数地址,存储到第一寄存器中。类似的,通过查询映射表,可以得到下一条指令的操作码60对应于Push函数,相应的将Push函数的地址,地址P存储在第一寄存器中。
接着,在步骤409,再次更新下一条指令的位置编号。此时,PC值更新为124。
并且,在步骤410,根据第二寄存器中的地址J,执行当前指令对应的模拟函数JMP。假定该函数是一个条件跳转函数,根据执行中某个参数的值确定跳转目标。那么,在执行该模拟函数时,首先将第一寄存器中存储的值设为无效,例如,将第一寄存器中的值清除为零,或者将其状态位设为无效,等等。此外,通过执行该JMP函数,可以确定接下来要执行的指令。假定,通过执行该函数,确定跳转到132处的指令,那么PC值重设为132。
再次回到步骤404后进入步骤405,在其中判断第一寄存器中是否为有效值。此时,第一寄存器已被清除或设为无效,因此步骤405的判断为否,于是进入另一分支的步骤411。
在步骤411,获取当前指令的操作码。由于经过跳转,PC重设为132,因此可以得到132处的操作码10为当前指令的操作码。
接着,在步骤412,获取当前指令的当前模拟函数地址,将其存入第二寄存器。于是,通过查询映射表,得到操作码10对应的Move函数,将其地址M存入第二寄存器。
然后,在步骤413,获得下一条指令的位置编号。于是,进一步对PC值累加,更新为140。
再次执行步骤407-408,预取下一条指令的模拟函数地址,存入第一寄存器;然后 在步骤409继续更新PC值,并在步骤410,按照第二寄存器中存储的地址M,执行当前的模拟函数Move。
如此持续执行,直到遇到结束条件。
从以上过程可以看到,在指令顺序执行的情况下,会按照步骤406的分支执行,此时,对于当前指令,只需要读取第二寄存器就可获得要执行的函数地址,而寄存器读取(只需1个时钟周期)相比于访问高速缓存或内存(几十甚至几百时钟周期)的耗时大大降低。同时,如前所述,预取下一条指令函数地址的耗时被隐藏在执行本次指令之中,几乎不增加额外耗时,因此,整个执行过程得到加速。仅仅在遇到跳转指令,指令流的执行顺序发生改变时,需要沿着步骤411的分支执行。但是一般而言,跳转指令仅占总体指令的很小一部分(20%左右),因此,多数指令都可以通过预取的方式加快执行过程,整个指令流的执行效率得到提升。
根据另一方面的实施例,提供了一种解释执行字节码指令流的装置,该装置部署在虚拟机的解释器中,虚拟机可以安装在任何具有计算、处理能力的设备、平台或设备集群中。图5示出根据一个实施例的解释执行字节码指令流的装置的示意性框图。如图5所示,该装置500包括:
读取单元51,配置为读取第一寄存器中存储的第一值;
存储单元53,配置为当所述第一值为有效值时,将所述第一值存储到第二寄存器中,所述第二寄存器用于存储所述字节码指令流中当前指令对应的当前模拟函数地址;
预取及执行单元55,配置为从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在所述第一寄存器中,并且,根据从所述第二寄存器中读取的所述当前模拟函数地址,执行所述当前指令。
在一个实施例中,装置500还包括地址获取单元54,配置为,当所述第一值不是有效值时,从存储器获取所述当前指令对应的当前模拟函数地址,并将其存储到所述第二寄存器中。
根据一个实施例,所述预取及执行单元55包括预取模块551,所述预取模块551配置为:
确定下一条指令对应的操作码;
查询所述存储器中存储的映射表,从而得到所述操作码对应的模拟函数地址。
进一步的,在一个实施例中,所述预取模块551进一步配置为:
将程序计数器的PC值累加预定字节长度,得到下一条指令的位置编号;
根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
在另一实施例中,所述预取模块551进一步配置为:
确定所述当前指令的指令长度;
将程序计数器的PC值累加所述指令长度,得到下一条指令的位置编号;
根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
在不同实施例中,所述存储器可以为高速缓存或内存;相应地,所述预取模块551配置为:在高速缓存中查询所述映射表和/或所述指令顺序表;在不命中的情况下,在内存中进行查询。
根据一个实施例,所述预取及执行单元55包括执行模块552,所述执行模块552配置为:
判断当前指令是否会改变指令流顺序;
如果是,则将所述第一寄存器设为无效值,并执行所述当前指令对应的当前模拟函数;
如果否,直接执行所述当前指令对应的当前模拟函数。
在一个实施例中,所述预取及执行单元55还配置为:
根据所述下一模拟函数地址,判断所述下一条指定对应的下一模拟函数是否被加载到高速缓存中;如果没有,则将该下一模拟函数加载到高速缓存中。
在一个实施例中,所述字节码指令流为智能合约编译后的字节码指令流,所述虚拟机为WASM虚拟机或Solidity虚拟机。
根据另一方面的实施例,还提供一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行结合图3和图4所描述的方法。
根据再一方面的实施例,还提供一种计算设备,包括存储器和处理器,所述存 储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现结合图3和图4所述的方法。
本领域技术人员应该可以意识到,在上述一个或多个示例中,本发明所描述的功能可以用硬件、软件、固件或它们的任意组合来实现。当使用软件实现时,可以将这些功能存储在计算机可读介质中或者作为计算机可读介质上的一个或多个指令或代码进行传输。
以上所述的具体实施方式,对本发明的目的、技术方案和有益效果进行了进一步详细说明,所应理解的是,以上所述仅为本发明的具体实施方式而已,并不用于限定本发明的保护范围,凡在本发明的技术方案的基础之上,所做的任何修改、等同替换、改进等,均应包括在本发明的保护范围之内。

Claims (20)

  1. 一种解释执行字节码指令流的方法,通过虚拟机的解释器执行,包括:
    读取第一寄存器中存储的第一值;
    当所述第一值为有效值时,将所述第一值存储到第二寄存器中,所述第二寄存器用于存储所述字节码指令流中当前指令对应的当前模拟函数地址;
    从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在所述第一寄存器中,并且,根据从所述第二寄存器中读取的所述当前模拟函数地址,执行所述当前指令。
  2. 根据权利要求1所述的方法,还包括,当所述第一值不是有效值时,从存储器获取所述当前指令对应的当前模拟函数地址,并将其存储到所述第二寄存器中。
  3. 根据权利要求1或2所述的方法,其中,从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,包括:
    确定下一条指令对应的操作码;
    查询所述存储器中存储的映射表,从而得到所述操作码对应的模拟函数地址。
  4. 根据权利要求3所述的方法,其中,确定下一条指令对应的操作码包括:
    将程序计数器的PC值累加预定字节长度,得到下一条指令的位置编号;
    根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
  5. 根据权利要求3所述的方法,其中,确定下一条指令对应的操作码包括:
    确定所述当前指令的指令长度;
    将程序计数器的PC值累加所述指令长度,得到下一条指令的位置编号;
    根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
  6. 根据权利要求4或5所述的方法,其中,所述存储器为高速缓存或内存;
    所述查询所述存储器中存储的映射表,和/或,查询所述存储器中存储的指令顺序表,包括:
    在高速缓存中进行查询;在不命中的情况下,在内存中进行查询。
  7. 根据权利要求1或2所述的方法,其中,执行所述当前指令包括:
    判断当前指令是否会改变指令流顺序;
    如果是,则将所述第一寄存器设为无效值,并执行所述当前指令对应的当前模拟函数;
    如果否,直接执行所述当前指令对应的当前模拟函数。
  8. 根据权利要求1或2所述的方法,其中,在从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址之后,还包括:
    根据所述下一模拟函数地址,判断所述下一条指定对应的下一模拟函数是否被加载到高速缓存中;如果没有,则将该下一模拟函数加载到高速缓存中。
  9. 根据权利要求1所述的方法,其中,所述字节码指令流为智能合约编译后的字节码指令流,所述虚拟机为WASM虚拟机或Solidity虚拟机。
  10. 一种解释执行字节码指令流的装置,部署在虚拟机的解释器中,包括:
    读取单元,配置为读取第一寄存器中存储的第一值;
    存储单元,配置为当所述第一值为有效值时,将所述第一值存储到第二寄存器中,所述第二寄存器用于存储所述字节码指令流中当前指令对应的当前模拟函数地址;
    预取及执行单元,配置为从存储器获取所述当前指令的下一条指令对应的下一模拟函数地址,并将该下一模拟函数地址存储在所述第一寄存器中,并且,根据从所述第二寄存器中读取的所述当前模拟函数地址,执行所述当前指令。
  11. 根据权利要求10所述的装置,还包括地址获取单元,配置为,当所述第一值不是有效值时,从存储器获取所述当前指令对应的当前模拟函数地址,并将其存储到所述第二寄存器中。
  12. 根据权利要求10或11所述的装置,其中,所述预取及执行单元包括预取模块,所述预取模块配置为:
    确定下一条指令对应的操作码;
    查询所述存储器中存储的映射表,从而得到所述操作码对应的模拟函数地址。
  13. 根据权利要求12所述的装置,其中,所述预取模块配置为:
    将程序计数器的PC值累加预定字节长度,得到下一条指令的位置编号;
    根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
  14. 根据权利要求12所述的装置,其中,所述预取模块配置为:
    确定所述当前指令的指令长度;
    将程序计数器的PC值累加所述指令长度,得到下一条指令的位置编号;
    根据该位置编号,查询所述存储器中存储的指令顺序表,得到下一条指令对应的操作码。
  15. 根据权利要求13或14所述的装置,其中,所述存储器为高速缓存或内存;
    所述预取模块配置为:
    在高速缓存中查询所述映射表和/或所述指令顺序表;在不命中的情况下,在内存中进行查询。
  16. 根据权利要求10或11所述的装置,其中,所述预取及执行单元包括执行模块,所述执行模块配置为:
    判断当前指令是否会改变指令流顺序;
    如果是,则将所述第一寄存器设为无效值,并执行所述当前指令对应的当前模拟函数;
    如果否,直接执行所述当前指令对应的当前模拟函数。
  17. 根据权利要求10或11所述的装置,其中,所述预取及执行单元还配置为:
    根据所述下一模拟函数地址,判断所述下一条指定对应的下一模拟函数是否被加载到高速缓存中;如果没有,则将该下一模拟函数加载到高速缓存中。
  18. 根据权利要求10所述的装置,其中,所述字节码指令流为智能合约编译后的字节码指令流,所述虚拟机为WASM虚拟机或Solidity虚拟机。
  19. 一种计算机可读存储介质,其上存储有计算机程序,当所述计算机程序在计算机中执行时,令计算机执行权利要求1-9中任一项的所述的方法。
  20. 一种计算设备,包括存储器和处理器,其特征在于,所述存储器中存储有可执行代码,所述处理器执行所述可执行代码时,实现权利要求1-9中任一项所述的方法。
PCT/CN2020/071560 2019-08-30 2020-01-11 解释执行字节码指令流的方法及装置 WO2021036173A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US16/786,856 US10802854B2 (en) 2019-08-30 2020-02-10 Method and apparatus for interpreting bytecode instruction stream

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910818266.X 2019-08-30
CN201910818266.XA CN110704108B (zh) 2019-08-30 2019-08-30 解释执行字节码指令流的方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/786,856 Continuation US10802854B2 (en) 2019-08-30 2020-02-10 Method and apparatus for interpreting bytecode instruction stream

Publications (1)

Publication Number Publication Date
WO2021036173A1 true WO2021036173A1 (zh) 2021-03-04

Family

ID=69194002

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2020/071560 WO2021036173A1 (zh) 2019-08-30 2020-01-11 解释执行字节码指令流的方法及装置

Country Status (3)

Country Link
CN (1) CN110704108B (zh)
TW (1) TWI743698B (zh)
WO (1) WO2021036173A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117971722A (zh) * 2024-03-28 2024-05-03 北京微核芯科技有限公司 一种取数指令的执行方法及其装置

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111399990B (zh) * 2020-05-29 2020-09-22 支付宝(杭州)信息技术有限公司 解释执行智能合约指令的方法及装置

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1222985A (zh) * 1996-05-03 1999-07-14 艾利森电话股份有限公司 在多级流水线结构中处理条件跳转的方法
CN102893260A (zh) * 2010-05-24 2013-01-23 高通股份有限公司 用以作为指令评估数据值的系统和方法
CN104679481A (zh) * 2013-11-27 2015-06-03 上海芯豪微电子有限公司 一种指令集转换系统和方法
US20150317163A1 (en) * 2014-05-01 2015-11-05 Netronome Systems, Inc. Table fetch processor instruction using table number to base address translation
CN109416632A (zh) * 2016-06-22 2019-03-01 Arm有限公司 寄存器还原分支指令

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6256784B1 (en) * 1998-08-14 2001-07-03 Ati International Srl Interpreter with reduced memory access and improved jump-through-register handling
GB2367651B (en) * 2000-10-05 2004-12-29 Advanced Risc Mach Ltd Hardware instruction translation within a processor pipeline
CN101295239A (zh) * 2007-04-26 2008-10-29 东信和平智能卡股份有限公司 Java卡虚拟机的指令执行方法
CN102292705B (zh) * 2010-08-30 2013-12-18 华为技术有限公司 网络处理器的指令处理方法和网络处理器
GB2564130B (en) * 2017-07-04 2020-10-07 Advanced Risc Mach Ltd An apparatus and method for controlling execution of instructions
CN108984392B (zh) * 2018-06-12 2021-07-16 珠海市杰理科技股份有限公司 单步调试方法和调试器

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1222985A (zh) * 1996-05-03 1999-07-14 艾利森电话股份有限公司 在多级流水线结构中处理条件跳转的方法
CN102893260A (zh) * 2010-05-24 2013-01-23 高通股份有限公司 用以作为指令评估数据值的系统和方法
CN104679481A (zh) * 2013-11-27 2015-06-03 上海芯豪微电子有限公司 一种指令集转换系统和方法
US20150317163A1 (en) * 2014-05-01 2015-11-05 Netronome Systems, Inc. Table fetch processor instruction using table number to base address translation
CN109416632A (zh) * 2016-06-22 2019-03-01 Arm有限公司 寄存器还原分支指令

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117971722A (zh) * 2024-03-28 2024-05-03 北京微核芯科技有限公司 一种取数指令的执行方法及其装置

Also Published As

Publication number Publication date
TWI743698B (zh) 2021-10-21
CN110704108B (zh) 2020-08-14
CN110704108A (zh) 2020-01-17
TW202109288A (zh) 2021-03-01

Similar Documents

Publication Publication Date Title
JP5681473B2 (ja) プログラムの最適化装置、最適化方法および最適化プログラム
US9696966B2 (en) Software development tool to automatically generate an optimized executable
US9201635B2 (en) Just-in-time dynamic translation for translation, compilation, and execution of non-native instructions
KR101081090B1 (ko) 명령어 스트림의 효율적인 에뮬레이션을 가능하게 하기 위한 레지스터 기반의 명령어 최적화
CN111399990B (zh) 解释执行智能合约指令的方法及装置
US9213563B2 (en) Implementing a jump instruction in a dynamic translator that uses instruction code translation and just-in-time compilation
KR100498272B1 (ko) 변환된 명령들을 실행하는 동안 문맥을 보존하기 위한 방법 및 장치
JPH1091455A (ja) キャッシュ・ヒット/ミスにおける分岐
US9524178B2 (en) Defining an instruction path to be compiled by a just-in-time (JIT) compiler
US9529610B2 (en) Updating compiled native instruction paths
US8291393B2 (en) Just-in-time compiler support for interruptible code
US9183018B2 (en) Dynamic on/off just-in-time compilation in a dynamic translator using instruction code translation
WO2021036173A1 (zh) 解释执行字节码指令流的方法及装置
JP4684571B2 (ja) エミュレーションコンピュータ技術を実行する直接命令
US8359589B2 (en) Helper thread for pre-fetching data
US10802854B2 (en) Method and apparatus for interpreting bytecode instruction stream
JP2004062908A (ja) 動的遅延演算情報を使用して制御投機ロードの即時遅延を制御する方法およびシステム
US11016771B2 (en) Processor and instruction operation method
US11966619B2 (en) Background processing during remote memory access
CN117270972B (zh) 指令处理方法、装置、设备和介质
US20150186168A1 (en) Dedicating processing resources to just-in-time compilers and instruction processors in a dynamic translator
JP2024030940A (ja) ソースコード変換プログラムおよびソースコード変換方法
AU2022226485A1 (en) Hybrid just in time load module compiler with performance optimizations

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 20859577

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 20859577

Country of ref document: EP

Kind code of ref document: A1