CN114691200A - Instruction simulation device and method thereof - Google Patents

Instruction simulation device and method thereof Download PDF

Info

Publication number
CN114691200A
CN114691200A CN202011588921.6A CN202011588921A CN114691200A CN 114691200 A CN114691200 A CN 114691200A CN 202011588921 A CN202011588921 A CN 202011588921A CN 114691200 A CN114691200 A CN 114691200A
Authority
CN
China
Prior art keywords
instruction
simulation
processor
executed
augmentation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011588921.6A
Other languages
Chinese (zh)
Inventor
王惟林
管应炳
杨梦晨
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Zhaoxin Semiconductor Co Ltd
Original Assignee
VIA Alliance Semiconductor Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by VIA Alliance Semiconductor Co Ltd filed Critical VIA Alliance Semiconductor Co Ltd
Priority to CN202011588921.6A priority Critical patent/CN114691200A/en
Priority to US17/471,170 priority patent/US11816487B2/en
Priority to US17/471,167 priority patent/US11803381B2/en
Publication of CN114691200A publication Critical patent/CN114691200A/en
Priority to US18/465,189 priority patent/US20240004658A1/en
Priority to US18/474,207 priority patent/US20240012649A1/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30098Register arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30145Instruction analysis, e.g. decoding, instruction word fields
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/3017Runtime instruction translation, e.g. macros
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30181Instruction operation extension or modification

Abstract

The invention relates to an instruction simulation device and a method thereof. The simulation apparatus includes a monitor. The monitor is used for judging whether the instruction to be executed belongs to a new instruction set or an augmentation instruction in an augmentation instruction set with the same structure as an instruction set of the processor. If the instruction to be executed is an amplification instruction, the amplification instruction is converted into a simulation program, wherein the simulation program is composed of the native instruction of the processor or a compatible instruction sequence, and the execution result of the amplification instruction is simulated by executing the simulation program. Therefore, the service life of the electronic equipment using the simulation device can be prolonged.

Description

Instruction simulation device and method thereof
Technical Field
The present invention relates to instruction execution technology of computer devices, and more particularly, to an instruction simulation device and method thereof.
Background
With the continuous development of computer instruction set system technology, the instruction sets supported by processors supporting various instruction set architectures will gradually update versions, resulting in the inability of old versions of processors to completely support newer instruction sets or extended instruction sets belonging to the same instruction set architecture of the processors. If the new instruction set or the augmented instructions in the augmented instruction set are to be executed on the old processor, the old processor may not be able to correctly execute the augmented instructions and even may be in error. In other words, the instructions that the processor is capable of supporting have already been determined after manufacture, and the subsequent augmentation instructions may not be able to function properly on the old version of the processor, thus presenting compatibility issues with the processor instruction set.
Thus, when a series or model of processors cannot support an updated version of an instruction set, it is often necessary to eliminate those processors, resulting in wasted resources and a phase change that shortens the life of electronic devices using those older versions of processors.
Disclosure of Invention
The invention provides an instruction simulation device and a method thereof, which are used for solving the problem of compatibility of a processor instruction set, thereby prolonging the service life of electronic equipment using the simulation device.
The simulation apparatus of the present invention includes a monitor. The monitor is used for judging whether an instruction to be executed which is currently required to be executed by the processor is a compatible instruction or an augmentation instruction, wherein the augmentation instruction is an instruction in a new instruction set or an augmentation instruction set which is of the same type as an instruction set currently possessed by the processor. If the instruction to be executed is judged to be the augmentation instruction, the instruction to be executed is converted into a simulation program constructed by a compatible instruction sequence consisting of the native instruction of the processor or the compatible instruction, and the execution result of the instruction to be executed is simulated by executing the simulation program.
The conversion method of the present invention includes the following steps. Whether the instruction to be executed is a compatible instruction or an expansion instruction is judged, wherein the expansion instruction is a new instruction set or an instruction in an expansion instruction set of the same structure of an instruction set of a processor. If the instruction to be executed is judged to be the augmentation instruction, the instruction to be executed is converted into a simulation program constructed by the compatible instruction sequence, and the execution result of the instruction to be executed is simulated by executing the simulation program.
Based on the above, the simulation apparatus and the method thereof according to the embodiment of the invention utilize the monitor to determine whether the instruction to be executed in the application program is a compatible instruction or an augmentation instruction of the processor. When the instruction to be executed is judged to be an amplification instruction, the instruction to be executed is converted into a simulation program constructed by a compatible instruction sequence, and then the simulation program is executed to simulate the execution result of the instruction to be executed, so that the invention can correctly execute the amplification instruction in a new instruction set or an amplification instruction set on an old version processor under the condition of slightly changing the hardware structure of the old version processor, thereby solving the compatibility problem of the instruction set of the processor and prolonging the service life of electronic equipment using the simulation device.
Drawings
Fig. 1 is a schematic diagram of an electronic device and a command simulation apparatus in the electronic device according to an embodiment of the invention.
Fig. 2A is a schematic diagram of an electronic device and a command simulation apparatus in the electronic device according to another embodiment of the invention.
Fig. 2B is a schematic diagram of an electronic device and a command simulation apparatus in the electronic device according to still another embodiment of the invention.
FIG. 3 is an architecture diagram of the dedicated hardware of one embodiment of the present invention.
FIG. 4A is a detailed block diagram of a processor in an electronic device according to an embodiment of the invention.
FIG. 4B is a detailed block diagram of a processor in an electronic device according to another embodiment of the invention.
FIG. 5A is a diagram illustrating an emulation module performing a conversion for an augmentation instruction, in accordance with one embodiment of the present invention.
FIG. 5B is a diagram illustrating an emulation module performing a conversion for an augmentation instruction according to another embodiment of the present invention.
FIGS. 6A/6B are schematic diagrams of a simulation process according to an embodiment of the invention.
FIG. 7 is a flow chart of a simulation method in accordance with an embodiment of the present invention.
FIG. 8 is a flow diagram of a translation method in accordance with an embodiment of the present invention.
Wherein the symbols in the drawings are briefly described as follows:
100: an electronic device; 110: a processor; 120: an operating system; 130: an application program; 132: an instruction to be executed; 112: an instruction decoder; 1122: an instruction parsing unit; 1124: a micro instruction sequence calling unit; 114: a monitor; 116: dedicated hardware; 116A: a processor current state pointer register; 116B: a translation information pointer register; 116C: a simulation execution result pointer register; 116D: a private register; 116E: simulating a register file; 116F: a destination register file; 122: a simulation module; 124: a save area; 1242: a processor state save area; 1244: a conversion information storage area; 1246: a simulation execution result storage area; 160: an actuator; 160: a renaming unit; 1604: a reservation station; 1606: an execution unit; 1608: a memory access unit; 171. 172, 173, 174, 175, 176: an arrow; 280: a conversion buffer; 410: a translation look-aside buffer; 420: caching an instruction; 430: a branch predictor; 440: reordering the cache area; 4402: an instruction commit unit; 4404: a micro instruction cache region; 450: a microcode memory; 460: a microcode control unit; 470: a micro instruction sequence storage unit; 702A, 702B: a control unit; 704A, 704B: amplifying an instruction to a simulation program conversion table; 7042A, 7042B: amplifying the instruction label; 70422A, 70422B: a simulation program sequence pointer hitting an augmented instruction tag; 7044A, 7044B: simulating a program sequence pointer; 706A, 706B: simulating a program sequence list; 7062A, 7062B: simulating a program sequence; 70622A, 70622B: hit the simulation program required for amplifying the instruction tag; 708A, 708B: dashed arrows; s702, S704, S706, S708, S710, S712, S714, S716: a step of; s802, S804, S806, S808, S810, S812, S814, S816: and (5) carrying out the following steps.
Detailed Description
The definitions of certain terms and terms contained in the specification and claims are as follows:
compatible instructions refer to instructions that are native instructions (native instructions) or recognizable and interpretable as native instructions and executable relative to a range or model of processors.
Incompatible instructions refer to three types of instructions, i.e., instructions that are of the same type as the instruction set architecture of a processor of a certain type or model but cannot be correctly recognized because they belong to a new instruction set or an extended instruction set, erroneous instructions, or instructions belonging to a different instruction set architecture from the processor.
Augmentation instructions refer to instructions that are not compatible, but are of the same type as the instruction set architecture currently available in a processor of a certain class or model, but are not correctly recognized in a new instruction set or an augmented instruction set. For example, with respect to the Pentium M processor, the instruction in AVX/AVX-512 (e.g., VADDSD or VADDPD) is an amplify instruction.
Non-convertible instructions refer to either faulty instructions in the non-compatible instructions or instructions of two types that belong to different instruction set architectures from the processor (e.g., for an X86 processor, instructions in an ARM instruction set architecture or a RISC instruction set architecture are instructions of different instruction set architectures).
Compatible instruction sequence refers to an instruction sequence composed of at least one processor native instruction or compatible instruction, and the execution result of the compatible instruction sequence is the same as the execution result of an augmentation instruction.
Simulation program, which is a program constructed by converting the augmentation instructions into a compatible instruction sequence composed of native instructions or compatible instructions of the processor, and can be executed by the processor to simulate the execution result of the augmentation instructions.
The immediate simulation mode is a process of converting an instruction to be executed currently executed by a processor into a simulation program constructed by compatible instruction sequences when the instruction to be executed is an amplification instruction, and then executing the simulation program to simulate the execution result of the amplification instruction. The application program that sends the instruction to be executed to the processor will not sense the existence of the real-time simulation mode.
It should be emphasized that the above-described compatible instructions, incompatible instructions, augmentation instructions, non-convertible instructions, compatible instruction sequences, simulation programs, etc., are directed to a range or model of processors. For example, the processor of the family or model may be a processor having an Instruction Set such as Reduced Instruction Set operation (RISC) of the ARM Cortex family Instruction Set, Complex Instruction Set operation (CISC) of the X86 Instruction Set of the Intel/AMD corporation, a processor supporting mips (microprocessor with out Interlocked Pipeline stages) or RISC-V (RISC-Five) Instruction Set architecture, a processor capable of supporting both ARM and X86 Instruction Set architecture, or a processor having an Instruction Set architecture other than RISC/CISC, and the present invention is not particularly limited in the type of Instruction Set architecture supported by the processor. One skilled in the art will appreciate that the manufacturer of the integrated circuit may adjust the content of the instruction set architecture supported by the processor according to its needs, and the present invention is not limited thereto.
The term "self-defined term" is used herein to describe that a person skilled in the art can define different terms by himself or herself in connection with the technical idea of the present invention, but it should be distinguished that the term "self-defined term" should be understood from the viewpoint of technical implementation and should not be distinguished by term names, and the present invention is not limited thereto. As one skilled in the art will appreciate, manufacturers may define a particular concept and/or refer to a particular component by different names. The present specification and claims do not intend to distinguish between differences in name but not function. In the following description and in the claims, the terms "include" and "comprise" are used in an open-ended fashion, and thus should be interpreted to mean "include, but not limited to. In addition, the term "coupled" is used herein to encompass any direct or indirect electrical connection. Thus, if a first device couples to a second device, that connection may be through a direct electrical connection, or through an indirect electrical connection via other devices and connections. It will be appreciated by those skilled in the art that changes may be made in these embodiments without departing from the principles and spirit of the invention, the scope of which is defined in the appended claims.
Fig. 1 is a schematic diagram of an electronic device 100 and an analog device in the electronic device 100 according to an embodiment of the invention. The simulation apparatus is applied to an electronic apparatus 100 having a processor. The electronic device 100 is, for example, a tablet computer, a smart phone, a computer, a server …, or other consumer electronic device.
Referring to fig. 1, the electronic device 100 includes a processor 110, and the processor 110 is used for running an operating system 120 and an application 130. An operating system 120 runs on the processor 110 to unify the running of the applications 130. Application programs 130 run on top of operating system 120 and use various functions provided by processor 110 and other hardware (not shown in FIG. 1, such as a hard disk, network card, etc.) through operating system 120. When the electronic device 100 is powered on, a basic input/output system (BIOS) may be used for self-checking and initialization, and then the processor 110 runs an operating system 120 and drivers or driver software for driving various main components. The application 130 is comprised of a plurality of instructions that are executed by the processor 110 to implement the application. In detail, after the to-be-executed instruction 132 indicated by the application 130 is read from a storage medium (e.g., a hard disk, not shown) and stored in a dynamic random access memory (e.g., a system memory, not shown) of the electronic device 100, the to-be-executed instruction 132 is executed by the processor 110 according to a program sequence. When the processor 110 executes the to-be-executed instruction 132, the instruction decoder 112 analyzes and generates the format information of the to-be-executed instruction 132 (e.g., divides the instruction into fields with different functions), and then executes the decoding operation of the to-be-executed instruction 132 according to the format information. On the other hand, the monitor 114 determines whether the instruction 132 to be executed is a compatible instruction (e.g., a native instruction or a compatible instruction) or an augmentation instruction (as indicated by the arrow 172 in FIG. 1) according to the format information generated by the instruction decoder 112. If the instruction 132 is a compatible instruction, the processor 110 executes the instruction 132 and returns the result to the application 130 (not shown). The execution of compatible instructions is well known to those skilled in the art and will not be described in detail herein. On the other hand, when the to-be-executed instruction 132 executed by the processor 110 is determined by the monitor 114 to be an augmentation instruction, the to-be-executed instruction 132 is used as the parameter call simulation module 122 (as shown by the arrow 173 in fig. 1). The simulation module 122 converts the to-be-executed instruction 132 (currently, the augmentation instruction) into a simulation program constructed by compatible instruction sequences, and after the compatible instruction sequences in the simulation program are executed to simulate the execution result of the augmentation instruction, finally returns the execution result to the application program 130 (as shown by an arrow 174 in fig. 1). On the other hand, when the instruction 132 to be executed by the processor 110 cannot be recognized and interpreted by the instruction decoder 112 and is determined by the monitor 114 to be an untranslatable instruction, the processor 110 reports an error or an execution exception to the application 130 (not shown). How to process the non-convertible instruction is known to those skilled in the art and is not the focus of the present invention, and will not be described herein. The simulation module 122 is called to read the instruction 132 to be executed (currently, an augmentation instruction), and determine whether a simulation program corresponding to the augmentation instruction can be found. The simulation module 122 searches the simulation program list by querying the simulation program list, and tries to find the simulation program corresponding to the amplification program from the simulation program list, wherein the simulation program list is a table constructed by the simulation program after editing the compatible instruction into the compatible instruction sequence according to the operation indicated by each amplification instruction in advance by the processor designer, and the search of the simulation program list can be realized by database search, address search, and the like. When the simulation program corresponding to the augmentation instruction is one of the simulation program lists, the simulation module 122 first calls the simulation program, then executes the simulation program to generate a simulation execution result, and then terminates the call of the simulation module 112 and returns the execution result to the application 130. On the other hand, if the simulation program list in the simulation module 122 does not find the simulation program corresponding to the augmentation instruction, the simulation module 122 will provide the failure result and notify the processor 110 to end the calling process. It should be noted that the simulation module 122 is called when the processor 110 executes the to-be-executed instruction 132 from the application 130 and the to-be-executed instruction 132 is an augmentation instruction, and stops operating after executing the simulation program corresponding to the augmentation instruction and generating the simulation execution result, so that the application 130 does not sense the conversion of the augmentation instruction by the simulation module 122 and the execution process of the simulation program (the operation process of the whole simulation module 122 is the period when the real-time simulation mode is turned on), that is, all the operations executed by the simulation module 122 are transparent to the application 130. The manner in which the simulation module 122 invokes the simulation program and executes the simulation program will be described in greater detail below.
The Processor 110 of fig. 1 may be a Central Processing Unit (CPU), a microprocessor (micro-Processor), or other Programmable Processing Unit (Processing Unit), a Digital Signal Processor (DSP), a Programmable controller, an Application Specific Integrated Circuit (ASIC), a Programmable Logic Device (PLD), or the like. In addition, the to-be-executed instruction 132 corresponding to the application 130 is written by the developer of the application 130 according to a middle/high level programming language (e.g., C language/C + + language/C # language, Java language, Python language …, etc.) and/or a low level language (e.g., a combination language), and is compiled by a compiler (compiler) to generate an executable code (e.g., a machine code or a binary code) that can be executed by the processor 110, so that the to-be-executed instruction 132 is transmitted to the processor 110 via an arrow 171 as shown in fig. 1 to execute a machine code (machine code) or a machine instruction after compiling and linking a program written in the middle/high level programming language. It should be understood by those skilled in the art that the to-be-executed instruction 132 is a machine instruction that can be recognized and executed by a processor for the purpose of illustration and convenience, and the difference between the medium/high level program language instruction and the machine instruction of the to-be-executed instruction 132 is not distinguished in the embodiments and claims of the present specification.
The monitor 114 of fig. 1 is configured within the processor 110 and is implemented by hardware modules. It should be noted that the monitor 114 determines whether the instruction 132 to be executed is a compatible instruction or an operation of an augmentation instruction, and other circuit structure designs or corresponding firmware/software programs can be adopted according to the design requirements of those skilled in the art to implement this determination function. For example, the monitor 114 may be implemented by driving updates. Assume that it is desirable to have an old processor support the new instruction set, because the old processor does not include hardware similar to the monitor 114, it is not able to interpret the instructions in the new instruction set, nor to call the emulation module 122 to assist in converting the augmented instructions into an emulator and executing them by setting the emulation flag EF. However, if the function of the monitor 114 is compiled into software program code and becomes part of the driver (driver), when the legacy processor causes an undefined instruction exception (e.g., # UD), the monitor 114 software can be called by a system call via the interrupt service program corresponding to # UD after learning the exception (e.g., the monitor 114 written into the software program code can become a callback function (callback function) for the os 120 to call), and when the monitor 114 determines that the currently to-be-executed instruction 132 is an augmentation instruction and therefore needs conversion assistance, the emulation module 122 calls the emulation program corresponding to the augmentation instruction, and then executes the emulation program and returns the execution result. The driver including the monitor 114 can be implemented by a Live Update (Live Update), and the processor designer can immediately notify the user of the old processor of the capability of supporting the new instruction set/extended instruction set by updating the driver in real time after compiling the simulation program corresponding to each augmented instruction using the native instructions of the old processor. In general, any unit module, software program, hardware unit, or combination of software/hardware, etc. and/or software program that can determine whether the instruction 132 to be executed is a compatible instruction or an augmentation instruction should be considered as a corresponding variation of the monitor 114, and the invention is not limited thereto.
In one embodiment, after the electronic device 100 is powered on, the operating system 120 starts the real-time emulation mode function and sets various saving areas in the memory of the electronic device for storing the processor state and converting/emulating the execution processes. That is, when the instruction 132 to be executed by the processor 110 is determined as an augmentation instruction by the monitor 114, since the electronic device 100 has set the corresponding saving area, the simulation module 122 may be directly called to convert the augmentation instruction and execute the simulation program to generate the execution result. However, one skilled in the art can design the timing and the determination condition for turning on the immediate conversion mode to call the simulation module 122 according to the design requirement, for example, a flag signal is set when the monitor 114 determines that the instruction 132 to be executed is an augmentation instruction and accordingly turns on the immediate conversion mode function, so in an embodiment, it can be determined whether the flag signal is set before calling the simulation module 122. In another embodiment, it is first determined whether the application 130 that wants to call the simulation module 122 is a legitimate program recognized by the operating system 120, if so, the driver code read password (password) is allowed to be read, and the simulation module 122 is called after the password comparison is successful. The password may be stored in a driver of the processor 110, so that when the processor 110 needs to call the simulation module 122, the driver will obtain the password for comparison, and the simulation module 122 can only be called after the comparison is successful. In another embodiment, multiple Authentication (Authentication) or encryption may be performed on the calling simulation module 122 to ensure the security of the simulation process of the instruction 132 to be executed, for example, the application 130 that takes the password is determined to be a legal program recognized by the operating system 120, and then the encrypted password is read, and the simulation module 122 is called after being decrypted correctly. Controlling whether the immediate simulation mode is enabled or not (i.e., whether the simulation module 122 is allowed to be invoked or not) has the advantage that the simulation module 122 can be successfully invoked only when the immediate simulation of the instructions to be executed 132 is required, thereby preventing unauthorized users from intruding into the simulation module 122 or unauthorized changes or tampering with the simulation process of the instructions to be executed 132. It should be understood that when the invocation of the simulation module 122 is allowed may vary depending on the design, and the invention is not limited in this regard. It should be further explained that, for the description of configuring the corresponding save area after invoking the simulation module 122, please refer to fig. 2 and fig. 4 for partial description.
Based on the above description, it is assumed that the processor 110 has the X86 instruction set, but since the X86 instruction set is already determined after manufacture or the hardware structure of the processor 110 cannot be changed to support the augmented instructions in the new or augmented X86 instruction set, the augmented instructions still belong to the X86 instruction set structure, but the processor 110 cannot recognize and correctly execute the augmented instructions due to the hardware structure limitation. Therefore, the embodiment of the present invention utilizes the monitor 114 of the simulation apparatus to determine whether the instruction (i.e., the instruction 132 to be executed) to be executed by the application program belongs to the compatible instruction of the X86 instruction set within the processor 110 or belongs to the augmented instruction in the X86 new instruction set/augmented instruction set, and then determine the subsequent processing mode. Therefore, in the embodiment of the present invention, if the to-be-executed instruction 132 executed by the X86 processor 110 belongs to the X86 augmented instruction in the X86 new instruction set/augmented instruction set, the to-be-executed instruction 132 is converted by the X86 processor 110 under the embodiment of the present invention into a simulation program constructed by the X86 compatible instruction sequence in the instruction set currently provided by the X86 processor 110, and then the simulation module 122 executes the simulation program to simulate the execution result of the to-be-executed instruction 132 and returns the execution result to the application program 130. Thus, with the aid of the simulation module 122, the embodiment of the present invention enables the processor 110 with the older version of the X86 instruction set to convert the X86 augmented instruction in the new X86 instruction set or the X86 augmented instruction set to obtain a simulation program (as shown above, the simulation program is constructed by an X86 compatible instruction sequence), and simulate the execution result of the instruction 132 to be executed by executing the simulation program. In another embodiment, the processor 110 is an ARM processor having an ARM instruction set architecture, the to-be-executed instruction 132 belongs to an augmented instruction under a newer ARM instruction set or an extended instruction set relative to a current instruction set architecture of the ARM processor 110, and the ARM compatible instruction sequence in the emulator is composed of native instructions or compatible instructions under the current instruction set of the ARM processor, so that the augmented instruction belonging to the ARM instruction set architecture can emulate an execution result of the to-be-executed instruction 132 by executing the emulator. As described above, the instruction to be executed, the compatible instruction, and the augmented instruction in the embodiment are all instructions of the same instruction set architecture, and are not limited to the instruction of the X86 instruction set architecture (or CISC instruction set architecture), but may also be instructions of an ARM instruction set architecture (or RISC instruction set architecture), a processor supporting an MIPS or RISC-V instruction set architecture, or other instruction set architectures. It is noted that the processor 110 of the present invention can support the operation of the relatively new instruction set architecture by using the relatively old instruction set architecture, which not only extends the lifetime of the electronic device including the old processor, but also allows the processor designer to support the instructions of the new instruction set or the extended instruction set by the old processor with only a few hardware changes. For example, the purpose can be achieved by adding hardware related to the monitor 114 to the processor 110, establishing a connection for transmitting signals among the hardware added based on the requirement of the simulated augmented instructions, such as the instruction decoder 112, the monitor 114, and the dedicated hardware 116 …, and then configuring the simulation module 122 and the simulation program list with software, without modifying the structures of the subsequent pipeline stage circuits, the instruction prediction branch …, and the like of the processor, so that the processor meeting the requirement can be quickly designed.
Furthermore, in an embodiment, the simulation module 122 is stored in a bios of the electronic device 100, and the bios loads the simulation module 122 into the operating system 120 when the system including the processor 110 is powered on. In another embodiment, the simulation module 122 may be disposed in the driver software of the processor 110, and loaded into the system memory after being run by the operating system 120; in yet another embodiment, the simulation module 122 may compile into the Kernel (Kernel) of the operating system 120 and wait for a call after the operating system 120 executes. In yet another embodiment, the simulation module 122 may notify the operating system 120 to disable responding to other interrupts (e.g., disable responding to other unrelated hardware interrupts) during the operation of the conversion operation so that the conversion operation can be performed without interference. It will be appreciated by those skilled in the art that changes in the embodiments described above may be made without departing from the spirit of the invention and are intended to be encompassed by the appended claims.
Fig. 2A is a schematic diagram of an electronic device 100 and an analog device in the electronic device 100 according to another embodiment of the invention. The simulation apparatus of the electronic apparatus 100 may further configure a storage area 124 in an associated access medium (e.g., a memory) of the electronic apparatus 100, and at least a processor state storage area 1242, a conversion information storage area 1244, and a simulation execution result storage area 1246 are configured in the storage area 124. The processor state saving area 1242 is used for saving the current operating environment state parameters of the processor 110, the conversion information saving area 1244 is used for temporarily storing information (such as comparison information during the process of calling the simulation program or pointers pointing to the simulation programs in the simulation program list) when the augmentation instructions are converted to call the required simulation program, and the simulation execution result saving area 1246 is used for storing temporary information (such as corresponding storage space of variables defined in the simulation program or temporary data during the execution process) during the operation of the simulation program and the execution result of the simulation program. As shown in FIG. 2A, the simulation module 122 can temporarily store the related status data into the saving area 124 (as indicated by the arrow 175), and the processor 110 can also read the corresponding execution result from the saving area 124 (as indicated by the arrow 176). It should be noted that, in an embodiment, the processor state saving area 1242, the conversion information saving area 1244, and the parameters (such as the size of each saving area and the base pointer) related to the saving area 124 of the simulation execution result saving area 1246 are obtained from the related access medium of the electronic device 100 through the bios for configuration, that is, the saving area 124 can be configured by the operating system 120, and the configuration manner of the saving area 124 is not limited in the present invention. The manner in which the save area 124 is used will be described in more detail below.
Fig. 2B is a schematic diagram of an electronic device 100 and an analog device in the electronic device 100 according to still another embodiment of the invention. The processor 110 of this embodiment further includes a dedicated hardware 116, which is dedicated to the storage space of the processor 110 for converting the augmentation instructions into the simulation program and for executing the simulation program to generate the related information required for simulating the execution result. The manner in which the dedicated hardware 116 is used will be described in greater detail below.
FIG. 3 is a block diagram of the dedicated hardware 116 according to an embodiment of the present invention. Referring to fig. 2A, 2B and 3, as shown in fig. 3, the dedicated hardware 116 of the processor 110 includes a processor current state pointer register 116A, a conversion information pointer register 116B, a simulation execution result pointer register 116C, a private register 116D, and a simulation register file 116E for mapping the register of the augmented instruction architecture, and the processor 110 can read the state data information of the storage area 124 in the main storage based on the pointer of the register. The address of the processor current state pointer register 116A points to the main storage for storing the information related to the current state of the processor 110, such as various register states of the current operating environment of the processor 110, or the address of the next instruction to be executed of the instruction 132 to be executed; the address of the conversion information pointer register 116B also points to the main storage, but the storage space indicated by the address is used as a temporary storage space required by the augmentation instruction during the conversion process, or is used to store information required by the conversion process, such as format information of the augmentation instruction, pointers to various simulation programs, and the like; the address of the emulation execution result pointer register 116C also points to the main storage, but the storage space indicated by the address is used as a temporary storage space required by the emulation program corresponding to the augmentation instruction during the execution process, or is used to store information required by the execution process (e.g., intermediate execution results of the augmentation instruction, etc.), and the execution result of the emulation program. The private register 116D may include an Emulation Flag (EF) and a register for caching the augmentation instruction (neither shown). The emulation flag EF is used to indicate whether the instruction 132 to be executed is a convertible/emulated add instruction, for example, when the value is set to 1, the current instruction 132 to be executed is an add instruction, and therefore the emulation module 122 needs to be invoked to perform the conversion and emulation execution operations of the add instruction. On the other hand, when the register of the private register 116D for buffering the to-be-executed instruction 132 is used as the call emulation module 122 (the to-be-executed instruction 132 is an augmentation instruction), the to-be-executed instruction 132 is provided as a parameter to the temporary storage space of the emulation module 122. The Simulation register file (Simulation register file)116E includes N (N is a natural number greater than 1) 256-bit registers Ereg0, Ereg1 … Ereg-1, Ereg, which support the specific micro-operations of the processor 110, for example, two sets of 256-bit registers may map the high 256 region and the low 256 region of a 512-bit register, respectively, so that the processor 110 may map the 512-bit registers that are not supported by itself by the 256-bit register file in the Simulation register file 116E, such as Treg0, Treg1 … Tregn-1, and Tregm (Target register file)116F shown by a dotted line in fig. 3. It is a routine matter for those skilled in the art to implement the mapping between registers, and the description will not be described in detail herein. It should be noted that, although 256-bit register mapping is used to simulate 512-bit registers in this specification, the present invention is not limited to simulation between these two registers. For example, in another embodiment, emulation register file 116E may also be used for other specific registers not supported by the processor 110's existing hardware, such as base address or state control registers dedicated to emulating a particular operating mode. On the other hand, the dedicated hardware 116 may also be a Register file (Register file) provided in the processor 110, and some registers may be specially designated as the dedicated hardware 116 required for executing the simulated augmentation instruction, which is not limited in the present invention.
Fig. 4A is an internal block diagram of the processor 110 according to an embodiment of the present invention, which includes an Instruction Translation Lookaside Buffer (ITLB) 410, an instruction cache 420, a branch predictor (branch predictor) 430, a re-order buffer 440, a microcode memory 450, a microcode control unit 460, and a micro instruction sequence storage unit 470, in addition to the instruction decoder 112, the monitor 114, and the dedicated hardware 116. The instruction translation lookaside buffer ITLB 410 may be used to retrieve instructions to be executed, such as instructions that support functions indicated by an application (i.e., the to-be-executed instructions 132). The instruction cache 420 is used to obtain instructions to be executed from the instruction translation lookaside buffer 410 by way of a page table cache or a translation bypass cache. The branch predictor 430 operates in conjunction with the instruction cache 420, and the branch predictor 430 predicts whether an instruction may branch and stores the branch instruction in the instruction cache 420 when a branch is predicted to be taken. As mentioned above, the private register 116D includes a simulation flag EF for indicating whether the currently executed instruction 132 is a simulated multiply instruction, and a storage space for caching the currently executed instruction 132, and the application of the simulation flag EF and the storage of the multiply instruction will be described in detail later. Furthermore, executor 160 further includes a Renaming unit (Renaming unit)1602, a Reservation station (Reservation station)1604, an execution unit 1606, and a memory access unit 1608. The instruction decoder 112 further comprises an instruction parsing unit 1122 and a micro instruction sequence calling unit 1124, wherein the instruction parsing unit 1122 is coupled to the micro instruction sequence calling unit 1124 and the monitor 114, the monitor 114 is further coupled to the private register 116D, and the micro instruction sequence calling unit 1124 is coupled to the micro instruction sequence storage unit 470.
When the to-be-executed instruction 132 is provided from the instruction cache 420 to the monitor 140, the instruction parsing unit 1122 in the instruction decoder 112 performs format analysis on the to-be-executed instruction 132 to cut out format Information such as Prefix (PRE), Escape Opcode (EOP), Opcode (MOP), and Other Decoding Information (ODI), and then provides the format Information (PRE/EOP/MOP/ODI) to the micro instruction sequence call unit 1124 and the monitor 114. The micro instruction sequence call unit 1124 of the instruction decoder 112 then decodes the format information to obtain the operation indicated by the instruction 132 to be executed, and accordingly calls a corresponding micro instruction (μ op) sequence from the micro instruction sequence storage unit 470, and combines operand (operands) related information (e.g., operand addressing information) of the instruction 132 to be executed to generate a micro instruction, which is then sent to the executor 160 (e.g., to the rename unit 1602). After the renaming of the operands, the micro instruction sequence is sent to the reservation station 1604 and the reorder buffer 440, and the reservation station 1604 sends the micro instruction sequence to the execution unit 166 or the memory access unit 168 for further processing according to the type of the micro instruction sequence. The reorder buffer 440 includes an instruction issue unit (retry unit)4402 and a micro instruction cache 4404, wherein the micro instruction cache 4404 includes a plurality of instruction entries (entries) for storing micro instruction sequences sent from the rename unit 1602, and the instruction issue unit 4402 notifies the reorder buffer 440 to issue (retry) according to the original program order after the micro instructions are executed by the execute unit 1606 or the access unit 1608.
The following describes the processing situation when the to-be-executed instruction 132 executed by the processor 110 is an augmentation instruction. The monitor 114 determines whether the instruction 132 to be executed is an augmentation instruction according to the format information (PRE/EOP/MOP/ODI obtained by the instruction parsing unit 1122 analyzing the format of the instruction 132 to be executed), sets the simulation flag EF if the instruction 132 to be executed is an augmentation instruction, and instructs the private register 116D to store the instruction 132 to be executed. On the other hand, as mentioned above, when the instruction 132 to be executed is an augmentation instruction, the micro instruction sequence call unit 1124 will not be able to correctly interpret the format information and will generate a No Operation Performed (NOP). Therefore, when the empty instruction is the oldest instruction in the reorder buffer 440, the instruction issue unit 4402 checks the emulation flag EF and finds that the emulation flag EF is set, so the instruction issue unit 4402 triggers the interrupt service routine of the call emulation module 122 to call the emulation module 122 for the conversion and emulation of the augmented instruction. The simulation program call corresponding to the augmentation instruction will be described below with reference to fig. 5A/5B, and the simulation program will be described with reference to the example of fig. 6A/6B. In one embodiment, the interrupt service routine used to invoke the emulation module 122 may be implemented by modifying the interrupt service routine corresponding to # UD invoked when the call instruction interprets an error, or by self-defining an interrupt service routine. For example, when a null command for initiating # UD is issued (retire) and the interrupt service routine corresponding to # UD is called, the state of the emulation flag EF of the corresponding # UD interrupt service routine may be modified to check first, and a conversion request may be issued to the emulation module 122 by the operating system 120 when the emulation flag EF is set, or an exception handler routine known to handle command interpretation errors may be called when the emulation flag EF is not set. In one embodiment, the known # UD interrupt service routine may be separate from the emulation mode-specific # UD interrupt service routine of the present invention that calls the emulation module 122 via # UD, and may be called separately according to the state of the emulation flag EF being set, such as calling the known # UD interrupt service routine when the emulation flag EF is not set, but calling the emulation mode-specific # UD interrupt service routine of the call emulation module 122 when the emulation flag EF is set. In another embodiment, when the null instruction causing the instruction interpretation error is committed, the commit unit 4402 determines the status of the emulation flag EF and calls the emulation module 122 via the os 120 by using a self-defined interrupt service routine (for example, the processor 110 designer selects a vector number (e.g., 20H) from the self-defined numbers of the interrupt vector table when the emulation flag EF is set, and self-defines an interrupt vector # NE (NE is an abbreviation of Non-support interrupt indicator) when the interrupt service routine calls the emulation module 122. it should be noted that the interrupt service routine must transmit the instruction to be executed 132 (currently, an augmented instruction) as a parameter to the emulation module 122, for example, transmit a register address pointer storing the instruction to be executed 132 in the private register 116D to the emulation module 122. the emulation module 122 then performs the transformation of the augmented instruction and executes the corresponding emulation program, and the execution result of the simulation program is stored in the simulation execution result storage area 1246, and then the calling of the simulation module 122 is stopped (i.e. the real-time simulation mode is exited). On the other hand, when the simulation execution result of the to-be-executed instruction 132 determined as the augmentation instruction is read back from the simulation module 122 by the processor 110, the simulation flag EF in the private register 116D needs to be cleared, which indicates that the simulation operation of the to-be-executed instruction 132 is completed. Therefore, if the subsequent instruction 132 to be executed is an augmentation instruction, the simulation flag EF is reset, the simulation module 122 is called again, and the conversion and execution operations of the simulation program corresponding to the augmentation instruction are started.
In one embodiment, the interrupt service routine used by the call simulation module 122 (i.e., corresponding to # UD interrupt service routine or self-defined interrupt service routine # NE) may be microcode stored in the microcode ROM 450 and called by the microcode control unit 360 (the microcode control unit 360 may be constructed by a state machine and combinational logic circuit); in another embodiment, the operation that invokes the interrupt service routine may be configured independently as an interrupt control unit or module (e.g., an interrupt control unit under RISC/RISC-V architecture); in yet another embodiment, the call may be made by an address indicated by microcode stored in microcode memory 450. In yet another embodiment, an interrupt pre-processing unit (e.g., the microcode control unit 460 is configured as an interrupt pre-processing unit, or the interrupt control unit under RISC/RISC-V architecture is changed to an interrupt pre-processing unit) may be utilized to call the corresponding interrupt service routine to invoke the emulation module 122 when a null instruction corresponding to the instruction 132 to be executed (the current augmentation instruction) is committed. In one embodiment, the emulation module 122 is called by an interrupt request to convert the augmented instruction, the operating System 120 may call the emulation module 122 to execute the emulation module 122 via a System call (System call), for example, the emulation module 122 is used as a callback function (callback function), the to-be-executed instruction 132 (or format information) is transmitted to the callback function as a parameter, and the processor 110 is notified of the execution result stored in the emulation execution result storage area 1246 after the callback function completes the conversion and execution of the emulator corresponding to the to-be-executed instruction 132. In addition, the simulation module 122 can also be called by an internal interrupt (internal interrupt) or Trap (Trap), for example, a designer of the processor 110 defines an interrupt vector # NE by himself, and calls a Kernel (Kernel) of an operating system to call the simulation module 122 by the system, which is not described in detail herein. In yet another embodiment, each instruction entry (entry) of the reorder buffer 440 also includes a emulation flag field (not shown) for holding the emulation flag EF in the micro instruction. Thus, when the instruction 132 is an expand instruction, which results in the micro instruction sequence call unit 1124 of the instruction decoder 112 failing to interpret and generating a null instruction, the monitor 114 determines that the instruction 132 is an expand instruction and sets the EFLAGS flag EF, so that the set EFLAGS flag EF is sent to the rename unit 162 and the reorder buffer 440 following the null instruction. Thus, when the instruction issue unit 4402 of the reorder buffer 440 issues the empty instruction, it finds that the emulation flag EF following the empty instruction is set, and therefore calls the corresponding interrupt service routine to call the emulation module 122 to convert the to-be-executed instruction 132 (also called an add instruction) into an emulation and execute it. It should be noted that if the analog flag EF associated with the null instruction is not set, the instruction issue unit 4402 calls the corresponding interrupt service routine through the interrupt vector # UD to handle the exception condition of the instruction interpretation error, which is a conventional exception handling method and will not be described again.
In one embodiment, the processor 110 may further comprise a conversion register 280 coupled to the micro instruction sequence call unit 1124 and the monitor 114, for storing a micro instruction (μ op) sequence of the simulation program sent by the micro instruction sequence call unit 1124 when the simulation flag EF is set (e.g., set to 1), and for fetching the micro instruction sequence directly from the conversion register 280 for use when the same augmentation instruction is encountered later, without repeating the conversion and simulation operations for the augmentation instruction. Therefore, when the instruction 132 to be executed is an amplify instruction and the micro program sequence of the simulation program corresponding to the amplify instruction is stored in the conversion register 280, the conversion register 280 will send a Clear signal to the private register 116D to Clear or disable (disable) the simulation flag EF in the current private register and the instruction 132 to be executed, indicating that the current micro program sequence of the simulation program is obtained, and it is not necessary to call the simulation module 122 for conversion/simulation operation. In addition, to facilitate identifying the augmentation instruction corresponding to each simulation program, instruction format information (e.g., PRE/EOP/MOP/ODI) of the augmentation instruction may be used as a tag (tag) of the corresponding simulation program. Therefore, when the EFLAG EF is 1, the tag of the translation register 280 can be compared with the instruction format information of the augmentation instruction (e.g., PRE/EOP/MOP/ODI), and the associated program micro instruction sequence can be called after hit (hit). It is noted that the micro instruction sequence stored in the translation register 280 is only allowed to access or change its contents when the emulation flag EF is set, and is kept unchanged when the emulation flag EF is not set, so that the emulation program stored in the translation register 280 is not cleared, reset, overwritten, or otherwise altered by the processor 110 switching its operating environment (context switch) to execute other programs. Therefore, if the processor 110 encounters the same instruction to be simulated and augmented, the required simulation program can be retrieved from the conversion register 280 without calling the simulation module 122 and performing repeated conversion and simulation execution operations. In yet another embodiment, the translation register 280 may be located in a non-core region (Uncore region) of the multi-core processor, such as in an L3-cache (L3-cache) to allow the simulator that has performed the simulation to share with other processor cores, it should be noted that the simulator stored in the non-core region of the multi-core processor should be stored in the macro instruction (macro instruction) mode, so that the processor core that needs the simulator to perform the simulation can decode the simulator and generate the required micro instruction sequence before sending it to the subsequent pipeline stage (e.g., the executor 160) for execution.
In one embodiment, as shown in FIG. 4B, the micro instruction sequence and the value of the emulation flag EF in the translation register 280 are provided to the micro instruction call unit 1124 of the processor 110. In the processor 110 configuration of FIG. 4B, the current instruction 132 to be executed is an augmented instruction that was previously emulated, so that the associated emulated microinstruction sequence is stored in the translation register 280. If the instruction 132 becomes to be executed again, the monitor 114 still can determine that the instruction 132 to be executed augments the instruction and thus sets the emulation flag EF, since the microinstruction call unit 1124 of the instruction decoder 112 still cannot correctly interpret the instruction. Therefore, the micro instruction call unit 1124 may check the emulation flag EF and find that the emulation flag EF is set after the to-be-executed instruction 132 cannot be decoded correctly, then query the translation register 280, find the emulator micro instruction sequence corresponding to the augmented instruction in the translation register 280, combine the information related to the operand of the to-be-executed instruction 132 (e.g., addressing information, which may be different between the current to-be-executed instruction 132 and the previously emulated operand) with the emulator micro instruction sequence to generate the micro instruction sequence corresponding to the current to-be-executed instruction, and then send the micro instruction sequence to the subsequent executor 160, and then clear the emulation flag EF and register of the cached augmented instruction in the private register 116D. Of course, if the micro instruction call unit 1124 still issues an empty instruction and sends it to the rename unit 1602 after the emulation flag EF is set and the translation register 280 is searched, and if the emulator micro instruction sequence corresponding to the augmentation instruction is not found, then the augmentation instruction is translated and emulated by the corresponding interrupt service routine when the empty instruction is submitted, which have been described above, and thus will not be described again.
It should be noted that since the augmentation instruction is generally known public information with fixed format content, the processor 110 designer may analyze the format information of the augmentation instruction and then use a combinational logic circuit or other similar design method to construct the instruction analyzing unit 1122 to determine the augmentation instruction, which is not limited by the present invention.
In one embodiment, the instruction parsing unit 1122 included in the instruction decoder 112 in fig. 4A or fig. 4B may be copied to the monitor 114, and the copied instruction parsing unit 1122 still receives the instruction 132 to be executed and specifically determines whether the instruction 132 to be executed is an augmentation instruction for the monitor 114. In this configuration, since the instruction decoder 112 and the monitor 114 are implemented as two separate modules (and together receive the instructions 132 to be executed), the processor 110 may be implemented with two separate modules.
Fig. 5A is a schematic diagram illustrating the simulation module 122 performing conversion on the augmentation instruction. The simulation module 122 of FIG. 5A includes a control unit 702A, an augmentation instruction to simulation program conversion table 704A, and a simulation program sequence table 706A (which is the simulation program sequence table mentioned above). The control unit 702A is responsible for converting the augmentation instruction by the simulation module 122 to obtain a simulation program corresponding to the augmentation instruction (i.e., the instruction 132 to be executed), which will be described in detail later. The augmentation instruction to simulation program conversion table 704A further includes an augmentation instruction tag 7042A and a simulation program sequence pointer 7044A, which are respectively used to store format information of the augmentation instruction and a storage address of the simulation program corresponding to the augmentation instruction in the simulation program sequence table 706A. The sequence list 706A stores the sequence 7062A of all the instructions to be amplified and is called by the sequence pointer 7044A. In one embodiment, each of the emulator sequences 7062A stored in the emulator sequence table 706A that corresponds to an augmented instruction is edited into a compatible instruction sequence by the processor 110 in advance by compatible instructions (e.g., native instructions) of the processor 110 for each new instruction set or each newly added augmented instruction in the augmented instruction set, and then is further edited into an emulator. Thus, the simulation programs can be called through the structure shown in FIG. 5A to generate simulation execution results when the corresponding augmentation instructions are executed (i.e., when the current instruction 132 to be executed is an augmentation instruction).
The control unit 702A can compare the format information of the amplification command, including the Prefix (PRE), the escape code (EOP), the operation code (MOP), and other information (ODI) required for interpretation of each amplification command, with the amplification command tag 7042A in the simulation program conversion table 704A, and go to the simulation program sequence table 706A to call the required simulation program according to the simulation program sequence pointer 7044A corresponding to the successfully-compared amplification command tag 7042A if the comparison is successful (i.e., the format information of the amplification command and a Hit occurs in a certain amplification command tag 7042A). For example, if the information of the to-be-executed instruction 132 transmitted by the processor 110 includes prefix/translation code/operation code/other interpretation information PRE _ J/EOP _ J/MOP _ J/ODI _ J (J is an integer between 1 and N), the control unit 702A compares each augmented instruction tag in the augmented instruction tag 7042A according to the augmented instruction format information of PRE _ J/EOP _ J/MOP _ J/ODI _ J. As shown in FIG. 5A, since the format information of the augmentation instruction is stored in the augmentation instruction to Simulation program translation table 704A, a hit (as shown by reference numeral 70422A) occurs, so that the corresponding Simulation program sequence Pointer 70442A (i.e., SimProJ _ Pointer) can be obtained from the hit augmentation instruction tag 70422A, and then the Simulation program sequence Pointer 70442A is used to go to the Simulation program sequence table 706A to find the required Simulation program, i.e., the Simulation program 70622A required for finding the Simulation program sequence Pointer 70442A (i.e., Simuling _ ProgramJ indicated by SimProgramJ _ Pointer in FIG. 5A) according to the dotted arrow 708A. Finally, the control unit 702A calls the Simulation program (i.e., Simulation _ ProgrammJ) to complete the conversion process of the augmentation instruction. The Simulation module 122 can then execute the Simulation program Simulation _ ProgramJ to generate the execution result of the Simulation augmentation instruction (format information is PRE _ J/EOP _ J/MOP _ J/ODI _ J) to the processor 110. The execution of the simulation program will be described later using the program example of fig. 6A/6B.
In one embodiment, in addition to transmitting the augmentation instruction (or only the format information of the augmentation instruction) to the simulation module 122, the processor 110 may also transmit information such as the current operating environment information of the processor 110 and the operating environment information of the augmentation instruction to the simulation module 122 to determine whether the augmentation instruction can be executed in the current operating state of the processor 110. For example, the control unit 702A may call the corresponding interrupt service routine to notify the operating system 120/application 130 of the transition/execution exception when determining that the augmentation instruction cannot (or is not suitable for) be executed in the current execution environment of the processor 110 (e.g., the augmentation instruction is to be executed in protected mode, but the current processor is in real mode). In another embodiment, the simulation module 122 may only perform the comparison by augmenting a portion of the instruction, such as prefix/escape code/opcode (PRE/EOP/MOP), to obtain the simulation program.
It should be noted that, in an embodiment, the augmented instruction tag 7042A in the augmented instruction translation table 704A may be a code obtained by further processing the Prefix (PRE), the escape code (EOP), the operation code (MOP), and other information (ODI) required for interpretation, for example, encrypting or Hashing (Hashing) the PRE/EOP/MOP/ODI to protect the translation process of the augmented instruction, which should be well known to those skilled in the art and will not be described again. In another embodiment, the augmentation instructions and the simulation programs corresponding to the augmentation instructions may be added, deleted, or updated to the augmentation instruction to simulation program conversion table 704A and the simulation program sequence table 706A as required. For example, the augmentation instruction format information of PRE _ N +1/EOP _ N +1/MOP _ N +1/ODI _ N +1 and the corresponding conversion instruction sequence InstSeqN +1_ NatInst 1 … InstSeqN +1_ NatInst M (N and M are both integers greater than 1) may be added to the augmentation instruction to simulator program conversion table 704A by firmware update, and the former table contents may be overwritten by firmware update after modification by pointing to the simulator program with the simulator program Pointer InstSeqN +1_ Pointer (the augmentation instruction format information, the simulator program, and the simulator program Pointer are not shown in fig. 5A). In another embodiment, the modification of the augmentation instruction to simulation program conversion table 704A and the simulation program sequence table 706A may be performed by a real-time Update (Live Update), which is not limited in the present invention.
In one embodiment, the simulation module 122 may further include an event processing module (not shown). When the simulation module 122 generates an exception or exception during the conversion process (e.g., the simulation program does not exist, or the currently converted augmentation instruction cannot be (or is not) executed under the current operating environment of the processor 110), the event processing module 154 generates an exception/exception result, and notifies the application 130 and the operating system 120 of the exception/exception result and performs corresponding remedial steps, thereby preventing the entire electronic device 100 from being crashed due to the exception or exception. For example, the exception/exception result may be a digital exception flag instruction, which is returned to the application 130. In another embodiment, the application 130 or the operating system 120 may skip an exception/exception instruction, indicate that the function indicated by the instruction is not executable, or report an error.
Referring next to fig. 5B, a schematic diagram of another embodiment of the simulation module 122 for performing conversion on the augmentation instructions is illustrated. Similar to the embodiment of fig. 5A, the simulation module 122 in fig. 5B also includes a control unit 702B, an augmentation instruction to simulation program conversion table 704B, and a simulation program sequence table 706B. The control unit 702B is responsible for converting the augmentation instruction by the simulation module 122 to obtain the simulation program corresponding to the augmentation instruction (i.e., the instruction 132 to be executed). The augmentation instruction to simulation program conversion table 704B further includes an augmentation instruction tag 7042B and a simulation program sequence pointer 7044B, which are used to store format information of the augmentation instruction and a storage address of the simulation program sequence table 706B corresponding to the augmentation instruction. The sequence list 706B stores the sequence 7062B of all the augmentation instructions and is called by the sequence pointer 7044B. The simulation program sequence table 706B stores therein simulation program sequences 7062B each corresponding to an augmentation instruction, which are also compatible instruction sequences previously edited with compatible instructions (e.g., native instructions) of the processor 110, and further writes these compatible instruction sequences into a simulation program. In another embodiment, the modification of the augmentation instruction to simulation program conversion table 704B and the simulation program sequence table 706B may be performed by a real-time Update (Live Update), which is not limited in the present invention.
Unlike fig. 5A, the embodiment of fig. 5B allows the augmentation instructions to correspond to more than one simulation program, for example, the format information prefix/transform code/operation code/other interpretation information of the instruction 132 to be executed is PRE _ J/EOP _ J/MOP _ J/ODI _ J, there may be three augmented instruction tags and emulator pointers corresponding to the augmented instruction to emulator translation table 704A, such as the augmentation instruction tags PRE _ J/EOP _ J/MOP _ J/ODI _ J-1, PRE _ J/EOP _ J/MOP _ J/ODI _ J-2, PRE _ J/EOP _ J/MOP _ J/ODI _ J-3 in FIG. 5B, the simulation program pointers are SimProJ-1_ Pointer, SimProJ-2_ Pointer and SimProJ-3_ Pointer respectively. The reason for this is that, for example, since the augmentation instructions often need to support operands of different lengths, the micro instruction sequence may include some decisions or loops, which may cause branches in the execution of the processor 110 and even affect the execution efficiency of the processor. Thus, by determining the operand length in advance and then building the micro instruction sequence after removing the determination and loop steps, it may be possible to reduce or avoid the occurrence of instruction branches in the instruction stream of the processor pipeline stage and improve the execution efficiency of the processor 110. For example, if the tag of the augmentation instruction PRE _ J/EOP _ J/MOP _ J/ODI _ J-1, PRE _ J/EOP _ J/MOP _ J/ODI _ J-2, PRE _ J/EOP _ J/MOP _ J/ODI _ J-3 corresponds to 128 bits, 256 bits, 512 bits of the operand of the augmentation instruction, respectively, and the control unit 702B can further know the length of the operand of the current augmentation instruction (for example, the maximum length of the operand is 512 bits, so the simulation program indicated by PRE _ J/EOP _ J/MOP _ J/ODI _ J-3 is called) after analyzing the format information (for example, ODI) of the augmentation instruction, so that the required simulation program can be called more precisely. It should be noted that the above examples are only illustrative, and those skilled in the art will appreciate that the augmentation instructions may also be used to cut the needed simulation program by means other than operand length, and the present invention is not limited thereto.
The operation of FIG. 5B is described next. The control unit 702B can compare the format information of the augmentation instruction, including the Prefix (PRE), the escape code (EOP), the operation code (MOP), and other information (ODI) required for interpretation of each augmentation instruction, with the augmentation instruction tag 7042A in the emulator conversion table 704B, and go to the emulator sequence table 706B to invoke the required emulator 7062B according to the emulator sequence pointer 7044B corresponding to the successfully-compared augmentation instruction tag 7042B if the comparison is successful (i.e., the format information of the augmentation instruction hits (Hit) with a certain augmentation instruction tag 7042B). For example, if the information of the instruction 132 to be executed transmitted by the processor 110 includes prefix/translation code/opcode/other translation information PRE _ J/EOP _ J/MOP _ J/ODI _ J (J is an integer between 1 and N), but the control unit 702B further analyzes PRE _ J/EOP _ J/MOP _ J/ODI _ J (e.g., analyzes operand length) to find that PRE _ J/EOP _ J/MOP _ J/ODI _ J should be PRE _ J/EOP _ J/MOP _ J/ODI _ J-1, and thus compares the result with PRE _ J/EOP _ J/MOP _ J/ODI _ J-1 in the emulator program conversion table 704A successfully (as shown by reference numeral 70422B), so that the control unit 702B obtains the corresponding emulator program sequence Pointer 70442B (SimProROJ-1 _ Pointer) from the augmented instruction tag 70422B, then, the address indicated by the Simulation program sequence pointer 70442B is used to go to the Simulation program sequence table 706B to find the required Simulation program, i.e. the Simulation program 70622B required by the Simulation program sequence pointer 70442B is found according to the dashed arrow 708B, i.e. Simulation _ ProgramJ-1 in FIG. 5B. Finally, the control unit 702B calls the Simulation program (i.e., Simulation _ ProgrammJ-1 indicated by the Simulation program sequence pointer 70442B), and the Simulation module 122 executes the called Simulation program to generate an execution result, which is provided to the processor 110.
In an embodiment, the processor 110 may transmit the augmentation instruction (or only the format information of the augmentation instruction) to the control unit 702B, and may also transmit information such as the current operating environment information of the processor 110 and the operating environment information of the augmentation instruction to the control unit 702B to determine whether the augmentation instruction can be executed in the current operating state of the processor 110, which is the same as the operation method of fig. 5A and thus is not described again. In addition, in an embodiment, the augmentation instruction tag 7042 in the augmentation instruction translation table 704B may be a code obtained by further processing the Prefix (PRE), the escape code (EOP), the operation code (MOP), and other information (ODI) required for interpretation, for example, encrypting or Hashing (Hashing) the PRE/EOP/MOP/ODI to protect the translation process of the augmentation instruction, which should be well known to those skilled in the art and will not be described again. In another embodiment, the augmentation instructions and the simulation programs corresponding to the augmentation instructions may be added, deleted, or updated to the augmentation instruction to simulation program conversion table 704B and the simulation program sequence table 706B as required. This part is the same as the operation of fig. 5A and is not described again.
It should be noted that, in either the conversion operation of the augmentation instructions of fig. 5A or fig. 5B, temporary information generated by the conversion operation (for example, temporary information in the alignment of the augmented instruction tags 7042A/7042B, or all tables and pointers in the example of fig. 5A/5B, and program codes required by the control unit 702A/702B during operation) may be stored in the conversion information storage area 1244 of the main storage via the address indicated by the conversion information Pointer register 116B, and the called simulation program may also be stored in the conversion information storage area 1244 temporarily and then await execution by the processor 110 (for example, a Pointer pointing to a simulation program sequence (for example, SimProJ-1_ Pointer) is transmitted to the processor 110). In one embodiment, all called simulation programs may be flagged or recorded for reference by simulation module 122 or the processor designer.
In another embodiment, the conversion operation of FIG. 5A/5B can be written as another conversion module (not shown) that can be disposed in the simulation module 122 or used as a callback function (callback function) for the simulation module 122 to call (e.g., via a system call), so that the processor 110 may need to change the contents of the state register of the processor 110 due to the switching of the operating environment. Based on this requirement, a state stack (not shown) may be disposed in the processor current state saving area 1242 to store the operating environment information of the processor 110. For example, when the processor 110 enters the simulation module 122, the current operating environment information of the processor 110 is stored in the first layer of the state stack, and then when the simulation module 122 calls the conversion module, the simulation module 122 also stores the current operating environment parameters in the second layer (above the first layer) of the state stack, and then switches to the conversion module to perform the conversion operation of the augmentation instruction. When the conversion module completes the conversion of the augmentation instruction and successfully calls the corresponding simulation program, and the simulation module 122 needs to restore to the working environment before the conversion module is called, the previously stored working environment parameters can be called from the state stack second layer to execute the restoration operation of the working environment. Finally, after the simulation module 122 completes the execution of the simulation program corresponding to the augmentation instruction, the working environment parameters stored in the first layer of the state stack can be read out and restored to the working environment when the instruction 132 to be executed is executed. Although the storage of the working environment is described by stacking, it should be understood by those skilled in the art that any suitable manner of storing the working environment parameters may be substituted without departing from the spirit of the invention and is intended to be covered by the claims.
Finally, when the simulation module 122 executes the simulation program, temporary information generated during the simulation process, including data structures, variables, temporary data … during the simulation process, etc. defined by the simulation program, may be stored in the simulation result storage area 1246 of the main memory via the indication of the simulation result pointer register 116C, and the execution result of the simulation program may also be stored in the simulation result storage area 1246 for the processor 110 or for reference of the subsequent augmentation instructions. This is advantageous, for example, if the application 130 sends consecutive instructions to the processor for execution, and the instructions have dependencies (dependencies) on execution, so that the emulation efficiency of the augmentation instruction can be improved if the result of the emulation of the previous (or some) augmentation instruction can be directly retained for reference by the following augmentation instruction. It should be noted that, since the processor state saving area 1242, the conversion information saving area 1244 and the simulation execution result saving area 1246 in the saving area 124 can be accessed and stored during the period when the simulation module 122 is called (i.e. during the period when the immediate simulation mode is turned on), all the information stored in the saving area 124 can be freely used by the simulation module 122, and those skilled in the art can adjust this based on different applications, but the invention is not limited thereto. In one embodiment, the portion of the execution simulator program may be further translated into an execution module (not shown), and the execution module may be called to execute the simulator program after the translation module calls the simulator program. The writing of calls to execution modules is well known to those skilled in the art and will not be described in detail.
The contents of the simulation program corresponding to the amplification instruction will be described with reference to the simulation program illustrated in fig. 6A/6B. Fig. 6A/6B shows a VADDSD instruction in the simulated AVX-512 instruction set (AVX is an abbreviation for Advanced Vector Extensions), which is set by the monitor 114 by the simulation flag EF, and then calls the simulation module 122 through the corresponding interrupt service routine, since the VADDSD cannot be correctly interpreted by the instruction decoder 112 but can be recognized by the monitor 114, and this VADDSD instruction is also provided as a parameter to the simulation module 122. The AVX-512 specification defines for VADDSD the following:
VADDSD XMM0{K1}{Z},XMM1,XMM2
the operation defined by VADDSD is to add the low 64 bits of the double precision floating-point source operand XMM2 to the source operand XMM1 and store the result in the destination operand XMM 0. In addition, the VADDSD also supports a mask (masking) operation, and performs the above-described addition operation when { K1}, is 1 (K1 is the 2 nd bit of the 8-bit mask register, i.e., bit 1 of bits 0 to 7). Z is used to determine whether the final result is to be cleared or merged with the original result (merge). For other specification and description of VADDSD, refer to the AVX-512 specification, which is not repeated herein.
The basic information of the simulation instruction emulate _ addsd _512 in fig. 6A/6B is outlined as follows:
(1) inst is a data structure that contains all the information that can be decoded from machine code (e.g., instructions to be executed 132 that have been identified as augmentation instructions), where:
a, dst is used for encoding a decoded target register;
src indicates the decoded operand register encoding;
evex.b, indicating the operation modes supported by the current instruction;
(2) max _ vl, which represents the maximum length of the current vector class register, and for the AVX-512 specification, max _ vl is 512;
(3) ProcessrContext, the machine state saved for the processor when the interrupt (e.g., # NE interrupt) occurred;
(4) DedicateHW-data structure of the hardware resources of the dedicated hardware 116 in the simulation.
The contents of the program code of FIG. 6A/6B are outlined below:
(1) lines 3-6: initializing a storage position required by simulation;
(2) lines 7-12: judging whether the register code of the current dst belongs to the category needing simulation or not in the special hardware 116 (indicated by dedicateHW), if not, reading data from a simulation execution result storage area 1246 according to the indication of ctx (indicated by processsocontext);
(3) lines 14-25: read source operands (designated src1/src2, respectively) are obtained as previously described for register code (dst), and are first determined to be present in dedicated hardware 116, and if not present, indicate that processor 110 supports the register, and are then read from processor state save area 1242 according to the indication of ctx (indicated by ProcessContext);
(4) lines 27-49: program code written in accordance with the AVX-512 specification for the operations defined by VADDSD. For example:
a. line 27: judging the operation mode (Broadcast/RC/SAE context) supported by the current instruction, and starting static rounding control on the operand according to the specification when the evex.b is 1;
b. lines 34-49: judging whether the current VADDSD instruction is controlled by a Mask register (Mask register), for example, src1+ src2 corresponding to the line when k1{0}, is 1, otherwise, not executing the corresponding operation;
c. the final result is determined according to { Z }, for example, when { Z }, which is 1, indicates zero operation (zeroing), and when { Z }, which is 0, indicates merging (merge) with the original dst result;
(5) lines 51-56: determining whether the destination register is a register supported by the processor 110 (for example, a 512-bit register is not a register supported by the current processor 110, because the register of the processor 110 has only 256 bits), and when the register is a register not supported by the processor 110, in addition to the fact that the simulated register is simulated by the simulated registers (located in the simulated register file 116E) supported by the processor 110 through the mapping of the simulated execution result pointer register 116C, storing the contents corresponding to the simulated registers into the simulated execution result storage area 1246 according to the address indicated by the simulated execution result pointer register 116C; if the register is supported by the processor 110, the register is saved in the running state of the processor of the NE;
(6) lines 58-60: after confirming the maximum vector length (e.g. 256) currently supported by the hardware of the processor 110, updating the calculated result to all vector registers already supported by the hardware;
(7) finally, the execution results are returned on line 62.
It is noted that 36, 48, 49 of fig. 6A/6B are Encoded versions (Encoded versions) of VADDSD defined in the AVX-512 specification, and since the source operands of this instruction are 128 bits (XMM1) and 64 bits (XMM2), VADDSD can be simulated by using only one simulation program. However, other instructions that support 512-bit source/destination registers may need to be split into multiple emulation programs. For example, the ADDPS (or VADDPS) instruction in the AVX-512 specification may support 512-bit registers, such as:
ADDPS ZMM0{k1}{z},ZMM1,ZMM2
the ZMM0/ZMM1/ZMM2 are 512-bit registers, and require many steps such as if/else and loop (for loop) if simulation is performed according to the operation defined by the AVX-512 specification, but these steps are not required in terms of Encoded version (Encoded version). Therefore, if the definition of the encoding version is directly used, the operands with different lengths are written into different simulation programs (for example, 128/256/512 bits are used for distinguishing), and the corresponding simulation program is called according to the operand with the largest length of the destination/source operand, so that the execution efficiency of the processor 110 can be improved. For example, according to the definition of the AVX-512 specification, the simulation of ADDPS can be divided into 128/256/512-bit three simulation programs, so that when the instruction 132 to be executed is an ADDPS instruction, the maximum length of the source/destination register can be determined and different simulation programs can be called accordingly, for example, the maximum length of the operand is YMM2, which is 256 bits, so that a simulation program for handling 256-bit registers exclusively by ADDPS is called. The above-mentioned operation of calling the corresponding simulation program to the VADDPD can be performed by using the architecture illustrated in fig. 5B, and therefore, the description thereof is omitted.
FIG. 7 is a flowchart of a simulation method according to an embodiment of the present invention, and the conversion method of FIG. 7 is applied to the processor shown in FIG. 4A/4B. As shown in fig. 7, in step S702, when the instruction 132 to be executed by the processor 110 is an augmentation instruction, an interrupt (which may be # UD or # NE) is triggered by hardware and enters a corresponding interrupt service routine. For how to enter the interrupt service routine, please refer to the related description of FIG. 4A/4B. Other hardware interrupts are also disabled to prevent the extraneous hardware interrupt signal from interfering with the conversion/simulation process during the conversion and simulation processes, and to prevent the conversion/simulation process from being attacked. In step S704, a simulation execution environment of the simulation module 122 is constructed, and the environment parameters of the simulation module 122 can be written in the bios, the driver of the processor, or the kernel of the os in advance, and can be called when the simulation module 122 needs to be constructed. The current execution environment parameters of the processor 110 are saved, for example, by storing the current execution environment parameters of the processor 110 in the processor state save area 1242 in the main memory (e.g., pushing the execution environment parameters into the (push) state stack) via the address indicated in the processor current state pointer register 116A in the dedicated hardware 116. In step S706, after the simulation module 122 reads the format information of the to-be-executed instruction 132 byte by byte (decoding to obtain the instruction information, please refer to fig. 4A/4B), according to the pointer provided by the interrupt service routine, it matches the current operating state of the processor 110 to determine whether the to-be-executed instruction 132 can operate in the current operating mode of the processor 110. If the current instruction 132 cannot be executed in the operating environment of the current processor (e.g., the current instruction 132 (i.e., an augmented instruction) is to be executed in protected mode, but the current processor 110 is operating in real mode), the determination is no, the flow proceeds from step S706 to step S708 to resume the saved interrupted program live, such as from the address indicated in the processor current state pointer register 116A, the processor 110 execution context parameter read previously stored in the processor state save area 1242 by the calling emulation module 122 is read (e.g., the previously stored execution context parameter is fetched (pop) from the state stack), to restore the execution state of the processor 110 prior to invoking the simulation module 122, and then exit the simulation module 122 and end the conversion/simulation operation of the instruction to be executed. If it is determined in step S706 that the current instruction 132 to be executed can run in the operating environment of the current processor, the process proceeds from step S706 to step S710, and the simulation module 122 invokes the corresponding simulation program according to the format information of the instruction 132 to be executed and then proceeds to step S712, and then reads the memory operand or the register operand according to the instruction of the instruction 132 to be executed. If an architectural register not supported by the current processor 110 hardware is encountered, a lookup (followed by data storage or modification) is performed using the mapping of the emulated register file 116E of the special-purpose hardware 116. As described above, in the calling process of the simulation program in step S712, all the information of the conversion process can be stored in the conversion information storage area 1244 through the address indicated by the conversion information pointer register 116B. In step S714, the simulation module 112 executes the simulation program. As described above, in the execution of the simulation program in step S714, all the information of the execution process may be stored in the simulation execution result storage area 1246 at the address indicated by the simulation execution result pointer register 116C. It is noted that simulation results generated by the simulation program are also cached in the simulation execution result storage area 1246. In step S716, simulation completion information is set to the simulation execution result saving area 1246, and then the saved interrupted program is recovered in the field (i.e. read from the address indicated by the current state pointer register 116A of the processor, the operating environment parameters of the processor 110 stored in the state stack of the processor state saving area 1242 before calling the simulation module 122), and the simulation module 122 is exited to terminate the simulation operation of the augmentation instruction. Finally, the processor 110 may use the address indicated by the simulated execution result pointer register 116C to read the simulated execution result of the instruction 132 to be executed.
FIG. 8 is a flowchart of determining whether a finger execution instruction is an augmentation instruction, according to an embodiment of the invention. As shown in fig. 8, in step S802, the processor 110 receives an instruction 132 to be executed. In step S804, the processor 110 decodes the instruction 132 to be executed. In step S806, the instruction decoder 112 determines whether the instruction 132 to be executed is a compatible instruction (e.g., a native instruction). If the instruction 132 to be executed is a compatible instruction (e.g., yes), the process proceeds from step S806 to step S808, and the processor 110 executes the compatible instruction and returns the execution result; if the to-be-executed instruction 132 is not a compatible instruction (e.g., if the determination result in step S806 is no), the step S806 proceeds to step S810, and the monitor 114 determines whether the to-be-executed instruction 132 is an augmentation instruction. If the monitor 114 determines that the instruction 132 to be executed is an augmentation instruction (e.g., yes), then step S814 is entered, in which the monitor 114 sets the simulation flag EF to be enabled, and when the instruction 132 to be executed commits (refer, as shown in FIG. 4A/4B, when a null instruction corresponding to the instruction 132 to be executed is committed), queries an interrupt vector table (e.g., # NE) in the processor to find the corresponding interrupt service routine, and calls the simulation module 122 through the interrupt service routine. The interrupt service routine may call the simulation module 122 through a preset hardware, a pre-written software, or an interface formed by a software/hardware combination (for example, a microcode control unit executes a microcode), and after determining that the current to-be-executed instruction 132 may call the simulation module 122 (for example, the application 130 successfully authenticates), the current to-be-executed instruction proceeds to call the simulation module 122. Finally, step S816 is proceeded to convert the to-be-executed instruction 132 into a simulation program via the simulation module 122, and then simulate the execution of the to-be-executed instruction 132 (currently, the augmentation instruction) by executing the simulation program and return the execution result.
In summary, the instruction simulation apparatus and the method thereof according to the embodiments of the invention utilize the monitor to determine whether the instruction to be executed in the application program is a compatible instruction or an augmentation instruction of the processor. If the instruction to be executed is judged to be the augmentation instruction, the processor converts the instruction to be executed into an analog program which can be run by the processor and executes the analog program, so that the problem of compatibility of a processor instruction set is solved, and the service life of the electronic equipment using the analog device is prolonged.
The above description is only for the preferred embodiment of the present invention, and it is not intended to limit the scope of the present invention, and any person skilled in the art can make further modifications and variations without departing from the spirit and scope of the present invention, therefore, the scope of the present invention should be determined by the claims of the present application.

Claims (36)

1. An instruction simulation apparatus, comprising:
a monitor for determining whether an instruction to be executed currently executed by a processor is a compatible instruction or an augmentation instruction, wherein the compatible instruction belongs to an instruction of a current instruction set of the processor, and the augmentation instruction does not belong to the current instruction set of the processor but belongs to an instruction of a new instruction set or an augmentation instruction set corresponding to the current instruction set architecture of the processor;
if the instruction to be executed is judged to be the amplification instruction in the new instruction set or the amplification instruction set, the processor converts the instruction to be executed into a simulation program corresponding to the amplification instruction, and simulates an execution result of the instruction to be executed through the execution of the simulation program; and
if the to-be-executed instruction is determined to be a compatible instruction, the to-be-executed instruction is executed by the processor,
the simulation program is a program which is edited by the designer of the processor by using the compatible instruction of the processor aiming at the operation indicated by the augmentation instruction belonging to the new instruction set or the augmentation instruction set in advance.
2. The instruction emulation apparatus of claim 1, wherein at least:
the processor current state saving area is used for saving the current working environment state of the processor;
a conversion mode state saving area for saving temporary information in the process of converting the instruction to be executed into the corresponding simulation program; and
and an execution result storage area for storing the execution result of the simulation program.
3. The instruction emulation apparatus of claim 1, wherein the processor sets a transition flag to obtain the emulator corresponding to the augmented instruction by interrupting a service routine when the instruction to be executed is determined to be the augmented instruction.
4. The instruction emulation apparatus of claim 1, wherein the processor comprises:
a plurality of registers, including at least a register for caching a current state of the processor, a register for caching a conversion intermediate result of the emulation program calling corresponding to the augmentation instruction, a register for caching a simulation execution result, an emulation register for mapping a destination register indicated by the augmentation instruction, and a register for caching an immediate conversion mode state saving area.
5. The instruction simulation apparatus according to claim 1, wherein the interrupted service program calls a simulation module to query whether the augmentation instruction has a simulation program corresponding thereto, and when the simulation program corresponding to the augmentation instruction is found, the simulation module executes the simulation program to obtain an execution result simulating the instruction to be executed.
6. The instruction simulation apparatus of claim 5, wherein the simulation module is configured to cache and suspend operation of the simulation module after simulation execution results are generated, and to provide a failure result to notify the processor and suspend operation of the simulation module when a simulation program corresponding to the augmentation instruction is not found.
7. The instruction simulation apparatus according to claim 6, wherein the processor reads the buffered simulation execution result after the simulation module suspends the operation of the simulation module, or the processor learns the failure result after the simulation module suspends the operation, and the processor notifies the application program corresponding to the instruction to be executed of the failure result.
8. The instruction simulation apparatus according to claim 6, wherein the execution result after the simulation program is executed is retained after the simulation module stops operating, so that the subsequent to-be-executed instruction is determined as an augmentation instruction and converted into the simulation program corresponding to the subsequent to-be-executed instruction, and then used as the reference information for the simulation program of the subsequent to-be-executed instruction.
9. The instruction emulation apparatus according to claim 5, wherein the emulation module is provided in a driver of the processor, in a kernel of an operating system running on the processor, or in a BIOS of a system including the processor.
10. The instruction emulation apparatus of claim 1, wherein the compatible instruction and the augmented instruction are both under an x86 instruction set architecture, both under an ARM instruction set architecture, both under an MIPS instruction set architecture, or both under a RISC-V instruction set architecture.
11. An instruction emulation method, comprising:
judging whether an instruction to be executed currently executed by a processor is a compatible instruction or an augmentation instruction, wherein the compatible instruction belongs to an instruction of a current instruction set of the processor, and the augmentation instruction does not belong to the current instruction set of the processor but belongs to an instruction which is a new instruction set or an augmentation instruction set relative to the current instruction set architecture of the processor;
if the instruction to be executed is judged to be the amplification instruction in the new instruction set or the amplification instruction set, converting the instruction to be executed into a simulation program corresponding to the amplification instruction, and simulating an execution result of the instruction to be executed through the execution of the simulation program; and
if the to-be-executed instruction is determined to be the compatible instruction, the to-be-executed instruction is executed by the processor,
the simulation program is a program which is edited by the designer of the processor by using the compatible instruction of the processor aiming at the operation indicated by the augmentation instruction belonging to the new instruction set or the augmentation instruction set in advance.
12. The instruction emulation method of claim 11, wherein at least:
the processor current state saving area is used for saving the current working environment state of the processor;
a conversion mode state saving area for saving temporary information in the process of converting the instruction to be executed into the corresponding simulation program; and
and an execution result storage area for storing the execution result of the simulation program.
13. The instruction emulation method of claim 11, wherein the processor sets a transition flag to obtain the emulator corresponding to the augmented instruction by interrupting a service routine when the instruction to be executed is determined to be the augmented instruction.
14. The instruction simulation method of claim 11, further comprising:
the interrupt service program calls the simulation module to read the instruction to be executed;
the simulation module inquires whether the simulation program corresponding to the amplification instruction exists;
when the simulation program corresponding to the augmentation instruction is found, the simulation module executes the simulation program to obtain an execution result simulating the instruction to be executed.
15. The instruction emulation method of claim 14, further comprising:
the simulation module stops the operation of the simulation module after caching the simulation execution result; and
when a simulation program corresponding to the augmentation instruction is found, a failure result is provided by the simulation module to inform the processor and stop the operation of the simulation module.
16. The instruction simulation method of claim 15, further comprising:
after the operation of the simulation module is stopped, the processor reads the cached simulation execution result, or after the operation of the simulation module is stopped, the processor learns the failure result and notifies the failure result to the application program corresponding to the instruction to be executed.
17. The instruction simulation method of claim 16, wherein the execution result after the execution of the simulation program is retained after the simulation module is operated, so that the subsequent to-be-executed instruction is determined as an augmentation instruction and converted into the simulation program corresponding to the subsequent to-be-executed instruction, and then used as the reference information for the simulation program of the subsequent to-be-executed instruction.
18. The instruction simulation method of claim 15, wherein the simulation module is disposed in a driver of the processor, in a kernel of an operating system running on the processor, or in a bios of a system including the processor.
19. The instruction emulation method of claim 11, wherein the compatible instruction and the augmented instruction are both instructions under an x86 instruction set architecture, both instructions under an ARM instruction set architecture, both instructions under an MIPS instruction set architecture, or both instructions under a RISC-V instruction set architecture.
20. A method for processor instruction emulation, comprising:
when the instruction to be executed by the processor is an amplification instruction, calling a simulation module through an interrupt service program to obtain a simulation program corresponding to the amplification instruction, wherein the amplification instruction does not belong to the current instruction set of the processor but belongs to a new instruction set or an instruction in the amplification instruction set, and the instruction set of the new instruction set or the instruction in the amplification instruction set is the same in structure;
disabling responses to other hardware interrupts during operation of the simulation module; and
and executing the simulation program to simulate the execution result of the instruction to be executed.
21. The method of claim 20, wherein the instruction to be executed is a parameter of the simulation module.
22. The method of claim 21, wherein the simulation module is executed by the processor by making a system call to an operating system including the processor.
23. The method of claim 22, wherein the emulation module is a callback function located in a kernel of an operating system including the processor or a driver of the processor.
24. The method as claimed in claim 20, wherein the simulation module and the simulation program are stored in a bios of a system including the processor, and the simulation module and the simulation program are loaded into a memory of the system including the processor when the system including the processor is powered on.
25. The processor instruction emulation method of claim 20, further comprising:
and caching the simulation execution result of the simulation program in a system memory comprising the processor and then stopping the operation of the simulation module.
26. The method as claimed in claim 20, wherein the execution result after the simulation program is executed is retained after the simulation module stops operating, so that the subsequent to-be-executed instruction is determined as an augmentation instruction and converted into the simulation program corresponding to the subsequent to-be-executed instruction, and then used as the reference information for the simulation program execution of the subsequent to-be-executed instruction.
27. The method as claimed in claim 20, wherein the simulation program is a program that a designer of the processor edits in advance with compatible instructions of the processor for operations indicated by the augmentation instructions belonging to the new instruction set or the augmented instruction set.
28. The method of claim 20, wherein the compatible instruction and the augmented instruction are both instructions under an x86 instruction set architecture, both instructions under an ARM instruction set architecture, both instructions under an MIPS instruction set architecture, or both instructions under a RISC-V instruction set architecture.
29. A method for processor instruction emulation, comprising:
when the instruction to be executed currently executed by the processor is an augmentation instruction, obtaining a simulation program corresponding to the augmentation instruction through an interrupt service program, wherein the augmentation instruction does not belong to the current instruction set of the processor, but belongs to a new instruction set or instructions in the augmentation instruction set, and the instruction set of the new instruction set or the instructions in the augmentation instruction set are of the same structure; and
executing the simulation program to generate a simulation execution result corresponding to the augmentation instruction,
wherein the acquisition and execution of the simulation program inhibits response to other hardware interrupts.
30. The method of claim 29, wherein the interrupt service routine is invoked by a system call to obtain the simulation program corresponding to the augmentation instruction, and the instruction to be executed is a parameter of the system call.
31. The method of claim 30, wherein the system call calls a simulation module located in a kernel of an operating system including the processor or a driver of the processor to obtain the simulation program.
32. A method according to claim 31, wherein the simulation module is a callback function.
33. The method as claimed in claim 31, wherein the simulation module terminates the operation of the simulation module after obtaining the simulation program and executing the simulation program to generate the simulation execution result corresponding to the augmentation instruction.
34. The method as claimed in claim 33, wherein the execution result after the simulation program is executed is retained after the simulation module is suspended from operation, so that the subsequent instruction to be executed is determined as an augmentation instruction and converted into a simulation program corresponding to the subsequent instruction to be executed, and then used as reference information for simulation program execution of the subsequent instruction to be executed.
35. The method as claimed in claim 29, wherein the simulation program is a program that a designer of the processor edits in advance with compatible instructions of the processor for operations indicated by the augmentation instructions belonging to the new instruction set or the augmented instruction set.
36. The method of claim 29, wherein the compatible instructions and the augmented instructions that can be executed correctly under the processor are both instructions under an x86 instruction set architecture, both instructions under an ARM instruction set architecture, both instructions under an MIPS instruction set architecture, or both instructions under a RISC-V instruction set architecture.
CN202011588921.6A 2020-12-29 2020-12-29 Instruction simulation device and method thereof Pending CN114691200A (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
CN202011588921.6A CN114691200A (en) 2020-12-29 2020-12-29 Instruction simulation device and method thereof
US17/471,170 US11816487B2 (en) 2020-12-29 2021-09-10 Method of converting extended instructions based on an emulation flag and retirement of corresponding microinstructions, device and system using the same
US17/471,167 US11803381B2 (en) 2020-12-29 2021-09-10 Instruction simulation device and method thereof
US18/465,189 US20240004658A1 (en) 2020-12-29 2023-09-12 Instruction simulation device and method thereof
US18/474,207 US20240012649A1 (en) 2020-12-29 2023-09-25 Instruction conversion method, instruction conversion system, and processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011588921.6A CN114691200A (en) 2020-12-29 2020-12-29 Instruction simulation device and method thereof

Publications (1)

Publication Number Publication Date
CN114691200A true CN114691200A (en) 2022-07-01

Family

ID=82133272

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011588921.6A Pending CN114691200A (en) 2020-12-29 2020-12-29 Instruction simulation device and method thereof

Country Status (1)

Country Link
CN (1) CN114691200A (en)

Similar Documents

Publication Publication Date Title
US9003422B2 (en) Microprocessor architecture having extendible logic
KR101761498B1 (en) Method and apparatus for guest return address stack emulation supporting speculation
KR101213821B1 (en) Proactive computer malware protection through dynamic translation
US11914997B2 (en) Method and system for executing new instructions
US11669328B2 (en) Method and system for converting instructions
US11803383B2 (en) Method and system for executing new instructions
US11803387B2 (en) System for executing new instructions and method for executing new instructions
US11604643B2 (en) System for executing new instructions and method for executing new instructions
US7725879B2 (en) Method and apparatus for executing instructions of java virtual machine and transforming bytecode
JP2001519956A (en) A memory controller that detects the failure of thinking of the addressed component
US20070294675A1 (en) Method and apparatus for handling exceptions during binding to native code
JP2014194770A (en) Instruction emulation processors, methods, and systems
US11625247B2 (en) System for executing new instructions and method for executing new instructions
US20080301653A1 (en) Method and apparatus for increasing task-execution speed
JPH11327918A (en) Dynamic conversion system
US5784607A (en) Apparatus and method for exception handling during micro code string instructions
KR100864891B1 (en) Unhandled operation handling in multiple instruction set systems
US7219337B2 (en) Direct instructions rendering emulation computer technique
JP2001519955A (en) Translation memory protector for advanced processors
CN114691200A (en) Instruction simulation device and method thereof
JP3723019B2 (en) Apparatus and method for performing branch prediction of instruction equivalent to subroutine return
US20240004658A1 (en) Instruction simulation device and method thereof
JP2001519954A (en) A host microprocessor having a device for temporarily maintaining the state of a target processor
JP2710994B2 (en) Data processing device
CN114691199A (en) Instruction conversion device, instruction conversion method, instruction conversion system and processor

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: Room 301, 2537 Jinke Road, Zhangjiang High Tech Park, Pudong New Area, Shanghai 201203

Applicant after: Shanghai Zhaoxin Semiconductor Co.,Ltd.

Address before: Room 301, 2537 Jinke Road, Zhangjiang High Tech Park, Pudong New Area, Shanghai 201203

Applicant before: VIA ALLIANCE SEMICONDUCTOR Co.,Ltd.