CN1328655C - Inhibitor and method for storing check - Google Patents

Inhibitor and method for storing check Download PDF

Info

Publication number
CN1328655C
CN1328655C CNB031222803A CN03122280A CN1328655C CN 1328655 C CN1328655 C CN 1328655C CN B031222803 A CNB031222803 A CN B031222803A CN 03122280 A CN03122280 A CN 03122280A CN 1328655 C CN1328655 C CN 1328655C
Authority
CN
China
Prior art keywords
instruction
extension
logic device
storage
microprocessor
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Lifetime
Application number
CNB031222803A
Other languages
Chinese (zh)
Other versions
CN1453702A (en
Inventor
G·葛兰·亨利
罗德·E·胡克
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
INTELLIGENCE FIRST CO
Original Assignee
INTELLIGENCE FIRST CO
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US10/283,397 external-priority patent/US7302551B2/en
Application filed by INTELLIGENCE FIRST CO filed Critical INTELLIGENCE FIRST CO
Publication of CN1453702A publication Critical patent/CN1453702A/en
Application granted granted Critical
Publication of CN1328655C publication Critical patent/CN1328655C/en
Anticipated expiration legal-status Critical
Expired - Lifetime legal-status Critical Current

Links

Images

Landscapes

  • Executing Machine-Instructions (AREA)

Abstract

An apparatus and method are provided for extending a microprocessor instruction set to allow for selective suppression of store checking at the instruction level. The apparatus includes fetch logic, and translation logic. The fetch logic receives an extended instruction. The extended instruction has an extended prefix and an extended prefix tag. The extended prefix specifies that store checking be suppressed for the extended instruction. The extended prefix tag is an otherwise architectural opcode within an existing instruction set. The fetch logic precludes store checking for pending store events associated with the extended instruction. The translation logic is coupled to the fetch logic. The translation logic translates the extended instruction into a micro instruction sequence that sequence directs the microprocessor to exclude store checking during execution of a prescribed operation.

Description

Inhibiting apparatus and method that storage is checked
Technical field
The present invention relates to microelectronic, the particularly a kind of technology that can include existing microprocessor instruction set structure in the feature of instruction optionally forbidden storage inspection of level (store checking) in.
Background technology
Since at the beginning of the 1970's, the use of microprocessor promptly is grows up as the index.From being applied to the field of science and technology the earliest, introduce commercial consumer field from those special dimensions by now, as desktop and (laptop) on knee computing machine.Products such as PlayStation 3 videogame console/PS3 and many other common family expenses and commercial device.
Along with the explosivity on using is grown up, also experiencing one technically improves accordingly, it is characterized in that the requirement that raises to lising down day by day: faster speed, stronger addressing capability, the computing of memory access, bigger operand, more kinds of general service types faster (as floating-point operation, single instruction multiple data (SIMD), condition move etc.) and additional specific use computing (as digital signal processing function and other multimedia computing).So brought up surprising technical progress in this field, and all be applied to the design of microprocessor, carried out (speculative execution) as expanding pipelineization (extensive pipelining), SuperScale structure (super-scalar architecture), cache structure, out of order processing (out-of-order processing), explosion type access (burst access) mechanism, branch prediction (branch predication) and imagination.State outright it, compared with before 30 years on the first appearance, present microprocessor presents surprising complexity, and has possessed powerful ability.
But different with many other products is to have another important factors to limit, and continue limiting the evolution of microprocessor architecture.Microprocessor can be so complicated now, and most must be owing to this factor, i.e. the compatibility of old software.Under the consideration of market, many manufacturers select new architectural feature is included in the up-to-date microprocessor Design, but simultaneously in these up-to-date products, kept again institute promising guarantee to be compatible with older, i.e. (legacy) the necessary ability of application program of what is called (old).
The burden of this old software compatibility does not have other place, can be than at x86, and more apparent in the development history of compatible microprocessor.As you know, 32/16 present Virtualization Mode (viral-mode) x86 microprocessor still can be carried out the application program of 8 actual patterns (real-mode) that the 1980's write.Also admit have many relevant structures " burden " to pile up in the x86 structure and be familiar with this art person, just in order to support the compatibility with old application program and mode of operation.Though in the past, the developer can add architectural feature newly developed existing instruction set architecture, the instrument that nowadays uses these features and relied on, i.e. and the instruction of programmable but becomes quite rare.Simpler he says, in some important instruction set, does not have the instruction of " unnecessary ", and the deviser feature of upgrading can be included in the existing structure.
For example, in the x86 instruction set architecture, be not used as yet without any the operational code state of a undefined byte-sized.In the x86 operational code figure of a main byte-sized, whole 256 operational code states have all been taken by existing instruction.The deviser of x86 microprocessor must provide new feature and keep old software compatibility intercropping choice now as a result.If new programmable feature will be provided, then must give these features by dispatch operations sign indicating number state.If existing instruction set architecture does not have unnecessary operational code state; Then some already present operational code state must redefine, to offer new feature.Therefore, for new feature is provided, the compatibility of old software must have been sacrificed.
In the microprocessor in modern times, some programmer wishes the feature included in, but all can't realize because of above-mentioned reason before this.Wherein one is characterized as in instruction level time control whether want the forbidden storage inspection.
Since nearly all microprocessor has all used multistage line construction, (in fact possibility is high) just might take place in that, an instruction that is extracted pipeline is likely that one waits for the target of (pending) storage computing, and this storage computing is the stage that proceeds to the pipeline back, but not complete as yet.Just, the data that will store a destination locations into also are not written into internal memory (being external memory or internally cached).This all can take place under many different situations.For example, this storage instruction may just proceed to an early stage pipeline stage that is not used for write memory.Perhaps, data may be placed in the first-class impact damper to be stored, and this memory buffer unit is just being waited for a reasonable time, so that write memory, but storage instruction is allowed to leave pipeline.Being familiar with this art person will discover, and line construction proposes various challenges to microprocessor Design person, and its synchronization with execution command in proper order is relevant, but these instruct some to be to carry out abreast with pipelined fashion.
Storage inspection is the inherent feature of all pipeline microprocessors, and it can guarantee that all instructions in microprocessor pipeline are instructions that the application programmer will carry out.Various devices and instrument are provided in these microprocessor pipelines, with the contrast also not write memory etc. incident to be stored, check that all enter the instruction of pipeline, and when storage instruction is carried out, further the destination address of contrast storage instruction is checked all instructions of pipeline stage in front.If detect first-class incident to be stored, its destination address correspondence (general with cache line big or small corresponding) is to a position that enters pipeline instruction, and then pipeline is understood break-off, and should the storage incident promptly be allowed to write memory.When the pipeline break-off, the instruction in each stage just can halt in the pipeline, till this standstill state is removed.After data write, just extract the instruction that enters pipeline once more, and be allowed to pass through pipeline from the origin-location.When a storage instruction is carried out, if an instruction is detected in a previous pipeline stage, and its position (is its instruction pointer (instructionpointer, IP)) correspond to the destination address of this storage instruction, then the synchronous logic device in the microprocessor can be with the pipeline work stoppage, and empties all pipeline stage before the previous pipeline stage.After storage instruction writes its data, can refill pipeline.
Storage inspection is a very heavy operation, and its required hardware is directly proportional with the number in microprocessor gut line stage.Here it is, and why storage purpose and instruction position is that unit checks with the size of cache line only usually, as previously mentioned.Moreover owing to virtual address translation is that physical address is very complicated in itself, storage inspection generally is to use virtual address but not physical address is reached.
Now, feature is checked in the storage of the uncontrollable microprocessor of programmer.If the programmer selects the technology of self-correcting code (self-modifying code) for use, must determine that then the subsequent instructions as the storage target of previous storage computing is the corresponding application program execution.In the level of coming source code (source code), this can reach, and a kind of although it is so program technic is also imperfect.Yet microprocessor does not carry out source code.The robotization compiler produces the required instruction stream of microprocessor from the program code that is provided.The instruction stream that is produced is most probably because arrangement (alignment) characteristic of specific compiler, and comprises staggered program code and data in same cache line.Therefore, be used for guaranteeing that the oneself revises the conforming instrument of source code, be unfavorable for that pipeline synchronization (pipelinesynchronization) incident still may produce because of the compiling of program code even the programmer provides.
The programmer is based on the consideration of various improvement effects, may want to assign a storage computing before an instruction, is used for the position of modify instruction, but desired execution sequence is still the content of carrying out before this location updating.Why Here it is stores inspection now can not be got rid of a kind of like this reason of carrying out the order of incident.
Therefore, we need a kind of device and method that the feature of forbidden storage inspection can be included in an existing microprocessor instruction set structure, wherein this instruction set architecture is taken fully by defined operational code, and including disable feature in can allow a microprocessor that meets old specification keep the ability of carrying out old application program, for any specific instruction, also provide the control of application programmer and compiler whether to carry out the ability that storage is checked simultaneously.
Summary of the invention
The problem and the shortcoming that The present invention be directed to above-mentioned and other known technology provide a kind of better technology, and the instruction set that is used for expanding microprocessor makes it surmount existing ability, provide the storage of instruction level to check disable feature.A kind of device of the storage inspection control that can instruct level in microprocessor is provided in one embodiment.This device comprises an extraction logic (fetch logic) device and a translation logic (translation logic) device.This extraction logic device receives one and extends instruction.This extends instruction tool one and extends a preamble (extended prefix) and an extension preamble mark (extended prefix tag).This extension preamble is specified the storage inspection that will forbid extending instruction.This extension preamble mark then is another structure operation sign indicating number in the existing instruction set.This extraction logic device is got rid of its storage and is checked for incidents to be stored such as being correlated with of extension instruction.This translation logic device is connected to the extraction logic device, will extend instruction and be translated into a microinstruction sequence (micro instruction sequence), to indicate this microprocessor when a specify arithmetic is carried out, gets rid of storage inspection.
One object of the present invention is to propose a kind of existing instruction set microprocessor mechanism of checking of forbidden storage optionally in a microprocessor pipeline that expands.This microprocessor mechanism has one and extends an instruction and a transfer interpreter (translator).This extension instruction is that its relevant storage inspection of appointment will be under an embargo, and wherein extends instruction and comprises a wherein operational code of choosing of existing instruction set, then follows the extension preamble of a n position thereafter.This operational code of choosing is pointed out to extend instruction, and the extension preamble of this n position is then indicated and wanted the forbidden storage inspection.This transfer interpreter receives and extends instruction, and produces a microinstruction sequence, and the indication microprocessor is carried out a specify arithmetic, and the associated storage inspection when getting rid of specify arithmetic and carrying out.
Another object of the present invention is to propose a kind of module that increases the storage inspection feature of inhibit command for existing instruction set.This module comprises an extending marking (escape tag), and a storage inspection forbids that specific bit (store check suppression specifier), a translation logic device and extend the actuating logic device.This extending marking is received by an extraction logic device, and points out that the subsidiary part of a corresponding instruction is to have specified the computing that will carry out, and wherein this extending marking is one first operational code in the existing instruction set.This storage inspection forbids that specific bit is connected to extending marking, and for should subsidiary part one of them, want the forbidden storage inspection when computing of its appointment is carried out.This translation logic device is connected to the extraction logic device, is used to produce a microinstruction sequence, carries out this computing with the indication microprocessor, and specifies in microinstruction sequence and want the forbidden storage inspection.This extension actuating logic device is connected to the translation logic device, receives microinstruction sequence, carries out this computing not store test mode.Another object of the present invention is to provide a kind of method that expands existing instruction set architecture, is used to instruct the inspection of level forbidden storage.This method comprises provides one to extend instruction, and the extension instruction comprises an extension mark and and extends preamble, and wherein this extension mark is wherein one first an operational code item of existing instruction set architecture; Extend preamble by this and specify and extend instruction when carrying out, forbidden storage inspection, the computing that the remainder appointment of wherein should extensions instructing will be carried out; And the associated storage inspection of forbidding this extension instruction.
Based on above-mentioned explanation, the invention provides a kind of device of the storage inspection control that can in a microprocessor, instruct level, comprise: an extraction logic device, be used for receiving one and extend instruction, wherein this extension instruction comprises an extension preamble, be used to specify the storage inspection that will forbid this extension instruction, and one extends the preamble mark, be another structure operation sign indicating number in the existing instruction set, wherein this extraction logic device is got rid of the storage inspection of the incidents to be stored such as relevant of extending instruction; An and translation logic device, be connected to the extraction logic device, be used for the extension instruction is translated into a microinstruction sequence, to indicate microprocessor when a specify arithmetic is carried out, get rid of storage inspection, wherein said translation logic device comprises: an extended instruction detects logical unit, be used to detect this extension preamble mark, an instruction translation logic device is with deciding this specify arithmetic, and in microinstruction sequence, specify this specify arithmetic, and one extend the translation logic device, is connected to extended instruction and detects logical unit and instruction translation logic device, is used for specifying in microinstruction sequence, to when carrying out, specify arithmetic get rid of storage inspection.
The present invention also provides a kind of existing instruction set that expands with the micro processor, apparatus checked of forbidden storage optionally in a microprocessor pipeline simultaneously, comprise: a transfer interpreter, be configured to receive one and extend instruction, produce a microinstruction sequence, carry out a specify arithmetic to indicate a microprocessor, and when specify arithmetic is carried out, get rid of relevant storage inspection, wherein this extension instruction is configured to specify the associated storage inspection of extending instruction to be under an embargo, wherein this extension instruction comprises a wherein operational code of choosing of existing instruction set, then follow the extension preamble of a n position thereafter, this operational code of choosing is pointed out to extend instruction, and the extension preamble of this n position is then indicated and wanted the forbidden storage inspection.Described transfer interpreter comprises: an extended instruction detecting device is used for detecting the selection operation sign indicating number in this extension instruction; One instruction transfer interpreter is used for translating the remainder that this extension is instructed, to determine this specify arithmetic; And one extend the preamble transfer interpreter, is connected to extended instruction detecting device and instruction transfer interpreter, is used for translating the extension preamble of n position, and specifies in microinstruction sequence and want the forbidden storage inspection, and wherein n is a positive integer.
The present invention provides a kind of module that increases the storage inspection feature of inhibit command for existing instruction set again, comprise: a translation logic device, be connected to an extraction logic device, be used to produce a microinstruction sequence, carry out this computing to indicate a microprocessor, and in microinstruction sequence, specify and want the forbidden storage inspection, wherein this extraction logic device receives an extending marking, this extending marking points out that the subsidiary part of a corresponding instruction is to have specified the computing that will carry out, and, this extending marking is one first operational code in the existing instruction set, and check with a storage and to forbid that specific bit is connected, this storage inspection forbid specific bit be subsidiary part one of them, be used to refer to fix on and want the forbidden storage inspection when computing is carried out; And one extend the actuating logic device, is connected to this translation logic device, receives this microinstruction sequence, and carry out computing in the mode of not storing inspection.Wherein said translation logic device comprises: an extending marking detects logical unit, is used for detecting extending marking, and indicates the action of translating of this subsidiary part to need to translate rule according to extension; An and decoding logic device, be connected to this extending marking and detect logical unit, be used for rule according to existing instruction set, execution command translate action, and translate rule according to extension and carry out translating of corresponding instruction, carry out in the mode of not storing inspection to enable this computing.
The present invention provides a kind of method that expands an existing instruction set architecture again, to check in instruction level forbidden storage, this method comprises: use an extraction logic device to receive one and extend instruction, this extension instruction comprises one and extends a mark and an extension preamble, and wherein extending mark is wherein one first operational code item of existing instruction set architecture; Specify in by this extension preamble and to extend instruction forbidden storage inspection when carrying out, and get rid of the storage inspection of the incidents to be stored such as relevant of extending instruction, the computing that the remainder appointment of wherein should extensions instructing will be carried out by this extraction logic device; And will extend instruction and be translated into a microinstruction sequence, when a specify arithmetic is carried out, forbid the associated storage inspection that this extension is instructed with the indication microprocessor.
Aforementioned purpose of the present invention, feature and advantage after cooperating following explanation and appended diagram, can obtain better understanding:
Description of drawings
Fig. 1 is the block scheme of the microprocessor instruction form of a correlation technique;
Fig. 2 is a form, and how its instruction of describing in the instruction set architecture corresponds to the position logic state that one 8 bit manipulation code words save in Fig. 1 order format;
Fig. 3 is the block scheme of extension order format of the present invention;
Fig. 4 is a form, and it shows according to the present invention how the extended structure feature corresponds to one 8 position logic states of extending among the preamble embodiment;
Fig. 5 is for using selectivity storage of the present invention to check to forbid the block scheme of a pipeline microprocessor of controlling;
Fig. 6 for the present invention in a microprocessor, be used for specifying the specific embodiment block scheme that will get rid of the extension preamble that storage checks;
Fig. 7 is the block scheme that extracts the phase logic device in Fig. 5 microprocessor;
Fig. 8 is for translating the block scheme of phase logic device in Fig. 5 microprocessor;
Fig. 9 is the block scheme of execute phase logical unit in Fig. 5 microprocessor; And
The workflow diagram of the method that Figure 10 checks for the associated storage of describing the present invention be used for inhibit command in microprocessor.
Embodiment
How preamble in the microprocessor at today, expands its architectural feature, to surmount the technology of associated instruction set ability, has done the discussion of background.In view of this, the example of one correlation technique will be discussed at Fig. 1 and Fig. 2, herein discussion emphasized microprocessor Design person always in the face of the restriction of instruction set, promptly on the one hand, they want the architectural feature of up-to-date exploitation is included in the microprocessor Design, but then, they keep the ability of carrying out old application program again.In the example of Fig. 1 to 2, the one operational code figure that takies fully, get rid of increasing new operational code to the possibility of this case structure, thereby force the deviser otherwise just select new feature is included in, and the sacrifice old software compatibility to a certain degree, otherwise just structural latest developments are abandoned in the lump, so that keep the compatibility of microprocessor and old application program.After the discussion of correlation technique,, will provide discussion of the invention in Fig. 3 to Figure 10.One existing but untapped operational code is extended the preamble mark of instruction as one by utilizing, the instruction set architecture that the present invention can allow microprocessor Design person overcome and use fully limits, except making the programmer can optionally forbid the storage inspection of single instruction or order bloc, also can keep simultaneously and carry out all required features of old application program.
See also Fig. 1, it is the block scheme of the microprocessor instruction form 100 of a correlation technique.The instruction 100 of this correlation technique has the variable data item 101-103 of quantity, each all sets a particular value for, lump together a specific instruction 100 of just forming microprocessor, these specific instruction 100 indication microprocessors are carried out a certain operations, for example with two operand additions, or an operand moved to an internal buffer from internal memory, or buffer moves to internal memory internally.Generally speaking, operational code item 102 in the instruction 100 has been specified the certain operations that will carry out, and selects for use the address specific bit item 103 of (optional) to be positioned at after the operational code 102, to specify the additional information about certain operations, similarly be how to carry out computing, where operand bit is in or the like.Order format 100 also allows the programmer to add preamble item 101 before an operational code 102.When operational code 102 specified certain operations were carried out, preamble 101 was used to refer to whether use specific architectural feature.In general, these architectural features can be applied to the major part of any operational code 102 specified computings in the instruction set.For example, preamble 101 is present in some microprocessors that can use different big or small operands (as 8,16,32) execution computing now.And when many these type of processors are turned to predetermined operations number size by program (such as 32), the preamble 101 that is provided is provided in its individual instructions, the programmer is instructed according to each, optionally replace (override) this predetermined operations and count size (as in order to produce 16 operand).Selectable operand size only is an example of architectural feature, in the microprocessor in many modern times, these architectural features can be applied to numerous can be by operational code 102 computing of appointment in addition (as add, subtract, take advantage of, Boolean logic etc.).
Order format 100 shown in Figure 1 has one to be the known example of industry, and this is an x86 order format 100, and it is adopted by all modern x86-compatible microprocessors.More specifically he says, x86 order format 100 (being also referred to as x86 instruction set architecture 100) has been used 8 locative preposition sign indicating numbers, 101,8 bit manipulation sign indicating numbers 102 and 8 bit address specific bit 103.X86 structure 100 also has several preambles 101, wherein two have replaced the default address/data size of x86 microprocessor (as operational code state 66H and 67H), another then indicates microprocessor to translate rule to come decipher opcode byte 102 thereafter (be preposition code value 0FH according to different, it makes that translating action is to carry out according to so-called two byte oriented operand rules), 101 of other preambles repeat special computing, till repeat condition satisfies (being REP operational code: F0H, F2H and F3H).
Now see also Fig. 2, it shows a form 200, and how the instruction 201 that is used for describing an instruction set architecture corresponds to the place value of one 8 bit manipulation code words joint 102 in Fig. 1 order format.Form 200 has presented the example of one 8 bit manipulation sign indicating number Figure 200, and it is associated with corresponding microprocessor operation code instruction 201 with maximum 256 values that one 8 bit manipulation sign indicating number items 102 are had.Form 200 is with a particular value of operational code item 102, such as 02H, maps to the opcode instructions 201 (promptly instruct I02 201) of a correspondence.In the example of x86 operational code figure, the people is known in the field for this reason; Opcode value 14H is that the carry that maps to x86 adds up that (this instruction adds to the value of including of structure buffer AL with direct (immediate) operand of one 8 for Add With Carry, ADC) instruction.Being familiar with this art person also will realize, x86 preamble 101 mentioned above (being 66H, 67H, 0FH, F0H, F2H and F3H) is actual opcode value 201, they are under different situations, and appointment will be extended specific structure item be applied to subsequently the specified computing of operational code item 102.For example, at operational code 14H (under the normal condition, be aforesaid ADC operational code) the preceding preamble 0FH that adds, can make the x86 processor carry out one " decompressing and the low-compression single-precision floating point value of insertion " (Unpack and Interleave Low Packed Single, precisionFloating-point Values) computing, but not ADC computing originally.The described feature of all x86 examples like this is to enable attainablely in the microprocessor in modern times, and this is because the instruction translation logic device in the microprocessor is the item 101-103 of decipher in regular turn an instruction 100.So in the past, in instruction set architecture, use the specific operation code value as preamble 101, can allow microprocessor Design person will many advanced architectural features to include in the design of microprocessor of compatible old software, and can not bring the negative impact in the execution to not using the old program of those particular opcode states.For example, one had not used the old program of x86 operational code 0FH, still can carry out on the x86 of today microprocessor.And a newer application program as preamble 101, just can be used many x86 architectural features of newly advancing to include in by utilization x86 operational code 0FH, and as single instruction multiple data (SIMD) computing, condition moves computing or the like.
Although the past, available by specifying (be unnecessary or do not assign) opcode value 201 was as preamble 101 (being also referred to as architectural feature mark/pointer 101 or extended instruction 101), architectural feature is provided, but many instruction set architectures 100 are because a very direct reason, and can't provide reinforcement on the function: all available/unnecessary opcode values have run out, just, the whole opcode values among operational code Figure 200 are by structured appointment.When all available values are assigned as operational code item 102 or preamble item 101, just do not have remaining opcode value can be used as and include the new feature use in.This serious problem is present in present many microprocessor architectures, thereby forces the deviser increasing architectural feature and the compatibility intercropping choice that keeps old program.
It should be noted that instruction shown in Figure 2 201 is to represent (being I24, I86) in general mode, but not concrete specify actual computing (as carry add up, subtract, mutual exclusion or).This is because in some different microprocessor architectures, the operational code Figure 200 that takies fully structurally will include in than the possibility of new development and get rid of.Though Fig. 2 example is mentioned, be 8 operational code item 102, be familiar with this art person and will realize, the specific size of operational code 102, except the problem that the operational code structure 200 that takies fully caused was discussed as special circumstances, others and problem itself was also irrelevant.Therefore, 6 bit manipulation sign indicating number figure that take fully will have 64 can structured appointment operational code/preamble 201, and can't provide the usefulness of available/unnecessary opcode value as expansion.
The another kind of way that substitutes, then be not that original instruction set is discarded fully, replace with operational code Figure 200 with a new form 100, but only at the existing operational code 201 of some, replace with new instruction connotation, operational code 40H to 4FH as Fig. 2: with the technology of this mixing, microprocessor just can be worked with one of following two kinds of patterns individually: wherein old pattern is utilized operational code 40V-4FH, be to come decipher by old rule, perhaps with another kind improvement pattern (enhanced mode) work, this moment, operational code 40H-4FH then came decipher by the tactical rule of strengthening.This technology can allow the deviser to include new feature in design, yet, when the microprocessor that meets old specification when enhancement mode is worked, shortcoming still exists, because microprocessor can not be carried out the application program of any use operational code 40H-4FH.Therefore, stand in the position that keeps the old software compatibility, the technology of compatible old software/enhancement mode is not an optimal selection.
Yet, the instruction set 200 that takies fully for opcode space, and the situation of all executive utilities on the microprocessor that meets old specification is contained in this space, this case inventor has noticed the wherein behaviour in service of operational code 201, though some instruction 202 is structured appointments, be not used for the application program that to be carried out by microprocessor.The described instruction of Fig. 2 IF1 202 i.e. an example of phenomenon for this reason, and in fact, identical operations code value 202 (being F1H) is to map to an effective instruction 202 that is not used for the x86 instruction set architecture.Though this untapped x86 instruction 202 is effective x86 instructions 202, its indication will be carried out the computing of a structured appointment on the x86 microprocessor, and it is not used in any application program that can carry out on modern x86 microprocessor.This special x86 instruction 202 is called as in-circuit emulation breakpoint (In Circuit Emulation Breakpoint) (be ICE BKPT, opcode value is F1H), all is to be used in specially in a kind of non-existent now microprocessor emulator before.ICEBKPT 202 never is used for the application program outside the in-circuit emulator (ICE), and has before used the in-circuit emulation equipment of ICEBKPT 202 not exist.Therefore, under the situation of x86, this case inventor has found the same instrument in an instruction set architecture that takies fully 200, by utilizing an effective but untapped operational code 202, in the design of microprocessor, include advanced architectural feature in permission, and need not sacrifice the compatibility of old software.Therefore in an instruction set architecture that takies fully 200, the present invention utilizes a structured appointment but untapped operational code 202, as an indicator marker, points out a n locative preposition sign indicating number thereafter, allows the microprocessor Design person can be with maximum 2 nThe architectural feature of individual recent development is included in the design of microprocessor, keep simultaneously with all old softwares complete, compatibility.
The specific bit preamble is forbidden in the extension storage inspection that the invention provides a n position, to use the notion of preamble mark/extension preamble, thereby can allow the programmer to extend instruction for one, from extracting the whole process that is finished, appointment will be forbidden the storage inspection of its correspondence.Another embodiment of the present invention then will be extended the execution of instructing with specific quantity instruction thereafter, get rid of outside the storage checking mechanism of microprocessor.The present invention is existing to discuss with reference to Fig. 3 to Figure 10.
Now see also Fig. 3, it is the block scheme of extension order format 300 of the present invention.Very approximate with the form 100 that Fig. 1 is discussed, this extension order format 300 has the variable instruction items 301-305 of quantity, and each is set at a particular value, gathers a specific instruction 300 of just forming microprocessor.These specific instruction 300 indication microprocessors are carried out a certain operations, similarly be with two operand additions, or an operand is moved to the buffer of microprocessor from internal memory.Generally speaking, the operational code item 302 of instruction 300 has been specified the certain operations that will carry out, after the address specific bit item of selecting for use 303 then is positioned at operational code 302,, similarly be buffer how to carry out computing, operand place to specify the relevant supplementary information of this certain operations.Be used to calculate source/result operand memory address directly and indirect data or the like.Order format 300 also allows the programmer to add preamble item 301 before an operational code 302.When the specified certain operations of operational code 30 was carried out, preamble item 301 was to be used to refer to whether will use existing architectural feature.
Yet, extension instruction 300 of the present invention is supersets (superset) of earlier figures 1 order format 100, it has two additive terms 304 and 305, can be extended item as instruction by selectivity, and places a format to extend before instruction 300 all its remainder 301-303.These two additive terms 304 and 305, whether it will forbid or get rid of the storage inspection of extending instruction 300 for extending the some of instruction 300, can allowing the programmer specify.Selecting item 304 and 305 for use is that preamble 305 is forbidden in an extension cue mark 304 and an extension storage inspection.This extension cue mark 304 be in the microprocessor instruction set another according to the specified operational code of structure.In the embodiment of an x86, this extends cue mark 304, or is called extending marking 304, is with operational code state F1H, and it is the previous ICE BKPT instruction of using.Extending marking 304 points out that to the microprocessor logic device this extends preamble 305, or claim to extend feature specific bit 305, be follow after, wherein should 305 appointments of extensions preamble to forbid extending and instruct 300 storage inspection.In one embodiment, extending marking 304 is pointed out, the computing that a corresponding subsidiary part 301-303 and 305 who extends instruction 300 has specified microprocessor to carry out.Storage inspection is forbidden specific bit 305, or claims to extend preamble 305, then specifies when carrying out this computing, need not extend the storage inspection of instruction 300.Extension actuating logic device in the microprocessor is carried out this computing, but this computing is to carry out under the situation of getting rid of any storage inspection.
The technology that optionally forbidden storage of the present invention is checked is done a general introduction herein.One extension instruction is to be configured to specify one according to having the computing that microprocessor instruction set is carried out now, wherein is somebody's turn to do to get rid of to store when extending the instruction execution and checks.This extension instruction comprises operational code/instruction 304 one and extension preamble 305 of n position wherein of this existing instruction set.Selected operational code/instruction is as a pointer 304, to point out instructing 300 to be that an extension feature instructs 300 (that is, they have specified the extension item of microprocessor architecture), the feature preamble 305 of this n position then to point out the storage inspection that will forbid.In one embodiment, extend preamble 305 and have eight size, can specify the storage inspection that to forbid an instruction and follow-up maximum 255 instructions, or specify and will forbid the storage inspection of this instruction and n subsequent instructions, add eight other specified extension features of residue place value of extending preamble 305.The embodiment of n locative preposition sign indicating number then can specify at most and will forbid 2 nIndividual storage inspection, or the various combinations of the aforementioned forbidden storage inspection of picture and other extension feature.
Now see also Fig. 4, a form 400 shows according to the present invention the position the logic state how appointment of forbidden storage inspection is videoed and extended preamble embodiment to one 8.Be similar to operational code Figure 200 that Fig. 2 discusses, the form 400 of Fig. 4 presents the example of extension preamble Figure 40 0 of one 8, it is with one 8 maximum 256 values of extending preamble item 305, be associated with one and meet in the old specification microprocessor, some instruct pairing storage inspection to forbid 401 (as E34, E4D etc.).In the specific embodiment of an x86,8 of the present invention are extended feature preambles 305 and provide the instruction level control usefulness of forbidding 401 (being E00-EFF) to storage inspection, and these storage inspections forbid that 401 is that existing x86 instruction set architecture is failed appointment separately.
Extension feature 401 shown in Figure 4 is to represent in general mode, but not concrete actual feature, this is to extend item 401 and specific instruction set architecture because technology of the present invention can be applicable to various structure.Being familiar with this art person will realize, many different architectural features 401, some of them are in above mentioning, the described extending marking 304/ that can Click here extends preamble 305 technology and includes it in existing instruction set.The 8 locative preposition sign indicating number embodiment of Fig. 4 provide maximum 256 different features, 401, one n locative preposition sign indicating number embodiment then to have maximum 2 nThe sequencing of individual different characteristic 401 is selected.
Now see also Fig. 5, it is used for carrying out the block scheme that the pipeline microprocessor 500 of computing is forbidden in optionally storage inspection for explaining orally the present invention.Microprocessor 500 has three tangible stage types: extract, translate and carry out.The extraction stage has extraction logic device 501, can extract instruction from external memory storage 503.The instruction of being extracted is sent to extends fetch logic (extended pre-fetch logic) device 502.Extend the instructions that 502 pairs in fetch logic device sends into and carry out storage inspection, and be configured to detect whether have wherein as described above that Fig. 3 forbids sequence with the extending marking of Fig. 4 with the storage inspection of extension preamble.The instruction of checking through storage is with in addition synchronization of aforementioned manner, and is cached in the command cache 504, and is admitted to an instruction queue (instruction queue) 505, for a translation logic device 506 accesses.This translation logic device 506 is to be connected to a micro-order formation 508, and it comprises one and extends translation logic device 507.Then there is an actuating logic device 509 execute phase, has one in it and extends actuating logic device 510.
Extraction logic device 501 extracts the format instruction from external memory storage 503 when work according to the present invention, and these instructions are sent in the extension fetch logic device 502.Extend fetch logic device 502 and carry out storage inspection, and in send into instruction be subjected to the pipeline back segment etc. during the influencing of incident to be stored, initial its synchronization action.Extend instruction if detect of the present invention one, then extend fetch logic device 502 and allow this extension instruction need not check, promptly deliver to command cache 504 and instruction formations 505 through storage.The instruction of being extracted is to deliver to instruction queue 505 by execution sequence.Then extract these instructions, deliver to translation logic device 506 from instruction queue 505.Translation logic device 506 translates to the microinstruction sequence of a correspondence with each instruction of sending into, goes to carry out the specified computing of these instructions with indication microprocessor 500.According to the present invention, extend translation logic device 507 and detect the instruction that those have extension preamble mark, check the action of translating of forbidding preamble to carry out corresponding extension storage.In the embodiment of an x86, extend translation logic device 507 and be configured to detect the extension preamble mark of its value for F1H, it is the ICE BKPT operational code of x86.Extend microinstruction field and then be provided in the micro-order formation 508, to specify the storage inspection that to forbid by the specified computing of the subsidiary part of this instruction.Other extends the embodiment of translation logic device 507, then can be in extending microinstruction field, and appointment will be forbidden the storage inspection that one first instruction of the present invention and follow-up some are instructed.
Micro-order is sent to actuating logic device 509 from micro-order formation 508, wherein extend actuating logic device 510 and be configured to carry out the specified certain operations of micro-order, and the IP position of all instructions of contrast pipeline leading portion, the destination address of computing to be stored such as check.If the destination address of first-class computing to be stored conforms to the IP position of a previous stage, and the extension microinstruction field of the instruction of this previous stage are not specified and are wanted the forbidden storage inspection, the pipeline that then extends till actuating logic device 510 near these previous stages empties, and allows these computings to be stored to write its data.After this storage incident is finished, refill pipeline again.Yet, if specifying, the extension microinstruction field of this previous stage instruction want the forbidden storage inspection, extend the action that empties that actuating logic device 510 will be got rid of pipeline.Therefore, mark instructions (taggedinstruction) is allowed to carry out by pipeline, and the synchronization that can't cause pipeline because of the storage incident of a follow-up phase separately empties and refills, and is cleared.
Being familiar with this art person will find, microprocessor 500 shown in Figure 5 is modern pipeline microprocessor 50 results through simplifying.In fact, Xian Dai pipeline microprocessor 500 can include 20 to 30 different pipeline stage at most.Yet these stages can briefly classify as the three phases shown in the block scheme, and therefore, the block scheme 500 of Fig. 5 can be used to point out the required necessary assembly of the aforementioned embodiment of the invention.For brevity, assembly irrelevant in the microprocessor 500 does not show, and is not discussed yet.
Now see also Fig. 6, its for the present invention in a microprocessor, be used for specifying an example embodiment block scheme of the extension preamble 600 that will forbid that an associated storage of extending instruction is checked.8 sizes of extension preamble 600 tools that forbidden storage is checked, and comprise that one forbids field 601.In one embodiment, forbidding that field 601 is specified will get rid of the associated storage inspection that this extensions is instructed.Another specific embodiment then comprises the field of forbidding that can specify the storage inspection that will get rid of this extension instruction and follow-up maximum 255 instructions.The quantity of forbidding the instruction checked is by forbidding that field indicates.
Now see also Fig. 7, it is the interior block scheme that extracts phase logic device 700 of microprocessor of Fig. 5.Extract phase logic device 700 and comprise that one is connected to the prefetch buffer 704 of external memory storage 705, this prefetch buffer 704 provides prefetched instruction to extend fetch logic device 706.This extension fetch logic device 706 has one and extracts controller 709, and the latter is coupled to one via a disable signal (disable signal) 708 to forbid sequential detector (suppress sequence detector) 707.Extraction controller 709 also is connected to one and has the machine specific register (machine specificregister) 702 of extending feature field 703.Forbid that sequential detector 707 provides a control signal SUPP to first-class evaluate logic to be stored (pending store evaluation logic) device 710.A plurality of buffers 711 of the destination address of incidents to be stored such as these 710 pairs in evaluate logic devices to be stored comprise carry out access.In one embodiment, these buffers 711 are called as low linear instruction pointer (linear instruction pointer, LIP) chain.Buffer 711 is to upgrade with the destination address that the memory buffer unit in the pipeline (as compound write buffer (write combinebuffer), because of write buffer (write back buffer) etc.) is provided by bus 715.Be coupled to a pipeline synchronous logic (pipeline synchronization logic) device 712 etc. evaluate logic device 710 to be stored via SMC HIT signal.This pipeline synchronous logic device 712 provides a control signal STALL 713 to pipeline control logic device (not shown).Extend fetch logic device 706 fill buffer (fill buffer) 714 is sent in the instruction of being extracted, the latter then is connected to an instruction cache 716.
In the real work, from storer 705 extraction cache lines the time, it is to be sent to prefetch buffer 704.Extend fetch logic device 706 and extract the content of cache line, and in the low LIP chain of contrast buffer 711, wait the destination address of computing to be stored, check the IP address of the instruction of sending into.If etc. the evaluate logic device 710 to be stored IP address that defines a prefetched instruction conform to buffer 711 interior first-class targets to be stored, then SMC HIT signal is set to very, thereby order about pipeline synchronous logic device 712 STALL signal 713 is made as very, and begin a pipeline synchronization events.Therefore, the pipeline break-off, till these computings to be stored were with its data write memory, then, this prefetched instruction was extracted again, and allowed to enter the pipeline execution.If the IP address of this prefetched instruction does not conform to any storage target, then extend fetch logic device 706 this prefetched instruction is delivered to fill buffer 714, deliver to instruction cache 716 at last again.
Forbid that sequential detector 707 also assesses the cache line content from prefetch buffer 704, whether contain extending marking/extension preamble sequence that the forbidden storage inspection is wanted in an indication to detect it.If detect this sequence, then the SUPP signal is set as very, low LIP chain buffer 711 is not assessed with evaluate logic devices 710 to be stored such as indications, and is allowed corresponding extension instruction to send into fill buffer 714.
Switch between the starting period at microprocessor, the state of the extension feature field 703 in the machine specific register 702 is by signal enabling state (signal power-up state) decision, whether can handle the extension instruction that the storage that is used for forbidding microprocessor of the present invention is checked to point out this particular microprocessor.In one embodiment, signal 701 is derived from feature control buffer (not showing on the figure), and this feature control buffer then reads fuse array (fuse array) (not shown) that has disposed when making.Machine specific register 702 will be extended the state of feature field 703 and be delivered to extraction controller 709 and other following logical unit.Extract 709 decisions of controller and whether will forbid sequence with the detection of stored inspection assessing from prefetch buffer 704 obtained cache line data.Such controlling features is provided, and feature is carried out in the extension that can allow to supervise application program (as SIOS) enables/disables microprocessor.If it is disabled to extend feature, then have the instruction that is chosen as the operational code state that extends signature, as will instructing as any other, incidents to be stored such as contrast are checked.Extract 709 indications of controller and forbid that sequential detector 707 is by being made as disable signal 708 really the detection of forbidding extension sequence.
Now see also Fig. 8, it is for translating the block scheme of phase logic device 800 in the microprocessor of Fig. 5.Translate phase logic device 800 and have an instruction buffer 804, according to the present invention, it provides to extend and instructs to translation logic device 805.Translation logic device 805 is to be connected to one to have a machine specific register 802 of extending feature field 803, as described in prior figures 7 parts.Translation logic device 805 tools one are translated controller 806, and it provides a disable signal 807 to one extended instruction detecting devices 808 and to extend preamble transfer interpreter 809.Extended instruction detecting device 808 is connected to and extends a preamble transfer interpreter 809 and an instruction transfer interpreter 810.Extend preamble transfer interpreter 809 and instruction transfer interpreter 810 accesses, one CROM (control read only memory) (ROM) 811, wherein stored and corresponded to template (template) microinstruction sequence that some extends instruction.Translation logic device 805 also comprises a micro-order impact damper 812, and it has an operational code and extends a field 813, a microcode field 814, a destination field 815, and come a source field 816 and a displacement field 817.
In the real work, switch between the starting period at microprocessor, the state of the extension feature field 803 in the machine specific register 802 is by 801 decisions of signal enabling state, whether can translate and carry out extension instruction of the present invention to point out this particular microprocessor, as described in prior figures 8 parts.The state that machine specific register 802 will be extended feature field 803 is delivered to and is translated controller 806.Translate 806 decisions of controller and will translate rule or the public rule of translating is translated according to extension from the obtained instruction of instruction buffer 804.If it is disabled to extend feature, then have the instruction that is chosen as the operational code state that extends signature, will translate according to the public rule of translating.In the specific embodiment of an x86, selection operation sign indicating number state F1H serves as a mark, and then in public translating under the rule, runs into F1H and will cause illegal instruction exception (exception).If it is disabled that extension is translated, instruction transfer interpreter 810 will be translated the instruction that all are sent into, and all fields 813 to 817 of micro-order impact damper 812 are configured.Yet, under rule is translated in extension,, can be expanded command detection device 808 and detect if run into mark.Extended instruction detecting device 808 will be indicated extension preamble transfer interpreter 809 foundations to extend and be translated rule, translate the extension preamble part of this extension instruction, and operational code is extended a field 813 be configured, to forbid that with indication this extensions instructs the storage inspection of pairing microinstruction sequence.Instruction transfer interpreter 810 will be translated the remainder that this extensions is instructed, and to the microcode field 814 of micro-order impact damper 812, come source field 816, destination field 815 and displacement field 817 to be configured.Some specific instruction will cause the access to control ROM 811, to obtain corresponding microinstruction sequence template.Micro-order through configuration is sent to a micro-order formation (not being shown among the figure), carries out follow-up execution by processor.
Now see also Fig. 9, it is the block scheme of the execute phase logical unit 900 in Fig. 5 microprocessor.These execute phase logical unit 900 tools one extend stored logic (extended store logic) device 908, and it is connected to a data cache 911 and a bus unit 912.Bus unit 912 is the memory access operations (memory transaction) that are used to instruct on the rambus (not shown).According to the present invention, extend stored logic device 908 from a microprocessor extension micro-order impact damper 901 reception micro-orders of previous stage, receive a data operand from data buffer 902, and receive a destination address operand from address buffer 903.Extend stored logic device 908 and comprise a storage inspection logical unit 909, the latter is connected to a plurality of linear IP buffers 905, a pipeline synchronous logic device 914 (via the IPHIT signal) and an a plurality of memory buffer unit 910 respectively.These linear IP buffers 905 are also referred to as higher LIP chain, and each buffer 905 has an IP field 906 and a storage inspection is forbidden field 907.The content of higher LIP chain 905 comprises the virtual address of the instruction that is positioned at the front pipeline stage, and these contents are that the facial canal line stage is delivered to higher LIP chain 905 in regular turn via bus 904 in the past.
In the real work, extend stored logic device 908 according to the indication of extending the micro-order in the micro-order impact damper 901, by bus unit 912 with operand write cache 911 or external memory.With regard to extending indicated the writing of micro-order/store computing, storage checks that logical unit 909 receives the required destination address information of computings from address buffer 903, and the operand that will store from impact damper 902 receptions.Storage checks that logical unit 909 then sends address and data into memory buffer unit 910, assesses the content of higher LIP chain 905 simultaneously, judging whether an instruction is arranged in the pipeline, its virtual ip address with etc. the destination address of incident to be stored conform to.If find a virtual ip address that conforms in higher LIP chain 905, then storage checks that logical unit 909 inspects relevant storage inspection and forbid field 907.If relevant storage inspection forbids that the content of field 907 indicates and wants the forbidden storage inspection that storage inspection logical unit 909 just allows pipeline to work on, and is not interrupted.The content of memory buffer unit 910 is then to meet the memory behavior that this storage incident is endowed, via bus unit 912 write cache 911 or external memories, wherein this memory behavior is specified by the structure routine of par-ticular processor (processor-specific).Yet, if storage is checked logical unit 909 decisions and is not forbidden a storage inspection that conforms to virtual address, then IP HIT signal can be set as very, with the incident that notice pipeline synchronous logic device 914 beginnings one pipeline empties/refills, this incident is to proceed to till the pipeline stage that detects this virtual ip address that conforms to.So pipeline synchronous logic device 914 is by FLUSH signal 915 beginning pipeline synchronization events.Along with instruction is processed, the extension micro-order is then synchronous with a pipeline time clock (not shown), is sent to micro-order buffer 913.
Important technology feature of the present invention (as described in Fig. 3 to Fig. 9 part) is done an arrangement herein.By using one to follow according to structure and specify but actual untapped operational code serves as a mark, the present invention can extend the mark that a kind of programmable is provided in the instruction one.The preamble combination.In one embodiment, preamble is to be used to refer to the storage inspection that a microprocessor that meets old specification only forbids extending instruction.In another different embodiment, preamble then indicates the microprocessor that meets old specification to forbid extending the storage inspection of instruction and the instruction of follow-up some.When extending instruction when being extracted, whether underlined-preamble combination that extension fetch logic device of the present invention detects, and allow to extend instruction and send into the instruction cache of processor is not carried out any storage inspection and incident to be stored such as do not contrast.Extend the translation logic device and then extend an item field, indicate the storage inspection of the corresponding microinstruction sequence that will forbid extending instruction by the operational code in the extension micro-order of the present invention.The content that operational code is extended a field is that field is forbidden in the storage inspection that is reflected in the higher LIP chain, so when extending stored logic device processing one storage incident, for specifying the extension instruction of wanting the forbidden storage inspection, can't start synchronization events.
Therefore, the present invention gives programmer or a kind of mechanism of automated procedures code compile device (automated codecompilation device), can be used to indicate the microprocessor that meets old specification to go to forbid the storage inspection of single instruction or order bloc, empty problem so as to the staggered pipeline synchronization that is caused that overcomes in identical high-speed cache in-line procedure code and data, and a kind of flexible instrument that has more is provided, make to comprise the algorithm (algorithm) of self-correcting code in fact.
Now see also Figure 10, it is for describing the present invention to making the instruction of the storage checking process of programmer in instruction level replacement microprocessor, the workflow diagram 1000 of the method for extracting, translating and carry out.Flow process starts from square frame 1002, and one of them disposes the program of extending the feature instruction, is sent to microprocessor.Flow process then proceeds to square frame 1004.
In square frame 1004, next instruction is extracted, to enter the pipeline of microprocessor.Flow process then proceeds to decisional block 1008.
In decisional block 1008, the instruction of being extracted in the square frame 1004 is assessed, to judge whether that comprising one extends extended code-extension preamble sequence.In the embodiment of an x86, this assessment is that to be used for detecting be thereafter the opcode value F1 (ICE BKPT) that a forbidden storage is checked the specific bit item.If detect this extension extended code and subsequent item, then flow process proceeds to square frame 1010.If do not detect this extension extended code and extend specific bit, then flow process proceeds to square frame 1006.
In square frame 1006, with etc. incident to be stored synchronization is carried out in the instruction of being extracted in the square frame 1004.Wherein, the target of reciprocity incident to be stored is assessed, and conforms to judge whether the virtual address of instructing with this extraction.If conform to, then pipeline break-off, and allow these etc. incident execution to be stored.Then, this extraction instruction newly is extracted from the memory external body weight, and flow process then proceeds to square frame 1012.
In square frame 1010, the extension preamble part of instruction is extended in decoding, when passing through pipeline to specify in a corresponding microinstruction sequence, forbid the storage inspection of this microinstruction sequence.What storage was checked forbids, is to specify by disposing extension microcode field of the present invention.Flow process then proceeds to square frame 1012.
In square frame 1012, all remainders of this instruction are translated, with the position that determines a specify arithmetic, buffer operand, memory address specific bit and with according to should existing microprocessor instruction set, by the use of the specified existing structure feature of preamble.Flow process then proceeds to square frame 1014.
In square frame 1014, one extends microinstruction sequence is configured to specify arithmetic and corresponding operational code extension item thereof.Flow process then proceeds to square frame 1016.
In square frame 1016, a plurality of microinstruction sequences wherein comprise the extension microinstruction sequence that square frame 1014 is disposed, and press the processing sequence of transfer device, deliver to a micro-order formation, are carried out by microprocessor.Flow process then proceeds to decisional block 1018.
In decisional block 1018, next microinstruction sequence extends the actuating logic device by of the present invention one and extracts.This extends this next microinstruction sequence of actuating logic device assessment, to have judged whether to specify a storage incident.If do not have, then flow process proceeds to square frame 1028.If have, then flow process proceeds to square frame 1020.
In square frame 1020, owing to specified a storage incident, storage inspection logical unit is just inquired about of the present invention one higher LIP chain.Flow process then proceeds to decisional block 1022.
In decisional block 1022, assess,, whether conform to any virtual ip address of the instruction of pipeline meta after this storage incident to judge the destination address of this storage incident.If find a virtual ip address that conforms in this LIP chain, then flow process proceeds to decisional block 1024.If do not find that then flow process proceeds to square frame 1028.
In decisional block 1024, assess the associated storage inspection of this virtual ip address that conforms to and forbid field, to judge whether to forbid the storage inspection of dependent instruction.If then flow process proceeds to square frame 1028, if not, then flow process proceeds to square frame 1026.
In square frame 1026, extend the actuating logic device and point out that this dependent instruction need carry out a pipeline synchronization events.Flow process then proceeds to square frame 1028.
In square frame 1028, carry out the specified computing of this next microinstruction sequence.Flow process then proceeds to square frame 1030.
In square frame 1030, this method is finished.
Though the present invention and purpose thereof.Feature and advantage are described in detail, and other embodiment also can be within the scope of the present invention.For example, the present invention is narrated with regard to following technology: utilize interior single, the untapped operational code state of the instruction set architecture that takies fully to serve as a mark, to point out extension feature preamble thereafter.But scope of the present invention is just arbitrarily on the one hand, the instruction set architecture that is not limited to take fully, or untapped instruction, or single labelled.On the contrary, the instruction set of reflection, the embodiment that tool has used the embodiment of operational code and used more than one cue mark have fully been contained not in the present invention.For example, consider that one does not use the instruction set architecture of operational code state.A specific embodiment of the present invention has comprised chooses an operational code state as extending marking, and wherein selection standard is to determine by the market factor.Another specific embodiment then comprises and uses a specific combination of operational code to serve as a mark, as the continuous appearance of operational code state 7FH.Therefore, essence of the present invention is to be to use a flag sequence, then is the extension preamble of a n position thereafter, can allow programmer/compiling person in an existing microprocessor instruction set, and appointment will be forbidden the storage inspection of individual instructions or order bloc.
In addition, though above be to utilize microprocessor to explain orally the present invention and purpose thereof for example.Feature and advantage, it is still discernable to be familiar with this art person, scope of the present invention is not limited to the structure of microprocessor, and can contain the programmable device of form of ownership, as signal processor, industrial controller (industrialcontroller), array processor and other same device.

Claims (18)

1. the device of the storage inspection control that can instruct level in a microprocessor is characterized in that it comprises:
One extraction logic device is used for receiving one and extends instruction, and wherein this extension instruction comprises:
One extends preamble, is used to specify the storage inspection that will forbid this extension instruction; And
One extends the preamble mark, is another structure operation sign indicating number in the existing instruction set;
Wherein this extraction logic device is got rid of the storage inspection of extending incidents to be stored such as being correlated with of instruction; And
One translation logic device is connected to the extraction logic device, is used for the extension instruction is translated into a microinstruction sequence, when a specify arithmetic is carried out, gets rid of storage inspection with the indication microprocessor, and wherein said translation logic device comprises:
One extended instruction detects logical unit, is used to detect this extension preamble mark;
One instruction translation logic device, with deciding this specify arithmetic, and in microinstruction sequence this specify arithmetic of appointment; And
One extends the translation logic device, is connected to extended instruction and detects logical unit and instruction translation logic device, is used for specifying in microinstruction sequence, get rid of storage inspection when specify arithmetic is carried out.
2. device as claimed in claim 1 is characterized in that described extension instruction also comprises the instruction items of existing instruction set.
3. device as claimed in claim 2 is characterized in that the specify arithmetic that described instruction items specifies microprocessor to carry out, and when wherein this specify arithmetic is carried out, will carry out the storage inspection separately.
4. device as claimed in claim 3, it is characterized in that also comprising a storage inspection logical unit, if before a specific instruction is finished execution, detect a storage incident, a destination address of wherein storing incident conforms to the position of specific instruction, then described storage checks that logical unit will remove the instruction of specific instruction and all subsequent extracted separately from microprocessor, and after the storage incident is finished, and begins to extract again instruction from the position of specific instruction.
5. device as claimed in claim 1 is characterized in that described extraction logic device comprises:
Extend the fetch logic device, be configured to detect and extend preamble and extend the preamble mark, and enable this extensions instruct need not check these etc. under the situation of incident to be stored, proceed to the translation logic device.
6. device as claimed in claim 4, it is characterized in that also comprising first-class evaluate logic device to be stored, be used for assessing separately the instruction of microprocessor pipeline front, with detect these etc. incident to be stored, and if find these etc. a destination address of incident to be stored conform to the position of instructing of extracting, then should storage check that logical unit suspended the execution of extracting instruction, to allow to upgrade destination address.
7. device as claimed in claim 1 is characterized in that described extension preamble comprises:
One forbids field, is used for specifying the associated storage inspection that will forbid extending instruction.
8. device as claimed in claim 1 is characterized in that described extension preamble comprises:
One forbids field, is used for specifying the associated storage inspection that will forbid extending instruction and the subsequent instructions of a specified quantity.
9. one kind is expanded an existing instruction set with the micro processor, apparatus checked of forbidden storage optionally in a microprocessor pipeline, it is characterized in that it comprises:
One transfer interpreter, be configured to receive one and extend instruction, produce a microinstruction sequence, carry out a specify arithmetic to indicate a microprocessor, and when specify arithmetic is carried out, get rid of relevant storage inspection, wherein this extension instruction is configured to specify the associated storage inspection of extending instruction to be under an embargo, wherein this extension instruction comprises a wherein operational code of choosing of existing instruction set, then follow the extension preamble of a n position thereafter, this operational code of choosing is pointed out to extend instruction, and the extension preamble of this n position is then indicated and wanted the forbidden storage inspection
Described transfer interpreter comprises:
One extended instruction detecting device is used for detecting the selection operation sign indicating number in this extension instruction;
One instruction transfer interpreter is used for translating the remainder that this extension is instructed, to determine this specify arithmetic; And
One extends the preamble transfer interpreter, is connected to extended instruction detecting device and instruction transfer interpreter, and be used for translating the extension preamble of n position, and in microinstruction sequence, specify and want the forbidden storage inspection,
Wherein n is a positive integer.
10. micro processor, apparatus as claimed in claim 9 is characterized in that described extension instruction also comprises all the other instruction items, is configured to specify this specify arithmetic.
11. micro processor, apparatus as claimed in claim 9 is characterized in that described n locative preposition sign indicating number comprises one and forbids field, is configured to specify the storage inspection that will forbid extending instruction.
12. micro processor, apparatus as claimed in claim 9 is characterized in that described n locative preposition sign indicating number comprises one and forbids field, is configured to specify the storage inspection of the subsequent instructions that will forbid this an extension instruction and a specified quantity.
13. micro processor, apparatus as claimed in claim 9 is characterized in that also comprising:
One extends the fetch logic device, is used for receiving from internal memory extending instruction, detects the operational code and the extension preamble of n position chosen, and do not check etc. incident to be stored whether with situation that the position of this extensions instruction conforms under, allow extension to instruct and deliver to transfer interpreter.
14. one kind is to have the module that instruction set increases the storage inspection feature of inhibit command now, it is characterized in that it comprises:
One translation logic device, be connected to an extraction logic device, be used to produce a microinstruction sequence, carry out this computing to indicate a microprocessor, and in microinstruction sequence, specify and want the forbidden storage inspection, wherein this extraction logic device receives an extending marking, this extending marking points out that the subsidiary part of a corresponding instruction is to have specified the computing that will carry out, and, this extending marking is one first operational code in the existing instruction set, and check with a storage and to forbid that specific bit is connected, this storage inspection forbid specific bit be subsidiary part one of them, be used to refer to and want the forbidden storage inspection when fixing on the computing execution; And
One extends the actuating logic device, is connected to this translation logic device, and receive this microinstruction sequence, and carry out computing in the mode of not storing inspection,
Wherein said translation logic device comprises:
One extending marking detects logical unit, is used for detecting extending marking, and indicates the action of translating of this subsidiary part to need to translate rule according to extension; And
One decoding logic device, be connected to this extending marking and detect logical unit, be used for rule according to existing instruction set, execution command translate action, and translate rule according to extension and carry out translating of corresponding instruction, carry out in the mode of not storing inspection to enable this computing.
15. a method that expands an existing instruction set architecture, to check that in instruction level forbidden storage this method comprises:
Use an extraction logic device to receive one and extend instruction, this extension instruction comprises an extension mark and and extends preamble, and wherein extending mark is wherein one first operational code item of existing instruction set architecture;
Specify in by this extension preamble and to extend instruction forbidden storage inspection when carrying out, and get rid of the storage inspection of the incidents to be stored such as relevant of extending instruction, the computing that the remainder appointment of wherein should extensions instructing will be carried out by this extraction logic device; And
To extend instruction and be translated into a microinstruction sequence, when a specify arithmetic is carried out, forbid the associated storage inspection of this extension instruction with the indication microprocessor.
16. method as claimed in claim 15 is characterized in that described required movement comprises:
Use one second operational code item of this existing instruction set architecture to specify this computing.
17. method as claimed in claim 15 is characterized in that also comprising:
Should extend instruction and be translated into a microinstruction sequence, microinstruction sequence is that indication one extension actuating logic device is carried out this computing in the mode of not storing inspection.
18. method as claimed in claim 17 is characterized in that described action of translating the extension instruction comprises:
In a translation logic device, detect this and extend mark; And
Translate this extension preamble of rule decoding and the remainder that extends instruction according to extension, to produce microinstruction sequence.
CNB031222803A 2002-10-29 2003-04-25 Inhibitor and method for storing check Expired - Lifetime CN1328655C (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US10/283,397 US7302551B2 (en) 2002-04-02 2002-10-29 Suppression of store checking
US10/283,397 2002-10-29

Publications (2)

Publication Number Publication Date
CN1453702A CN1453702A (en) 2003-11-05
CN1328655C true CN1328655C (en) 2007-07-25

Family

ID=29270355

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB031222803A Expired - Lifetime CN1328655C (en) 2002-10-29 2003-04-25 Inhibitor and method for storing check

Country Status (2)

Country Link
CN (1) CN1328655C (en)
TW (1) TWI223773B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5734857A (en) * 1993-11-29 1998-03-31 U.S. Philips Corporation Program memory expansion using a special-function register
CN1335561A (en) * 2000-06-30 2002-02-13 先进数字芯片股份有限公司 Extended instruction word folding equipment
JP2002312176A (en) * 2001-03-30 2002-10-25 Internatl Business Mach Corp <Ibm> Conversion program, compiler, computer device, program converting method, and storage medium

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5734857A (en) * 1993-11-29 1998-03-31 U.S. Philips Corporation Program memory expansion using a special-function register
CN1335561A (en) * 2000-06-30 2002-02-13 先进数字芯片股份有限公司 Extended instruction word folding equipment
JP2002312176A (en) * 2001-03-30 2002-10-25 Internatl Business Mach Corp <Ibm> Conversion program, compiler, computer device, program converting method, and storage medium

Also Published As

Publication number Publication date
CN1453702A (en) 2003-11-05
TWI223773B (en) 2004-11-11
TW200406702A (en) 2004-05-01

Similar Documents

Publication Publication Date Title
CN1414468B (en) Device and method for extending microprocessor instruction set
CN1218243C (en) Appts. and method of extending microprocessor data mode
CN1327338C (en) Data processing using multiple instruction sets
US5926646A (en) Context-dependent memory-mapped registers for transparent expansion of a register file
US7647478B2 (en) Suppression of store checking
JP2001195250A (en) Instruction translator and instruction memory with translator and data processor using the same
EP1351135B1 (en) Microprocessor and method for selective control of condition code write back
US7089539B2 (en) Program instruction interpretation
US5961632A (en) Microprocessor with circuits, systems, and methods for selecting alternative pipeline instruction paths based on instruction leading codes
KR100864891B1 (en) Unhandled operation handling in multiple instruction set systems
US6516410B1 (en) Method and apparatus for manipulation of MMX registers for use during computer boot-up procedures
CN1328655C (en) Inhibitor and method for storing check
CN100578442C (en) Device and method for selectivity controlling result write back
CN1308813C (en) Control mechanism referenced by non-temporary memory
CN1414464B (en) Mechanism and method for adding number of buffer storage of microprocessor
CN1212566C (en) Device and method for excution condition instruction
CN100590591C (en) Device and method for selectivity controlling condition code write back
CN1211731C (en) Appts. and method of extending address mode
TWI224284B (en) Selective interrupt suppression
CN1237441C (en) Appts. and method of selectively controlling memory attribute
US5903742A (en) Method and circuit for redefining bits in a control register
JPH09507321A (en) Pipelined microinstruction apparatus and method using branch prediction and speculative state change

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CX01 Expiry of patent term

Granted publication date: 20070725

CX01 Expiry of patent term