CN106325820A - Heterogeneous processor architecture - Google Patents

Heterogeneous processor architecture Download PDF

Info

Publication number
CN106325820A
CN106325820A CN201510372349.2A CN201510372349A CN106325820A CN 106325820 A CN106325820 A CN 106325820A CN 201510372349 A CN201510372349 A CN 201510372349A CN 106325820 A CN106325820 A CN 106325820A
Authority
CN
China
Prior art keywords
instruction set
processor
instruction
reduced instruction
set computer
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201510372349.2A
Other languages
Chinese (zh)
Inventor
孟凡金
曹君
曹一君
严伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN201510372349.2A priority Critical patent/CN106325820A/en
Publication of CN106325820A publication Critical patent/CN106325820A/en
Pending legal-status Critical Current

Links

Abstract

The invention provides a design method of a heterogeneous processor. According to the design method, a general reduced instruction set computer (RISC) processor and a complex instruction set computer (CISC) processor are combined into one, namely, one architecture can be used for synchronously supporting two completely different instruction sets, and thus advantages of the two processors can be synchronously integrated by the design, and optimization of data processing performance can be achieved; and as the two traditional structures are combined into one, the parallel processing capability can be increased, more application platforms can be adapted, and meanwhile, duplication of hardware resources can be reduced to a great extent, and thus the area and power consumption can be better optimized.

Description

A kind of heterogeneous processor framework
Technical field
The present invention relates to novel framework and the realization of processor.
Background technology
During the optimized development of processor instruction system, two distinct optimization directions occurred: CISC technology and RISC technology.CISC refers to Complex Instruction Set Computer (Complex Instruction Set Computer);RISC refers to Reduced Instruction Set Computing (Reduced Instruction Set Computer).Here the repertoire of computer refers to the machine instruction of the lowermost layer of computer, and namely CPU can the instruction of Direct Recognition.Complexity along with processor system, it is desirable to the structure of processor instruction system can make the overall performance of processor the most more stable.Initially, the optimization method that people use is the instruction complicated by arranging some functions, the instruction system that function that some were realized by software originally, conventional uses hardware instead realizes, the execution speed of processor is improved with this, this processor system is thus referred to as complex instruction set processor, i.e. Complex Instruction Set Computer, is called for short CISC.Another kind of optimization method just grew up in the eighties in 20th century, its basic thought is to try to simplify computer instruction function, only retain the instruction that those functions are simple, can perform in a beat, and more complicated function is realized with a segment subprogram, this processor system is thus referred to as reduced instruction set processor. i.e. Reduced Instruction Set Computer, is called for short RISC.The elite of RISC technology is through simplified processor command function, makes the average execution cycle of instruction reduce, thus improves the work dominant frequency of processor, the most a large amount of speed using general register to improve subprogram execution.
RISC and CISC is respectively arranged with pluses and minuses, and their difference is different CPU design theories and method.CPU in early days is entirely CISC architecture, and its purpose of design is intended to complete required calculating task with minimum machine language instruction.RISC and CISC is the two kinds of typical technologies manufacturing and designing microprocessor, although they are try to make certain balance in the factors such as architecture, operation operation, hardware and software, compilation time and operation time, in the hope of reaching efficient purpose, but the method used is different, therefore, widely different at a lot of aspects, RISC designer is placed on main attention in those commonly used instructions, makes them have simple efficient characteristic as far as possible.To the function being of little use, complete usually through combined command.Therefore, when realizing specific function on RISC machine, efficiency may be relatively low.And the instruction system of cisc computer is abundanter, there is special instruction to complete specific function.Therefore, special duty efficiency is processed higher.Risc processor simple in construction, compact in design, the design cycle is short, and is prone to use state-of-the-art technology;CISC microprocessor architecture is complicated, and the design cycle is long.
Summary of the invention
This invention is a kind of new types of processors framework, in conjunction with the respective feature of RISC and CISC, maximizes favourable factors and minimizes unfavourable ones, thus increases motility, improves the execution efficiency of processor greatly.In the invention, a kind of processor framework can support two kinds of diverse instruction set simultaneously, and processor freely can switch between two kinds of instruction set.For reaching to support the purpose of two kinds of instruction set, instruction bus, instruction decoding unit and arithmetic element are all different with traditional single instrction set processor structure.
First the instruction fetch of two kinds of different instruction sets to be supported by instruction bus, the command length of two kinds of instruction set is different, risc instruction set command length is 32, and cisc instruction set command length is probably 64,128 ISC and cisc instruction set complete the switching of two kinds of instruction set by unique mixed instruction decoding unit.First looking at instruction set flag during decoding, this flag can be arranged by two kinds of methods: a kind of is to arrange, by software programming system register, the instruction set needed;Another mode is that the switching command by instruction set completes, that is one switching (SWITCH) instruction of design is concentrated at RISC instruction, this instruction is as the last item instruction entering cisc instruction set, this instruction can change command identification position and know its entrance cisc instruction integrated mode of decoding, in like manner, one switching (SWITCH) instruction of design is concentrated at cisc instruction, this instruction instructs as the last item entering risc instruction set, and this instruction can change the knowledge decoding of command identification position, and it enters RISC instruction integrated mode.
RISC and cisc instruction set share same instruction bus, and the namely instruction fetch of two kinds of instruction set is all crossed same and is exclusive command bus, and this instruction bus can be from RISC or CISC program code instruction fetch, it is also possible to be the code of RISC and CISC hybrid programming.
RISC and cisc instruction set processor share same data/address bus, and the program of two kinds of instruction set all accesses the data storage cell of system by same data/address bus.
RISC and cisc instruction set share all of arithmetic element, including arithmetical operation, logical operations, floating-point operation and addressing unit.After instruction is by mixed instruction decoding unit, will no longer be distinguished RISC and cisc instruction set.Except for the difference that, CISC may enable multiple arithmetic element simultaneously and reach union, and RISC only enables single arithmetic element according to instruction.
For ensureing the stability of program, when RISC and CISC program switches mutually, the instruction only having unique a kind of instruction set in same execution cycle is effective: reduced instruction set computer instruction or sophisticated vocabulary instruction.
Accompanying drawing illustrates:
The overall implementation method of Fig. 1 heterogeneous processor;
Fig. 2 command decoder realize example;
Fig. 3 is an application example of the present invention.
Detailed description of the invention:
The present invention provides the method for designing of a kind of heterogeneous processor, and general reduced instruction set computer (RISC) processor and sophisticated vocabulary (CISC) processor are united two into one by this invention, and i.e. a kind of framework can support two kinds of diverse instruction set, with reference to Fig. 1 simultaneously.Can be the instruction of 16bits for reduced instruction set computer in specific embodiment, it is also possible to be the instruction set of 32bits;Can be the very long instruction word (VLIW) of 48bits or 64bits for sophisticated vocabulary in specific embodiment, or the instruction set of instruction unequal length.
The core of this invention is the design of command decoder, and Fig. 2 is a kind of application example of the decoding unit of a kind of mixed architecture processor.In such instances, instruction decoding unit supports two kinds of Instruction decodings, RISC and CISC simultaneously.Two kinds of methods can change the operational mode of processor: directly instruction changes and system register mode.Directly instruction changes operational mode mainly by pattern switching command, a switching command is separately designed in RISC and CISC, when after system instruction fetch, first pass through the pre-decode of instruction, when pre-decode judgement is switching command, system model switching state can be entered into, update the pattern identification position of system simultaneously, subsequently into the real Instruction decoding stage, at this moment decoding unit carries out different pattern decoding according to system model flag.System register mode is to weave into the pattern identification position of system register to reach the change of system model with crossing, and this mode realizes simple to operation, but efficiency can reduce.When changing system model with system register, the value of mode register is to be commanded last stage change of process, namely the write back stage of data occurs, in this case, compiling to be inserted in a pipeline does not has new instruction to enter processor before corresponding non-operation instruction ensures switching.
This inventive embodiment can include multiple application.Such as this invention can be applied in multimedia processor for accelerating image or Audio Processing, make full use of CISC efficiently and easily the weaving into low-power consumption thus significantly speeded up the disposal ability of audio frequency and video of reduced instruction set computer.Can also be used in Industry Control, there is superpower digital signal processing capability simultaneously.
As in figure 2 it is shown, the implementation of a kind of audio video processor.In this implementation, RISC is the MCU of 32bits instruction set, and MCU supports most basic fixed-point calculation;CISC is the VLIW instruction set of 64bits, supports the special instruction that audio frequency and video are relevant, supports floating-point operation simultaneously.Processor core supports two kinds of instruction set simultaneously, and RISC can be used for control part, and VLIW can be used for audio/video encoding/decoding.For efficient disposal ability, this realizes example and supports weaving into of mixed instruction collection.In order to coordinate vliw processor, the system X/Y bus increasing two dimension accesses for 2-D data, drastically increases digital signal processing capability.Meanwhile, corresponding peripheral hardware, such as intervalometer, level cache, L2 cache, Boot ROM etc. are increased.In this example, default mode is the MCU pattern of 32bits instruction set, can be by starting the program initialization pattern to the VLIW instruction set of 64bits, and startup program is weaved into corresponding system register and carried out identification processor mode of operation.When mixing is weaved into, compiler is responsible for inserting switching command thus is changed the operational mode of processor in real time.

Claims (8)

1. the framework of a heterogeneous processor includes: two kinds of instruction set supported by this processor simultaneously: reduced instruction set computer (RISC) and sophisticated vocabulary (CISC).
Method the most according to claim 1, comprises further: this processor can support single reduced instruction set computer program, it is also possible to support single sophisticated vocabulary program, can support again the combination process of both the above instruction set simultaneously.
Method the most according to claim 2, comprises further:
Reduced instruction set computer program can be switched to sophisticated vocabulary program, and switching mode can be two kinds: the first is that reduced instruction concentrates one switching command of design, and after processor runs this switching command, processor enters complicated order integrated mode;The second way is that the depositor by programmable universal makes processor enter complicated order integrated mode.
Method the most according to claim 2, comprises further:
Sophisticated vocabulary program can be switched to reduced instruction set computer program, and switching mode can be two kinds: the first is that complicated order concentrates one switching command of design, and after processor runs this switching command, processor enters reduced instruction integrated mode;The second way is that the depositor by programmable universal makes processor enter reduced instruction integrated mode.
Method the most according to claim 1, comprises further: reduced instruction set computer and sophisticated vocabulary shared instruction bus and data/address bus.
Method the most according to claim 1, comprises further: reduced instruction set computer and sophisticated vocabulary shared instruction decoding unit, i.e. processor decoding unit can decode reduced instruction set computer and sophisticated vocabulary.
Method the most according to claim 1, comprises further:
Reduced instruction set computer and sophisticated vocabulary share arithmetic element, and arithmetic element includes arithmetical operation, logical operations, floating-point operation and addressing unit.
Method the most according to claim 7, comprises further: all arithmetic element parallel processings, and the instruction only having unique a kind of instruction set in same execution cycle is effective: reduced instruction set computer instruction or sophisticated vocabulary instruction.
CN201510372349.2A 2015-06-30 2015-06-30 Heterogeneous processor architecture Pending CN106325820A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510372349.2A CN106325820A (en) 2015-06-30 2015-06-30 Heterogeneous processor architecture

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510372349.2A CN106325820A (en) 2015-06-30 2015-06-30 Heterogeneous processor architecture

Publications (1)

Publication Number Publication Date
CN106325820A true CN106325820A (en) 2017-01-11

Family

ID=57723021

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510372349.2A Pending CN106325820A (en) 2015-06-30 2015-06-30 Heterogeneous processor architecture

Country Status (1)

Country Link
CN (1) CN106325820A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096308A (en) * 2019-04-24 2019-08-06 北京探境科技有限公司 A kind of parallel memorizing arithmetic unit and its method
CN110688166A (en) * 2019-09-26 2020-01-14 浪潮商用机器有限公司 Server and server starting method

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110096308A (en) * 2019-04-24 2019-08-06 北京探境科技有限公司 A kind of parallel memorizing arithmetic unit and its method
CN110688166A (en) * 2019-09-26 2020-01-14 浪潮商用机器有限公司 Server and server starting method

Similar Documents

Publication Publication Date Title
US11687345B2 (en) Out-of-order block-based processors and instruction schedulers using ready state data indexed by instruction position identifiers
US11379229B2 (en) Apparatus and method for adaptable and efficient lane-wise tensor processing
JP6849274B2 (en) Instructions and logic to perform a single fused cycle increment-comparison-jump
JP6373425B2 (en) Instruction to shift multiple bits to the left and pull multiple 1s into multiple lower bits
US20160055004A1 (en) Method and apparatus for non-speculative fetch and execution of control-dependent blocks
US9870226B2 (en) Control of switching between executed mechanisms
WO1997050031A1 (en) Method for increasing performance of binary translated conditional instructions
KR20150019349A (en) Multiple threads execution processor and its operating method
US10915328B2 (en) Apparatus and method for a high throughput parallel co-processor and interconnect with low offload latency
JP5941488B2 (en) Convert conditional short forward branch to computationally equivalent predicate instruction
US10831505B2 (en) Architecture and method for data parallel single program multiple data (SPMD) execution
KR20170036035A (en) Apparatus and method for configuring sets of interrupts
US20160283247A1 (en) Apparatuses and methods to selectively execute a commit instruction
CN110692039A (en) Microprocessor instruction pre-dispatch prior to block commit
US20150277910A1 (en) Method and apparatus for executing instructions using a predicate register
CN106325820A (en) Heterogeneous processor architecture
WO2002057908A2 (en) A superscalar processor having content addressable memory structures for determining dependencies
Putnam et al. Dynamic vectorization in the E2 dynamic multicore architecture
CN101944012B (en) Instruction processing method and super-pure pipeline microprocessor
US20150095616A1 (en) Data processor
US11940945B2 (en) Reconfigurable SIMD engine
CN112559037A (en) Instruction execution method, unit, device and system
US20220206792A1 (en) Methods, systems, and apparatuses to optimize partial flag updating instructions via dynamic two-pass execution in a processor
US20230195456A1 (en) System, apparatus and method for throttling fusion of micro-operations in a processor
Briejer et al. Extending the Cell SPE with energy efficient branch prediction

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20170111