CN110427337B - Processor core based on field programmable gate array and operation method thereof - Google Patents

Processor core based on field programmable gate array and operation method thereof Download PDF

Info

Publication number
CN110427337B
CN110427337B CN201910930708.XA CN201910930708A CN110427337B CN 110427337 B CN110427337 B CN 110427337B CN 201910930708 A CN201910930708 A CN 201910930708A CN 110427337 B CN110427337 B CN 110427337B
Authority
CN
China
Prior art keywords
register
state
interrupt
input
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910930708.XA
Other languages
Chinese (zh)
Other versions
CN110427337A (en
Inventor
徐庆嵩
刘锴
刘建华
王铜铜
范召
杜金凤
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangdong High Cloud Semiconductor Technologies Ltd Co
Original Assignee
Guangdong High Cloud Semiconductor Technologies Ltd Co
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangdong High Cloud Semiconductor Technologies Ltd Co filed Critical Guangdong High Cloud Semiconductor Technologies Ltd Co
Priority to CN201910930708.XA priority Critical patent/CN110427337B/en
Publication of CN110427337A publication Critical patent/CN110427337A/en
Application granted granted Critical
Publication of CN110427337B publication Critical patent/CN110427337B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/76Architectures of general purpose stored program computers
    • G06F2015/761Indexing scheme relating to architectures of general purpose stored programme computers
    • G06F2015/768Gate array

Abstract

The invention discloses a processor core based on a field programmable gate array and an operation method thereof. The processor core comprises an input/output module, a decoding module and an execution module; the input and output module is used as a unique transmission interface for uniformly connecting the processor core with the out-of-core equipment to transmit data and instructions, and performs transmission operation with the out-of-core equipment according to an address signal of the out-of-core equipment, wherein the out-of-core equipment comprises an instruction memory and a data memory; the decoding module is used for decoding the instruction read by the input and output module from the instruction memory to generate a decoding result; the execution module is used for processing operation according to the decoding result. According to the technical scheme, the input/output module is used as the only transmission interface for uniformly connecting the processor core with the out-core equipment to transmit data and instructions, so that the input/output module is subjected to uniform transmission logic control, the use of logic resources is reduced, the core design difficulty is reduced, the core structure can be simplified, and the core power consumption is reduced.

Description

Processor core based on field programmable gate array and operation method thereof
Technical Field
The invention relates to the technical field of processor cores, in particular to a processor core based on a field programmable gate array and an operation method thereof.
Background
With the rapid development of Field-Programmable Gate Array (FPGA) technology, the processor core based on the FPGA is applied more and more widely, and compared with the traditional processor core, the processor core based on the FPGA has good expansibility, so that designers can freely expand the function of the processor core according to different application scenarios, and rapid design and reuse are facilitated.
The existing FPGA-based processor core comprises an instruction fetching module, a data reading and writing module, a decoding module and an execution module. The instruction fetching module reads an instruction in an instruction memory outside the core and sends the instruction to the decoding module; the data read-write module reads data in a data memory outside the core and writes the data into the general register, or reads the data in the general register and writes the data into data memory equipment outside the core; and the execution module carries out processing operation according to the decoding result of the decoding module. The prior art has two transmission interfaces of an instruction fetching module and a data reading and writing module, and two sets of transmission logics are required to be designed to respectively control the instruction fetching module and the data reading and writing module, so that more logic resources are required to be used, and the design difficulty of a kernel is higher.
Disclosure of Invention
The embodiment of the invention provides a processor core based on a field programmable gate array and an operation method thereof, which can reduce the use of logic resources and reduce the difficulty of core design.
The embodiment of the invention adopts the following technical scheme:
in a first aspect, an embodiment of the present invention provides a processor core based on a field programmable gate array, including an input/output module, a decoding module, and an execution module; the input and output module is used as a unique transmission interface for uniformly connecting the processor core with the out-of-core equipment to transmit data and instructions, and performs transmission operation with the out-of-core equipment according to an address signal of the out-of-core equipment, wherein the out-of-core equipment comprises an instruction memory and a data memory; the equipment outside the core is uniformly numbered and uniformly addressed; the decoding module is used for decoding the instruction read by the input and output module from the instruction memory to generate a decoding result; the execution module is used for processing operation according to the decoding result.
The input and output module comprises an address register, a data effective register, an output data register, a data byte effective register and an input data register;
the input and output module is specifically configured to:
when the output operation is executed, the effective signal of the output data written in the effective data register is a second value, the address signal of the out-of-core equipment is written in the address register, the output data is written in the output data register, and the byte effective signal of the output data is written in the effective data byte register, so that the address signal corresponds to the out-of-core equipment and reads the output data from the output data register according to the byte effective signal of the output data;
when the input operation is executed, the output data valid signal written in the data valid register is a first value, and the address signal of the extranuclear equipment is written in the address register, so that the address signal corresponds to the extranuclear equipment to write the input instruction or the input data in the input data register.
As an alternative embodiment, the processor core controls the instruction processing flow by using a finite state machine, and the state types of the finite state machine comprise a read instruction state, a decoding state, a operand acquisition and instruction type state, an execution state and a register write-back state;
when the state type of the finite state machine is a read instruction state, controlling the input/output module to carry out read instruction operation; when the state type of the finite state machine is a decoding state, controlling a decoding module to perform decoding operation; when the state type of the finite state machine is the state of obtaining the operand and the instruction type, controlling an execution module to carry out operation of obtaining the operand and the instruction type; when the state type of the finite state machine is an execution state, controlling an execution module to execute an execution operation; and when the state type of the finite state machine is the register write-back state, controlling the execution module to perform register write-back operation.
As an optional implementation manner, the execution module is specifically configured to receive the instruction type, the source register number, the destination register number, and the immediate number generated by the decoding module, obtain a value of a corresponding position in the general register corresponding to the source register number as a source operand, determine a sub-state of the execution state of the finite state machine according to the instruction type, and execute a processing operation corresponding to the sub-state; the sub-states of the execution state include an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state, and an unconditional jump state, which correspond to the arithmetic logic operation state, the read operation state, the write operation state, the conditional branch operation state, and the unconditional jump operation state, respectively.
As an optional implementation manner, the execution module includes a register bank management unit, a register write-back unit, an arithmetic logic unit, and an input/output management unit, and the register bank management unit is provided with a general register and a program pointer register;
executing the processing operation corresponding to the sub-state, including:
when the arithmetic logic operation is executed, the source operand and the immediate are transmitted to the arithmetic logic unit for operation, and the operation result is stored in the general register corresponding to the destination register number;
when executing conditional branch operation, transmitting the source operand and the immediate number to an arithmetic logic unit for operation, determining the next jump offset of a program pointer according to the operation result, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by the arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back to the program pointer register through a register write-back unit;
when executing unconditional jump operation, taking a source operand or an immediate as the next jump offset of a program pointer, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by an arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register through a register write-back unit;
when the write operation is executed, the input/output management unit sets the output data valid signal written in the data valid register of the input/output module to be a second value, writes the address signal of the out-of-core device in the address register of the input/output module, writes the output data in the output data register of the input/output module, and writes the byte valid signal of the output data in the data byte valid register of the input/output module, so that the address signal corresponds to the out-of-core device to read the output data from the output data register according to the byte valid signal of the output data; storing the operation result returned by the input and output module into a general register corresponding to the number of the destination register;
when the read operation is executed, writing an address signal of the out-of-core equipment into an address register of the input/output module so that the address signal corresponds to the out-of-core equipment and writes an input instruction or input data into an input data register of the input/output module; and storing the operation result returned by the input and output module into the general register corresponding to the destination register number.
As an optional implementation, the processor core further includes an interrupt management module, where the interrupt management module includes an interrupt enable register, an interrupt wait register, an interrupt return address register, and an interrupt idle state register;
the interrupt management module is used for comparing the interrupt request signal with the value in the interrupt enable register when receiving the interrupt request signal and judging whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, setting the value of the corresponding position of the interrupt request signal of the interrupt waiting register as a second value;
if it is determined that no interrupt request signal is currently processed according to the interrupt idle state register, after the current instruction execution is finished, storing a current program pointer into an interrupt return address register, setting the value of the interrupt idle state register as a first value, storing a current value in a general register of a register group management unit of the execution module into a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register; after the interrupt function is executed, restoring the site before the interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register as a second value, and jumping to the next instruction before the interrupt execution according to the current program pointer in the interrupt return address register.
As an optional implementation manner, the interrupt management module is further configured to, if it is determined that an interrupt request signal is currently processed according to the interrupt idle status register, wait for the end of processing of the current interrupt request signal, and continue to process the received interrupt request signal after the end of processing.
As an optional implementation manner, the interrupt enable register, the interrupt waiting register, and the interrupt return address register are disposed in a register group management unit of the execution module, and are numbered, addressed, and read and write controlled uniformly with a general register and a program pointer register in the register group management unit.
As an optional implementation manner, the out-of-core device further includes a peripheral register, and the peripheral register, the instruction memory and the data memory are numbered and addressed uniformly.
As an alternative embodiment, the processor core is a processor core of a fifth generation reduced instruction set architecture.
In a second aspect, an embodiment of the present invention provides an operation method for a processor core based on a field programmable gate array, where the processor core includes an input/output module, a decoding module, and an execution module;
the input and output module is used as a unique transmission interface for uniformly connecting the kernel of the processor with the out-of-core equipment to transmit data and instructions, and performs transmission operation with the out-of-core equipment according to an address signal of the out-of-core equipment, wherein the out-of-core equipment comprises an instruction memory and a data memory; the equipment outside the core is uniformly numbered and uniformly addressed; the decoding module carries out decoding operation on the instruction read by the input and output module from the instruction memory to generate a decoding result; and the execution module performs processing operation according to the decoding result.
As an optional implementation, the input/output module includes an address register, a data valid register, an output data register, a data byte valid register, and an input data register;
the method for performing transmission operation with the extranuclear device according to the address signal of the extranuclear device includes:
when the output operation is executed, the effective signal of the output data written in the effective data register is a second value, the address signal of the out-of-core equipment is written in the address register, the output data is written in the output data register, and the byte effective signal of the output data is written in the effective data byte register, so that the address signal corresponds to the out-of-core equipment and reads the output data from the output data register according to the byte effective signal of the output data;
when the input operation is executed, the output data valid signal written in the data valid register is a first value, and the address signal of the extranuclear equipment is written in the address register, so that the address signal corresponds to the extranuclear equipment to write the input instruction or the input data in the input data register. Wherein the input operation comprises a read instruction operation to the instruction memory or a read data operation to the data memory.
As an alternative embodiment, the processor core controls the instruction processing flow by using a finite state machine, and the state types of the finite state machine comprise a read instruction state, a decoding state, a operand acquisition and instruction type state, an execution state and a register write-back state;
when the state type of the finite state machine is a read instruction state, controlling the input/output module to carry out read instruction operation;
when the state type of the finite state machine is a decoding state, controlling a decoding module to perform decoding operation;
when the state type of the finite state machine is the state of obtaining the operand and the instruction type, controlling an execution module to carry out operation of obtaining the operand and the instruction type;
when the state type of the finite state machine is an execution state, controlling an execution module to execute an execution operation;
and when the state type of the finite state machine is the register write-back state, controlling the execution module to perform register write-back operation.
As an optional implementation manner, the executing module performs processing operation according to the decoding result, including: the execution module receives the instruction type, the source register number, the target register number and the immediate number generated by the decoding module, obtains the value of the corresponding position in the general register corresponding to the source register number as a source operand, determines the sub-state of the execution state of the finite state machine according to the instruction type, and executes the processing operation corresponding to the sub-state;
the sub-states of the execution state include an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state, and an unconditional jump state, which correspond to the arithmetic logic operation state, the read operation state, the write operation state, the conditional branch operation state, and the unconditional jump operation state, respectively.
As an optional implementation manner, the execution module includes a register bank management unit, a register write-back unit, an arithmetic logic unit, and an input/output management unit, and the register bank management unit is provided with a general register and a program pointer register;
executing the processing operation corresponding to the sub-state, including:
when the arithmetic logic operation is executed, the source operand and the immediate are transmitted to the arithmetic logic unit for operation, and the operation result is stored in the general register corresponding to the destination register number;
when executing conditional branch operation, transmitting the source operand and the immediate number to an arithmetic logic unit for operation, determining the next jump offset of a program pointer according to the operation result, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by the arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back to the program pointer register through a register write-back unit;
when executing unconditional jump operation, taking a source operand or an immediate as the next jump offset of a program pointer, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by an arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register through a register write-back unit;
when the write operation is executed, the input/output management unit sets the output data valid signal written in the data valid register of the input/output module to be a second value, writes the address signal of the out-of-core device in the address register of the input/output module, writes the output data in the output data register of the input/output module, and writes the byte valid signal of the output data in the data byte valid register of the input/output module, so that the address signal corresponds to the out-of-core device to read the output data from the output data register according to the byte valid signal of the output data; storing the operation result returned by the input and output module into a general register corresponding to the number of the destination register;
when the read operation is executed, writing an address signal of the out-of-core equipment into an address register of the input/output module so that the address signal corresponds to the out-of-core equipment and writes an input instruction or input data into an input data register of the input/output module; and storing the operation result returned by the input and output module into the general register corresponding to the destination register number.
As an optional implementation, the processor core further includes an interrupt management module, where the interrupt management module includes an interrupt enable register, an interrupt wait register, an interrupt return address register, and an interrupt idle state register;
the operation method further comprises the following steps:
when receiving the interrupt request signal, comparing the interrupt request signal with the value in the interrupt enable register, and judging whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, setting the value of the corresponding position of the interrupt request signal of the interrupt waiting register as a second value;
if it is determined that no interrupt request signal is currently processed according to the interrupt idle state register, after the current instruction execution is finished, storing a current program pointer into an interrupt return address register, setting the value of the interrupt idle state register as a first value, storing a current value in a general register of a register group management unit of the execution module into a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register; after the interrupt function is executed, restoring the site before the interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register as a second value, and jumping to the next instruction before the interrupt execution according to the current program pointer in the interrupt return address register.
As an optional implementation manner, after setting the value of the position corresponding to the interrupt request signal of the interrupt waiting register to the second value, the method further includes:
and if the current interrupt request signal is confirmed to be processed according to the interrupt idle state register, waiting for the end of processing of the current interrupt request signal, and continuously processing the received interrupt request signal after the end of processing.
As an optional implementation manner, the interrupt enable register, the interrupt waiting register, and the interrupt return address register are disposed in a register group management unit of the execution module, and are numbered, addressed, and read and write controlled uniformly with a general register and a program pointer register in the register group management unit.
As an optional implementation manner, the out-of-core device further includes a peripheral register, and the peripheral register, the instruction memory and the data memory are numbered and addressed uniformly.
As an alternative embodiment, the processor core is a processor core of a fifth generation reduced instruction set architecture.
Compared with the prior art, the embodiment of the invention has the following beneficial effects:
in the technical scheme, the input/output module is used as a unique transmission interface for uniformly connecting the kernel of the processor with the out-of-core equipment to transmit data and instructions, the out-of-core equipment comprising the instruction memory and the data memory is connected, the transmission operations of reading instructions, reading and writing data and the like are carried out on the out-of-core equipment according to address signals, the kernel of the processor carries out uniform transmission logic control on the input/output module, the use of logic resources is reduced, and the design difficulty of the kernel is reduced; and only one transmission interface is arranged outside, so that the structure of the kernel can be simplified, the area of the kernel can be optimized, and the power consumption of the kernel can be reduced.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings needed to be used in the description of the embodiments will be briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
FIG. 1 is a schematic diagram of a processor core according to an embodiment of the disclosure;
FIG. 2 is a schematic diagram of an input/output module according to an embodiment of the present disclosure;
FIG. 3 is a schematic diagram of a decoding module according to an embodiment of the present invention;
FIG. 4 is a diagram illustrating a state transition of a finite state machine according to an embodiment of the present invention;
FIG. 5 is a schematic diagram of an execution module disclosed in the embodiment of the present invention;
FIG. 6 is a block diagram of an interrupt management module according to an embodiment of the present invention.
Detailed Description
In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The processor core based on the FPGA is the most important component of the processor, can be made of monocrystalline silicon and is used for processing various operations.
Referring to fig. 1, a schematic diagram of a processor core according to an embodiment of the disclosure is shown. The processor core based on the field programmable gate array of the embodiment comprises an input/output module 10, a decoding module 20 and an execution module 30; the input/output module 10 is used as a unique transmission interface for uniformly connecting the processor core to the out-core device to transmit data and instructions, and performs transmission operation with the out-core device according to an address signal of the out-core device, where the out-core device includes an instruction memory and a data memory. The transmission interface is uniformly connected with the out-of-core equipment and transmits various data and instructions processed by the execution module 30; the transmission interface does not forward interrupt related signals, which are processed by the interrupt management module 40. The address signal is set in advance for the extra-core device, and the input/output module 10 can interact with the extra-core device of the address signal according to the address signal. The decoding module 20 is configured to perform a decoding operation on the instruction read from the instruction memory by the input/output module 10, so as to generate a decoding result. The execution module 30 is used for performing processing operation according to the decoding result. The execution module 30 performs processing operations corresponding to the decoding results on the various decoding results; in addition, the execution module 30 may perform various operations.
The processor core of the present embodiment may include an interrupt management module 40; as another alternative, the processor core may not include interrupt management module 40, interrupt management module 40 may be located external to the processor core, and interrupt management module 40 sends interrupt related signals from outside the core to program pointer registers 306 and execution module 30 within the core. The interrupt management module 40 is used for processing interrupt related signals, and can be used as an interrupt interface; if the interrupt management module 40 sends an interrupt related signal from outside the core to inside the core, an interrupt interface for receiving the interrupt related signal may be set inside the core. The execution module 30 of the embodiment of the present invention includes a general register 305 and a program pointer register 306, and the general register 305 and the program pointer register 306 are set in the register group management unit 301 of the execution module 30; alternatively, the execution module 30 may not include the general purpose registers 305 and/or the program pointer registers 306, which may be separate modules within the processor core.
An out-of-core device refers to a device external to a processor core and may include a device external to the same processor core and external to the processor core, or external to the processor core. For example, external devices include, but are not limited to: instruction memory, data memory, peripheral registers, audio processor, wireless processor, etc.; in addition, when the technical scheme is applied to the multi-core processor, the external device can also comprise other processor cores which are arranged in the same processor and interact data and/or instructions with the processor core. Some of the out-of-core devices may be controlled by the processor associated with the processor core, and some of the out-of-core devices may be controlled by other processors.
As an alternative, the instruction memory and data memory are uniformly numbered and uniformly addressed. As an optional mode, the out-of-core device further includes a peripheral register, and the peripheral register, the instruction memory, and the data memory are numbered and addressed in a unified manner. In the prior art, a plurality of sets of transmission logics control a plurality of modules to be respectively connected with different pieces of extra-core equipment, and each set of logics respectively and independently operate, so that the extra-core equipment is not required to be uniformly numbered and uniformly addressed. The input/output module 10 of the technical scheme is uniformly connected with the out-of-core equipment, so that the out-of-core equipment is uniformly numbered and uniformly addressed, the processor core efficiently manages the out-of-core equipment, and the operation efficiency is improved.
The input/output module 10 serves as a unique transmission interface for the processor core to connect with the out-of-core devices in a unified manner to transmit data and instructions, and the transmission interface transmits data and instructions (or signals), including reading instructions (or program instructions) from an instruction memory outside the core and reading and writing data (or operation data) to a data memory outside the core. It should be noted that, the program instruction may be used as special data, and the operation data may be used as general data, which all belong to binary information stored in a memory-related device, that is, the instruction in the instruction memory and the data in the data memory are collectively referred to as data; the input/output module 10 serves as a unique transmission interface for the processor core to be uniformly connected with the out-of-core device to transmit data and instructions, which can be expressed as that the input/output module 10 serves as a unique data transmission interface for the processor core to be uniformly connected with the out-of-core device to transmit data; the data processed by the processor core at different stages, which may include operation data and/or program instructions, is operation data or program instructions, which belongs to the prior art or can be derived by those skilled in the art according to the prior art.
In this embodiment, the processor core may be a processor core of a fifth generation Reduced Instruction Set Computing (RISC-V) architecture, that is, the processor core applies a RISC-V architecture. The RISC-V architecture is an open source instruction set architecture based on a reduced instruction set, has the characteristics of simple architecture, modular design, easy transplantation and the like, the instruction set of the RISC-V is open source to the outside, and anyone and any organization can freely use the instruction set of the RISC-V, so the RISC-V architecture is used for the processor core design and software development of the RISC-V architecture. In the embodiment, the processor core combines the characteristics of simple and completely open source of RISC-V architecture and the characteristics of programmable and easy-to-expand FPGA, has the advantages of simple structure, small area, low power consumption, easy expansion, easy transplantation and the like, and meets various light-weight and deep-embedded data processing requirements.
In addition, the processor core of the embodiment of the invention can also apply other generations of processor cores with reduced instruction set architecture except for the fifth generation. As an alternative, the processor core according to the embodiment of the present invention may also apply other architectures such as a Complex Instruction Set (CISC) architecture, an explicit parallel Instruction Set (EPIC) architecture, and a Very Long Instruction Word (VLIW) architecture.
As one embodiment, the running method of the processor core comprises the following steps: the input/output module 10 is used as a unique transmission interface for uniformly connecting the processor core with the out-core equipment to transmit data and instructions, and performs transmission operation with the out-core equipment according to an address signal of the out-core equipment, wherein the out-core equipment comprises an instruction memory and a data memory; the decoding module 20 performs decoding operation on the instruction read from the instruction memory by the input/output module 10 to generate a decoding result; the execution module 30 performs a processing operation according to the decoding result.
In the above embodiment of the operating method, the input/output module 10 is used as a unique transmission interface for the processor core to be uniformly connected to the out-of-core device to transmit data and instructions, which means that the processor core only has the transmission interface of the input/output module, and can be uniformly connected to the out-of-core device to transmit data and instructions with the out-of-core device. The method includes the steps that transmission operation is carried out on the nuclear device according to address signals of the nuclear device, the address signals can be set for the nuclear device in advance, and interaction can be carried out on the nuclear device of the address signals according to the address signals; according to the processor core, in different working phases, data and/or instructions corresponding to the working phases are transmitted from different or the same out-of-core devices, including but not limited to reading instructions from an out-of-core instruction memory and reading and writing data to an out-of-core data memory. The instruction memory and the data memory may be implemented by two separate memories, or by the same memory having both instruction and data storage functions. The execution module can be used for various operations besides processing operations according to decoding results. In different working stages, the operation method can only read the instruction from the instruction memory through the input/output module, and can also read data from the data memory or write the data into the data memory before or after the read instruction, which belongs to the prior art and is not described herein; embodiments of the present invention do not limit when the input output module is used to transmit data or instructions.
The processor core only has the unique transmission interface which is the input/output module 10 and is uniformly connected with the equipment outside the core to transmit data and instructions, only needs to carry out uniform transmission logic control on the input/output module 10, and only needs to design a plurality of sets of transmission logics compared with the transmission interfaces with a plurality of different functions in the prior art; only a unique transmission interface is arranged outside, the core structure can be simplified, the core layout is simple, the core area is optimized, the core area is reduced, the input power is reduced, and the core power consumption is reduced.
It should be noted that, in the embodiment of the present invention, names are defined for a plurality of modules, units, and registers, and when implementing the present technical solution, a person skilled in the art may modify the names; the module, unit and register with modified names are within the scope of protection of the present invention as long as the implementation functions of the module, unit and register are the same as or equivalent to the implementation functions of the module, unit and register of the embodiments of the present invention.
Referring to fig. 2, a schematic diagram of an input/output module 10 according to an embodiment of the disclosure is shown. In this embodiment, the input/output module 10 includes an address register 101, a data valid register 102, an output data register 103, a data byte valid register 104, and an input data register 105.
The input/output module 10 is specifically configured to: when the output operation is executed, the output data valid signal written in the data valid register 102 is a second value, the address signal of the out-of-core device is written in the address register 101, the output data is written in the output data register 103, and the byte valid signal of the output data is written in the data byte valid register 104, so that the address signal corresponds to the out-of-core device to read the output data from the output data register 103 according to the byte valid signal of the output data; when an input operation is performed, the output data valid signal written in the data valid register 102 is a first value, and the address signal of the off-core device is written in the address register 101, so that the address signal corresponds to the off-core device to write an input instruction or input data in the input data register 105. Wherein the input operation comprises a read instruction operation to the instruction memory or a read data operation to the data memory. The second value here is 1 and the first value is 0; of course, as an alternative embodiment, the second value and the first value may be set to other values according to the actual application scenario, for example, the second value is 0, and the first value is 1.
When the output data valid signal is a second value, notifying the out-of-core device corresponding to the address signal that the output data can be read from the output data register 103; when reading the output data, the extra-core device reads according to the byte valid signal of the output data, for example, the byte valid signal of the output data is 1, the corresponding byte of the output data is read, the byte valid signal of the output data is 0, and the corresponding byte of the output data is not read. When the output data valid signal is at the first value, the address signal corresponds to an out-of-core device that may write the input data into the input data register 105.
The address signal corresponding to the extra-core device reads the output data from the output data register according to the byte valid signal of the output data, and the address signal corresponding to the extra-core device reads the output data from the output data register according to the byte valid signal of the output data in the data byte valid register.
The address signal corresponding to the extra-core device writes the input instruction or the input data into the input data register, wherein when the input operation is a read instruction operation on the instruction memory, the address signal corresponding to the instruction memory writes the input instruction into the input data register, and when the input operation is a read data operation on the data memory, the address signal corresponding to the data memory writes the input data into the input data register; in addition, when the input operation is a read data operation for the peripheral register, the address signal may write the input data into the input data register corresponding to the peripheral register.
The input operation includes but is not limited to a read instruction operation to the instruction memory or a read data operation to the data memory; input operations may also include read data operations to other types of out-of-core devices, such as to a peripheral register, and so forth. When each input operation is performed, this input operation may be a read command operation or a read data operation. In the process of executing a plurality of input operations, a plurality of read instruction operations to the instruction memory can be executed in sequence, or a plurality of read data operations to the data memory can be executed in sequence, or a plurality of read instruction operations and read data operations can be executed in a crossed manner.
Each register of the input/output module 10 has a function that an address register 101 is used for storing an address signal of an out-of-core device; the data valid register 102 is used to store the output data valid signal; the output data register 103 is used for storing output data; data byte valid register 104 is used to store byte valid signals of the output data; the input data register 105 is used to store input data for the off-core device.
The output data valid signal written in the data valid register 102 is a second value, and specifically, the output data valid signal written in the data valid register 102 is a second value when the output operation is executed; the output data valid signal of the second value may also be written in the data valid register 102 before the output operation is performed. The output data valid signal written in the data valid register 102 is a first value, and specifically, the output data valid signal written in the data valid register 102 is the first value when the input operation is performed, or the output data valid signal written in the data valid register 102 is the first value before the input operation is performed.
In this embodiment, the output data is written in the output data register 103, and the byte valid signal of the output data is written in the data byte valid register 104, so that the address signal corresponds to the out-of-core device to read the output data from the output data register 103 according to the byte valid signal of the output data. When the byte effective signal is effective, the out-core equipment reads the byte of the output data corresponding to the byte effective signal, and when the byte effective signal is ineffective, the out-core equipment does not read the byte of the output data corresponding to the byte effective signal; the byte effective signal of the output data provides effective shielding/enabling information for the equipment outside the core, thereby avoiding the possible error writing operation and ensuring the accuracy of the operation.
In this embodiment, a 32-bit RISC-V instruction is applied, and the input/output module 10 is a 32-bit transmission interface; of course, the bit width of the input/output module 10 may be set according to the actual application scenario, and the module interface signal changes correspondingly. The input/output module 10 determines the currently interacting out-of-core device according to the address signal, and sends a read instruction to the decoding module 20 if interacting with the instruction memory, and transmits data between the data memory and the execution module 30 if interacting with the data memory.
The interface signals of the input/output module 10 of this embodiment are as follows:
address signals: the bit width is 32, and read and write address signals are provided for the out-of-core storage device, the processor external device and other targeted out-of-core devices.
Outputting a data valid signal: the bit width is 1, and may be that, after the output data and the address signal of the output operation are ready, the output data valid signal is set to 1, and the targeted address signal is notified that the corresponding out-of-core device may read the output data. After the output operation is finished, as an implementation mode, after the output operation is finished, the output data valid signal is automatically set to 0; in another embodiment, the output data valid signal is set to 0 when data is input. When the valid signal is set to 0, the out-of-core device is notified that the input operation can be performed.
Outputting a data signal: the bit width is 32, and is output data to be written to the target extra-core device.
Byte valid signal of output data: the bit width is 4, the 32-bit data of the output data is divided into 4 bytes, the 4 bits of the byte valid signal of the output data are respectively used for indicating the validity of the 4 bytes, and each 1 bit corresponds to 1 byte of the output data. When a certain bit of the byte valid signal of the output data is 1, the byte corresponding to the output data is represented as valid data, and the out-of-core device needs to read the byte in the output data signal; when a bit of the byte valid signal of the output data is 0, it indicates that the byte corresponding to the output data is an invalid signal, and the out-of-core device does not need to read the byte in the output data signal.
Input data valid signal: the bit width is 1, and when the targeted out-of-core device is ready to input data, it sends a signal to set the input data valid signal to 1, informing the processor core to read the data in the input data register 105.
Inputting a data signal: the bit width is 32, and is input data transmitted from the target extra-core device.
As an embodiment, the input/output module 10 performs a transmission operation with the nuclear device according to an address signal of the nuclear device, including: when the output operation is executed, the output data valid signal written in the data valid register 102 is a second value, the address signal of the out-of-core device is written in the address register 101, the output data is written in the output data register 103, and the byte valid signal of the output data is written in the data byte valid register 104, so that the address signal corresponds to the out-of-core device to read the output data from the output data register 103 according to the byte valid signal of the output data; when an input operation is performed, the output data valid signal written in the data valid register 102 is a first value, and the address signal of the off-core device is written in the address register 101, so that the address signal corresponds to the off-core device to write an input instruction or input data in the input data register 105. Wherein the input operation comprises a read instruction operation to the instruction memory or a read data operation to the data memory.
In this embodiment, by setting the value of the data valid register 102, the address register 101, the data valid register 102, the output data register 103, the data byte valid register 104, and the input data register 105 are used in cooperation to control to execute output or input operations at a certain time, and a time division multiplexing manner is adopted to interact with the nuclear device, including reading instructions from the instruction memory and reading and writing data from the data memory, and also reading and writing data from the external register, so that the occupation amount of logic resources is significantly reduced, and the area of the kernel is optimized.
Referring to fig. 3, a decoding module 20 according to an embodiment of the present invention is shown. The decoding module 20 includes an input data register 201, an instruction type table and an immediate decoding logic module 202, and the decoding module 20 is configured to receive an instruction read from the instruction memory through the input output module 10, generate a decoding result such as an instruction type, a source register number, a destination register number and an immediate, and send the decoding result to the execution module 30.
In this embodiment, the RISC-V instruction set is used, the decoding module 20 performs decoding operation on the read instruction according to the encoding rule of the RISC-V instruction set, divides the read 32-bit instruction according to the encoding rule, determines the meaning and decoding mode of each code segment according to the content of the operation code, and performs decoding and recombination, thereby extracting information such as the instruction type, the source register number, the destination register number, the immediate number, and the like. The decode module 20 and the input-output module 10 may each have their own independent input data register.
Fig. 4 is a schematic diagram of state transition of a finite state machine according to an embodiment of the present invention. The processor core of the embodiment controls the instruction processing flow by using the finite state machine, wherein the state types of the finite state machine comprise a read instruction state, a decoding state, an operand acquisition and instruction type state, an execution state and a register write-back state; when the state type of the finite state machine is a read instruction state, controlling the input/output module 10 to perform a read instruction operation; when the state type of the finite state machine is a decoding state, controlling the decoding module 20 to perform decoding operation; when the state type of the finite state machine is the state of obtaining the operand and the instruction type, the control execution module 30 performs the operation of obtaining the operand and the instruction type, and the operation can obtain the operand and the instruction type and can also obtain information such as a source register number, a destination register number and the like; when the state type of the finite state machine is an execution state, controlling the execution module 30 to perform execution operation; when the state type of the finite state machine is the register write-back state, the control execution module 30 performs the register write-back operation.
The embodiment is based on the structure of a finite-state machine and is designed and implemented by using a hardware description language. Finite state machines are divided into five state types: reading an instruction state, a decoding state, obtaining an operand and an instruction type state, an execution state and a register write-back state; the read instruction state is executed by the input/output module 10, the decode state is executed by the decode module 20, and the three states of operand and instruction type acquisition, execution and register write-back are executed by the execution module 30; the architecture avoids the use of synchronous logic function modules between modules, and can improve the coupling degree between modules, thereby remarkably reducing the use amount of logic resources of a processor core.
Referring to fig. 5, an execution module 30 according to an embodiment of the disclosure is shown. The execution module 30 of the present embodiment includes a register group management unit 301, a register write-back unit 302, an arithmetic logic unit 303, and an input-output management unit 304, and the register group management unit 301 is provided with a general-purpose register 305 and a program pointer register 306.
The execution module 30 is specifically configured to receive the instruction type, the source register number, the destination register number, and the immediate number generated by the decoding module 20, obtain a value of a corresponding position in the general register 305 corresponding to the source register number as a source operand, determine a sub-state of the execution state of the finite state machine according to the instruction type, and execute a processing operation corresponding to the sub-state; the sub-states of the execution state include an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state, and an unconditional jump state, which correspond to the arithmetic logic operation state, the read operation state, the write operation state, the conditional branch operation state, and the unconditional jump operation state, respectively.
As an optional implementation, executing the processing operation corresponding to the sub-state includes:
when executing the arithmetic logic operation, the source operand and the immediate are transmitted to the arithmetic logic unit 303 to be operated, and the operated result is stored in the general register 305 corresponding to the destination register number;
when executing the conditional branch operation, the source operand and the immediate are transmitted to the arithmetic logic unit 303 for operation, the next jump offset of the program pointer is determined according to the operation result, the value of the program pointer in the program pointer register 306 is read, the value of the program pointer after the jump is calculated by the arithmetic logic unit 303 according to the value of the program pointer and the jump offset, and the value of the program pointer after the jump is written back to the program pointer register 306 through the register write-back unit 302;
when executing unconditional jump operation, taking the source operand or immediate as the next jump offset of the program pointer, reading the value of the program pointer in the program pointer register 306, calculating the value of the program pointer after jumping by the arithmetic logic unit 303 according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register 306 through the register write-back unit 302;
when a write operation is performed, the input/output management unit 304 sets the output data valid signal written in the data valid register 102 of the input/output module 10 to a second value, writes the address signal of the out-of-core device in the address register 101 of the input/output module 10, writes the output data in the output data register 103 of the input/output module 10, and writes the byte valid signal of the output data in the data byte valid register 104 of the input/output module 10, so that the address signal corresponds to the byte valid signal of the output data, and the out-of-core device reads the output data from the output data register 103; storing the operation result returned by the input/output module 10 into the general register 305 corresponding to the destination register number;
when a read operation is performed, writing an address signal of the out-of-core device in the address register 101 of the input/output module 10, so that the address signal corresponds to the out-of-core device to write an input instruction or input data in the input data register 105 of the input/output module 10; the operation result returned by the input/output module 10 is saved in the general-purpose register 305 corresponding to the destination register number.
When a read operation is performed, if the output data valid signal is automatically set to 0 after the end of outputting data, the input/output management unit 304 is not required to set the output data valid signal written in the data valid register 102 of the input/output module 10 to 0 (i.e., the first value); if the output data valid signal is not automatically set to 0, the input/output management unit 304 sets the output data valid signal written in the data valid register 102 of the input/output module 10 to 0.
The arithmetic logic unit 303 calculates the value of the program pointer after the jump according to the value of the program pointer and the jump offset, specifically, the arithmetic logic unit 303 adds the value of the program pointer and the jump offset, and the added value is used as the value of the program pointer after the jump. It should be noted that, each module of the processor core relates to an instruction set in the running process, such as generating an instruction type, a source register number, a destination register number, and an immediate number, and transmitting a source operand and an immediate number to the arithmetic logic unit 303 for operation, and the like.
Determining the sub-state of the execution state of the finite-state machine according to the instruction type, specifically, establishing an instruction type table in advance, and inquiring the sub-state of the execution state of the finite-state machine corresponding to the instruction type according to the instruction type table; the instruction type may further include sub-state information of the execution state, and the sub-state of the execution state of the finite state machine may be determined according to the sub-state information included in the instruction type.
In this embodiment, as shown in fig. 4, the execution state is divided into five sub-states, the sub-state of the execution state of the finite state machine is determined according to the instruction type, and the operation corresponding to the sub-state is executed, so that the processing logic of the execution module 30 is optimized, and the usage amount of the logic resource of the processor core is significantly reduced. It is noted that as an alternative embodiment, the state machines may be classified into more than five types, or less than five types; the execution state of the finite state machine may also be divided into more than five sub-states, or less than five sub-states.
The execution module 30 is a core module of the processor core, and is responsible for executing various processing operations such as arithmetic logic operation, conditional branch operation, unconditional jump operation, and completion of read operation and write operation through the input/output module 10. The execution module 30 also processes an interrupt-related instruction, such as an interrupt return instruction; when the interrupt return instruction is executed, a program jump operation is performed to jump the program to the address of the instruction next to the digital value representing the address in the interrupt return address register 403, and an interrupt return signal is generated to notify the interrupt control module that the current interrupt processing is finished.
As an embodiment, the execution module 30 performs processing operations according to the decoding result, including: the execution module 30 receives the instruction type, the source register number, the destination register number and the immediate number generated by the decoding module 20, obtains a value of a corresponding position in the general register 305 corresponding to the source register number as a source operand, determines a sub-state of an execution state of the finite state machine according to the instruction type, and executes a processing operation corresponding to the sub-state, wherein the sub-state of the execution state includes an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state and an unconditional jump state, and the sub-states respectively correspond to the arithmetic logic operation, the read operation, the write operation, the conditional branch operation and the unconditional jump operation; when executing the arithmetic logic operation, the source operand and the immediate are transmitted to the arithmetic logic unit 303 to be operated, and the operated result is stored in the general register 305 corresponding to the destination register number; when executing the conditional branch operation, the source operand and the immediate are transmitted to the arithmetic logic unit 303 for operation, the next jump offset of the program pointer is determined according to the operation result, the value of the program pointer in the program pointer register 306 is read, the value of the program pointer after the jump is calculated by the arithmetic logic unit 303 according to the value of the program pointer and the jump offset, and the value of the program pointer after the jump is written back to the program pointer register 306 through the register write-back unit 302; when executing unconditional jump operation, taking the source operand or immediate as the next jump offset of the program pointer, reading the value of the program pointer in the program pointer register 306, calculating the value of the program pointer after jumping by the arithmetic logic unit 303 according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register 306 through the register write-back unit 302; when a write operation is performed, the input/output management unit 304 sets the output data valid signal written in the data valid register 102 of the input/output module 10 to a second value, writes the address signal of the out-of-core device in the address register 101 of the input/output module 10, writes the output data in the output data register 103 of the input/output module 10, and writes the byte valid signal of the output data in the data byte valid register 104 of the input/output module 10, so that the address signal corresponds to the byte valid signal of the output data, and the out-of-core device reads the output data from the output data register 103; storing the operation result returned by the input/output module 10 into the general register 305 corresponding to the destination register number; when a read operation is performed, writing an address signal of the out-of-core device in the address register 101 of the input/output module 10, so that the address signal corresponds to the out-of-core device to write an input instruction or input data in the input data register 105 of the input/output module 10; the operation result returned by the input/output module 10 is saved in the general-purpose register 305 corresponding to the destination register number.
In the present embodiment, in the processing operations corresponding to the execution sub-states, five sub-states correspond to five operations, and through the cooperation of the register write-back unit 302, the arithmetic logic unit 303, the input/output management unit 304, the register group management unit 301, the general-purpose register 305, and the program pointer register 306, the operations of different sub-states are efficiently completed, the operation logic between units/registers is simplified, and the overall operation efficiency of the kernel is improved.
Referring to fig. 6, a schematic diagram of an interrupt management module 40 according to an embodiment of the disclosure is shown. The processor core of this embodiment further includes an interrupt management module 40, and the interrupt management module 40 includes an interrupt enable register 401, an interrupt waiting register 402, an interrupt return address register 403, and an interrupt idle state register 404.
The interrupt management module 40 is configured to, when receiving an interrupt request signal, compare the interrupt request signal with a value in the interrupt enable register 401, and determine whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, the value of the position corresponding to the interrupt request signal of the interrupt waiting register 402 is set to the second value. If it is determined that no interrupt request signal is currently processed according to the interrupt idle state register 404, after the current instruction execution is finished, saving the current program pointer into the interrupt return address register 403, setting the value of the interrupt idle state register 404 to a first value, saving the current value in the general register 305 of the register set management unit 301 of the execution module 30 to a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register 402; after the interrupt function is executed, restoring the site before interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register 404 as a second value, and jumping to the next instruction before interrupt execution according to the current program pointer in the interrupt return address register 403; thereby completing an interrupt handling operation. The second value here is 1 and the first value is 0; of course, the second value and the first value may be set to other values according to the actual application scenario.
The value of the interrupt free status register 404 is 0 if the current processor core is processing an interrupt, and 1 otherwise. The interrupt idle state register 404 is not accessible to read and write operations by instructions. Specifically, the value of the interrupt idle status register 404 is read, and if the value is 1, it is determined that no interrupt request signal is currently processed.
As an alternative embodiment, the interrupt management module 40 is further configured to, if it is determined that an interrupt request signal is currently processed according to the interrupt idle status register 404, wait for the end of processing of the current interrupt request signal, and continue to process the received interrupt request signal after the end of processing. Of course, when it is determined that there is an interrupt request signal currently being processed, other methods may be used, such as processing preferentially if the received interrupt request signal is a priority signal.
As an alternative embodiment, the interrupt enable register 401, the interrupt waiting register 402, and the interrupt return address register 403 are provided in the register group management unit 301 of the execution module 30, and are numbered, addressed, and read/write controlled in unison with the general purpose register 305 and the program pointer register 306 in the register group management unit 301. Corresponding to the RISC-V architecture, the register set management unit 301 is provided with 32 RISC-V architecture 32-bit general registers 305(x 0-x 31), 1 32-bit program pointer register 306, 1 32-bit interrupt enable register 401, 1 32-bit interrupt return address register 403, and 1 32-bit interrupt enable register 401; interrupt management module 40 may support interrupt request management and response for 32 interrupt sources. The interrupt enable register 401, the interrupt wait register 402 and the interrupt return address register 403 are arranged in the register group management unit 301 of the execution module 30, and are uniformly numbered, uniformly addressed and uniformly read and write controlled with other registers in the register group management unit 301, and the numbering and addressing are used for management and control, and are uniformly numbered, uniformly addressed and uniformly read and write controlled, so that the control can be optimized, the access speed is increased, and the real-time performance of interrupt management is improved.
In this embodiment, the interrupt control module implements real-time management and response of external and internal interrupt request signals. The interrupt control module includes an interrupt enable register 401, an interrupt wait register 402, and an interrupt return address register 403, and defines a set of interrupt control instructions including an interrupt enable instruction, an interrupt register read/write instruction, and an interrupt return instruction according to the coding space reserved in the RISC-V architecture for managing these three registers. Wherein the interrupt enable instruction: the interrupt with the designated number is shielded or enabled through the write operation of the interrupt enable register 401; for example, the interrupt request signal may be compared to a value in the interrupt enable register 401 using an interrupt enable instruction. And (3) interrupting the reading and writing instructions of the register: reading or modifying the value of the interrupt pending register 402 to implement some specific function; for example, the value of the corresponding position of the interrupt request signal of the interrupt waiting register 402 can be set to the second value by the interrupt register read/write instruction. Interrupt return instruction: passing the value of the interrupt return address register 403 to the program pointer to jump back to the location prior to interrupt handling, which may set 1 to the interrupt idle state register 404, indicating that the processor core may currently receive a new interrupt request; for example, the value of the interrupt idle status register 404 may be set to the second value by the interrupt return instruction, the value of the interrupt return address register 403 may be passed to the current program pointer, and the next instruction before the interrupt is executed may be skipped.
As one embodiment, the method for operating an interrupt control module includes: when receiving the interrupt request signal, comparing the interrupt request signal with the value in the interrupt enable register 401, and determining whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, setting the value of the corresponding position of the interrupt request signal of the interrupt waiting register 402 as a second value; if it is determined that no interrupt request signal is currently processed according to the interrupt idle state register 404, after the current instruction execution is finished, saving the current program pointer into the interrupt return address register 403, setting the value of the interrupt idle state register 404 to a first value, saving the current value in the general register 305 of the register set management unit 301 of the execution module 30 to a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register 402; after the interrupt function is executed, restoring the site before the interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register 404 to be a second value, and jumping to the next instruction before the interrupt execution according to the current program pointer in the interrupt return address register 403.
In the embodiment, the interrupt enable register 401, the interrupt waiting register 402, the interrupt return address register 403 and the interrupt idle state register 404 of the interrupt control module are controlled, so that the interrupt response speed is improved, the interrupt control logic is simplified, and convenience is provided for embedded development.
The above description is only a preferred embodiment of the present invention, and for those skilled in the art, the present invention should not be limited by the description of the present invention, which should be interpreted as a limitation.

Claims (18)

1. A processor core based on a field programmable gate array is characterized by comprising an input/output module, a decoding module and an execution module;
the input and output module is used as a unique transmission interface for uniformly connecting the processor core with the out-of-core equipment to transmit data and instructions, and performs transmission operation with the out-of-core equipment according to an address signal of the out-of-core equipment, wherein the out-of-core equipment comprises an instruction memory and a data memory; the equipment outside the core is uniformly numbered and uniformly addressed;
the decoding module is used for decoding the instruction read by the input and output module from the instruction memory to generate a decoding result;
the execution module is used for processing operation according to the decoding result;
the input and output module comprises an address register, a data effective register, an output data register, a data byte effective register and an input data register;
the input and output module is specifically configured to:
when the output operation is executed, the effective signal of the output data written in the effective data register is a second value, the address signal of the out-of-core equipment is written in the address register, the output data is written in the output data register, and the byte effective signal of the output data is written in the effective data byte register, so that the address signal corresponds to the out-of-core equipment and reads the output data from the output data register according to the byte effective signal of the output data;
when the input operation is executed, the output data valid signal written in the data valid register is a first value, and the address signal of the extranuclear equipment is written in the address register, so that the address signal corresponds to the extranuclear equipment to write the input instruction or the input data in the input data register.
2. The processor core of claim 1, wherein the processor core controls instruction processing flow using a finite state machine, the state types of the finite state machine including a read instruction state, a decode state, a fetch operand and instruction type state, an execute state, and a register write back state;
when the state type of the finite state machine is a read instruction state, controlling the input/output module to carry out read instruction operation;
when the state type of the finite state machine is a decoding state, controlling a decoding module to perform decoding operation;
when the state type of the finite state machine is the state of obtaining the operand and the instruction type, controlling an execution module to carry out operation of obtaining the operand and the instruction type;
when the state type of the finite state machine is an execution state, controlling an execution module to execute an execution operation;
and when the state type of the finite state machine is the register write-back state, controlling the execution module to perform register write-back operation.
3. The processor core according to claim 1, wherein the execution module is specifically configured to receive an instruction type, a source register number, a destination register number, and an immediate number generated by the decoding module, obtain a value of a corresponding position in the general purpose register corresponding to the source register number as a source operand, determine a sub-state of an execution state of the finite state machine according to the instruction type, and execute a processing operation corresponding to the sub-state;
the sub-states of the execution state include an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state, and an unconditional jump state, which correspond to the arithmetic logic operation state, the read operation state, the write operation state, the conditional branch operation state, and the unconditional jump operation state, respectively.
4. The processor core of claim 3, wherein the execution module comprises a register bank management unit, a register write back unit, an arithmetic logic unit, and an input output management unit, the register bank management unit having a general purpose register and a program pointer register;
the executing the processing operation corresponding to the sub-state includes:
when the arithmetic logic operation is executed, the source operand and the immediate are transmitted to the arithmetic logic unit for operation, and the operation result is stored in the general register corresponding to the destination register number;
when executing conditional branch operation, transmitting the source operand and the immediate number to an arithmetic logic unit for operation, determining the next jump offset of a program pointer according to the operation result, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by the arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back to the program pointer register through a register write-back unit;
when executing unconditional jump operation, taking a source operand or an immediate as the next jump offset of a program pointer, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by an arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register through a register write-back unit;
when the write operation is executed, the input/output management unit sets the output data valid signal written in the data valid register of the input/output module to be a second value, writes the address signal of the out-of-core device in the address register of the input/output module, writes the output data in the output data register of the input/output module, and writes the byte valid signal of the output data in the data byte valid register of the input/output module, so that the address signal corresponds to the out-of-core device to read the output data from the output data register according to the byte valid signal of the output data; storing the operation result returned by the input and output module into a general register corresponding to the number of the destination register;
when the read operation is executed, writing an address signal of the out-of-core equipment into an address register of the input/output module so that the address signal corresponds to the out-of-core equipment and writes an input instruction or input data into an input data register of the input/output module; and storing the operation result returned by the input and output module into the general register corresponding to the destination register number.
5. The processor core of claim 1, wherein the processor core further comprises an interrupt management module, the interrupt management module comprising an interrupt enable register, an interrupt pending register, an interrupt return address register, and an interrupt idle state register;
the interrupt management module is used for comparing the interrupt request signal with the value in the interrupt enable register when receiving the interrupt request signal and judging whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, setting the value of the corresponding position of the interrupt request signal in the interrupt waiting register as a second value;
if it is determined that no interrupt request signal is currently processed according to the interrupt idle state register, after the current instruction execution is finished, storing a current program pointer into an interrupt return address register, setting the value of the interrupt idle state register as a first value, storing a current value in a general register of a register group management unit of the execution module into a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register; after the interrupt function is executed, restoring the site before interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register as a second value, and jumping to the next instruction before interrupt execution according to the current program pointer in the interrupt return address register.
6. The processor core of claim 5, wherein the interrupt management module is further configured to, if it is determined from the interrupt idle status register that an interrupt request signal is currently being processed, wait for the end of processing of the current interrupt request signal, and continue to process the received interrupt request signal after the end of processing.
7. The processor core of claim 5, wherein the interrupt enable register, the interrupt wait register, and the interrupt return address register are located in a register bank management unit of the execution module, and are numbered, addressed, and read and write controlled uniformly with a general register and a program pointer register in the register bank management unit.
8. The processor core of claim 1, wherein: the out-of-core device also includes a peripheral register.
9. The processor core of any of claims 1 to 8, wherein: the processor core is a processor core of a fifth generation reduced instruction set architecture.
10. The operating method of a processor core based on field programmable gate array is characterized in that the processor core comprises an input/output module, a decoding module and an execution module;
the input and output module is used as a unique transmission interface for uniformly connecting the kernel of the processor with the out-of-core equipment to transmit data and instructions, and performs transmission operation with the out-of-core equipment according to an address signal of the out-of-core equipment, wherein the out-of-core equipment comprises an instruction memory and a data memory; the equipment outside the core is uniformly numbered and uniformly addressed;
the decoding module carries out decoding operation on the instruction read by the input and output module from the instruction memory to generate a decoding result;
the execution module carries out processing operation according to the decoding result;
the input and output module comprises an address register, a data effective register, an output data register, a data byte effective register and an input data register;
the operation of transmitting with the extranuclear device according to the address signal of the extranuclear device includes:
when the output operation is executed, the effective signal of the output data written in the effective data register is a second value, the address signal of the out-of-core equipment is written in the address register, the output data is written in the output data register, and the byte effective signal of the output data is written in the effective data byte register, so that the address signal corresponds to the out-of-core equipment and reads the output data from the output data register according to the byte effective signal of the output data;
when the input operation is executed, the output data valid signal written in the data valid register is a first value, and the address signal of the extranuclear equipment is written in the address register, so that the address signal corresponds to the extranuclear equipment to write the input instruction or the input data in the input data register.
11. The method of claim 10, wherein the processor core controls instruction processing flow using a finite state machine, the state types of the finite state machine including a read instruction state, a decode state, a fetch operand and instruction type state, an execute state, and a register write back state;
when the state type of the finite state machine is a read instruction state, controlling the input/output module to carry out read instruction operation;
when the state type of the finite state machine is a decoding state, controlling a decoding module to perform decoding operation;
when the state type of the finite state machine is the state of obtaining the operand and the instruction type, controlling an execution module to carry out operation of obtaining the operand and the instruction type;
when the state type of the finite state machine is an execution state, controlling an execution module to execute an execution operation;
and when the state type of the finite state machine is the register write-back state, controlling the execution module to perform register write-back operation.
12. The method according to claim 10, wherein the executing module performs a processing operation according to the decoding result, and includes: the execution module receives the instruction type, the source register number, the target register number and the immediate number generated by the decoding module, obtains the value of the corresponding position in the general register corresponding to the source register number as a source operand, determines the sub-state of the execution state of the finite state machine according to the instruction type, and executes the processing operation corresponding to the sub-state;
the sub-states of the execution state include an arithmetic logic operation state, a read operation state, a write operation state, a conditional branch state, and an unconditional jump state, which correspond to the arithmetic logic operation state, the read operation state, the write operation state, the conditional branch operation state, and the unconditional jump operation state, respectively.
13. The method of claim 12, wherein the execution module includes a register bank management unit, a register write-back unit, an arithmetic logic unit, and an input-output management unit, the register bank management unit being provided with a general-purpose register and a program pointer register;
the executing the processing operation corresponding to the sub-state includes:
when the arithmetic logic operation is executed, the source operand and the immediate are transmitted to the arithmetic logic unit for operation, and the operation result is stored in the general register corresponding to the destination register number;
when executing conditional branch operation, transmitting the source operand and the immediate number to an arithmetic logic unit for operation, determining the next jump offset of a program pointer according to the operation result, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by the arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back to the program pointer register through a register write-back unit;
when executing unconditional jump operation, taking a source operand or an immediate as the next jump offset of a program pointer, reading the value of the program pointer in a program pointer register, calculating the value of the program pointer after jumping by an arithmetic logic unit according to the value of the program pointer and the jump offset, and writing the value of the program pointer after jumping back into the program pointer register through a register write-back unit;
when the write operation is executed, the input/output management unit sets the output data valid signal written in the data valid register of the input/output module to be a second value, writes the address signal of the out-of-core device in the address register of the input/output module, writes the output data in the output data register of the input/output module, and writes the byte valid signal of the output data in the data byte valid register of the input/output module, so that the address signal corresponds to the out-of-core device to read the output data from the output data register according to the byte valid signal of the output data; storing the operation result returned by the input and output module into a general register corresponding to the number of the destination register;
when the read operation is executed, writing an address signal of the out-of-core equipment into an address register of the input/output module so that the address signal corresponds to the out-of-core equipment and writes an input instruction or input data into an input data register of the input/output module; and storing the operation result returned by the input and output module into the general register corresponding to the destination register number.
14. The method of operation of claim 10, wherein the processor core further comprises an interrupt management module, the interrupt management module comprising an interrupt enable register, an interrupt pending register, an interrupt return address register, and an interrupt idle state register;
the operation method further comprises the following steps:
when receiving the interrupt request signal, comparing the interrupt request signal with the value in the interrupt enable register, and judging whether the interrupt request signal is in an enable state; if not, shielding the interrupt request signal; if so, setting the value of the corresponding position of the interrupt request signal in the interrupt waiting register as a second value;
if it is determined that no interrupt request signal is currently processed according to the interrupt idle state register, after the current instruction execution is finished, storing a current program pointer into an interrupt return address register, setting the value of the interrupt idle state register as a first value, storing a current value in a general register of a register group management unit of the execution module into a stack space, jumping the current program pointer to an interrupt processing function entry, and executing an interrupt function corresponding to the value in the interrupt waiting register; after the interrupt function is executed, restoring the site before interrupt execution through the current value of the stack space, setting the value of the interrupt idle state register as a second value, and jumping to the next instruction before interrupt execution according to the current program pointer in the interrupt return address register.
15. The method according to claim 14, wherein after setting the value of the position corresponding to the interrupt request signal in the interrupt waiting register to the second value, the method further comprises:
and if the current interrupt request signal is confirmed to be processed according to the interrupt idle state register, waiting for the end of processing of the current interrupt request signal, and continuously processing the received interrupt request signal after the end of processing.
16. The method according to claim 14, wherein the interrupt enable register, the interrupt wait register, and the interrupt return address register are located in a register bank management unit of the execution module, and are uniformly numbered, uniformly addressed, and uniformly controlled for reading and writing with a general register and a program pointer register in the register bank management unit.
17. The method of operation of claim 10, wherein: the out-of-core device also includes a peripheral register.
18. The operating method according to any one of claims 10 to 17, characterized in that: the processor core is a processor core of a fifth generation reduced instruction set architecture.
CN201910930708.XA 2019-09-29 2019-09-29 Processor core based on field programmable gate array and operation method thereof Active CN110427337B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910930708.XA CN110427337B (en) 2019-09-29 2019-09-29 Processor core based on field programmable gate array and operation method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910930708.XA CN110427337B (en) 2019-09-29 2019-09-29 Processor core based on field programmable gate array and operation method thereof

Publications (2)

Publication Number Publication Date
CN110427337A CN110427337A (en) 2019-11-08
CN110427337B true CN110427337B (en) 2020-01-03

Family

ID=68419099

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910930708.XA Active CN110427337B (en) 2019-09-29 2019-09-29 Processor core based on field programmable gate array and operation method thereof

Country Status (1)

Country Link
CN (1) CN110427337B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177067B (en) * 2019-12-13 2023-09-19 广东高云半导体科技股份有限公司 System on chip
CN112099853B (en) * 2020-09-17 2021-10-29 广东高云半导体科技股份有限公司 RISC-V processor, FPGA chip and system on chip based on FPGA
CN112463723A (en) * 2020-12-17 2021-03-09 王志平 Method for realizing microkernel array
CN117008972B (en) * 2023-09-27 2023-12-05 武汉深之度科技有限公司 Instruction analysis method, device, computing equipment and storage medium
CN117472637B (en) * 2023-12-27 2024-02-23 苏州元脑智能科技有限公司 Interrupt management method, system, equipment and medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6825698B2 (en) * 2001-08-29 2004-11-30 Altera Corporation Programmable high speed I/O interface
CN102073480B (en) * 2010-12-28 2013-08-07 清华大学 Method for simulating cores of multi-core processor by adopting time division multiplex
CN103150146B (en) * 2013-01-31 2015-11-25 西安电子科技大学 Based on ASIP and its implementation of scalable processors framework
CN107341053B (en) * 2017-06-01 2020-12-15 深圳大学 Heterogeneous multi-core programmable system and memory configuration and programming method of computing unit thereof

Also Published As

Publication number Publication date
CN110427337A (en) 2019-11-08

Similar Documents

Publication Publication Date Title
CN110427337B (en) Processor core based on field programmable gate array and operation method thereof
KR101817397B1 (en) Inter-architecture compatability module to allow code module of one architecture to use library module of another architecture
KR950003552B1 (en) Programmable controller
CN103714039B (en) universal computing digital signal processor
JP2928695B2 (en) Multi-thread microprocessor using static interleave and instruction thread execution method in system including the same
CN102750133B (en) 32-Bit triple-emission digital signal processor supporting SIMD
US20110231616A1 (en) Data processing method and system
CN102150139A (en) Data processing device and semiconductor integrated circuit device
KR20160130324A (en) Instruction for shifting bits left with pulling ones into less significant bits
CN112667289B (en) CNN reasoning acceleration system, acceleration method and medium
CN110488764A (en) A kind of engraving machine motion controller and its engraving equipment and method based on FPGA
CN103793208B (en) The data handling system of vector dsp processor and coprocessor Collaboration
CN104346132A (en) Control device applied to running of intelligent card virtual machine and intelligent card virtual machine
JP2004086837A (en) Data processor
CN112182999B (en) Three-stage pipeline CPU design method based on MIPS32 instruction system
TW201712534A (en) Decoding information about a group of instructions including a size of the group of instructions
JP6094356B2 (en) Arithmetic processing unit
CN111124360B (en) Accelerator capable of configuring matrix multiplication
CN111095197B (en) Code processing method and device
CN116204232A (en) Method and device for expanding data operation bit width
US20210089305A1 (en) Instruction executing method and apparatus
CN108234147B (en) DMA broadcast data transmission method based on host counting in GPDSP
TW201005649A (en) Operating system fast run command
JPH04104350A (en) Micro processor
KR20150081148A (en) Processor and method of controlling the same

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant