WO2017054132A1 - Method for generating address and data processing device - Google Patents

Method for generating address and data processing device Download PDF

Info

Publication number
WO2017054132A1
WO2017054132A1 PCT/CN2015/091079 CN2015091079W WO2017054132A1 WO 2017054132 A1 WO2017054132 A1 WO 2017054132A1 CN 2015091079 W CN2015091079 W CN 2015091079W WO 2017054132 A1 WO2017054132 A1 WO 2017054132A1
Authority
WO
WIPO (PCT)
Prior art keywords
address
operation instruction
recirculation
area
step value
Prior art date
Application number
PCT/CN2015/091079
Other languages
French (fr)
Chinese (zh)
Inventor
汪涛
张广飞
宋风龙
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to PCT/CN2015/091079 priority Critical patent/WO2017054132A1/en
Priority to CN201580001436.5A priority patent/CN107580700B/en
Publication of WO2017054132A1 publication Critical patent/WO2017054132A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems

Definitions

  • each instruction needs to explicitly indicate the destination address of the memory access operation, which makes the memory access instruction occupy a large amount of storage space.
  • each instruction needs to take care of the memory address, which also increases the burden on the programmer.
  • the embodiment of the invention provides a method for generating an address and a data processing device, which are used to solve the technical problem that the current memory access instruction occupies a large amount of storage space.
  • a method for generating an address including:
  • the first operation instruction is executed, and the first re-circulation in the memory is modified by a second step value corresponding to the second re-circulation area
  • the starting address of the ring area to get the address corresponding to the next received operation command including:
  • Executing the first operation instruction modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, and corresponding to the first recirculation area
  • the first step value modifies an internal offset address of the first recirculation region to obtain an address corresponding to an operation instruction received next time.
  • Modifying a starting address of the first re-circulation area in the memory by using a second step value corresponding to the second re-circulation area including:
  • Modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area including:
  • the address corresponding to the next received operation instruction is obtained, including:
  • the result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  • the method also includes:
  • Executing the second operation instruction and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
  • the internal offset address of a recirculating area to obtain the address corresponding to the next received operation instruction including:
  • the address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
  • the address corresponding to the next received operation instruction is obtained, including:
  • the result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  • the seventh possible implementation manner of the first aspect Before the first operation instruction, it also includes:
  • a data processing device including:
  • control unit is configured to:
  • control unit is configured to:
  • control unit is configured to:
  • the result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  • control unit is also used to:
  • Executing the second operation instruction and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
  • control unit is configured to:
  • the address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
  • control unit is configured to:
  • the result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  • the device further includes a control register, configured to store a length of the first recirculation region, a length of the second recirculation region, a first step value corresponding to the first recirculation region, and the second recirculation a second step value corresponding to the area, and a start address of the cyclic storage area; the control unit is configured to:
  • the length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the stored in the control register by executing a dedicated instruction
  • the start address of the cyclic storage area is initialized.
  • the memory includes a first re-circulation area and a second re-circulation area.
  • a first operation instruction for the second re-circulation area
  • the first re-routing can also be modified by the second step value corresponding to the second recirculation region
  • the start address of the ring region can obtain the address corresponding to the received operation command when the next clock cycle arrives, that is, the next time the operation command for the first recirculation region or the second recirculation region is received again, Directly according to the last obtained address, there is no need to carry the destination address in the operation instruction, which reduces the code amount of the instruction, does not require a large amount of storage space for storing the code, and also reduces the programming burden of the programmer.
  • the embodiment of the present invention provides a double loop area (ie, a first recirculation area and a second recirculation area), and the first recirculation area can be cyclically moved in the second recirculation area, so that automatic loop addressing can be realized and stored.
  • the use of the area is more flexible and can also improve the utilization of the storage area.
  • FIG. 1 is a schematic structural diagram of a data processing device according to an embodiment of the present invention.
  • FIG. 3 is a schematic diagram of an addressing area of a storage space according to an embodiment of the present invention.
  • FIG. 4 is a flowchart of a method for generating an address according to an embodiment of the present invention.
  • the memory 101 is composed of RF, and each register can correspond to a number.
  • the length of the first re-circulation area is 9, and the register 1-register 15 is the second re-circulation area.
  • the register 1-register 9 in the memory 101 can be used as the first re-circulation area, when the first re-circulation area is overall
  • the step value of the overall movement of the first recirculation region ie, the step value corresponding to the second recirculation region, for example, referred to as the second step value
  • register 2 The register 10 can be used as the first recirculation area after the first recirculation area is moved, and the register 3-register 11 can be used as the first re-circulation area after the first re-circulation area is moved again, and so on, that is, the first weight
  • the starting address of the loop area is variable.
  • the memory 101 may include two interfaces for reading data and two interfaces for writing data.
  • the interface for reading data may be represented, for example, as an R interface, and an interface for writing data, for example. It can be represented as a W interface, in which one interface for reading data is cyclically read data in accordance with a method to be described later in the embodiment of the present invention, and another interface for reading data is directed to a method of performing acyclic read data, that is, a method to be described later.
  • the two commands Imm Wr and Imm Rd belong to the operation instruction of the ordinary immediate index type.
  • This operation instruction will be described later, and the destination address can be carried in the operation instruction, which can be directly based on this
  • the operation instructions read data or write data in the memory 101.
  • the RCU 1021 is used to generate a read operation instruction to the memory 101, which is commonly controlled by the CR and the operation instructions from the IB 105.
  • RCU1021 is mainly composed of comparator A, comparator B, comparator C, adder D, adder E, adder F, adder G, selector H, selector J, selector K, register I and register S.
  • the DEC in FIG. 2 is the DEC in FIG. 1, that is, another functional module that does not belong to the RCU 1021.
  • one possible working process of the RCU 1021 in FIG. 2 is as follows.
  • the DEC After the operation instruction enters the DEC, the DEC performs decoding to generate a corresponding control signal, and the control signal can control the update of the register S and/or the register I.
  • N is an integer greater than 3
  • the entire N-recycle can be represented by 2N+1 data structures, which can also be stored in the CR.
  • these data structures are:
  • the following parameters may be involved:
  • the internal small loop can be moved in the outer major loop, and the moved address growth step value can be expressed as Stride.
  • the step value is referred to as the second step value.
  • the inner-weight small loop can also perform the loop index.
  • the address growth step value between the two sub-memory spaces can be represented as Inc.
  • the step value is called The first step value.
  • the external major loop start address that is, the start address of the second re-circulation area. After it is initialized, it can also be called the initialization address of the cyclic storage area.
  • % represents the modulo operation
  • S (S + Stride)% CirLen
  • S can be initialized to
  • I (I + Inc) % WinLen
  • I can be initialized to 0.
  • an instruction to read data or an instruction to write data is executed to a position pointed to by a continuous physical address or a continuous logical address in a round robin manner, the instruction may not need to carry the destination address, and may be addressed by a cyclic manner. If the instruction to read data or the instruction to write data is executed in an acyclic manner, the instruction may carry the destination address and directly read and write according to the destination address. Therefore, the types of operation instructions in the embodiments of the present invention may be different, and are briefly described below.
  • the internal loop index type of the window is an operation instruction for the first re-circulation area.
  • the operation instruction does not need to indicate the location of the data storage, that is, the destination address is not required to be carried in the operation instruction, and the system automatically addresses and operates in the operation.
  • the address pointed to by the first pointer corresponding to the first recirculation area automatically increases Inc at the next clock cycle.
  • the system can calculate by using formula (1) to obtain the address corresponding to the next operation instruction.
  • Instructions of this type of read data may be represented, for example, as RdI Dest
  • instructions of this type of write data may be represented, for example, as WrI Src.
  • Wr Ri Src writes the data Src to the storage area pointed to by the address Ri.
  • Ri may be a register number, or may be other information for indicating an address;
  • WrS Src write Src to the position pointed to by the self-indexing address in the second recirculation area, where the self-indexing address is the address obtained according to formula (1) in the current clock cycle after the last execution of the loop type instruction.
  • the self-index pointer S (S + Stride)% CirLen
  • the self-index pointer I (I + Inc)% WinLen;
  • system and “network” are used interchangeably herein.
  • the term “and/or” in this context is merely an association describing the associated object, indicating that there may be three relationships, for example, A and / or B, which may indicate that A exists separately, and both A and B exist, respectively. B these three situations.
  • the character "/" in this article unless otherwise specified, generally indicates that the contextual object is an "or" relationship.
  • an embodiment of the present invention provides a method for generating an address, which may be completed by a control unit.
  • the control unit is a CU
  • the method may be completed by the CU.
  • the control unit includes the WCU 1022 and the RCU 1021, then The operation instruction is an instruction to read data, and the method can be completed by the WCU 1022. If the operation instruction is an instruction to write data, the method can be completed by the RCU 1021.
  • the flow of the method is described below.
  • Step 401 Receive a first operation instruction for a second recirculation area in the memory 101;
  • Step 402 The first operation instruction is executed, and the start address of the first re-circulation area in the memory 101 is modified by the second step value corresponding to the second re-circulation area to obtain an address corresponding to the next received operation instruction;
  • the first recirculation region is cyclically moved in the second recirculation region according to the second step value.
  • the operation command may carry the destination address, or may not carry the destination address.
  • the first operation instruction does not carry the destination address, that is, the first operation instruction is a cyclic type instruction for the second re-circulation area.
  • the first operation instruction may further carry an address to which the data is directed, that is, indicate where the read data is to be stored, and if the first operation instruction is an instruction to write data, Then, the first operation instruction can also carry the address of the data source, that is, indicate where the data to be written comes from.
  • the above are just some possible examples, and the present invention is not limited thereto.
  • the system in the embodiment of the present invention can execute a loop type instruction or a non-loop type instruction, so the operation instruction can be divided into different types, and different execution modes are used for different types of operation instructions. . Then, after receiving the operation instruction, the control unit may first determine the type information of the operation instruction, that is, determine what type of operation instruction is.
  • control unit After determining the type of operational command, the control unit can determine the manner of execution corresponding to this type of operational command.
  • the first operation instruction may be a loop type instruction
  • the type of the first operation instruction may be a loop index type of the overall movement of the window, that is, a type of the operation instruction for the second recirculation area, then the first operation instruction
  • the execution mode is to execute the external large loop for the second recirculation region.
  • the first operational instruction may be RdS Dest or may be WrS Src.
  • the first operation instruction may be executed, for example, if the first operation instruction is an instruction to read data, the first address may be The pointed location writes data, or for example, if the first operational instruction is an instruction to write data, the data can be read from the location pointed to by the first address.
  • the address corresponding to the first operation instruction is the first address.
  • the first address is an initialization address of the cyclic storage area, that is, Start as described above. If the first operation instruction is not the first operation instruction received by the control unit after the system initializes the parameters involved in the cyclic class instruction, that is, before the first operation instruction, the control unit executes other instructions for reading data.
  • the first address is the address obtained by the system in the current clock cycle after the last execution of the operation instruction
  • the last executed operation instruction may be an operation instruction for the first recirculation area, or It may be an operation instruction for the second re-circulation area, and may be an instruction to read data or an instruction to write data.
  • the system after each execution of the operation instruction is completed, the system only obtains the new I value and the new S value, but does not update the value of the corresponding register according to the new value, when the next clock cycle arrives (for example, the next time
  • the system updates the corresponding register according to the new I value and the new S value obtained last time, and then obtains the address corresponding to the current operation by calculation.
  • the internal offset address of the new second recirculation area can be obtained, and when the next clock cycle arrives (generally, when the operation instruction of the next cycle type is received), the second recirculation can be passed.
  • the second step value corresponding to the region modifies the start address of the first re-circulation region, and obtains an address corresponding to the operation instruction for the first re-circulation region or the second re-circulation region. In this way, the next time the loop-type operation instruction is received, the control unit can automatically address it without having to carry the destination address in the operation instruction.
  • the next received area for the first recirculation area or The address corresponding to the operation instruction of the second recirculation area (here, the corresponding value is obtained, and the address needs to be calculated when the next clock cycle comes), so that the address obtained this time can be directly obtained when the operation instruction is received next time. Point to the location for data manipulation.
  • the operation instruction does not need to carry the destination address, which reduces the code amount of the program to a large extent, does not need to consume too much storage space when storing the code, and also reduces the programming burden of the programmer.
  • the first operation instruction is executed, and the start address of the first recirculation area in the memory 101 is modified by the second step value corresponding to the second recirculation area to obtain The address corresponding to the next received operation command, including:
  • Executing a first operation instruction modifying a start address of the first recirculation region in the memory 101 by using a second step value corresponding to the second recirculation region, and modifying the first step value corresponding to the first recirculation region
  • the internal offset address of a loop region is obtained to obtain the address corresponding to the next received operation command.
  • the first re-circulation area may include at least one sub-storage space, and each operation is directed to one of the sub-storage spaces, that is, the first pointer points to only one of the sub-storage spaces at a time, then the first re-circulation
  • the internal offset address of the area may refer to the address of the sub-memory space to which the first pointer currently points.
  • Modifying the start address of the first recirculation region in the memory 101 by using the second step value corresponding to the second recirculation region including:
  • Modifying the internal offset of the first recirculation region by the first step value corresponding to the first recirculation region Address including:
  • the address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area.
  • obtaining an address corresponding to the next received operation instruction including:
  • the result obtained by modulo the length of the second recirculation area is added to the initialization address of the cyclic storage area, and the address corresponding to the next received operation instruction is obtained.
  • a new I value and a new S value can be obtained.
  • the system can update the values of the register I and the register S according to the new I value and the new S value obtained after executing the first operation instruction, and can obtain the current value according to the updated register value.
  • the address corresponding to the operation instruction obviously, the address corresponding to the operation instruction is actually determined after the last execution of the operation instruction. For example, when the system obtains the address corresponding to this operation instruction, it can be calculated according to formula (1).
  • the first operation instruction is an instruction for the second re-circulation area
  • the second operation instruction is an instruction for the first re-circulation area, that is, the first operation instruction and the second operation instruction are both cyclic type instructions. It's just for different storage spaces.
  • the second operation instruction may be an instruction for reading data, such as RdI Dest, or a second operation
  • the instruction may also be an instruction for writing data, such as WrI Src.
  • the address corresponding to the second operation instruction is the second address.
  • the second address is an initialization address of the cyclic storage area, that is, Start as described above. If the second operation instruction is not the first operation instruction received by the control unit after the system initializes the parameters involved in the cyclic class instruction, that is, before the second operation instruction, the control unit executes other instructions for reading data.
  • the second address is the address obtained by the system in the current clock cycle after the last execution of the operation instruction
  • the last executed operation instruction may be an operation instruction for the first recirculation area, or It may be an operation instruction for the second re-circulation area, and may be an instruction to read data or an instruction to write data.
  • the corresponding register when receiving the second operation instruction, may be directly updated according to the corresponding I value and S value obtained after the last execution of the operation instruction, and stored according to the value of the updated register and the cyclic storage.
  • the parameter of the initialization address of the area is calculated to obtain the second address, and the data is operated from the location pointed by the second address.
  • the data operation is performed on the location pointed to by the address, which may be read from the location pointed to by the address. Take data, or you can write data to the location pointed to by the address. That is, the second operation instruction does not need to carry the destination address, and the system can operate according to the previously obtained address (ie, the second address).
  • the next received area for the first recirculation area or The address corresponding to the operation instruction of the second recirculation area (here, the corresponding value is obtained, and the address needs to be calculated when the next clock cycle comes), so that the address obtained this time can be directly obtained when the operation instruction is received next time. Point to the location for data manipulation.
  • the operation instruction does not need to carry the destination address, which reduces the code amount of the program to a large extent, does not need to consume too much storage space when storing the code, and also reduces the programming burden of the programmer.
  • the internal offset address of the first re-circulation area is modified by the first step value corresponding to the first re-circulation area, to obtain an address corresponding to the next received operation instruction, include:
  • the address in the embodiment of the present invention is a result after modulo.
  • An example is as follows.
  • the read data instruction 1 for the first re-circulation area is received, and the read of the I value and the S-value obtained after the execution of the last operation instruction is completed.
  • the address corresponding to the data instruction 1 is the address of the fourth sub-memory space in the first first re-circulation area, and the data is to be read from the fourth sub-memory space in the first first re-circulation area.
  • the value of the first pointer is incremented by Inc, and the value of the second pointer is determined to be unchanged.
  • the value of the register I is updated according to the value after the first pointer is incremented by Inc.
  • the value of the register S may be unchanged, and the received second recirculation region or the second recirculation region may be obtained.
  • the address corresponding to the operation instruction is the address of the first sub-memory space in the first first re-circulation area, and then the next time the operation instruction for the first re-circulation area or the second re-circulation area is received,
  • the data that needs to be processed is the data in the first sub-storage space in the first first re-circulation region. It can be seen that if the first re-circulation area is regarded as a window, and each of the sub-storage spaces is regarded as a small window, the loop operation in the window of the first re-circulation area can be realized by performing a modulo operation on the address.
  • the same is true for the second recirculation area, and it is also a process that can implement a cyclic operation.
  • the instruction sent is preferably for the second re-circulation region.
  • the instruction in this way, naturally executes the instruction in the second recirculation region, and moves to the next first recirculation region to implement sequential execution. Therefore, this also has certain requirements for the sender of the instruction (for example, a programmer). If the sequence execution is to be performed, the sender of the instruction needs to know the address pointed to by the first pointer after each execution. position.
  • obtaining an address corresponding to the next received operation instruction including:
  • the result obtained by modulo the length of the second recirculation area is added to the initialization address of the cyclic storage area, and the address corresponding to the next received operation instruction is obtained.
  • a new I value and a new S value can be obtained.
  • the system can update the values of the register I and the register S according to the new I value and the new S value obtained after executing the second operation instruction, and can obtain the current value according to the updated register value.
  • the address corresponding to the operation instruction obviously, the address corresponding to the operation instruction is actually determined after the last execution of the operation instruction. For example, when the system obtains the address corresponding to this operation instruction, it can be calculated according to formula (1).
  • the process of receiving and executing the second operation instruction may occur before the process of receiving and executing the first operation instruction, or the process of receiving and executing the second operation instruction may also occur in the process. After the process of receiving and executing the first operational instruction.
  • the process of receiving and executing the second operation instruction occurs before the process of receiving and executing the first operation instruction, and the system executes the first operation instruction after the execution of the second operation instruction
  • the second operation instruction is the last execution operation instruction
  • the address corresponding to the next received operation instruction obtained after the execution of the second operation instruction may be the same address.
  • the system executes other operation instructions, and then executes the first An operation instruction
  • the first address is not the same address as the address corresponding to the next received operation instruction obtained after executing the second operation instruction.
  • the process of receiving and executing the second operation instruction occurs after the process of receiving and executing the first operation instruction, and the system executes the second after the execution of the first operation instruction
  • the first operation instruction is the last execution operation instruction
  • the address corresponding to the next received operation instruction obtained after the execution of the first operation instruction may be the same as the second address address.
  • the system executes other operation instructions, and then executes the first
  • the second operation instruction is that the address corresponding to the next received operation instruction obtained after the execution of the first operation instruction is not the same address as the second address.
  • the process of receiving and executing the second operation instruction occurs before the process of receiving and executing the first operation instruction, and after the execution of the second operation instruction, the system executes the first operation instruction, that is, In the case of the first operational command, the second operational command is the most recently executed operational command.
  • the second operation instruction is an instruction for reading data
  • the second operation instruction is RdI Src for the first recirculation area
  • the first pointer is I
  • the first step value is Inc.
  • the corresponding address is an address of a location where the first sub-memory space in the first first re-circulation area is located, and the first re-circulation area includes, for example, four sub-storage spaces.
  • the first operation instruction is an instruction for reading data
  • the first operation instruction is RdS Src for the second re-circulation area
  • the second pointer is S
  • the second step value is Stride.
  • the address corresponding to the operation instruction can be obtained by formula (1), for example, called address 2.
  • the method before receiving the first operation instruction, the method further includes:
  • parameters such as Start, CirLen, WinLen, Inc, and Stride can be initialized before the operation instruction is executed.
  • the dedicated instruction in this embodiment may be, for example, a dedicated configuration instruction, such as SetCR Src as described above, which is not limited by the present invention.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • the integrated unit if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium.
  • a computer readable storage medium A number of instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in various embodiments of the present application.
  • the foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.

Abstract

A method for generating an address and a data processing device, related to the technical field of storage and used to solve the current technical problem that an access and storage instruction occupies a large amount of storage space. A memory comprises a first recirculation region and a second recirculation region. When a first operation instruction regarding the second recirculation region is received, an initial address of the first recirculation region can be modified by means of a second step value corresponding to the second recirculation region, in addition to executing the first operation instruction. When the next clock period arrives, an address corresponding to the operation instruction received this time can be obtained; that is, when the operation instruction regarding the first recirculation region or the second recirculation region is received again next time, an operation can be performed directly according to the address obtained last time, without the need of carrying a destination address in the operation instruction. Consequently, the quantity of codes of the instruction is reduced, it is not necessary to consume a large amount of storage space to store the codes, and at the same time, the programming burden of programming personnel is also reduced.

Description

一种生成地址的方法及数据处理设备Method for generating address and data processing device 技术领域Technical field
本发明涉及存储技术领域,尤其涉及一种生成地址的方法及数据处理设备。The present invention relates to the field of storage technologies, and in particular, to a method for generating an address and a data processing device.
背景技术Background technique
现在的应用所能够完成的功能日趋强大,处理的数据越来越多,例如,现在还出现了数据密集型应用。所谓数据密集型,是指海量数据的存储与计算,其对应的应用程序广泛存在于万维网、科学计算和人工智能等多种领域。特别是进入21世纪以后,随着移动互联网、云计算、物联网的飞速发展,全球信息量呈指数级增长,数据密集型应用风暴正在形成。Today's applications are capable of performing increasingly powerful functions, processing more and more data, for example, data-intensive applications are now appearing. The so-called data-intensive refers to the storage and calculation of massive data, and its corresponding applications are widely used in many fields such as the World Wide Web, scientific computing and artificial intelligence. Especially after entering the 21st century, with the rapid development of mobile Internet, cloud computing and Internet of Things, the global information volume has grown exponentially, and data-intensive application storms are taking shape.
数据密集型应用访存频繁,且访存规律不明显,不是简单的线性增长。在现有的访存系统中,每条指令都需要显式指明访存操作的目的地址,这使得访存指令占用了大量的存储空间。同时,程序员在编程时,需要每条指令都兼顾到访存地址,这也加大了程序员的负担。Data-intensive applications are frequently accessed, and the access rules are not obvious, not a simple linear growth. In the existing memory access system, each instruction needs to explicitly indicate the destination address of the memory access operation, which makes the memory access instruction occupy a large amount of storage space. At the same time, when programmers are programming, each instruction needs to take care of the memory address, which also increases the burden on the programmer.
发明内容Summary of the invention
本发明实施例提供一种生成地址的方法及数据处理设备,用以解决目前的访存指令占用了大量的存储空间的技术问题。The embodiment of the invention provides a method for generating an address and a data processing device, which are used to solve the technical problem that the current memory access instruction occupies a large amount of storage space.
第一方面,提供一种生成地址的方法,包括:In a first aspect, a method for generating an address is provided, including:
接收针对存储器中的第二重循环区域的第一操作指令;Receiving a first operational instruction for a second recirculation region in the memory;
执行所述第一操作指令,并通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址;其中,所述第一重循环区域根据所述第二步进值在所述第二重循环区域中循环移动。Executing the first operation instruction, and modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, to obtain an operation instruction corresponding to the next reception An address; wherein the first recirculation region cyclically moves in the second recirculation region according to the second step value.
结合第一方面,在第一种可能的实现方式中,执行所述第一操作指令,并通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循 环区域的起始地址,以得到下一次接收的操作指令对应的地址,包括:With reference to the first aspect, in a first possible implementation, the first operation instruction is executed, and the first re-circulation in the memory is modified by a second step value corresponding to the second re-circulation area The starting address of the ring area to get the address corresponding to the next received operation command, including:
执行所述第一操作指令,通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以及通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the first operation instruction, modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, and corresponding to the first recirculation area The first step value modifies an internal offset address of the first recirculation region to obtain an address corresponding to an operation instruction received next time.
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,In conjunction with the first possible implementation of the first aspect, in a second possible implementation of the first aspect,
通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,包括:Modifying a starting address of the first re-circulation area in the memory by using a second step value corresponding to the second re-circulation area, including:
通过所述第二重循环区域对应的第二步进值修改所述第二重循环区域对应的第二指针指向的地址;Modifying, by the second step value corresponding to the second recirculation region, an address pointed by the second pointer corresponding to the second recirculation region;
通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,包括:Modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area, including:
通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址。And modifying, by the first step value corresponding to the first recirculation region, an address pointed by the first pointer corresponding to the first recirculation region.
结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,得到下一次接收的操作指令对应的地址,包括:With reference to the second possible implementation of the first aspect, in a third possible implementation manner of the first aspect, the address corresponding to the next received operation instruction is obtained, including:
将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,及将所述第二指针增加一个所述第二步进值后指向的地址与所述第二重循环区域的长度进行取模操作;Adding a first pointer to an address pointed by the first step value to perform a modulo operation with a length of the first recirculation region, and adding the second pointer to the second step value Performing a modulo operation on the address pointed to and the length of the second recirculation region;
将得到的两个取模的结果相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the obtained results of the two modulo, and performing a modulo operation on the added result and the length of the second recirculation region;
将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
结合第一方面或第一方面的第一种可能的实现方式至第三种可能的实现方式中的任一种可能的实现方式,在第一方面的第四种可能的实现方式中,所述方法还包括: With reference to the first aspect or the first possible implementation of the first aspect to any one of the third possible implementation manners, in a fourth possible implementation manner of the first aspect, The method also includes:
接收针对所述第一重循环区域的第二操作指令;Receiving a second operation instruction for the first recirculation region;
执行所述第二操作指令,并通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the second operation instruction, and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
结合第一方面或第一方面的第四种可能的实现方式,在第一方面的第五种可能的实现方式中,通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址,包括:With reference to the first aspect or the fourth possible implementation manner of the first aspect, in a fifth possible implementation manner of the first aspect, The internal offset address of a recirculating area to obtain the address corresponding to the next received operation instruction, including:
通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址,以得到下一次接收的操作指令对应的地址。The address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
结合第一方面或第一方面的第五种可能的实现方式,在第一方面的第六种可能的实现方式中,得到下一次接收的操作指令对应的地址,包括:With reference to the first aspect or the fifth possible implementation manner of the first aspect, in a sixth possible implementation manner of the first aspect, the address corresponding to the next received operation instruction is obtained, including:
将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,将得到的结果与所述第二指针指向的地址相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the first pointer to an address pointed by the first step value and performing a modulo operation on the length of the first recirculation region, and adding the obtained result to the address pointed by the second pointer, And performing a modulo operation on the added result and the length of the second recirculation region;
将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
结合第一方面或第一方面的第一种可能的实现方式至第六种可能的实现方式中的任一种可能的实现方式,在第一方面的第七种可能的实现方式中,在接收第一操作指令之前,还包括:With reference to the first aspect or the first possible implementation manner of the first aspect to the sixth possible implementation manner, in the seventh possible implementation manner of the first aspect, Before the first operation instruction, it also includes:
通过执行专用指令,对所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一重循环区域对应的第一步进值、所述第二重循环区域对应的第二步进值、及循环存储区域的起始地址进行初始化。And performing, by using a dedicated instruction, a length of the first recirculation area, a length of the second recirculation area, a first step value corresponding to the first recirculation area, and a corresponding corresponding to the second recirculation area The second step value and the start address of the cyclic storage area are initialized.
第二方面,提供一种数据处理设备,包括:In a second aspect, a data processing device is provided, including:
存储器,包括第一重循环区域和第二重循环区域;a memory including a first recirculation area and a second recirculation area;
控制单元,与所述存储器连接,用于接收针对存储器中的第二重循环区域的第一操作指令;执行所述第一操作指令,并通过所述第二重循环区域对 应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址;其中,所述第一重循环区域根据所述第二步进值在所述第二重循环区域中循环移动。a control unit, coupled to the memory, for receiving a first operation instruction for a second recirculation region in the memory; executing the first operation instruction, and passing the second recirculation region pair The second step value should modify the start address of the first re-circulation area in the memory to obtain an address corresponding to the next received operation instruction; wherein the first re-circulation area is according to the second step The value is cyclically moved in the second recirculation region.
结合第二方面,在第二方面的第一种可能的实现方式中,所述控制单元用于:In conjunction with the second aspect, in a first possible implementation of the second aspect, the control unit is configured to:
执行所述第一操作指令,通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以及通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the first operation instruction, modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, and corresponding to the first recirculation area The first step value modifies an internal offset address of the first recirculation region to obtain an address corresponding to an operation instruction received next time.
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述控制单元用于:In conjunction with the first possible implementation of the second aspect, in a second possible implementation of the second aspect, the control unit is configured to:
通过所述第二重循环区域对应的第二步进值修改所述第二重循环区域对应的第二指针指向的地址;及,Modifying, by the second step value corresponding to the second recirculation region, an address pointed by the second pointer corresponding to the second recirculation region; and
通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址。And modifying, by the first step value corresponding to the first recirculation region, an address pointed by the first pointer corresponding to the first recirculation region.
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述控制单元用于:In conjunction with the second possible implementation of the second aspect, in a third possible implementation of the second aspect, the control unit is configured to:
将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,及将所述第二指针增加一个所述第二步进值后指向的地址与所述第二重循环区域的长度进行取模操作;Adding a first pointer to an address pointed by the first step value to perform a modulo operation with a length of the first recirculation region, and adding the second pointer to the second step value Performing a modulo operation on the address pointed to and the length of the second recirculation region;
将得到的两个取模的结果相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the obtained results of the two modulo, and performing a modulo operation on the added result and the length of the second recirculation region;
将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
结合第二方面或第二方面的第一种可能的实现方式至第三种可能的实现方式中的任一种可能的实现方式,在第二方面的第四种可能的实现方式中,所述控制单元还用于: With reference to the second aspect, or the first possible implementation of the second aspect, to any one of the third possible implementation manners, in a fourth possible implementation manner of the second aspect, The control unit is also used to:
接收针对所述第一重循环区域的第二操作指令;Receiving a second operation instruction for the first recirculation region;
执行所述第二操作指令,并通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the second operation instruction, and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
结合第二方面的第四种可能的实现方式,在第二方面的第五种可能的实现方式中,所述控制单元用于:In conjunction with the fourth possible implementation of the second aspect, in a fifth possible implementation of the second aspect, the control unit is configured to:
通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址,以得到下一次接收的操作指令对应的地址。The address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
结合第二方面的第五种可能的实现方式,在第二方面的第六种可能的实现方式中,所述控制单元用于:In conjunction with the fifth possible implementation of the second aspect, in a sixth possible implementation of the second aspect, the control unit is configured to:
将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,将得到的结果与所述第二指针指向的地址相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the first pointer to an address pointed by the first step value and performing a modulo operation on the length of the first recirculation region, and adding the obtained result to the address pointed by the second pointer, And performing a modulo operation on the added result and the length of the second recirculation region;
将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
结合第二方面或第二方面的第一种可能的实现方式至第六种可能的实现方式中的任一种可能的实现方式,在第二方面的第七种可能的实现方式中,所述设备还包括控制寄存器,用于存储所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一重循环区域对应的第一步进值、所述第二重循环区域对应的第二步进值、及循环存储区域的起始地址;所述控制单元用于:With reference to the second aspect or the first possible implementation manner of the second aspect, the possible implementation manner of the sixth possible implementation manner, in the seventh possible implementation manner of the second aspect, The device further includes a control register, configured to store a length of the first recirculation region, a length of the second recirculation region, a first step value corresponding to the first recirculation region, and the second recirculation a second step value corresponding to the area, and a start address of the cyclic storage area; the control unit is configured to:
通过执行专用指令,对所述控制寄存器中存储的所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一步进值、所述第二步进值、及所述循环存储区域的起始地址进行初始化。The length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the stored in the control register by executing a dedicated instruction The start address of the cyclic storage area is initialized.
本发明实施例中,存储器中包括第一重循环区域和第二重循环区域,在接收针对第二重循环区域的操作指令(称为第一操作指令)时,除了执行第一操作指令外,还可以通过第二重循环区域对应的第二步进值修改第一重循 环区域的起始地址,在下一个时钟周期到来时就可以得到该次接收的操作指令对应的地址,即下一次再接收到针对第一重循环区域或第二重循环区域的操作指令时,可以直接根据上次得到的地址进行操作,无需在操作指令中携带目的地址,减少了指令的代码量,无需耗费大量的存储空间用于存储代码,同时也减轻了编程人员的编程负担。In the embodiment of the present invention, the memory includes a first re-circulation area and a second re-circulation area. When receiving an operation instruction (referred to as a first operation instruction) for the second re-circulation area, in addition to executing the first operation instruction, The first re-routing can also be modified by the second step value corresponding to the second recirculation region The start address of the ring region can obtain the address corresponding to the received operation command when the next clock cycle arrives, that is, the next time the operation command for the first recirculation region or the second recirculation region is received again, Directly according to the last obtained address, there is no need to carry the destination address in the operation instruction, which reduces the code amount of the instruction, does not require a large amount of storage space for storing the code, and also reduces the programming burden of the programmer.
且本发明实施例提供两重循环区域(即第一重循环区域和第二重循环区域),第一重循环区域可以在第二重循环区域中循环移动,这样可以实现自动循环寻址,存储区域的使用方式较为灵活,也能够提高存储区域的利用率。The embodiment of the present invention provides a double loop area (ie, a first recirculation area and a second recirculation area), and the first recirculation area can be cyclically moved in the second recirculation area, so that automatic loop addressing can be realized and stored. The use of the area is more flexible and can also improve the utilization of the storage area.
附图说明DRAWINGS
图1为本发明实施例中数据处理设备的结构示意图;1 is a schematic structural diagram of a data processing device according to an embodiment of the present invention;
图2为本发明实施例中RCU的结构示意图;2 is a schematic structural diagram of an RCU according to an embodiment of the present invention;
图3为本发明实施例中存储空间的寻址区域示意图;3 is a schematic diagram of an addressing area of a storage space according to an embodiment of the present invention;
图4为本发明实施例中生成地址的方法的流程图。FIG. 4 is a flowchart of a method for generating an address according to an embodiment of the present invention.
具体实施方式detailed description
为使本发明实施例的目的、技术方案和优点更加清楚,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described in conjunction with the drawings in the embodiments of the present invention. It is a partial embodiment of the invention, and not all of the embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
首先介绍本发明的应用场景及硬件架构等内容。First, the application scenario and hardware architecture of the present invention are introduced.
请参见图1,本发明实施例提供一种数据处理设备,该数据处理设备可以用于执行本发明实施例中的生成地址的方法。该数据处理设备例如可以是处理器,比如可以是CPU(中央处理器),或者例如可以是处理器中的子模块,等等。该数据处理设备可以包括存储器101和控制单元。Referring to FIG. 1, an embodiment of the present invention provides a data processing device, which can be used to perform a method for generating an address in an embodiment of the present invention. The data processing device may for example be a processor, such as may be a CPU (Central Processing Unit), or may be, for example, a sub-module in a processor, or the like. The data processing device can include a memory 101 and a control unit.
存储器(或称为存储实体)101,可以由RF(Register File,寄存器文件)或SRAM(Static Random Access Memory,静态随机存取存储器)等存储介质 实现,图1中以RF为例。存储器101用于存储数据。本发明实施例中,存储器101的存储区域中可以包括第一重循环区域和第二重循环区域,其中,第一重循环区域可以包括至少一个子存储空间,第二重循环区域的长度大于等于第一重循环区域的长度,如果把第一重循环区域看作一个整体,则第一重循环区域可以整体在第二重循环区域中循环移动。A memory (or storage entity) 101 may be a storage medium such as an RF (Register File) or an SRAM (Static Random Access Memory). Implementation, in Figure 1, take RF as an example. The memory 101 is used to store data. In the embodiment of the present invention, the storage area of the memory 101 may include a first re-circulation area and a second re-circulation area, where the first re-circulation area may include at least one sub-storage space, and the length of the second re-circulation area is greater than or equal to The length of the first recirculation region, if the first recirculation region is regarded as a whole, the first recirculation region may be cyclically moved as a whole in the second recirculation region.
例如,存储器101由RF构成,则每个寄存器可以对应一个编号。例如第一重循环区域的长度为9,寄存器1-寄存器15为第二重循环区域,那么,存储器101中的寄存器1-寄存器9可以作为第一重循环区域,当第一重循环区域整体在第二重循环区域中移动时,例如第一重循环区域整体移动的步进值(即第二重循环区域对应的步进值,例如称为第二步进值)为1,那么,寄存器2-寄存器10可以作为第一重循环区域移动后的第一重循环区域,寄存器3-寄存器11可以作为第一重循环区域再次移动后的第一重循环区域,等等,也就是说,第一重循环区域的起始地址是可变的。For example, the memory 101 is composed of RF, and each register can correspond to a number. For example, the length of the first re-circulation area is 9, and the register 1-register 15 is the second re-circulation area. Then, the register 1-register 9 in the memory 101 can be used as the first re-circulation area, when the first re-circulation area is overall When moving in the second recirculation region, for example, the step value of the overall movement of the first recirculation region (ie, the step value corresponding to the second recirculation region, for example, referred to as the second step value) is 1, then, register 2 The register 10 can be used as the first recirculation area after the first recirculation area is moved, and the register 3-register 11 can be used as the first re-circulation area after the first re-circulation area is moved again, and so on, that is, the first weight The starting address of the loop area is variable.
如果第二重循环区域对应的存储空间是存储器101中的可用的存储空间中的全部存储空间,那么可以认为第二重循环区域的起始地址是固定的。或者,即使第二重循环区域对应的存储空间是存储器101中的可用的存储空间中的部分存储空间,那么因为第二重循环区域所对应的存储空间是固定的,因此也可以认为第二重循环区域的起始地址是固定的。If the storage space corresponding to the second re-circulation area is all the storage space in the available storage space in the memory 101, the start address of the second re-circulation area may be considered to be fixed. Alternatively, even if the storage space corresponding to the second re-circulation area is a partial storage space in the available storage space in the memory 101, since the storage space corresponding to the second re-circulation area is fixed, the second weight can also be considered The starting address of the loop area is fixed.
控制单元也可以称为循环读写控制电路,可以包括RCU(Read Control Unit,读控制单元)1021和WCU(Write Control Unit,写控制单元)1022。控制单元分别用于产生在存储器101中循环读和循环写的索引编号。其中,RCU1021和WCU1022可以位于同一功能模块中,若RCU1021和WCU1022位于同一功能模块中,则也可以将该功能模块称为CU(Control Unit,控制单元),可以认为RCU1021和WCU1022是CU中的两个子模块,或者RCU1021和WCU1022也可以位于不同的功能模块中,若RCU1021和WCU1022位于不同的功能模块中,则RCU1021和WCU1022的硬件结构可以相同,只是DEC可以通过不同的控制指令来控制RCU1021和WCU1022,图1中,可以理解 为是以RCU1021和WCU1022位于CU中为例。The control unit may also be referred to as a cyclic read/write control circuit, and may include an RCU (Read Control Unit) 1021 and a WCU (Write Control Unit) 1022. The control unit is used to generate index numbers for cyclic reads and cyclic writes in the memory 101, respectively. The RCU 1021 and the WCU 1022 may be located in the same functional module. If the RCU 1021 and the WCU 1022 are located in the same functional module, the functional module may also be referred to as a CU (Control Unit), and the RCU 1021 and the WCU 1022 may be considered as two of the CUs. The sub-modules, or the RCU 1021 and the WCU 1022, may also be located in different functional modules. If the RCU 1021 and the WCU 1022 are located in different functional modules, the hardware structures of the RCU 1021 and the WCU 1022 may be the same, except that the DEC may control the RCU 1021 and the WCU 1022 through different control commands. , in Figure 1, can understand For example, the RCU 1021 and the WCU 1022 are located in the CU.
可选的,请继续参见图1,在本发明另一实施例中,该数据处理设备还可以包括译码电路103和控制寄存器104,其中,译码电路103与控制单元连接,控制寄存器104分别与控制单元和译码电路103连接。Optionally, please continue to refer to FIG. 1. In another embodiment of the present invention, the data processing device may further include a decoding circuit 103 and a control register 104, wherein the decoding circuit 103 is connected to the control unit, and the control register 104 respectively It is connected to the control unit and decoding circuit 103.
控制寄存器104用于存储第一重循环区域的长度、第二重循环区域的长度、第一步进值、第二步进值、及循环存储区域的起始地址。关于这几个参数的含义,将在后面介绍。The control register 104 is configured to store the length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the start address of the cyclic storage region. The meaning of these parameters will be described later.
译码电路103用于:通过执行专用指令,对控制寄存器104中存储的第一重循环区域的长度、第二重循环区域的长度、第一步进值、第二步进值、及循环存储区域的起始地址进行初始化。其中,在对循环存储区域的起始地址初始化之后,得到的地址可以称为循环存储区域的初始化地址。The decoding circuit 103 is configured to: store the length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the cyclic storage stored in the control register 104 by executing a dedicated instruction. The start address of the area is initialized. Wherein, after initializing the start address of the cyclic storage area, the obtained address may be referred to as an initialization address of the cyclic storage area.
可选的,请继续参见图1,在本发明另一实施例中,该数据处理设备还可以包括IB(Instruction Buffer,输入指令缓冲)105,用于接收并暂时缓存针对数据处理设备的读指令、写指令、及控制指令等各类指令。本发明实施例中,指令与操作指令,可以是同一概念,可互换。Optionally, please continue to refer to FIG. 1. In another embodiment of the present invention, the data processing device may further include an IB (Instruction Buffer) 105 for receiving and temporarily buffering read commands for the data processing device. , write instructions, and control instructions and other types of instructions. In the embodiment of the present invention, the instruction and the operation instruction may be the same concept and are interchangeable.
译码电路103例如为DEC(Decode,译码电路),可以对IB105中缓存的指令进行译码,生成对应的控制信号,并将控制信号发往数据处理设备的其他模块。The decoding circuit 103 is, for example, a DEC (Decode), which can decode the instructions buffered in the IB 105, generate corresponding control signals, and send the control signals to other modules of the data processing device.
控制寄存器104例如为CR(Control Register,可编程控制寄存器),可以用于存储对数据处理设备的配置信息,CR可通过专用指令进行写操作来完成配置,即完成初始化。The control register 104 is, for example, a CR (Control Register), which can be used to store configuration information for the data processing device. The CR can be configured by a dedicated instruction to complete the configuration, that is, the initialization is completed.
例如,在执行读指令、写指令或其他控制指令之前,可以首先通过专用的配置指令,即专用指令,例如为SetCR Src(表示用数据Src设置CR)指令,对CR进行写操作。例如CR中包含5个位域,分别对应如前所述的第一重循环区域的长度、第二重循环区域的长度、第一步进值、第二步进值、及循环存储区域的起始地址,例如用WinLen表示第一重循环区域的长度,用CirLen表示第二重循环区域的长度,用Inc表示第一步进值,用Stirde表示第二步进 值,用Start表示循环存储区域的起始地址,如表1所示,则通过专用的配置指令对CR进行写操作,就能对这些位域对应的参数进行初始化。For example, before executing a read command, a write command, or other control command, the CR can be first written by a dedicated configuration command, that is, a dedicated instruction, such as a SetCR Src (representing setting CR with data Src) instruction. For example, the CR includes five bit fields, respectively corresponding to the length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the cycle storage area. The starting address, for example, the length of the first recirculation area is represented by WinLen, the length of the second recirculation area is represented by CirLen, the first step value is represented by Inc, and the second step is represented by Stirde. Value, use Start to indicate the starting address of the cyclic storage area. As shown in Table 1, the parameters corresponding to these bit fields can be initialized by writing to the CR through a dedicated configuration command.
表1Table 1
StartStart CirLenCirLen WinLenWinLen IncInc StrideStride
表1中,各个位域的相对位置只是用于示例,在实际情况中可以按照任意顺序排列。In Table 1, the relative positions of the respective bit fields are only for the examples, and may be arranged in any order in the actual case.
另外,在图1中,可以理解为是以RCU1021和WCU1022位于不同的功能模块中为例,或者也可以理解为是将CU拆分开来进行示意,即实际上图1中的RCU1021和WCU1022为CU中的两个功能模块。In addition, in FIG. 1 , it can be understood that the RCU 1021 and the WCU 1022 are located in different functional modules as an example, or can be understood as being separated from the CU, that is, the RCU 1021 and the WCU 1022 in FIG. 1 are actually Two functional modules in the CU.
另外,在图1中,存储器101可以包括2个用于读数据的接口和2个用于写数据的接口,在图1中,读数据的接口例如可以表示为R接口,写数据的接口例如可以表示为W接口,其中一个读数据的接口针对按照本发明实施例中后续将要介绍的方法循环读数据的方式,另一个读数据的接口针对进行非循环读数据的方式,即后续将要介绍的普通立即数索引类型的方式,同样的,其中一个写数据的接口针对按照本发明实施例中后续将要介绍的方法循环写数据的方式,另一个写数据的接口针对非循环写数据的方式。In addition, in FIG. 1, the memory 101 may include two interfaces for reading data and two interfaces for writing data. In FIG. 1, the interface for reading data may be represented, for example, as an R interface, and an interface for writing data, for example. It can be represented as a W interface, in which one interface for reading data is cyclically read data in accordance with a method to be described later in the embodiment of the present invention, and another interface for reading data is directed to a method of performing acyclic read data, that is, a method to be described later. The manner of the normal immediate index type, similarly, one of the interfaces for writing data is for the manner of cyclically writing data according to the method to be described later in the embodiment of the present invention, and the other interface for writing data is for the manner of writing data acyclically.
从图1中可知,Imm Wr和Imm Rd这两种指令属于普通立即数索引类型的操作指令,这种操作指令将在后面介绍,在这种操作指令中可以携带目的地址,可直接根据这种操作指令在存储器101中读数据或写数据。As can be seen from Fig. 1, the two commands Imm Wr and Imm Rd belong to the operation instruction of the ordinary immediate index type. This operation instruction will be described later, and the destination address can be carried in the operation instruction, which can be directly based on this The operation instructions read data or write data in the memory 101.
Cyc Wr和Cyc Rd这两种指令属于窗口内部循环索引类型的指令,或属于窗口整体移动的循环索引类型,也将在后面介绍,在指令中可以不携带目的地址,可以根据本发明实施例提供的循环方式进行读数据或写数据。The two types of instructions, Cyc Wr and Cyc Rd, belong to the instruction of the internal loop index type of the window, or the type of the loop index that belongs to the overall movement of the window, which will be described later. The instruction may not carry the destination address, and may be provided according to the embodiment of the present invention. The loop mode is to read data or write data.
图1只是列举一种可能的数据处理设备的结构,其他可能的结构也在本发明的保护范围之内,只要数据处理设备能够实现本发明实施例中所提供的生成地址的方法即可。1 is a structure of a possible data processing device. Other possible structures are also within the scope of the present invention, as long as the data processing device can implement the method for generating an address provided in the embodiment of the present invention.
请参见图2,基于同一发明构思及上述各实施例,本发明实施例以 RCU1021为例,介绍RCU1021的一种可能的内部实现方式。WCU1022的硬件结构可以与RCU1021相同,或者,图2也可以理解为CU的结构,本发明不作限制。Referring to FIG. 2, based on the same inventive concept and the above embodiments, the embodiment of the present invention The RCU 1021 is an example of a possible internal implementation of the RCU 1021. The hardware structure of the WCU 1022 can be the same as that of the RCU 1021. Alternatively, FIG. 2 can also be understood as the structure of the CU, which is not limited in the present invention.
如图2所示,RCU1021用于产生对存储器101的读操作指令,它受CR和来自IB105的操作指令共同控制。RCU1021内部主要由比较器A、比较器B、比较器C、加法器D、加法器E、加法器F、加法器G、选择器H、选择器J、选择器K、寄存器I和寄存器S构成。图2中的DEC为图1中的DEC,即为不属于RCU1021的另一功能模块。例如,图2中的RCU1021的一种可能的工作过程如下。As shown in FIG. 2, the RCU 1021 is used to generate a read operation instruction to the memory 101, which is commonly controlled by the CR and the operation instructions from the IB 105. RCU1021 is mainly composed of comparator A, comparator B, comparator C, adder D, adder E, adder F, adder G, selector H, selector J, selector K, register I and register S. . The DEC in FIG. 2 is the DEC in FIG. 1, that is, another functional module that does not belong to the RCU 1021. For example, one possible working process of the RCU 1021 in FIG. 2 is as follows.
1、系统初始化。例如,系统开始工作时,将寄存器I和寄存器S的值分别初始化为0。例如,该过程可以由CU(或RCU1021和WCU1022)完成,例如,CU可以在接收到DEC发送的信令(或称为控制信号)时,将寄存器I和寄存器S的值分别初始化为0。1. System initialization. For example, when the system starts to work, the values of register I and register S are initialized to 0, respectively. For example, the process may be performed by a CU (or RCU 1021 and WCU 1022), for example, the CU may initialize the values of register I and register S to zero, respectively, upon receiving signaling (or referred to as a control signal) transmitted by the DEC.
2、设定CR中各个参数的初值,即通过专用指令,例如SetCR Src指令,对CR中的各个参数进行初始化。例如,DEC在接收SetCR Src指令后,产生控制信号,CU可以在接收到DEC发送的控制信号时,对CR中的各个参数进行初始化。2. Set the initial value of each parameter in the CR, that is, initialize each parameter in the CR by a dedicated instruction, such as the SetCR Src instruction. For example, after receiving the SetCR Src command, the DEC generates a control signal, and the CU can initialize each parameter in the CR when receiving the control signal sent by the DEC.
3、在系统接收到循环类指令时,将寄存器I的值自加,将自加的结果与第一重循环区域的长度进行取模操作,以及,将寄存器S的值自加,将自加的结果与第二重循环区域的长度进行取模操作,之后,将两个取模操作的结果相加,把相加的结果通过比较器C与CirLen进行比较,若二者相等,则得到操作指令的地址Addr=Start,若二者不相等,则得到操作指令的地址Addr=S+I+Start。3. When the system receives the loop type instruction, the value of the register I is self-added, and the self-addition result is modulo-operated with the length of the first re-circulation area, and the value of the register S is self-added. The result is modulo operation with the length of the second recirculation region, after which the results of the two modulo operations are added, and the added result is compared with CirLen by the comparator C, and if the two are equal, the operation is obtained. The address of the instruction Addr=Start. If the two are not equal, the address of the operation instruction Addr=S+I+Start is obtained.
其中,RCU1021支持两类循环指令:Among them, RCU1021 supports two types of loop instructions:
1)窗口内部索引循环指令有效,即接收的是针对第一重循环区域的操作指令。此时,Inc对应的加法器D进行加法操作,X=I+Inc,产生的新的值X与WinLen通过比较器A进行比较。当X与WinLen相等时选择器H输出0, 否选择器H输出X。当寄存器I接收到DEC产生的控制信号时,根据比较器A输出的结果对寄存器I的值进行更新,即,如果X=I+Inc=WinLen时,更新I=0,否则更新I=I+Inc。1) The window internal index loop instruction is valid, that is, the operation instruction for the first recirculation area is received. At this time, the adder D corresponding to Inc performs an addition operation, X=I+Inc, and the generated new value X is compared with WinLen through the comparator A. Selector H outputs 0 when X is equal to WinLen, No selector H outputs X. When the register I receives the control signal generated by the DEC, the value of the register I is updated according to the result of the output of the comparator A, that is, if X=I+Inc=WinLen, the update I=0, otherwise the update I=I+ Inc.
2)窗口整体索引循环指令有效,即接收的是针对第二重循环区域的操作指令。Stride对应的加法器B进行加法操作,Y=S+Stride,产生的新的值Y与CirLen通过比较器B进行比较。当Y与CirLen相等时选择器J输出0,否选择器J输出Y。当寄存器S接收到DEC产生的控制信号时,根据比较器B输出的结果对寄存器S的值进行更新,即,如果Y=S+Stride=CirLen时,更新S=0,否则更新S=S+Stride,及,当寄存器I接收到DEC产生的控制信号时,根据比较器A输出的结果对寄存器I的值进行更新,即,如果X=I+Inc=WinLen时,更新I=0,否则更新I=I+Inc。2) The window overall index loop instruction is valid, that is, the operation instruction for the second recirculation area is received. Stride's adder B performs the addition operation, Y=S+Stride, and the new value Y generated is compared with CirLen through comparator B. When J is equal to CirLen, selector J outputs 0, and selector J outputs Y. When the register S receives the control signal generated by the DEC, the value of the register S is updated according to the result of the output of the comparator B, that is, if Y=S+Stride=CirLen, the update S=0, otherwise the update S=S+ Stride, and, when the register I receives the control signal generated by the DEC, updates the value of the register I according to the result of the output of the comparator A, that is, if X=I+Inc=WinLen, the update I=0, otherwise update I=I+Inc.
其中,操作指令进入DEC后,DEC进行译码,产生相应的控制信号,该控制信号可以控制寄存器S和/或寄存器I的更新。After the operation instruction enters the DEC, the DEC performs decoding to generate a corresponding control signal, and the control signal can control the update of the register S and/or the register I.
图2只是对RCU1021的结构的一种可能的示例,其他可能的RCU1021的结构也在本发明的保护范围之内。FIG. 2 is only one possible example of the structure of the RCU 1021. Other possible configurations of the RCU 1021 are also within the scope of the present invention.
下面介绍在图1和图2架构下,本发明实施例中用到的一些概念。Some of the concepts used in the embodiments of the present invention under the architecture of Figures 1 and 2 are described below.
本发明实施例中,令存储空间中的寻址区域包括循环访问区域,循环访问的区域可以包括多重循环。In the embodiment of the present invention, the addressing area in the storage space is included in the cyclic access area, and the cyclically accessed area may include multiple loops.
以3重循环索引为例,采用与两重循环类似的方法,在3重循环中,循环索引由三个循环体(即三个循环区域)组成,分别为:Taking the 3-loop index as an example, a method similar to the two-loop is used. In the 3-cycle, the loop index is composed of three loop bodies (ie, three loop regions), which are:
第一重循环(最外重循环):采用循环访问方式,循环体的长度例如用CirLen表示。The first re-circulation (external re-circulation): In the cyclic access mode, the length of the loop body is represented by CirLen, for example.
第二重循环(中间重循环):采用循环访问方式,若将第二重循环体看作一个整体,则第二重循环体可以在第一重循环内部整体循环移动,循环体的长度例如用MidWinLen表示,其中,MidWinLen<=CirLen。The second heavy loop (intermediate heavy loop): adopts the cyclic access mode. If the second heavy loop body is regarded as a whole, the second heavy loop body can be cyclically moved inside the first heavy loop, and the length of the loop body is used, for example. MidWinLen said that among them, MidWinLen<=CirLen.
第三重循环(最内重循环):采用循环访问方式,若将第三重循环体看作一个整体,则第三重循环体可以在第二重循环体内部整体循环移动,循环体 的长度例如用InnWinLen表示,其中,InnWinLen<=MidWinLen。The third heavy loop (the innermost heavy loop): adopts the cyclic access method. If the third heavy loop body is regarded as a whole, the third heavy loop body can be cyclically moved inside the second heavy loop body, and the loop body The length is represented, for example, by InnWinLen, where InnWinLen<=MidWinLen.
整个3重循环可以由以下7个数据结构表示,这7个数据结构也可以存储在CR中。例如这7个数据结构分别为:The entire 3 loops can be represented by the following 7 data structures, which can also be stored in the CR. For example, the seven data structures are:
Start:最外重循环体的起始地址。Start: The starting address of the outermost heavy loop body.
CirLen:最外重循环体的长度,其中CirLen<=Size,Size表示存储器101中的整个寻址区域。CirLen: The length of the outermost heavy loop body, where CirLen<=Size, Size represents the entire addressing area in the memory 101.
MidWinLen:中间重循环体的长度,其中MidWinLen<=CirLen。MidWinLen: The length of the intermediate heavy loop body, where MidWinLen<=CirLen.
InnWinLen:最内重循环体的长度,其中InnWinLen<=MidWinLen。InnWinLen: The length of the innermost heavy loop body, where InnWinLen<=MidWinLen.
Inc:最内重循环体内部的地址增长步进值。Inc: The address growth step value inside the innermost loop body.
Stirde:最内重循环体整体在中间重循环体中移动时的地址增长步进值。Stirde: The address growth step value when the innermost heavy loop body moves in the middle heavy loop body as a whole.
Step:中间重循环体整体在最外重循环体中移动时的地址增长步进值。Step: The address growth step value when the intermediate heavy loop body moves in the outermost heavy loop body as a whole.
各个循环体的工作方式同两重循环类似,即先配置后控制。即,先通过专用指令完成对CR中存储的各数据结构的初始化,然后根据具体的应用场景分别实现下面三种索引方式:在最内重循环体内部循环索引、最内重循环体在中间重循环内部整体移动索引、中间重循环体在最外重循环内部整体移动索引。Each loop body works in the same way as a double loop, that is, it is configured first and then controlled. That is, the initialization of each data structure stored in the CR is completed by a dedicated instruction, and then the following three indexing modes are respectively implemented according to specific application scenarios: the innermost index is looped inside the innermost heavy loop body, and the innermost heavy loop body is heavy in the middle. The inner internal movement index and the intermediate heavy loop body move the index as a whole inside the outermost heavy loop.
那么,如果扩展至N重循环(N为大于3的整数),则整个N重循环可以由2N+1个数据结构表示,这些数据结构也可以存储在CR中。例如这些数据结构分别为:Then, if extended to an N-repetition (N is an integer greater than 3), the entire N-recycle can be represented by 2N+1 data structures, which can also be stored in the CR. For example, these data structures are:
最外重循环体的起始地址;The starting address of the outermost heavy loop body;
N个循环体中每个循环体的长度,例如最外重循环体的长度为L1,……,最内重循环体的长度为LN,那么L1>=L2……>=LNThe length of each of the N loop bodies, for example, the length of the outermost heavy loop body is L 1 , ..., the length of the innermost heavy loop body is L N , then L 1 >= L 2 ...>=L N.
同样的,第Ni+1重循环可以整体在Ni重循环内整体循环移动,其中,N个循环体中的每个循环体对应的移动步进值,例如分别表示为S1~SNSimilarly, the N i +1 re-circulation can be cyclically moved as a whole in the N i re-circulation, wherein the moving step values corresponding to each of the N loop bodies are represented as S 1 -S N , respectively. .
同样的,对于N重循环的整个循环区域,也是先配置后使用。Similarly, for the entire loop area of the N-cycle, it is also configured first.
下面皆以两重循环为例进行介绍,本领域技术人员自然知晓,根据本发 明的思想,如前介绍的多重循环的方案亦在本发明的保护范围之内。The following is an example of a double cycle, which is well known to those skilled in the art, according to the present invention. The idea of a clear cycle, as described above, is also within the scope of the invention.
若循环访问的区域包括两重循环,则可能会涉及以下参数:If the area of the loop access includes a double loop, the following parameters may be involved:
内重小循环,也称为窗口循环,本发明实施例中将其对应的存储区域称为第一重循环区域,对第一重循环区域可采用循环访问方式,循环体长度即是第一重循环区域的长度,例如在如前介绍了,可表示为WinLen,其中,WinLen<=CirLen。第一重循环区域包括至少一个子存储空间,每个子存储空间对应一个地址。In the embodiment of the present invention, the corresponding storage area is referred to as a first re-circulation area, and the first re-circulation area may adopt a cyclic access mode, and the length of the loop body is the first weight. The length of the loop area, for example as described above, can be expressed as WinLen, where WinLen<=CirLen. The first re-circulation area includes at least one sub-storage space, and each sub-storage space corresponds to one address.
外重大循环,本发明实施例中将其对应的存储区域称为第二重循环区域,对第二重循环区域可采用循环访问方式,循环体长度(即第二重循环区域的长度)例如在如前介绍了,可表示为CirLen。其中CirLen<=Size,Size表示存储器101的存储空间中的整个寻址区域。In the embodiment of the present invention, the corresponding storage area is referred to as a second re-circulation area, and the second re-circulation area may adopt a cyclic access mode, and the length of the loop body (ie, the length of the second re-circulation area) is, for example, As introduced before, it can be expressed as CirLen. Where CirLen<=Size, Size represents the entire addressing area in the memory space of the memory 101.
在前面介绍了,内重小循环可以整体在外重大循环中移动,移动的地址增长步进值可表示为Stride,本发明实施例中将该步进值称为第二步进值。内重小循环在外重大循环中的地址偏移为S=S+Stirde,S例如可以初始化为0。同时,内重小循环内部也可以进行循环索引,内重小循环中,两个子存储空间之间的地址增长步进值例如可以表示为Inc,本发明实施例中例如将该步进值称为第一步进值。内重小循环中的子存储空间相对于内重小循环的起始地址的偏移为I=I+Inc,I例如可以初始化为0。即,I所指向的地址可以称为内重小循环的内部偏移地址,S所指向的地址可以称为外重大循环的内部偏移地址。As described above, the internal small loop can be moved in the outer major loop, and the moved address growth step value can be expressed as Stride. In the embodiment of the present invention, the step value is referred to as the second step value. The address offset of the inner heavy small loop in the outer major loop is S=S+Stirde, and S can be initialized to 0, for example. At the same time, the inner-weight small loop can also perform the loop index. In the inner-heavy loop, the address growth step value between the two sub-memory spaces can be represented as Inc. For example, in the embodiment of the present invention, the step value is called The first step value. The offset of the sub-memory space in the inner-heavy loop relative to the start address of the inner-heavy loop is I=I+Inc, and I can be initialized to 0, for example. That is, the address pointed to by I can be called the internal offset address of the inner heavy small loop, and the address pointed to by S can be called the internal offset address of the outer major loop.
可以认为,若循环访问的区域包括两重循环,则整个循环体可以包括5个数据结构,即如表1介绍的5个位域中存储的信息,分别为:It can be considered that if the cyclically accessed area includes two loops, the entire loop body can include five data structures, that is, information stored in five bit fields as described in Table 1, respectively:
Start:外重大循环起始地址,即第二重循环区域的起始地址,在对其进行初始化后,也可以称为循环存储区域的初始化地址。Start: The external major loop start address, that is, the start address of the second re-circulation area. After it is initialized, it can also be called the initialization address of the cyclic storage area.
CirLen:外重大循环的长度,即第二重循环区域的长度。CirLen: The length of the outer major cycle, ie the length of the second recirculation zone.
WinLen:内重小循环的长度,即第一重循环区域的长度。WinLen: The length of the inner heavy loop, that is, the length of the first recirculation region.
Inc:内重小循环内部的地址增长步进值,即第一步进值。 Inc: The address growth step value inside the inner small loop, which is the first step value.
Stirde:内重小循环窗口整体在外重大循环中的移动地址增长步进值,即第二步进值。Stirde: The moving address growth step value of the inner heavy small loop window as a whole in the outer major loop, that is, the second step value.
本发明实施例中,第一重循环区域对应一个指针,例如可以称为第一指针(可以表示为I),第一指针的增长步进值为Inc,即,在第一指针指向的地址的基础上增加Inc后,第一指针就指向第一重循环区域中的下一个地址。第二重循环区域对应一个指针,例如可以称为第二指针(可以表示为S),第二指针的增长步进值为Stirde,即,在第二指针指向的地址的基础上增加Stirde后,第二指针就指向第二重循环区域中的下一个地址。其中,在第二指针指向的地址的基础上增加Stirde,可以认为是将第一重循环整体在第二重循环里移动了一个第二步进值。In the embodiment of the present invention, the first re-circulation area corresponds to a pointer, which may be referred to as a first pointer (which may be denoted as I), and the increasing step value of the first pointer is Inc, that is, the address pointed to by the first pointer. After adding Inc based on the base, the first pointer points to the next address in the first recirculation area. The second re-circulation area corresponds to a pointer, which may be referred to as a second pointer (which may be denoted as S), and the growth step of the second pointer is Stirde, that is, after adding Stirde based on the address pointed by the second pointer, The second pointer points to the next address in the second recirculation region. Wherein, adding Stirde based on the address pointed by the second pointer can be considered as moving the first re-circulation as a whole in the second re-circle by a second step value.
如图3所示,为本发明实施例中一种可能的循环区域的示意图。FIG. 3 is a schematic diagram of a possible loop area in an embodiment of the present invention.
在任意时刻,循环体的访问地址,即操作指令(本发明实施例中,操作指令可以是读数据的指令,也可以是写数据的指令)所针对的地址,可以用数学表达式表示为:At any time, the access address of the loop body, that is, the operation instruction (in the embodiment of the present invention, the operation instruction may be an instruction to read data or an instruction to write data), may be expressed by a mathematical expression as:
Addr=Start+(S+I)%CirLen    (1)Addr=Start+(S+I)%CirLen (1)
公式(1)中,%表示取模操作,S=(S+Stride)%CirLen,S可以初始化为0,I=(I+Inc)%WinLen,I可以初始化为0。In the formula (1), % represents the modulo operation, S = (S + Stride)% CirLen, S can be initialized to 0, I = (I + Inc) % WinLen, I can be initialized to 0.
本发明实施例中,如果通过循环方式向连续的物理地址或连续的逻辑地址指向的位置执行读数据的指令或写数据的指令,则指令中可以无需携带目的地址,通过循环方式寻址即可,如果通过非循环方式执行读数据的指令或写数据的指令,则指令中可以携带目的地址,直接根据目的地址进行读写即可。因此,本发明实施例中的操作指令的类型可以有所不同,以下简单介绍。In the embodiment of the present invention, if an instruction to read data or an instruction to write data is executed to a position pointed to by a continuous physical address or a continuous logical address in a round robin manner, the instruction may not need to carry the destination address, and may be addressed by a cyclic manner. If the instruction to read data or the instruction to write data is executed in an acyclic manner, the instruction may carry the destination address and directly read and write according to the destination address. Therefore, the types of operation instructions in the embodiments of the present invention may be different, and are briefly described below.
普通立即数索引类型,即在操作指令中需要显式指明数据存储的位置的指令类型,即,需在操作指令中携带目的地址。该类型的读数据的指令例如可以表示为Rd Dest Ri,该类型的写数据的指令例如可以表示为Wr Ri Src。其中Ri用于指示要读取的数据所在的地址,Dest表示读取的数据需要写往的目的地址,Src表示写入的数据。 The normal immediate index type, that is, the type of instruction that needs to explicitly indicate the location of the data storage in the operation instruction, that is, the destination address needs to be carried in the operation instruction. Instructions of this type of read data may be represented, for example, as Rd Dest Ri, and instructions of this type of write data may be represented, for example, as Wr Ri Src. Where Ri is used to indicate the address where the data to be read is located, Dest is the destination address to which the read data needs to be written, and Src is the data to be written.
窗口内部循环索引类型,为针对第一重循环区域的操作指令,在操作指令中不需要指明数据存储的位置,即无需在操作指令中携带目的地址,系统会自动寻址,且在该类操作指令执行完成之后,在下一个时钟周期到来时,对应于第一重循环区域的第一指针指向的地址会自动增加Inc。本发明实施例中,每次更新指针指向的地址后,系统可通过公式(1)进行计算以得到下一次操作指令所对应的地址。该类型的读数据的指令例如可以表示为RdI Dest,该类型的写数据的指令例如可以表示为WrI Src。The internal loop index type of the window is an operation instruction for the first re-circulation area. The operation instruction does not need to indicate the location of the data storage, that is, the destination address is not required to be carried in the operation instruction, and the system automatically addresses and operates in the operation. After the execution of the instruction is completed, the address pointed to by the first pointer corresponding to the first recirculation area automatically increases Inc at the next clock cycle. In the embodiment of the present invention, each time the address pointed by the pointer is updated, the system can calculate by using formula (1) to obtain the address corresponding to the next operation instruction. Instructions of this type of read data may be represented, for example, as RdI Dest, and instructions of this type of write data may be represented, for example, as WrI Src.
窗口整体移动的循环索引类型,为针对第二重循环区域的操作指令,在操作指令中也无需指明数据存储的位置,即无需在操作指令中携带目的地址,系统会自动寻址,且在该类操作指令执行完成之后,对应于第二重循环区域的第二指针指向的地址会自动增加Stirde,且对应于第一重循环区域的第一指针指向的地址也要自动增加Inc。本发明实施例中,在下一个时钟周期到来时更新指针指向的地址,且系统可通过公式(1)进行计算以得到下一次操作指令所对应的地址。该类型的读数据的指令例如可以表示为RdS Dest,该类型的写数据的指令例如可以表示为WrS Src。The loop index type of the overall movement of the window is an operation instruction for the second re-circulation area, and there is no need to specify the location of the data storage in the operation instruction, that is, the destination address is not required to be carried in the operation instruction, and the system automatically addresses, and After the execution of the class operation instruction is completed, the address pointed to by the second pointer corresponding to the second re-circulation area automatically increases Stirde, and the address pointed to by the first pointer corresponding to the first re-circulation area also automatically increases Inc. In the embodiment of the present invention, the address pointed by the pointer is updated when the next clock cycle arrives, and the system can perform calculation by the formula (1) to obtain the address corresponding to the next operation instruction. Instructions of this type of read data may be represented, for example, as RdS Dest, and instructions of this type of write data may be represented, for example, as WrS Src.
在接收操作指令后,系统可以根据操作指令的结构等确定操作指令的类型,从而对不同类型的操作指令采用不同的执行方式。After receiving the operation instruction, the system can determine the type of the operation instruction according to the structure of the operation instruction, etc., thereby adopting different execution modes for different types of operation instructions.
以上对各个数据结构以及对各类操作指令的命名,只是本发明实施例提供的一种可能的表示方式,在实际应用中,可以根据不同需求将其进行不同的命名。The naming of the various data structures and the various types of operation instructions is only one possible representation provided by the embodiment of the present invention. In practical applications, it can be named differently according to different requirements.
上述各类指令的含义解释如下。The meanings of the above various types of instructions are explained below.
Wr Ri Src,把数据Src写入地址Ri所指向的存储区域。例如Ri可以是寄存器编号,或者也可以是其他用于指示地址的信息;Wr Ri Src writes the data Src to the storage area pointed to by the address Ri. For example, Ri may be a register number, or may be other information for indicating an address;
Rd Dest Ri,把存储器101中地址Ri所指向的位置中的数据读出到地址Dest指向的位置;Rd Dest Ri, reading data in the location pointed to by the address Ri in the memory 101 to the position pointed by the address Dest;
WrI Src,把数据Src写入第一重循环区域中的自索引地址指向的位置,这里的自索引地址即上次执行完循环类型的指令(窗口内部循环索引类型或 窗口整体移动的循环索引类型)后,在当前的时钟周期根据公式(1)得到的地址,同时更新自索引指针I=(I+Inc)%WinLen;WrI Src, the data Src is written to the position pointed to by the self-indexing address in the first re-circulation area, where the self-indexing address is the instruction that last executed the loop type (the inner loop index type of the window or After the loop index type of the overall movement of the window), according to the address obtained by the formula (1) in the current clock cycle, the self-index pointer I=(I+Inc)%WinLen is simultaneously updated;
WrS Src,把Src写入第二重循环区域中的自索引地址指向的位置,这里的自索引地址即上次执行完循环类型的指令后,在当前的时钟周期根据公式(1)得到的地址,同时更新自索引指针S=(S+Stride)%CirLen,以及更新自索引指针I=(I+Inc)%WinLen;WrS Src, write Src to the position pointed to by the self-indexing address in the second recirculation area, where the self-indexing address is the address obtained according to formula (1) in the current clock cycle after the last execution of the loop type instruction. , at the same time update the self-index pointer S = (S + Stride)% CirLen, and update the self-index pointer I = (I + Inc)% WinLen;
RdI Dest,把第一重循环区域中的自索引地址指向的位置中的数据读出到地址Dest指向的位置,这里的自索引地址即上次执行完循环类型的指令后,在当前的时钟周期根据公式(1)得到的地址,同时更新自索引指针I=(I+Inc)%WinLen;Dest用于指示一个存储区域;RdI Dest, reads the data in the position pointed by the self-indexing address in the first re-circulation area to the position pointed by the address Dest, where the self-indexing address is the last clock cycle after the last execution of the loop type instruction According to the address obtained by the formula (1), the self-index pointer I=(I+Inc)%WinLen is updated at the same time; Dest is used to indicate a storage area;
RdS Dest,把第二重循环区域中的自索引地址指向的位置中的数据读出到地址Dest指向的位置,这里的自索引地址即上次执行完循环类型的指令后,在当前的时钟周期根据公式(1)得到的地址,同时更新自索引指针S=(S+Stride)%CirLen,以及更新自索引指针I=(I+Inc)%WinLen。RdS Dest, reads the data in the position pointed by the self-indexing address in the second recirculation area to the position pointed by the address Dest, where the self-indexing address is the last clock cycle after the last execution of the loop type instruction. According to the address obtained by the formula (1), the self-index pointer S=(S+Stride)%CirLen is updated at the same time, and the self-index pointer I=(I+Inc)%WinLen is updated.
当然,以上各类操作指令的结构、功能等,只是给出的一种示例,只要符合本发明思想的操作指令均在本发明的保护范围之内。Of course, the structure, function, and the like of the above various types of operation instructions are only an example given, as long as the operation instructions conforming to the idea of the present invention are within the protection scope of the present invention.
另外,本文中术语“系统”和“网络”在本文中常被可互换使用。本文中术语“和/或”,仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,本文中字符“/”,如无特殊说明,一般表示前后关联对象是一种“或”的关系。Additionally, the terms "system" and "network" are used interchangeably herein. The term "and/or" in this context is merely an association describing the associated object, indicating that there may be three relationships, for example, A and / or B, which may indicate that A exists separately, and both A and B exist, respectively. B these three situations. In addition, the character "/" in this article, unless otherwise specified, generally indicates that the contextual object is an "or" relationship.
下面结合说明书附图对本发明实施例作进一步详细描述。The embodiments of the present invention are further described in detail below with reference to the accompanying drawings.
请参见图4,本发明一个实施例提供一种生成地址的方法,该方法可以由控制单元完成(若控制单元为CU,那么该方法可以由CU完成,若控制单元包括WCU1022和RCU1021,那么若操作指令是读数据的指令,则该方法可以由WCU1022完成,若操作指令是写数据的指令,则该方法可以由RCU1021完成),该方法的流程描述如下。 Referring to FIG. 4, an embodiment of the present invention provides a method for generating an address, which may be completed by a control unit. (If the control unit is a CU, the method may be completed by the CU. If the control unit includes the WCU 1022 and the RCU 1021, then The operation instruction is an instruction to read data, and the method can be completed by the WCU 1022. If the operation instruction is an instruction to write data, the method can be completed by the RCU 1021. The flow of the method is described below.
步骤401:接收针对存储器101中的第二重循环区域的第一操作指令;Step 401: Receive a first operation instruction for a second recirculation area in the memory 101;
步骤402:执行第一操作指令,并通过第二重循环区域对应的第二步进值修改存储器101中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址;其中,第一重循环区域根据第二步进值在第二重循环区域中循环移动。Step 402: The first operation instruction is executed, and the start address of the first re-circulation area in the memory 101 is modified by the second step value corresponding to the second re-circulation area to obtain an address corresponding to the next received operation instruction; The first recirculation region is cyclically moved in the second recirculation region according to the second step value.
首先,控制单元可以接收操作指令,本发明实施例中将该操作指令称为第一操作指令。第一操作指令可以是读数据的指令,或者也可以是写数据的指令。First, the control unit can receive an operation instruction, which is referred to as a first operation instruction in the embodiment of the present invention. The first operational instruction may be an instruction to read data, or may be an instruction to write data.
本发明实施例中,操作指令中可以携带目的地址,或者也可以不携带目的地址。而本发明实施例认为第一操作指令中不携带目的地址,即第一操作指令是针对第二重循环区域的循环类指令。In the embodiment of the present invention, the operation command may carry the destination address, or may not carry the destination address. In the embodiment of the present invention, the first operation instruction does not carry the destination address, that is, the first operation instruction is a cyclic type instruction for the second re-circulation area.
另外,若第一操作指令为读数据的指令,则第一操作指令中还可以携带数据去向的地址,即指明要将读出的数据存储到哪里,若第一操作指令为写数据的指令,则第一操作指令中还可以携带数据来源的地址,即指明要写入的数据来自于哪里。当然,以上只是一些可能的示例,本发明对此不作限制。In addition, if the first operation instruction is an instruction to read data, the first operation instruction may further carry an address to which the data is directed, that is, indicate where the read data is to be stored, and if the first operation instruction is an instruction to write data, Then, the first operation instruction can also carry the address of the data source, that is, indicate where the data to be written comes from. Of course, the above are just some possible examples, and the present invention is not limited thereto.
在前面已有介绍,本发明实施例中的系统可以执行循环类指令,也可以执行非循环类指令,因此可以将操作指令分为不同的类型,对于不同类型的操作指令,有不同的执行方式。那么,在接收操作指令后,控制单元可以首先确定操作指令的类型信息,即确定操作指令是何种类型。As described above, the system in the embodiment of the present invention can execute a loop type instruction or a non-loop type instruction, so the operation instruction can be divided into different types, and different execution modes are used for different types of operation instructions. . Then, after receiving the operation instruction, the control unit may first determine the type information of the operation instruction, that is, determine what type of operation instruction is.
在确定操作指令的类型后,控制单元就可以确定与这种类型的操作指令对应的执行方式。After determining the type of operational command, the control unit can determine the manner of execution corresponding to this type of operational command.
本发明实施例中,第一操作指令可以是循环类指令,第一操作指令的类型可以是窗口整体移动的循环索引类型,即针对第二重循环区域的操作指令的类型,那么第一操作指令的执行方式就是针对第二重循环区域进行外部大循环的执行方式。例如,第一操作指令可以是RdS Dest,或者可以是WrS Src。In the embodiment of the present invention, the first operation instruction may be a loop type instruction, and the type of the first operation instruction may be a loop index type of the overall movement of the window, that is, a type of the operation instruction for the second recirculation area, then the first operation instruction The execution mode is to execute the external large loop for the second recirculation region. For example, the first operational instruction may be RdS Dest or may be WrS Src.
在确定第一操作指令为针对第二重循环区域的操作指令后,可以执行第一操作指令,例如如果第一操作指令为读数据的指令,则可以向第一地址所 指向的位置写入数据,或者例如,如果第一操作指令为写数据的指令,则可以从第一地址所指向的位置读取数据。After determining that the first operation instruction is an operation instruction for the second re-circulation area, the first operation instruction may be executed, for example, if the first operation instruction is an instruction to read data, the first address may be The pointed location writes data, or for example, if the first operational instruction is an instruction to write data, the data can be read from the location pointed to by the first address.
例如第一操作指令所对应的地址为第一地址。其中,如果第一操作指令是系统对循环类指令涉及的各参数进行初始化后控制单元接收的第一个操作指令,则第一地址为循环存储区域的初始化地址,即如前所述的Start,而如果第一操作指令不是系统对循环类指令涉及的各参数进行初始化后控制单元接收的第一个操作指令,也就是说在第一操作指令之前,控制单元还执行过其他的读数据的指令或写数据的指令,则第一地址为最近一次执行完操作指令后,系统在当前的时钟周期得到的地址,最近一次执行的操作指令,可以是针对第一重循环区域的操作指令,或者也可以是针对第二重循环区域的操作指令,可以是读数据的指令,也可以是写数据的指令。For example, the address corresponding to the first operation instruction is the first address. Wherein, if the first operation instruction is the first operation instruction received by the control unit after the system initializes each parameter involved in the cyclic class instruction, the first address is an initialization address of the cyclic storage area, that is, Start as described above. If the first operation instruction is not the first operation instruction received by the control unit after the system initializes the parameters involved in the cyclic class instruction, that is, before the first operation instruction, the control unit executes other instructions for reading data. Or the instruction to write data, the first address is the address obtained by the system in the current clock cycle after the last execution of the operation instruction, and the last executed operation instruction may be an operation instruction for the first recirculation area, or It may be an operation instruction for the second re-circulation area, and may be an instruction to read data or an instruction to write data.
其中,在每次执行完操作指令后,可以得到新的第一重循环区域的内部偏移地址(即新的I值)和新的第二重循环区域的内部偏移地址(即新的S值),在下一个时钟周期到来时,可以根据上次得到的新的第一重循环区域的内部偏移地址和新的第二重循环区域的内部偏移地址更新相应的寄存器的值,并根据更新后的寄存器的值得到本次操作所对应的地址。也就是说,在每次操作指令执行完成后,系统只是得到了新的I值和新的S值,但未根据新的值更新相应的寄存器的值,在下一个时钟周期到来时(例如下一次接收到循环类型的操作指令时),系统根据上一次得到的新的I值和新的S值更新相应的寄存器,再通过计算得到本次操作所对应的地址。Wherein, after each execution of the operation instruction, the internal offset address of the new first recirculation region (ie, the new I value) and the internal offset address of the new second recirculation region (ie, the new S) may be obtained. Value), when the next clock cycle comes, the corresponding register value can be updated according to the internal offset address of the new first recirculation region obtained last time and the internal offset address of the new second recirculation region, and according to The value of the updated register gets the address corresponding to this operation. That is to say, after each execution of the operation instruction is completed, the system only obtains the new I value and the new S value, but does not update the value of the corresponding register according to the new value, when the next clock cycle arrives (for example, the next time When receiving the operation instruction of the loop type, the system updates the corresponding register according to the new I value and the new S value obtained last time, and then obtains the address corresponding to the current operation by calculation.
在执行第一操作指令后,可以得到新的第二重循环区域的内部偏移地址,在下一个时钟周期到来时(一般就是接收到下一个循环类型的操作指令时),可以通过第二重循环区域对应的第二步进值修改第一重循环区域的起始地址,得到该次接收的针对第一重循环区域或第二重循环区域的操作指令对应的地址。这样,下一次再接收到循环类操作指令时,控制单元就可以自动寻址,而无需将目的地址携带在操作指令中。After executing the first operation instruction, the internal offset address of the new second recirculation area can be obtained, and when the next clock cycle arrives (generally, when the operation instruction of the next cycle type is received), the second recirculation can be passed. The second step value corresponding to the region modifies the start address of the first re-circulation region, and obtains an address corresponding to the operation instruction for the first re-circulation region or the second re-circulation region. In this way, the next time the loop-type operation instruction is received, the control unit can automatically address it without having to carry the destination address in the operation instruction.
即本发明实施例中,在接到第一操作指令时,直接可以计算得到第一地 址,并从第一地址指向的位置进行数据操作,本发明实施例中,对一个地址指向的位置进行数据操作,可以是指从该地址指向的位置读取数据,或者也可以是指向该地址指向的位置写入数据。即第一操作指令中无需携带目的地址,系统可以根据之前获得的地址(即第一地址)进行操作,同样,在本次操作完毕后,可以自动得到下一次接收的针对第一重循环区域或第二重循环区域的操作指令对应的地址(这里的指得到相应的值,需要在下一个时钟周期到来时再计算得到该地址),以便下次接到操作指令时可以直接对本次得到的地址指向的位置进行数据操作。这样,操作指令中无需携带目的地址,在较大的程度上减少了程序的代码量,在存储代码时无需耗费太多的存储空间,同时也减轻了编程人员的编程负担。That is, in the embodiment of the present invention, when the first operation instruction is received, the first place can be directly calculated. Addressing, and performing data operations from the location pointed by the first address. In the embodiment of the present invention, performing data operations on a location pointed to by an address may refer to reading data from a location pointed to by the address, or may be pointing to the address. Write the data to the location pointed to. That is, the first operation instruction does not need to carry the destination address, and the system can operate according to the previously obtained address (ie, the first address). Similarly, after the current operation is completed, the next received area for the first recirculation area or The address corresponding to the operation instruction of the second recirculation area (here, the corresponding value is obtained, and the address needs to be calculated when the next clock cycle comes), so that the address obtained this time can be directly obtained when the operation instruction is received next time. Point to the location for data manipulation. In this way, the operation instruction does not need to carry the destination address, which reduces the code amount of the program to a large extent, does not need to consume too much storage space when storing the code, and also reduces the programming burden of the programmer.
可选的,在本发明另一实施例中,执行第一操作指令,并通过第二重循环区域对应的第二步进值修改存储器101中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址,包括:Optionally, in another embodiment of the present invention, the first operation instruction is executed, and the start address of the first recirculation area in the memory 101 is modified by the second step value corresponding to the second recirculation area to obtain The address corresponding to the next received operation command, including:
执行第一操作指令,通过第二重循环区域对应的第二步进值修改存储器101中的第一重循环区域的起始地址,以及通过第一重循环区域对应的第一步进值修改第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing a first operation instruction, modifying a start address of the first recirculation region in the memory 101 by using a second step value corresponding to the second recirculation region, and modifying the first step value corresponding to the first recirculation region The internal offset address of a loop region is obtained to obtain the address corresponding to the next received operation command.
第一重循环区域可以包括至少一个子存储空间,而每次操作针对的是其中的一个子存储空间,即,第一指针在一个时刻只指向其中的一个子存储空间,那么,第一重循环区域的内部偏移地址,可以是指第一指针当前指向的子存储空间的地址。The first re-circulation area may include at least one sub-storage space, and each operation is directed to one of the sub-storage spaces, that is, the first pointer points to only one of the sub-storage spaces at a time, then the first re-circulation The internal offset address of the area may refer to the address of the sub-memory space to which the first pointer currently points.
可选的,在本发明另一实施例中,Optionally, in another embodiment of the present invention,
通过第二重循环区域对应的第二步进值修改存储器101中的第一重循环区域的起始地址,包括:Modifying the start address of the first recirculation region in the memory 101 by using the second step value corresponding to the second recirculation region, including:
通过第二重循环区域对应的第二步进值修改第二重循环区域对应的第二指针指向的地址;Modifying, by the second step value corresponding to the second recirculation region, an address pointed by the second pointer corresponding to the second recirculation region;
通过第一重循环区域对应的第一步进值修改第一重循环区域的内部偏移 地址,包括:Modifying the internal offset of the first recirculation region by the first step value corresponding to the first recirculation region Address, including:
通过第一重循环区域对应的第一步进值修改第一重循环区域对应的第一指针指向的地址。The address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area.
可选的,在本发明另一实施例中,得到下一次接收的操作指令对应的地址,包括:Optionally, in another embodiment of the present invention, obtaining an address corresponding to the next received operation instruction, including:
将第一指针增加一个第一步进值后指向的地址与第一重循环区域的长度进行取模操作,及将第二指针增加一个第二步进值后指向的地址与第二重循环区域的长度进行取模操作;Adding a first pointer to a first step value to point to an address and performing a modulo operation on the length of the first recirculation region, and adding the second pointer to a second step value to point to the address and the second recirculation region The length of the modulo operation;
将得到的两个取模的结果相加,并对相加后的结果与第二重循环区域的长度进行取模操作;Adding the obtained results of the two modulo, and performing a modulo operation on the added result and the length of the second recirculation region;
将与第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulo the length of the second recirculation area is added to the initialization address of the cyclic storage area, and the address corresponding to the next received operation instruction is obtained.
在该实施例中,执行完第一操作指令后,可以得到新的I值和新的S值。在下一个时钟周期到来时,系统可以根据执行完第一操作指令后得到的新的I值和新的S值分别更新寄存器I和寄存器S的值,且可以根据更新后的寄存器的值得到本次操作指令对应的地址,那么显然,本次操作指令对应的地址其实是上次执行完操作指令后确定的。例如,系统在得到本次操作指令对应的地址时,可以根据公式(1)计算得到。In this embodiment, after the first operation instruction is executed, a new I value and a new S value can be obtained. When the next clock cycle arrives, the system can update the values of the register I and the register S according to the new I value and the new S value obtained after executing the first operation instruction, and can obtain the current value according to the updated register value. The address corresponding to the operation instruction, obviously, the address corresponding to the operation instruction is actually determined after the last execution of the operation instruction. For example, when the system obtains the address corresponding to this operation instruction, it can be calculated according to formula (1).
可选的,在本发明另一实施例中,所述方法还包括:Optionally, in another embodiment of the present invention, the method further includes:
接收针对第一重循环区域的第二操作指令;Receiving a second operation instruction for the first recirculation region;
执行第二操作指令,并通过第一重循环区域对应的第一步进值修改第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。The second operation instruction is executed, and the internal offset address of the first recirculation area is modified by the first step value corresponding to the first recirculation area to obtain an address corresponding to the next received operation instruction.
即本发明实施例中,数据处理设备可以执行各类操作指令。That is, in the embodiment of the present invention, the data processing device can execute various types of operation instructions.
本发明实施例中,第一操作指令为针对第二重循环区域的指令,第二操作指令为针对第一重循环区域的指令,即第一操作指令和第二操作指令都是循环类指令,只不过针对的是不同的存储空间。In the embodiment of the present invention, the first operation instruction is an instruction for the second re-circulation area, and the second operation instruction is an instruction for the first re-circulation area, that is, the first operation instruction and the second operation instruction are both cyclic type instructions. It's just for different storage spaces.
第二操作指令可以是用于读数据的指令,例如为RdI Dest,或者第二操作 指令也可以是用于写数据的指令,例如为WrI Src。The second operation instruction may be an instruction for reading data, such as RdI Dest, or a second operation The instruction may also be an instruction for writing data, such as WrI Src.
例如第二操作指令所对应的地址为第二地址。其中,如果第二操作指令是系统对循环类指令涉及的各参数进行初始化后控制单元接收的第一个操作指令,则第二地址为循环存储区域的初始化地址,即如前所述的Start,而如果第二操作指令不是系统对循环类指令涉及的各参数进行初始化后控制单元接收的第一个操作指令,也就是说在第二操作指令之前,控制单元还执行过其他的读数据的指令或写数据的指令,则第二地址为最近一次执行完操作指令后,系统在当前的时钟周期得到的地址,最近一次执行的操作指令,可以是针对第一重循环区域的操作指令,或者也可以是针对第二重循环区域的操作指令,可以是读数据的指令,也可以是写数据的指令。For example, the address corresponding to the second operation instruction is the second address. Wherein, if the second operation instruction is the first operation instruction received by the control unit after the system initializes each parameter involved in the cyclic class instruction, the second address is an initialization address of the cyclic storage area, that is, Start as described above. If the second operation instruction is not the first operation instruction received by the control unit after the system initializes the parameters involved in the cyclic class instruction, that is, before the second operation instruction, the control unit executes other instructions for reading data. Or the instruction to write data, the second address is the address obtained by the system in the current clock cycle after the last execution of the operation instruction, and the last executed operation instruction may be an operation instruction for the first recirculation area, or It may be an operation instruction for the second re-circulation area, and may be an instruction to read data or an instruction to write data.
本发明实施例中,在接到第二操作指令时,直接可以根据上次执行完操作指令后得到的相应的I值和S值更新相应的寄存器,并根据更新后的寄存器的值和循环存储区域的初始化地址等参数计算得到第二地址,并从第二地址指向的位置进行数据操作,本发明实施例中,对一个地址指向的位置进行数据操作,可以是指从该地址指向的位置读取数据,或者也可以是指向该地址指向的位置写入数据。即第二操作指令中无需携带目的地址,系统可以根据之前获得的地址(即第二地址)进行操作,同样,在本次操作完毕后,可以自动得到下一次接收的针对第一重循环区域或第二重循环区域的操作指令对应的地址(这里的指得到相应的值,需要在下一个时钟周期到来时再计算得到该地址),以便下次接到操作指令时可以直接对本次得到的地址指向的位置进行数据操作。这样,操作指令中无需携带目的地址,在较大的程度上减少了程序的代码量,在存储代码时无需耗费太多的存储空间,同时也减轻了编程人员的编程负担。In the embodiment of the present invention, when receiving the second operation instruction, the corresponding register may be directly updated according to the corresponding I value and S value obtained after the last execution of the operation instruction, and stored according to the value of the updated register and the cyclic storage. The parameter of the initialization address of the area is calculated to obtain the second address, and the data is operated from the location pointed by the second address. In the embodiment of the present invention, the data operation is performed on the location pointed to by the address, which may be read from the location pointed to by the address. Take data, or you can write data to the location pointed to by the address. That is, the second operation instruction does not need to carry the destination address, and the system can operate according to the previously obtained address (ie, the second address). Similarly, after the current operation is completed, the next received area for the first recirculation area or The address corresponding to the operation instruction of the second recirculation area (here, the corresponding value is obtained, and the address needs to be calculated when the next clock cycle comes), so that the address obtained this time can be directly obtained when the operation instruction is received next time. Point to the location for data manipulation. In this way, the operation instruction does not need to carry the destination address, which reduces the code amount of the program to a large extent, does not need to consume too much storage space when storing the code, and also reduces the programming burden of the programmer.
可选的,在本发明另一实施例中,通过第一重循环区域对应的第一步进值修改第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址,包括:Optionally, in another embodiment of the present invention, the internal offset address of the first re-circulation area is modified by the first step value corresponding to the first re-circulation area, to obtain an address corresponding to the next received operation instruction, include:
通过第一重循环区域对应的第一步进值修改第一重循环区域对应的第一 指针指向的地址,以得到下一次接收的操作指令对应的地址。Modifying the first corresponding to the first recirculation area by the first step value corresponding to the first recirculation area The address pointed to by the pointer to get the address corresponding to the next received operation instruction.
第一重循环区域包括至少一个子存储区域,例如在执行第二操作指令时,第一指针指向的是第一重循环区域中的第一个子存储区域的地址,那么在第二操作指令执行完毕后,控制单元可以将第一指针自加第一步进值,在下一个时钟周期到来时,第一指针就会指向第一重循环区域中的第二个子存储区域的地址,这样就实现了第一重循环区域内部的移动。The first re-circulation area includes at least one sub-storage area. For example, when the second operation instruction is executed, the first pointer points to the address of the first sub-memory area in the first re-circulation area, and then is executed in the second operation instruction. After the completion, the control unit may add the first step value to the first pointer, and when the next clock cycle arrives, the first pointer points to the address of the second sub-storage area in the first re-circulation area, thus realizing Movement within the first recirculation zone.
且,本发明实施例中的地址是取模后的结果。举例如下。Moreover, the address in the embodiment of the present invention is a result after modulo. An example is as follows.
假如第一重循环区域包括四个子存储空间,在第一时刻,接收针对第一重循环区域的读数据指令1,根据上次的操作指令执行完毕后得到的I值和S值计算得到的读数据指令1对应的地址是第一个第一重循环区域中的第四个子存储空间的地址,则本次要从第一个第一重循环区域中的第四个子存储空间中读取数据。在本次数据读取完毕后,将第一指针的值自增Inc,确定第二指针的值不变。在下一个时钟周期到来时,根据第一指针自增Inc后的值更新寄存器I的值,寄存器S的值可以不变,则可以得到该次接收的针对第一重循环区域或第二重循环区域的操作指令对应的地址,即为第一个第一重循环区域中的第一个子存储空间的地址,那么下次再接收针对第一重循环区域或第二重循环区域的操作指令时,需要处理的数据就是第一个第一重循环区域中的第一个子存储空间中的数据。可见,如果把第一重循环区域看做一个窗口,其中的每个子存储空间看做一个小窗口,则通过对地址进行取模操作,可以实现在第一重循环区域这个窗口内的循环操作。If the first re-circulation area includes four sub-memory spaces, at the first moment, the read data instruction 1 for the first re-circulation area is received, and the read of the I value and the S-value obtained after the execution of the last operation instruction is completed. The address corresponding to the data instruction 1 is the address of the fourth sub-memory space in the first first re-circulation area, and the data is to be read from the fourth sub-memory space in the first first re-circulation area. After the data is read, the value of the first pointer is incremented by Inc, and the value of the second pointer is determined to be unchanged. When the next clock cycle arrives, the value of the register I is updated according to the value after the first pointer is incremented by Inc. The value of the register S may be unchanged, and the received second recirculation region or the second recirculation region may be obtained. The address corresponding to the operation instruction is the address of the first sub-memory space in the first first re-circulation area, and then the next time the operation instruction for the first re-circulation area or the second re-circulation area is received, The data that needs to be processed is the data in the first sub-storage space in the first first re-circulation region. It can be seen that if the first re-circulation area is regarded as a window, and each of the sub-storage spaces is regarded as a small window, the loop operation in the window of the first re-circulation area can be realized by performing a modulo operation on the address.
同样的,对于第二重循环区域也是一样,也是可以实现循环操作的过程。Similarly, the same is true for the second recirculation area, and it is also a process that can implement a cyclic operation.
可见,如果不想在一个第一重循环区域中进行循环操作,则在第一指针指向一个第一重循环区域中的最后一个子存储空间时,发送的指令最好是针对第二重循环区域的指令,这样,通过执行第二重循环区域的指令,自然就会移动到下一个第一重循环区域中,实现顺序执行。因此,这也就对指令的发送方(例如为编程人员)有了一定的要求,如果要实现顺序执行,则指令的发送方需要知道每次执行完毕后,第一指针所指向的地址所在的位置。 It can be seen that if the loop operation is not required in a first recirculation region, when the first pointer points to the last sub-memory space in the first re-circulation region, the instruction sent is preferably for the second re-circulation region. The instruction, in this way, naturally executes the instruction in the second recirculation region, and moves to the next first recirculation region to implement sequential execution. Therefore, this also has certain requirements for the sender of the instruction (for example, a programmer). If the sequence execution is to be performed, the sender of the instruction needs to know the address pointed to by the first pointer after each execution. position.
可选的,在本发明另一实施例中,得到下一次接收的操作指令对应的地址,包括:Optionally, in another embodiment of the present invention, obtaining an address corresponding to the next received operation instruction, including:
将第一指针增加一个第一步进值后指向的地址与第一重循环区域的长度进行取模操作,将得到的结果与第二指针指向的地址相加,并对相加后的结果与第二重循环区域的长度进行取模操作;Adding a first pointer to an address pointed to by the first step value and performing a modulo operation on the length of the first recirculation region, adding the obtained result to the address pointed by the second pointer, and adding the result to the first pointer The length of the second recirculation zone is subjected to a modulo operation;
将与第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulo the length of the second recirculation area is added to the initialization address of the cyclic storage area, and the address corresponding to the next received operation instruction is obtained.
在该实施例中,执行完第二操作指令后,可以得到新的I值和新的S值。在下一个时钟周期到来时,系统可以根据执行完第二操作指令后得到的新的I值和新的S值分别更新寄存器I和寄存器S的值,且可以根据更新后的寄存器的值得到本次操作指令对应的地址,那么显然,本次操作指令对应的地址其实是上次执行完操作指令后确定的。例如,系统在得到本次操作指令对应的地址时,可以根据公式(1)计算得到。In this embodiment, after the second operation instruction is executed, a new I value and a new S value can be obtained. When the next clock cycle arrives, the system can update the values of the register I and the register S according to the new I value and the new S value obtained after executing the second operation instruction, and can obtain the current value according to the updated register value. The address corresponding to the operation instruction, obviously, the address corresponding to the operation instruction is actually determined after the last execution of the operation instruction. For example, when the system obtains the address corresponding to this operation instruction, it can be calculated according to formula (1).
本发明实施例中,关于第二操作指令的接收及执行的过程可以发生在关于第一操作指令的接收及执行的过程之前,或者,关于第二操作指令的接收及执行的过程也可以发生在关于第一操作指令的接收及执行的过程之后。In the embodiment of the present invention, the process of receiving and executing the second operation instruction may occur before the process of receiving and executing the first operation instruction, or the process of receiving and executing the second operation instruction may also occur in the process. After the process of receiving and executing the first operational instruction.
例如,若关于第二操作指令的接收及执行的过程发生在关于第一操作指令的接收及执行的过程之前,且在第二操作指令执行完毕后系统就执行第一操作指令,那么对于第一操作指令来说,第二操作指令就是最近一次执行的操作指令,则第一地址与执行完第二操作指令后得到的下一次接收的操作指令对应的地址可以是同一地址。当然,若关于第二操作指令的接收及执行的过程发生在关于第一操作指令的接收及执行的过程之前,且在第二操作指令执行完毕后系统还执行了其他操作指令,之后才执行第一操作指令,则第一地址与执行完第二操作指令后得到的下一次接收的操作指令对应的地址不是同一地址。For example, if the process of receiving and executing the second operation instruction occurs before the process of receiving and executing the first operation instruction, and the system executes the first operation instruction after the execution of the second operation instruction, then for the first In the case of an operation instruction, the second operation instruction is the last execution operation instruction, and the address corresponding to the next received operation instruction obtained after the execution of the second operation instruction may be the same address. Of course, if the process of receiving and executing the second operation instruction occurs before the process of receiving and executing the first operation instruction, and after the execution of the second operation instruction, the system executes other operation instructions, and then executes the first An operation instruction, the first address is not the same address as the address corresponding to the next received operation instruction obtained after executing the second operation instruction.
例如,若关于第二操作指令的接收及执行的过程发生在关于第一操作指令的接收及执行的过程之后,且在第一操作指令执行完毕后系统就执行第二 操作指令,那么对于第二操作指令来说,第一操作指令就是最近一次执行的操作指令,则执行完第一操作指令后得到的下一次接收的操作指令对应的地址与第二地址可以是同一地址。当然,若关于第二操作指令的接收及执行的过程发生在关于第一操作指令的接收及执行的过程之后,且在第一操作指令执行完毕后系统还执行了其他操作指令,之后才执行第二操作指令,则执行完第一操作指令后得到的下一次接收的操作指令对应的地址与第二地址不是同一地址。For example, if the process of receiving and executing the second operation instruction occurs after the process of receiving and executing the first operation instruction, and the system executes the second after the execution of the first operation instruction The operation instruction, then, for the second operation instruction, the first operation instruction is the last execution operation instruction, and the address corresponding to the next received operation instruction obtained after the execution of the first operation instruction may be the same as the second address address. Of course, if the process of receiving and executing the second operation instruction occurs after the process of receiving and executing the first operation instruction, and after the execution of the first operation instruction, the system executes other operation instructions, and then executes the first The second operation instruction is that the address corresponding to the next received operation instruction obtained after the execution of the first operation instruction is not the same address as the second address.
以下介绍一个例子,关于第二操作指令的接收及执行的过程发生在关于第一操作指令的接收及执行的过程之前,且在第二操作指令执行完毕后系统就执行第一操作指令,即对于第一操作指令来说,第二操作指令就是最近一次执行的操作指令。The following describes an example in which the process of receiving and executing the second operation instruction occurs before the process of receiving and executing the first operation instruction, and after the execution of the second operation instruction, the system executes the first operation instruction, that is, In the case of the first operational command, the second operational command is the most recently executed operational command.
例如第二操作指令为用于读数据的指令,例如第二操作指令为针对第一重循环区域的RdI Src,第一指针为I,第一步进值为Inc。在接到第二操作指令时,对应的地址为第一个第一重循环区域中的第一个子存储空间所在的位置的地址,第一重循环区域例如共包括四个子存储空间。For example, the second operation instruction is an instruction for reading data, for example, the second operation instruction is RdI Src for the first recirculation area, the first pointer is I, and the first step value is Inc. When receiving the second operation instruction, the corresponding address is an address of a location where the first sub-memory space in the first first re-circulation area is located, and the first re-circulation area includes, for example, four sub-storage spaces.
接收第二操作指令后,执行第二操作指令,从第一个第一重循环区域中的第一个子存储空间所在的位置中读取数据,读取完毕后,将I增加Inc,则在下一个时钟周期到来时(例如为接收第一操作指令时),I会指向第一个第一重循环区域中的第二个子存储空间所在的位置,S不变,根据I+Inc、S和Start,通过公式(1)得到下一次接收的操作指令所对应的地址,例如称为地址1,地址1为第一个第一重循环区域中的第二个子存储空间的地址。After receiving the second operation instruction, executing the second operation instruction, reading data from the position of the first sub-memory space in the first first re-circulation area, and after adding, increasing I to Inc, then under When a clock cycle arrives (for example, when receiving the first operation instruction), I points to the location of the second sub-memory space in the first first re-circulation area, and S does not change, according to I+Inc, S, and Start. The address corresponding to the next received operation instruction is obtained by the formula (1), for example, called address 1, and the address 1 is the address of the second sub-memory space in the first first re-circulation area.
例如第一操作指令为用于读数据的指令,例如第一操作指令为针对第二重循环区域的RdS Src,第二指针为S,第二步进值为Stride。For example, the first operation instruction is an instruction for reading data, for example, the first operation instruction is RdS Src for the second re-circulation area, the second pointer is S, and the second step value is Stride.
接收第一操作指令后,根据执行第二操作指令后得到的新的I值更新寄存器I的值,得到第一操作指令对应的地址,即为第一个第一重循环区域中的第二个子存储空间的地址。执行第一操作指令,从第一个第一重循环区域中的第二个子存储空间所在的位置中读取数据,读取完毕后,将S增加Stride,及 将I增加Inc,在下一个时钟周期到来时,根据I+Inc、S+Stride和Start,通过公式(1)可以得到该次操作指令所对应的地址,例如称为地址2。After receiving the first operation instruction, updating the value of the register I according to the new I value obtained after executing the second operation instruction, and obtaining the address corresponding to the first operation instruction, that is, the second sub of the first first recirculation region The address of the storage space. Executing the first operation instruction, reading data from the position of the second sub-memory space in the first first re-circulation area, and after adding, increasing S to Stride, and Adding Inc to I, when the next clock cycle comes, according to I+Inc, S+Stride, and Start, the address corresponding to the operation instruction can be obtained by formula (1), for example, called address 2.
可选的,在本发明另一实施例中,在接收第一操作指令之前,还包括:Optionally, in another embodiment of the present invention, before receiving the first operation instruction, the method further includes:
通过执行专用指令,对第一重循环区域的长度、第二重循环区域的长度、第一步进值、第二步进值、及循环存储区域的起始地址进行初始化。其中,在对循环存储区域的起始地址进行初始化之后,得到的地址即为循环存储区域的初始化地址。The length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the start address of the cyclic storage region are initialized by executing a dedicated instruction. Wherein, after initializing the start address of the cyclic storage area, the obtained address is the initialization address of the cyclic storage area.
即,在执行操作指令之前,可以先对Start、CirLen、WinLen、Inc、Stride等参数进行初始化。That is, parameters such as Start, CirLen, WinLen, Inc, and Stride can be initialized before the operation instruction is executed.
该实施例中的专用指令例如可以是专用的配置指令,例如可以是如前所述的SetCR Src,本发明对此不作限制。The dedicated instruction in this embodiment may be, for example, a dedicated configuration instruction, such as SetCR Src as described above, which is not limited by the present invention.
以下举例介绍。The following examples are presented.
假设初始化后CR中的各参数的取值为:Assume that the values of the parameters in the CR after initialization are:
Start=0;Start=0;
CirLen=7;CirLen=7;
WinLen=4;WinLen=4;
Inc=1;Inc=1;
Stride=1。Stride=1.
假设需要执行一系列的读数据的指令,在执行时需要访问的寄存器的ID分别为:Suppose you need to execute a series of instructions to read data. The IDs of the registers that need to be accessed during execution are:
{0,1,2,3},{0,1,2,3},{1,2,3,4},{2,3,4,5},{3,4,5,6},{4,5,6,0},{5,6,0,1},……。{0,1,2,3}, {0,1,2,3},{1,2,3,4},{2,3,4,5},{3,4,5,6}, {4,5,6,0}, {5,6,0,1},......
假设第二重循环区域的范围是{0,1,2,3,4,5,6,7},前4个第一重循环区域的范围分别是{0,1,2,3},{1,2,3,4},{2,3,4,5},{3,4,5,6},下一个第一重循环区域的范围是{4,5,6,0},以此类推,即后面的第一重循环区域的范围分别为{5,6,0,1},{6,0,1,2},等等。Assuming that the range of the second recirculation region is {0, 1, 2, 3, 4, 5, 6, 7}, the ranges of the first four first recirculation regions are {0, 1, 2, 3}, respectively. 1,2,3,4},{2,3,4,5},{3,4,5,6}, the range of the next first recirculation region is {4,5,6,0}, Such a push, that is, the range of the first first recirculation region is {5, 6, 0, 1}, {6, 0, 1, 2}, and so on.
则请参见表1,可按照表1所示的方式执行一系列读数据的指令。 Then, refer to Table 1. A series of instructions for reading data can be executed in the manner shown in Table 1.
表1Table 1
Figure PCTCN2015091079-appb-000001
Figure PCTCN2015091079-appb-000001
从表1中可以看到,当I的值第一次为3时,下次接收的操作指令如果还是针对第一重循环区域的指令,例如为RdI Src指令,则I指向的地址增加Inc,根据公式(1),则又会得到寄存器0的地址,那么下次会再次访问寄存器0,实现了循环操作。It can be seen from Table 1 that when the value of I is 3 for the first time, if the next received operation instruction is still for the instruction of the first recirculation area, for example, the RdI Src instruction, the address pointed by I increases Inc, According to the formula (1), the address of the register 0 is obtained again, and the register 0 is accessed again next time, and the loop operation is realized.
而当I的值第二次为3时,下次接收的指令是针对第二重循环区域的指令, 例如为RdS Src,则S指向的地址增加Stride,从0变为1,I指向的地址也增加Inc,从3变为0,根据公式(1),就从第一个第一重循环区域跳到了第二个第二重循环区域。表1的其他内容以此类推,不多赘述。When the value of I is 3 for the second time, the next received instruction is an instruction for the second recirculation area. For example, for RdS Src, the address pointed to by S increases Stride, from 0 to 1, and the address pointed to by I also increases Inc, from 3 to 0. According to formula (1), it jumps from the first first recirculation region. Go to the second second heavy loop area. The other contents of Table 1 are deduced by analogy and will not be repeated.
可选的,在本发明另一实施例中,所述方法还可以包括:Optionally, in another embodiment of the present invention, the method may further include:
接收第三操作指令;Receiving a third operation instruction;
执行第三操作指令,对第三操作指令携带的目的地址指向的位置进行数据操作。The third operation instruction is executed, and the data operation is performed on the position pointed by the destination address carried by the third operation instruction.
本发明实施例中,第三操作指令可以是非循环类的指令(普通立即数索引类型的指令),第三操作指令可以是读数据的指令,例如可以是如前所述的Rd Dest Src,或者第三操作指令可以是写数据的指令,例如可以是如前所述的Wr Ri Src。第三操作指令中可以携带目的地址,则系统可以直接对第三操作指令携带的目的地址所指向的位置进行数据操作,可以是读操作或写操作。In the embodiment of the present invention, the third operation instruction may be an acyclic type instruction (a normal immediate index type instruction), and the third operation instruction may be an instruction to read data, for example, may be a Rd Dest Src as described above, or The third operational instruction may be an instruction to write data, such as Wr Ri Src as previously described. The third operation instruction may carry the destination address, and the system may directly perform data operation on the location pointed by the destination address carried by the third operation instruction, which may be a read operation or a write operation.
本发明实施例中,存储器101中包括第一重循环区域和第二重循环区域,在接收针对第二重循环区域的操作指令(称为第一操作指令)时,除了执行第一操作指令外,还可以通过第二重循环区域对应的第二步进值修改第一重循环区域的起始地址,在下一个时钟周期到来时就可以自动得到该次接收的操作指令对应的地址,即下一次再接收到针对第一重循环区域或第二重循环区域的操作指令时,可以直接根据上次得到的地址进行操作,无需在操作指令中携带目的地址,减少了指令的代码量,无需耗费大量的存储空间用于存储代码,同时也减轻了编程人员的编程负担。In the embodiment of the present invention, the memory 101 includes a first re-circulation area and a second re-circulation area. When receiving an operation instruction (referred to as a first operation instruction) for the second re-circulation area, in addition to executing the first operation instruction, The start address of the first recirculation area may be modified by the second step value corresponding to the second recirculation area, and the address corresponding to the operation instruction received by the time may be automatically obtained when the next clock cycle arrives, that is, the next time When receiving an operation instruction for the first re-circulation area or the second re-circulation area, the operation can be directly performed according to the last obtained address, without carrying the destination address in the operation instruction, thereby reducing the code amount of the instruction, and does not need to consume a large amount. The storage space is used to store code while also reducing the programming burden on the programmer.
且本发明实施例提供两重循环区域(即第一重循环区域和第二重循环区域),第一重循环区域可以在第二重循环区域中循环移动,这样可以实现自动循环寻址,存储区域的使用方式较为灵活,也能够提高存储区域的利用率。The embodiment of the present invention provides a double loop area (ie, a first recirculation area and a second recirculation area), and the first recirculation area can be cyclically moved in the second recirculation area, so that automatic loop addressing can be realized and stored. The use of the area is more flexible and can also improve the utilization of the storage area.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,仅以上述各功能单元的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元完成,即将装置的内部结构划分成不同的功能单元,以完成以上描述的全部或者部分功能。上述描述的系统,装置和单元的具体 工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。It will be clearly understood by those skilled in the art that for the convenience and brevity of the description, only the division of each functional unit described above is exemplified. In practical applications, the above function assignment can be completed by different functional units as needed. The internal structure of the device is divided into different functional units to perform all or part of the functions described above. The specifics of the systems, devices and units described above For the working process, refer to the corresponding process in the foregoing method embodiment, and details are not described herein again.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided by the present application, it should be understood that the disclosed system, apparatus, and method may be implemented in other manners. For example, the device embodiments described above are merely illustrative. For example, the division of the unit or unit is only a logical function division. In actual implementation, there may be another division manner, for example, multiple units or components may be used. Combinations can be integrated into another system, or some features can be ignored or not executed. In addition, the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or unit, and may be in an electrical, mechanical or other form.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically separately, or two or more units may be integrated into one unit. The above integrated unit can be implemented in the form of hardware or in the form of a software functional unit.
所述集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)或processor(处理器)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a standalone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application, in essence or the contribution to the prior art, or all or part of the technical solution may be embodied in the form of a software product stored in a storage medium. A number of instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) or a processor to execute all or part of the steps of the methods described in various embodiments of the present application. The foregoing storage medium includes various media that can store program codes, such as a USB flash drive, a mobile hard disk, a ROM, a RAM, a magnetic disk, or an optical disk.
以上所述,以上实施例仅用以对本申请的技术方案进行了详细介绍,但以上实施例的说明只是用于帮助理解本发明的方法及其核心思想,不应理解为对本发明的限制。本技术领域的技术人员在本发明揭露的技术范围内,可 轻易想到的变化或替换,都应涵盖在本发明的保护范围之内。 The above embodiments are only used to describe the technical solutions of the present application in detail, but the description of the above embodiments is only for helping to understand the method and the core idea of the present invention, and should not be construed as limiting the present invention. Those skilled in the art can be within the technical scope of the present disclosure. Changes or substitutions that are conceivable are intended to be included within the scope of the invention.

Claims (16)

  1. 一种生成地址的方法,其特征在于,包括:A method for generating an address, comprising:
    接收针对存储器中的第二重循环区域的第一操作指令;Receiving a first operational instruction for a second recirculation region in the memory;
    执行所述第一操作指令,并通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址;其中,所述第一重循环区域根据所述第二步进值在所述第二重循环区域中循环移动。Executing the first operation instruction, and modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, to obtain an operation instruction corresponding to the next reception An address; wherein the first recirculation region cyclically moves in the second recirculation region according to the second step value.
  2. 如权利要求1所述的方法,其特征在于,执行所述第一操作指令,并通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址,包括:The method of claim 1 wherein said first operational command is executed and said first recirculation region in said memory is modified by a second step value corresponding to said second recirculation region Start address to get the address corresponding to the next received operation command, including:
    执行所述第一操作指令,通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以及通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the first operation instruction, modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, and corresponding to the first recirculation area The first step value modifies an internal offset address of the first recirculation region to obtain an address corresponding to an operation instruction received next time.
  3. 如权利要求2所述的方法,其特征在于,The method of claim 2 wherein
    通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,包括:Modifying a starting address of the first re-circulation area in the memory by using a second step value corresponding to the second re-circulation area, including:
    通过所述第二重循环区域对应的第二步进值修改所述第二重循环区域对应的第二指针指向的地址;Modifying, by the second step value corresponding to the second recirculation region, an address pointed by the second pointer corresponding to the second recirculation region;
    通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,包括:Modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area, including:
    通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址。And modifying, by the first step value corresponding to the first recirculation region, an address pointed by the first pointer corresponding to the first recirculation region.
  4. 如权利要求3所述的方法,其特征在于,得到下一次接收的操作指令对应的地址,包括:The method of claim 3, wherein obtaining an address corresponding to the next received operation instruction comprises:
    将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循 环区域的长度进行取模操作,及将所述第二指针增加一个所述第二步进值后指向的地址与所述第二重循环区域的长度进行取模操作;Adding the first pointer to an address pointed by the first step value and the first re-circulation The length of the ring region is subjected to a modulo operation, and the modulo operation is performed by adding an address pointed by the second pointer to the second step value and a length of the second recirculation region;
    将得到的两个取模的结果相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the obtained results of the two modulo, and performing a modulo operation on the added result and the length of the second recirculation region;
    将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  5. 如权利要求1-4任一所述的方法,其特征在于,所述方法还包括:The method of any of claims 1-4, wherein the method further comprises:
    接收针对所述第一重循环区域的第二操作指令;Receiving a second operation instruction for the first recirculation region;
    执行所述第二操作指令,并通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the second operation instruction, and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
  6. 如权利要求5所述的方法,其特征在于,通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址,包括:The method according to claim 5, wherein the internal offset address of the first recirculation region is modified by a first step value corresponding to the first recirculation region to obtain an operation instruction received next time The corresponding address, including:
    通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址,以得到下一次接收的操作指令对应的地址。The address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
  7. 如权利要求6所述的方法,其特征在于,得到下一次接收的操作指令对应的地址,包括:The method according to claim 6, wherein the address corresponding to the next received operation instruction is obtained, including:
    将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,将得到的结果与所述第二指针指向的地址相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the first pointer to an address pointed by the first step value and performing a modulo operation on the length of the first recirculation region, and adding the obtained result to the address pointed by the second pointer, And performing a modulo operation on the added result and the length of the second recirculation region;
    将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  8. 如权利要求1-7任一所述的方法,其特征在于,在接收第一操作指令之前,还包括:The method according to any one of claims 1-7, further comprising: before receiving the first operation instruction, further comprising:
    通过执行专用指令,对所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一重循环区域对应的第一步进值、所述第二重循环区域对 应的第二步进值、及循环存储区域的起始地址进行初始化。The length of the first recirculation region, the length of the second recirculation region, the first step value corresponding to the first recirculation region, and the second recirculation region pair are performed by executing a dedicated instruction The second step value and the start address of the cyclic storage area are initialized.
  9. 一种数据处理设备,其特征在于,包括:A data processing device, comprising:
    存储器,包括第一重循环区域和第二重循环区域;a memory including a first recirculation area and a second recirculation area;
    控制单元,与所述存储器连接,用于接收针对存储器中的第二重循环区域的第一操作指令;执行所述第一操作指令,并通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以得到下一次接收的操作指令对应的地址;其中,所述第一重循环区域根据所述第二步进值在所述第二重循环区域中循环移动。a control unit, coupled to the memory, for receiving a first operation instruction for a second re-circulation area in the memory; executing the first operation instruction, and passing the second step corresponding to the second re-circulation area The value modifies a start address of the first re-circulation area in the memory to obtain an address corresponding to an operation instruction received next time; wherein the first re-circulation area is in the first step according to the second step value The cyclic movement in the double loop area.
  10. 如权利要求9所述的设备,其特征在于,所述控制单元用于:The device of claim 9 wherein said control unit is operative to:
    执行所述第一操作指令,通过所述第二重循环区域对应的第二步进值修改所述存储器中的第一重循环区域的起始地址,以及通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the first operation instruction, modifying a start address of the first recirculation area in the memory by using a second step value corresponding to the second recirculation area, and corresponding to the first recirculation area The first step value modifies an internal offset address of the first recirculation region to obtain an address corresponding to an operation instruction received next time.
  11. 如权利要求10所述的设备,其特征在于,所述控制单元用于:The device according to claim 10, wherein said control unit is configured to:
    通过所述第二重循环区域对应的第二步进值修改所述第二重循环区域对应的第二指针指向的地址;及,Modifying, by the second step value corresponding to the second recirculation region, an address pointed by the second pointer corresponding to the second recirculation region; and
    通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址。And modifying, by the first step value corresponding to the first recirculation region, an address pointed by the first pointer corresponding to the first recirculation region.
  12. 如权利要求11所述的设备,其特征在于,所述控制单元用于:The device according to claim 11, wherein said control unit is configured to:
    将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,及将所述第二指针增加一个所述第二步进值后指向的地址与所述第二重循环区域的长度进行取模操作;Adding a first pointer to an address pointed by the first step value to perform a modulo operation with a length of the first recirculation region, and adding the second pointer to the second step value Performing a modulo operation on the address pointed to and the length of the second recirculation region;
    将得到的两个取模的结果相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the obtained results of the two modulo, and performing a modulo operation on the added result and the length of the second recirculation region;
    将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  13. 如权利要求9-12任一所述的设备,其特征在于,所述控制单元还用 于:Apparatus according to any of claims 9-12, wherein said control unit is further to:
    接收针对所述第一重循环区域的第二操作指令;Receiving a second operation instruction for the first recirculation region;
    执行所述第二操作指令,并通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域的内部偏移地址,以得到下一次接收的操作指令对应的地址。Executing the second operation instruction, and modifying an internal offset address of the first re-circulation area by using a first step value corresponding to the first re-circulation area to obtain an address corresponding to an operation instruction received next time.
  14. 如权利要求13所述的设备,其特征在于,所述控制单元用于:The device of claim 13 wherein said control unit is operative to:
    通过所述第一重循环区域对应的第一步进值修改所述第一重循环区域对应的第一指针指向的地址,以得到下一次接收的操作指令对应的地址。The address pointed by the first pointer corresponding to the first re-circulation area is modified by the first step value corresponding to the first re-circulation area to obtain an address corresponding to the next received operation instruction.
  15. 如权利要求14所述的设备,其特征在于,所述控制单元用于:The device according to claim 14, wherein said control unit is adapted to:
    将所述第一指针增加一个所述第一步进值后指向的地址与所述第一重循环区域的长度进行取模操作,将得到的结果与所述第二指针指向的地址相加,并对相加后的结果与所述第二重循环区域的长度进行取模操作;Adding the first pointer to an address pointed by the first step value and performing a modulo operation on the length of the first recirculation region, and adding the obtained result to the address pointed by the second pointer, And performing a modulo operation on the added result and the length of the second recirculation region;
    将与所述第二重循环区域的长度取模后得到的结果与循环存储区域的初始化地址相加,得到下一次接收的操作指令对应的地址。The result obtained by modulating the length of the second recirculation area is added to the initialization address of the cyclic storage area to obtain an address corresponding to the next received operation instruction.
  16. 如权利要求9-15任一所述的设备,其特征在于,所述设备还包括控制寄存器,用于存储所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一重循环区域对应的第一步进值、所述第二重循环区域对应的第二步进值、及循环存储区域的起始地址;所述控制单元用于:The device according to any one of claims 9-15, wherein the device further comprises a control register for storing a length of the first recirculation region, a length of the second recirculation region, the a first step value corresponding to the first recirculation region, a second step value corresponding to the second recirculation region, and a start address of the cyclic storage region; the control unit is configured to:
    通过执行专用指令,对所述控制寄存器中存储的所述第一重循环区域的长度、所述第二重循环区域的长度、所述第一步进值、所述第二步进值、及所述循环存储区域的起始地址进行初始化。 The length of the first recirculation region, the length of the second recirculation region, the first step value, the second step value, and the stored in the control register by executing a dedicated instruction The start address of the cyclic storage area is initialized.
PCT/CN2015/091079 2015-09-29 2015-09-29 Method for generating address and data processing device WO2017054132A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2015/091079 WO2017054132A1 (en) 2015-09-29 2015-09-29 Method for generating address and data processing device
CN201580001436.5A CN107580700B (en) 2015-09-29 2015-09-29 Address generating method and data processing equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2015/091079 WO2017054132A1 (en) 2015-09-29 2015-09-29 Method for generating address and data processing device

Publications (1)

Publication Number Publication Date
WO2017054132A1 true WO2017054132A1 (en) 2017-04-06

Family

ID=58422614

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091079 WO2017054132A1 (en) 2015-09-29 2015-09-29 Method for generating address and data processing device

Country Status (2)

Country Link
CN (1) CN107580700B (en)
WO (1) WO2017054132A1 (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1437112A (en) * 2002-02-07 2003-08-20 旺宏电子股份有限公司 Circularly addressing method and system with effective memory
CN101354641A (en) * 2008-08-20 2009-01-28 炬力集成电路设计有限公司 Access control method and device of external memory
US20100199064A1 (en) * 2009-02-05 2010-08-05 Anderson Timothy D Fast Address Translation for Linear and Circular Modes
CN103365821A (en) * 2013-06-06 2013-10-23 北京时代民芯科技有限公司 Address generator of heterogeneous multi-core processor

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US3797002A (en) * 1972-11-16 1974-03-12 Ibm Dynamically double ordered shift register memory
US6401196B1 (en) * 1998-06-19 2002-06-04 Motorola, Inc. Data processor system having branch control and method thereof

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1437112A (en) * 2002-02-07 2003-08-20 旺宏电子股份有限公司 Circularly addressing method and system with effective memory
CN101354641A (en) * 2008-08-20 2009-01-28 炬力集成电路设计有限公司 Access control method and device of external memory
US20100199064A1 (en) * 2009-02-05 2010-08-05 Anderson Timothy D Fast Address Translation for Linear and Circular Modes
CN103365821A (en) * 2013-06-06 2013-10-23 北京时代民芯科技有限公司 Address generator of heterogeneous multi-core processor

Also Published As

Publication number Publication date
CN107580700B (en) 2020-10-09
CN107580700A (en) 2018-01-12

Similar Documents

Publication Publication Date Title
JP5987233B2 (en) Apparatus, method, and system
US10229089B2 (en) Efficient hardware instructions for single instruction multiple data processors
US11630800B2 (en) Programmable vision accelerator
KR102470264B1 (en) Apparatus and method for performing reverse training of a fully-connected layer neural network
US9727341B2 (en) Control flow in a thread-based environment without branching
KR101723121B1 (en) Vector move instruction controlled by read and write masks
US20150012723A1 (en) Processor using mini-cores
US20140013078A1 (en) Efficient hardware instructions for single instruction multiple data processors
CN107766079B (en) Processor and method for executing instructions on processor
JP6571752B2 (en) Parallelization of scalar operations by vector processors using data indexing accumulators in vector register files, related circuits, methods and computer readable media
US11182207B2 (en) Pre-fetching task descriptors of dependent tasks
WO2017185336A1 (en) Apparatus and method for executing pooling operation
KR20160130324A (en) Instruction for shifting bits left with pulling ones into less significant bits
WO2017185384A1 (en) Apparatus and method for executing vector circular shift operation
WO2017185392A1 (en) Device and method for performing four fundamental operations of arithmetic of vectors
WO2017185404A1 (en) Apparatus and method for performing vector logical operation
US20150254116A1 (en) Data processing apparatus for pipeline execution acceleration and method thereof
CN106062814B (en) Improved banked memory access efficiency by a graphics processor
WO2017185419A1 (en) Apparatus and method for executing operations of maximum value and minimum value of vectors
JP5231949B2 (en) Semiconductor device and data processing method using semiconductor device
WO2017054132A1 (en) Method for generating address and data processing device
CN107329733B (en) Apparatus and method for performing posing operations
US9189448B2 (en) Routing image data across on-chip networks
US8631173B2 (en) Semiconductor device
US10366049B2 (en) Processor and method of controlling the same

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15905044

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 15905044

Country of ref document: EP

Kind code of ref document: A1