WO2009157441A1 - Dispositif de traitement de données, dispositif de traitement d’informations et procédé de traitement d’informations - Google Patents

Dispositif de traitement de données, dispositif de traitement d’informations et procédé de traitement d’informations Download PDF

Info

Publication number
WO2009157441A1
WO2009157441A1 PCT/JP2009/061401 JP2009061401W WO2009157441A1 WO 2009157441 A1 WO2009157441 A1 WO 2009157441A1 JP 2009061401 W JP2009061401 W JP 2009061401W WO 2009157441 A1 WO2009157441 A1 WO 2009157441A1
Authority
WO
WIPO (PCT)
Prior art keywords
tile
tiles
netlist
configuration information
unit
Prior art date
Application number
PCT/JP2009/061401
Other languages
English (en)
Japanese (ja)
Inventor
武 犬尾
亨 粟島
Original Assignee
日本電気株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 日本電気株式会社 filed Critical 日本電気株式会社
Priority to JP2010518020A priority Critical patent/JPWO2009157441A1/ja
Publication of WO2009157441A1 publication Critical patent/WO2009157441A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F15/00Digital computers in general; Data processing equipment in general
    • G06F15/16Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs

Definitions

  • the present invention relates to a data processing device having context transition type reconfigurable hardware that performs desired processing by switching configuration information (context) every clock cycle, an information processing device that generates an object code including the context, and information And a processing method.
  • CPU Central Processing Unit
  • This array type processor includes a state management unit and a data path unit, and the data path unit includes a processor element and a programmable wiring. The operation is performed while switching the processing to be performed in the data path unit every clock cycle.
  • an array type processor having a plurality of state management units is described in Japanese Patent No. 39877782, Japanese Patent No. 3987778, and Japanese Patent No. 3987784.
  • an array type processor having a plurality of state management units when the data path unit managed by each state management unit performs processing in synchronization, the same state transition information is sent to the state management unit that operates synchronously.
  • Each state management unit receives event data generated from any of a plurality of data path units operating in synchronization, performs state transition and circuit switching, and operates.
  • JP3921367B a technology for generating an object code that is a program for the array type processor is described in Japanese Patent No. 3911367 (hereinafter referred to as JP3921367B).
  • the state management unit and its data path can be operated with different clocks.
  • data is exchanged between different clock domains via one of circuits such as a clock transfer circuit, FIFO, and 2-port memory. .
  • a clock generation circuit that generates clocks of various frequencies from a single or a plurality of master clocks is known, and an appropriate clock can be used without correcting the duty ratio.
  • a simple clock thinning circuit that only thins out pulses is also known.
  • Reconfigurable hardware that operates while switching circuits every clock cycle is hereinafter referred to as context transition type reconfigurable hardware. Further, the processing content that operates in one clock cycle is referred to as a context.
  • the one clock cycle mentioned here includes not only the case of one clock of the master clock signal but also the case where two or more master clock signals are set to one clock.
  • FIG. 1 is a block diagram showing a configuration example of related context transition type reconfigurable hardware.
  • the context transition type reconfigurable hardware 15 includes a control unit 100 and a calculation unit 200.
  • the control unit 100 includes a state number holding memory 102 that stores a current state number that is a number indicating a current state, a context number holding memory 103 that stores a context number being executed, and a next transition state number.
  • a transition table 101 that outputs a state number and a next context number.
  • the context number holding memory 103 has an input side connected to the transition table 101 and an output side connected to the context selection unit 214 of the arithmetic unit 200.
  • the transition table 101 obtains the next state number from the current state number and the event signal and obtains the state number.
  • the next context number is obtained from the current state number and the event signal and output to the context number holding memory 103.
  • the next context number is input from the context number holding memory 103 to the context selection unit 214 of the calculation unit 200.
  • the arithmetic unit 200 includes an arithmetic processing unit 212 including a large number of arithmetic units and programmable wiring, an object code storage unit 210 storing an object code including a plurality of contexts 211, and a context for selecting one from the plurality of contexts 211.
  • the context is configuration information including processing instruction information for the arithmetic unit in the arithmetic processing unit 212 and wiring information indicating how to connect the programmable wiring for processing contents operating in one cycle.
  • FIG. 2 is a schematic diagram for explaining the operation of the context transition type reconfigurable hardware shown in FIG. 1 .
  • the state number holding memory 102 of the control unit 100 changes to the next state number
  • the next context number is input from the transition table 101 to the context selection unit 214 of the arithmetic unit 200.
  • the context number selected by the context selection unit 214 changes, the newly selected context number is instruction-decoded, and the processing content in the arithmetic processing unit 212 changes.
  • the arithmetic processing unit 212 finishes the arithmetic operation based on the new context, the arithmetic processing unit 212 transmits an event signal to the transition table 101 of the control unit 100.
  • the state number holding memory 102 makes a state transition to the next state number and repeats the above operation.
  • the state transition in the control unit 100 occurs at the update timing of the numbers stored in the state number holding memory 102 and the context number holding memory 103, respectively.
  • a buffer 300 including any one of circuits such as a clock transfer circuit, a FIFO, and a 2-port memory is usually provided between context transition type reconfigurable hardware. It has been.
  • the buffer 300 exchanges data between the context transition type reconfigurable hardware.
  • the reason for providing the buffer 300 is to facilitate data exchange between context transition type reconfigurable hardware that operates under different controls.
  • Control here refers to the state transition of processing executed by object code consisting of multiple contexts. That is, the context transition type reconfigurable hardware that operates under different control means that the processing executed in the context and its state transition are different.
  • an inter-tile buffer 300 is provided between the tile 27A and the tile 28B.
  • the operating frequency of the clock 4001A of the tile 27A Fa
  • the throughput of the tile 27A (processing performance per unit time)
  • the operating frequency of the clock 4002B of the tile 22B Fb
  • the throughput of the tile 28B Pb.
  • FIG. 4 is a diagram illustrating a case where there are three tiles.
  • an inter-tile buffer 301 is provided between the tile 27A and the tile 28B
  • an inter-tile buffer 302 is provided between the tile 28B and the tile 29C.
  • the tile 27A and the tile 29C operate based on the clock 4001A
  • the tile 28B operates based on the clock 4002B.
  • a tile 28B that operates under different control exists between the tile 27A and tile 29C that operate under the same control, and the tile 27A and the tile 29C exchange data via the tile 28B. become.
  • clock transfer occurs in the inter-tile buffers 301 and 302, so that latency occurs in data exchange between the tile 27A and the tile 29C. .
  • FIG. 5 is a diagram illustrating another configuration example in the case of three tiles.
  • a dedicated wiring 400 for directly connecting the tile 27A and the tile 29C is provided.
  • the data can be directly exchanged via the dedicated wiring 400 without passing through the inter-tile buffers 301 and 302.
  • tiles that operate under the same control need to be connected by dedicated wires, and the number of dedicated wires increases in proportion to the number of tiles provided, and the area of these dedicated wires increases. Become. As a result, the area of the entire data processing apparatus increases.
  • An example of an object of the present invention is to provide a data processing device capable of exchanging good data between tiles operating with different clock signals, and an information processing device and an information processing method for generating an object code to be provided to the data processing device. Is to provide.
  • a data processing device includes a plurality of arithmetic units and a plurality of programmable wirings, and stores a plurality of pieces of configuration information for setting operation contents of the plurality of arithmetic units and the plurality of programmable wirings.
  • the configuration information is switched each time a signal is input, reconfigurable hardware that executes arithmetic processing by setting a plurality of arithmetic units and a plurality of programmable wirings according to the configuration information, and a clock based on the master clock corresponding to the configuration information
  • a plurality of tiles including a clock thinning unit that inputs a clock signal obtained by thinning an arbitrary clock pulse from a signal or a master clock to reconfigurable hardware, and programmable wirings of adjacent tiles provided between the tiles. Between the tiles connecting the two.
  • the information processing apparatus is provided with a plurality of arithmetic units and a plurality of programmable wires, and holds a plurality of pieces of configuration information for setting operation contents of the plurality of arithmetic units and the plurality of programmable wires.
  • a data processing apparatus having a plurality of tiles including reconfigurable hardware that switches configuration information each time a clock signal is input, sets a plurality of arithmetic units and a plurality of programmable wirings according to the configuration information, and executes arithmetic processing Information processing apparatus for generating an object code composed of a plurality of pieces of configuration information, including first and second original nets having different operation frequencies, including setting contents of arithmetic processing executed by reconfigurable hardware When the list is input, the first original net list is included in the third and fourth net lists and the relay wiring setting.
  • a net list dividing unit that divides the subnet list into a net list
  • a net list integrating unit that generates a fifth net list by integrating the subnet list and the second original net list, and the same operating frequency among the plurality of tiles.
  • the third netlist is composed of the first and second tiles and the third tile sandwiched between the two tiles and having a different operating frequency from the two tiles.
  • a processing arrangement for generating an object code including information indicating that the fourth netlist corresponds to the configuration information of the second tile and the fifth netlist corresponds to the configuration information of the third tile. Part.
  • the information processing method of one aspect of the present invention is provided with a plurality of arithmetic units and a plurality of programmable wires, and holds a plurality of pieces of configuration information for setting operation contents of the plurality of arithmetic units and the plurality of programmable wires.
  • a data processing apparatus having a plurality of tiles including reconfigurable hardware that switches configuration information each time a clock signal is input, sets a plurality of arithmetic units and a plurality of programmable wirings according to the configuration information, and executes arithmetic processing
  • the subnet list and the second original net list are integrated to generate a fifth net list, and among the plurality of tiles, the first and second tiles having the same operating frequency;
  • the third netlist is associated with the configuration information of the first tile for the third tile that is sandwiched between the two tiles and has a different operating frequency from the two tiles. Is generated in correspondence with the configuration information of the second tile, and an object code including information indicating that the fifth netlist is associated with the configuration information of the third tile is generated.
  • FIG. 1 is a block diagram showing a configuration example of related context transition type reconfigurable hardware.
  • FIG. 2 is a schematic diagram for explaining the operation of the context transition type reconfigurable hardware shown in FIG.
  • FIG. 3 is a diagram illustrating an example of connection between tiles.
  • FIG. 4 is a diagram showing a case where there are three tiles.
  • FIG. 5 is a diagram showing another configuration example in the case of three tiles.
  • FIG. 6 is a block diagram illustrating a configuration example of the data processing apparatus according to the first embodiment.
  • FIG. 7 is a block diagram showing an example of the configuration of tiles in the data processing apparatus shown in FIG.
  • FIG. 8 is a diagram for explaining the operation of the data processing apparatus according to the first embodiment.
  • FIG. 1 is a block diagram showing a configuration example of related context transition type reconfigurable hardware.
  • FIG. 2 is a schematic diagram for explaining the operation of the context transition type reconfigurable hardware shown in FIG.
  • FIG. 3 is a diagram illustrating an example of connection between tiles
  • FIG. 9 is a diagram for explaining the operation of the data processing apparatus according to the first embodiment.
  • FIG. 10 is a diagram illustrating an example of state transition of the context transition type reconfigurable hardware of each tile when the clock thinning unit of each tile is operated by thinning the clock.
  • FIG. 11 is a diagram schematically showing the processing contents of the data processing device at the respective timings T101, T102, T103 and T104 shown in FIG.
  • FIG. 12A is a block diagram illustrating a configuration example of the net list conversion apparatus.
  • FIG. 12B is a block diagram for explaining a net list generation unit in the net list conversion apparatus.
  • FIG. 13 is a block diagram showing a specific configuration example of the net list conversion apparatus shown in FIG. 12A.
  • FIG. 14 is a diagram showing an object code generation procedure.
  • FIG. 14 is a diagram showing an object code generation procedure.
  • FIG. 15 is a diagram for explaining a method for generating a netlist corresponding to an object code.
  • FIG. 16 is a block diagram illustrating a configuration example of the data processing apparatus according to the second embodiment.
  • FIG. 17 is a block diagram showing an example of the configuration of tiles in the data processing apparatus shown in FIG.
  • FIG. 18 is a diagram illustrating another configuration example of the inter-tile wiring.
  • FIG. 6 is a block diagram illustrating a configuration example of the data processing apparatus according to the present embodiment.
  • the data processing apparatus 1 includes an I / F (interface) unit 2 to which an object code is input from the outside, a tile array 5 in which a plurality of tiles 20 are provided, and an object code to the tile 20. It has a data distribution unit 3 that distributes, a clock signal generation unit 7 that supplies a clock signal to the tile 20, and a data input / output unit 9.
  • the data input / output unit 9 is an input / output device such as an FDD (Floppy Disk Drive) or a CD drive.
  • the I / F unit 2 converts data received from the outside via a communication line (not shown) into a format that can be processed by the data processing apparatus 1, and converts data sent from the data processing apparatus 1 to the outside. Convert to a format that can be transmitted (not shown).
  • an FDD or CD drive may be used.
  • FIG. 7 is a block diagram showing an example of the configuration of tiles in the data processing apparatus shown in FIG.
  • FIG. 7 is an enlarged view of the broken line portion of FIG.
  • the tile 20 includes a context transition type configurable hardware 10 including a control unit 100 and a calculation unit 200, and a clock thinning unit 500.
  • An inter-tile buffer 300 is provided between the tiles, and adjacent tiles are connected via the inter-tile buffer 300. Further, an inter-tile wiring 600 for connecting adjacent tiles is also provided between the tiles.
  • control unit 100 and the calculation unit 200 in the present embodiment have the same configuration as described in FIG. Therefore, detailed description thereof is omitted here.
  • Information on the context number output from the context number holding memory 103 is input to the context selection unit 214.
  • the context 211 executed by the arithmetic unit 200 is switched every clock cycle.
  • the clock thinning unit 500 of each tile 20 is connected to the clock signal generation unit 7. Information on the operating frequency set in the context transition type configurable hardware 10 is input to the clock thinning unit 500 via the object code.
  • a master clock signal is supplied from the clock signal generation unit 7 to the clock thinning unit 500 of each tile 20.
  • the clock thinning unit 500 receives the master clock signal 4000 from the clock signal generation unit 7, the clock thinning unit 500 thins out the clock pulse in accordance with the operating frequency set in the context, and the context transition type reconfigurable hardware 10 to send.
  • the inter-tile buffer 300 includes at least one of a clock transfer circuit, a FIFO, and a 2-port memory circuit.
  • the inter-tile buffer 300 is connected to the programmable wiring in the arithmetic processing unit 212 of the arithmetic unit 200, and relays data transmission between adjacent tiles.
  • the inter-tile wiring 600 is connected to the programmable wiring in the arithmetic processing unit 212 of the arithmetic unit 200 and directly connects the programmable wirings of adjacent tiles 20.
  • the data input from the data input / output unit 9 is processed by each calculation unit 200 in the tile array 5.
  • the result of the arithmetic processing in the tile array 5 is output via the data input / output unit 9.
  • the operation is an operation after the object code is registered in the data processing device 1.
  • FIG 8 and 9 are diagrams for explaining the operation of the data processing apparatus according to this embodiment. In order to simplify the description, it is assumed that three tiles 21A, 22B, and 23C among the tiles provided in the tile array 5 are used.
  • the upper diagram in FIG. 9 schematically shows circuits and wirings set by context in each tile of the data processing apparatus 1 of the present embodiment.
  • An ellipse shown in each context represents a state where a circuit for a predetermined calculation process is set in the calculation processing unit 212.
  • Curves and arrows shown in each context represent wiring set in the arithmetic processing unit 212.
  • a point shown in each context represents a terminal for connecting to an adjacent tile via the inter-tile wiring 600.
  • the tile 21A and the tile 23C are operated by the clock 4001A generated by the clock thinning unit 500 in each tile. It is assumed that the tile 22B operates with the clock 4002B generated by the clock thinning unit 500 in the tile. For this reason, the tile 21A and the tile 23C have the same operating frequency, operate under the same control, and exchange data with each other. The tile 22B operates under control different from that of the other tiles 21A and 23C.
  • the context A1-1 and context A1-2 of the object code A1 are the context of the tile 21A.
  • the context B1-1, context B1-2, and context B1-3 of the object code B1 are the context of the tile 22B.
  • the context A2-1 and context A2-2 of the object code A2 are the context of the tile 23C.
  • the tile 21A and the tile 23C operate in synchronization, when the tile 21A performs the process of the context A1-1, the tile 23C performs the process of the context A2-1.
  • Each of the contexts B1-1 to B1-3 registered in the tile 22B includes the following information in addition to the circuit information for setting the arithmetic processing to be executed by the tile 22B in the arithmetic processing unit 212. It is.
  • the information means that a terminal connected to the tile 21A via the inter-tile wiring 600 and a terminal connected to the tile 23C via the inter-tile wiring 600 are connected in the tile 22B, and these terminals are equipotential. This is information for setting such a relay wiring as a programmable wiring of the arithmetic processing unit 212.
  • the position of the terminal for connecting to the relay wiring are the same.
  • the tiles 21A and 23C sandwiching the tile 22B are also set so that the position of the terminal for connecting to the relay wiring of the tile 22B does not change even if the context changes.
  • the tile 21A and the tile 23C can exchange data via the tile 22B by using the relay wiring set in the tile 22B.
  • FIG. 10 is a diagram illustrating an example of state transition of the context transition type reconfigurable hardware of each tile when the clock thinning unit of each tile is operated by thinning the clock.
  • the clock 4002B input to the tile 22B is thinned by one clock pulse with respect to the master clock 4000.
  • the clock 4002B is thinned by two clock pulses with respect to the master clock 4000.
  • FIG. 11 is a diagram schematically showing the processing contents of the data processing device at each timing of T101, T102, T103, and T104 shown in FIG. 11A shows the time at T101, FIG. 11B shows the time at T102, FIG. 11C shows the time at T103, and FIG. 11D shows the time at T104.
  • the inter-tile wiring is not shown in the figure.
  • the tile 21A has selected the context A1-1
  • the tile 22B has selected the context B1-1
  • the tile 23C has the context A2- 1 is selected.
  • Relay wiring is set in the tile 22B, and the tile 21A is connected to the tile 23C via the relay wiring. Therefore, the tile 21A and the tile 23C can exchange data via the tile 22B.
  • the tile 22B remains set by the context B1-1, but the tile 21A is switched to the setting by the context A1-2, and the tile 23C is set by the context A2.
  • the setting is switched to -2. Since the tile 22B remains set by the context B1-1, the positions of the relay wiring and its terminals are the same as in FIG. Further, not only the position of the terminal of the tile 21A connected to the tile 22B but also the position of the terminal of the tile 23C connected to the tile 22B is the same as in the case of FIG. Therefore, the tile 21A and the tile 23C can exchange data via the tile 22B.
  • the circuit set for each of the tile 21A and the tile 23C is the same as that at the timing of T101.
  • the tile 22B is set by the context B1-2.
  • the relay wiring is set by another route, and the position of the terminal does not change. Therefore, the tile 21A and the tile 23C can exchange data via the tile 22B.
  • the tile 21A is set by the context A1-1 and the tile 23C is set by the context A2-1 from before the timing of T104.
  • the tile 22B is switched to the setting by the context B1-3. Although the setting in the tile 22B changes, the relay wiring is set by another route, and the position of the terminal does not change. Therefore, the tile 21A and the tile 23C can exchange data via the tile 22B.
  • the tile 21A and the tile 23C can exchange data via the tile 22B at all timings.
  • the clock thinning unit 500 arbitrarily thins the clock, the rising edge of the clock input to each tile is synchronized with the master clock 4000, so the timing of context switching for each tile is exactly the same.
  • the clock thinning unit 500 can operate without any problem even if the clock is thinned arbitrarily.
  • a net list conversion device will be described as an example of a device that supplies an object code to the data processing device 1.
  • FIG. 12A is a block diagram showing a configuration example of the net list conversion apparatus
  • FIG. 12B is a block diagram for explaining a net list generation unit in the net list conversion apparatus.
  • the netlist conversion device 150 is input with a data input unit 201 to which source code is input, and a storage unit 203 that holds constraint conditions such as inter-tile wiring terminals in the data processing device 1.
  • An object code generation unit 220 that generates an object code from the source code, and a data output unit 202 that outputs the generated object code.
  • the object code generation unit 220 includes a list generation unit 101 that generates a netlist from source code, and a netlist conversion that converts a netlist received from the list generation unit 101 based on information in the storage unit 203. And a processing arrangement unit 102 that assigns the converted netlist to each tile as an object code.
  • the net list conversion unit 204 includes a net list division unit 301 and a net list integration unit 302.
  • the netlist includes information related to setting of arithmetic processing executed by the arithmetic processing unit 212 of the context transition type reconfigurable hardware 10.
  • FIG. 13 is a block diagram showing a specific configuration example of the net list conversion apparatus shown in FIG. 12A.
  • the net list conversion device 150 is a kind of information processing device, and includes a CPU 601, a RAM 602 connected to the CPU 601 via a bus line 620, a ROM 603, an HDD (Hard Disk Drive) 604, and the like. , FDD 606, CD drive 607, display 608, keyboard 609, mouse 610, and I / F unit 605.
  • the FDD 606 is detachable from an FD (Floppy Disk) 611 of a recording medium, and writes data to the mounted FD 611 or reads data from the FD 611.
  • the CD drive 607 is detachable from the CD-ROM 612 as a recording medium, and writes data to the CD-ROM 612 loaded or reads data from the CD-ROM 612.
  • the keyboard 609 and the mouse 610 correspond to the data input unit 201
  • the display 608 corresponds to the data output unit 202.
  • Each of the I / F unit 605, the FDD 606, and the CD drive 607 has functions of both the data input unit 201 and the data output unit 202.
  • the RAM 602, the ROM 603, and the HDD 604 correspond to the storage unit 203.
  • Various restriction conditions such as an inter-tile wiring terminal in the data processing apparatus 1 are registered in advance in at least one of the ROM 603 and the HDD 604 of the storage unit 203.
  • a program for the CPU 601 to execute processing and various data necessary for the processing are stored in the storage unit 203. These programs and data may be stored in advance in the storage unit 203 or may be registered in the storage unit 203 from a recording medium.
  • a program for causing the CPU 601 to execute various processes is stored in the FD 611 or the CD-ROM 612 in advance.
  • this program is copied from the HDD 604 to the RAM 602 and read from the RAM 602 to the CPU 601 when the netlist conversion apparatus 150 is activated.
  • the list generation unit 101, the net list conversion unit 204, and the processing arrangement unit 102 illustrated in FIG. 12B virtually add to the net list conversion device 150. Composed. Note that the processing of the list generation unit 101 and the processing arrangement unit 102 is the same as the method disclosed in the patent document of JP3921367B, and thus detailed description thereof is omitted here.
  • Source code input from the outside to the netlist conversion apparatus 150 is performed by the CPU 601 controlling the FDD 606, the CD-ROM 612, or the I / F unit 605 in accordance with a program stored in the RAM 602.
  • the CPU 601 controls the FDD 606 if the FD 611 on which the source code is recorded is attached to the FDD 606, and controls the CD drive 607 if the CD-ROM 612 on which the source code is recorded is attached to the CD drive 607.
  • the I / F unit 605 receives the source code via the not shown), the I / F unit 605 is controlled.
  • the output of the object code from the netlist conversion device 150 is performed by the CPU 601 controlling the FDD 606, the CD-ROM 612, or the I / F unit 605 in accordance with a program stored in the RAM 602.
  • the CPU 601 controls the FDD 606 to write the object code in the FD 611.
  • the CPU 601 controls the CD drive 607 to write the object code in the CD-ROM 612.
  • the I / F unit 2 of the data processing device 1 and the I / F unit 605 of the net list conversion device 150 are connected by a communication line (not shown), and the object code is processed through the communication line (not shown).
  • the CPU 601 controls the I / F unit 605 to transmit the object code to the data processing apparatus 1 according to the program. In this way, the object code is transmitted from the netlist conversion apparatus 150 to the data processing apparatus 1 via the communication line (not shown).
  • FIG. 14 is a diagram showing an object code generation procedure.
  • FIG. 15 is a diagram for explaining a method for generating a netlist corresponding to an object code.
  • FIG. 15 shows a method of generating netlists 7011A, 7012A, and 7021B corresponding to the object codes A1, A2, and B1 shown in FIG. Note that a method for generating a netlist from source code is disclosed in JP3921367B patent document, and therefore detailed description thereof is omitted here.
  • a list generation unit 101 that generates a netlist and a processing arrangement unit 102 that allocates an object code to each tile, In between, a net list conversion unit 204 including a net division unit 301 and a net integration unit 302 is provided.
  • the list generation unit 101 When the source code A0 is input, the list generation unit 101 generates a net list 701A by the method disclosed in the patent document JP3921367B and passes it to the netlist division unit 301. When the source code B0 is input, the list generation unit 101 generates a net list 702B by the method disclosed in the patent document JP3921367B and passes it to the netlist integration unit 302.
  • the net list dividing unit 301 Upon receiving the net list 701A from the list generation unit 101, the net list dividing unit 301 divides the net list 701A into a net list 7011A, a net list 7012A, and a subnet list 7013A.
  • the subnet list 7013A is a circuit that connects the inter-tile wiring terminals 901 and the inter-tile wiring terminals 902 shown in FIG. Then, the netlist dividing unit 301 passes the netlist 7011A and the netlist 7012A to the processing arrangement unit 102, and passes the subnet list 7013A to the netlist integration unit 302.
  • the netlist integration unit 302 Upon receiving the subnet list 7013A and the netlist 702B, the netlist integration unit 302 adds the subnet list 7013A as data to the original netlist for each of the contexts included in the netlist 702B, and the netlist 7021B. Is generated. Then, the generated netlist 7021B is transferred to the processing placement unit 102.
  • the processing arrangement unit 102 Upon receiving the netlists 7011A, 7021B, and 7012A, the processing arrangement unit 102 generates the object code A1 from the netlist 7011A and the object code B1 from the netlist 7021B in the same manner as disclosed in the patent document JP3921367B.
  • the object code A2 is generated from the net list 7012A.
  • the netlist conversion method is a method for converting the original netlist generated by the netlist generation unit 101 into a netlist for generating data of object codes used in the data processing apparatus of this embodiment.
  • the netlist 701A shown in FIG. 15 is a netlist before placement and routing that is the basis of the object codes A1 and A2, and the netlist 702B is a netlist before placement and routing that is the basis of the object code B1.
  • Each netlist includes a plurality of contexts.
  • the netlist dividing unit 301 determines a dividing point 800 of the netlist 701A in order to divide the netlist 701A into the tile 21A and the tile 23C.
  • the division point 800 is made to correspond to the inter-tile wiring terminal 901 for connecting the tile 21A and the tile 22B and the inter-tile wiring terminal 902 for connecting the tile 22B and the tile 23C.
  • the net list when the net list 701A is divided into two at the dividing point 800 is referred to as the above-described net list 7011A and net list 7012A.
  • the net list division unit 301 generates a subnet list including the setting contents of the relay wiring such that the inter-tile wiring terminal 901 and the inter-tile wiring terminal 902 in the tile 22B are equipotential.
  • the netlist integration unit 302 adds subnet list data to the original netlist for each of the contexts included in the netlist 702B, and sets this as the netlist 7021B.
  • the processing arrangement unit 102 After that, the processing arrangement unit 102 generates object codes by associating these netlists 7011A, 7021B, and 7012A with the tiles 21A, 22B, and 23C, respectively, so that the object codes A1 and B1 shown in FIG. , A2 is obtained.
  • the clock thinning unit thins out an arbitrary clock pulse from the master clock, so that context transition type reconfiguration in any tile is possible
  • the timing of state transition accompanying switching of configuration information coincides with the rising edge of the master clock. Therefore, even if tiles operating under the same control are not adjacent to each other, data can be exchanged between tiles under the same control via tiles operating under different control. As a result, it is possible to exchange data between the context transition type reconfigurable hardware that is the same control and not adjacent without increasing the wiring resources and without increasing the access latency.
  • the clock thinning unit thins out the clock pulse from the master clock according to the context and sets the clock input to a plurality of context transition type reconfigurable hardware having different throughputs. Yes. Therefore, it is possible to suppress the idle time of the context transition type reconfigurable hardware having a high throughput, and to equalize the throughput of each context transition type reconfigurable hardware. As a result, the operating power of the data processing device can be reduced without reducing the overall throughput.
  • the net list conversion apparatus divides the content of the predetermined arithmetic processing into two context transition type reconfigurable hardware net lists that operate under the same control and are not adjacent to each other.
  • the two context transition-type reconfigurable hardware of the same control are placed in the netlist of the context transition-type reconfigurable hardware that operates under different controls and is located between these two context-transition type reconfigurable hardware It integrates subnets that include settings for relay wiring for connecting to equipotentials. This makes it possible to use the programmable wiring of context transition type reconfigurable hardware that operates under different controls as a data transfer path between non-adjacent context transition type reconfigurable hardware that operates under the same control.
  • a netlist can be generated.
  • the arithmetic processing unit 212 may have at least one arithmetic unit, but a plurality of programmable wirings are provided. It is desirable that This is because, as shown in FIG. 11, in all contexts, at least one programmable wiring is set so that both ends thereof are equipotential.
  • the present embodiment is characterized by detecting the throughput of arithmetic processing of each tile and resetting the operation frequency of the tile so that the throughput of each tile becomes equal based on the detected throughput information.
  • FIG. 16 is a block diagram showing a configuration example of the data processing apparatus according to this embodiment.
  • the data processing apparatus 105 of this embodiment includes a tile array 50 including a plurality of tiles 250 and an operating frequency setting unit 510.
  • the operating frequency setting unit 510 includes a CPU (not shown) that executes predetermined processing according to a program, and a memory (not shown) for storing the program.
  • FIG. 17 is a block diagram showing a configuration example of tiles in the data processing apparatus shown in FIG.
  • FIG. 17 is an enlarged view of the broken line portion of FIG.
  • the tile 250 includes a throughput detection unit 520.
  • the operating frequency setting unit 510 is connected to the throughput detection unit 520 and the clock thinning unit 500 of each tile 250.
  • FIG. 17 shows a configuration in which the operating frequency setting unit 510 is connected to the throughput detection unit 520 and the clock thinning unit 500 of one tile 250.
  • the throughput detection unit 520 of each tile 250 periodically detects and detects the throughput of the context transition type reconfigurable hardware 10 when the data processing apparatus 1 that has registered the object code is activated and starts operating. Throughput information which is information indicating the throughput is transmitted to the operating frequency setting unit 510.
  • the operating frequency setting unit 510 When the operating frequency setting unit 510 receives the throughput information from the throughput detection unit 520 of each tile 250, the operating frequency setting unit 510 provides the clock thinning unit 500 of each tile 250 with a uniform throughput based on the throughput information. To set the operating frequency. For example, a series of procedures for specifying a tile having a throughput higher than the average value from the throughput information collected from each tile 250 and changing the operating frequency of the tile 250 is described in the program.
  • the data processing apparatus 1 starts operating thereafter.
  • the throughput detection unit 520 of each tile 250 periodically detects the throughput of the context transition type reconfigurable hardware 10 and transmits the throughput information to the operating frequency setting unit 510.
  • the operating frequency setting unit 510 receives the throughput information from the throughput detecting unit 520 of each tile 250, the operating frequency setting unit 510 sets the operating frequency for the clock thinning unit 500 of each tile 250 so that the throughput in each tile 250 becomes equal. To do. A specific example will be described below.
  • the operating frequency of the clock signal of a certain tile 250c is Fc, and the throughput of the tile 250c is Pc.
  • the clock thinning unit 500 is set to operate.
  • the throughput of each tile is equalized even if the equalization of the throughput of each tile is insufficient by the setting of the operating frequency by the object code. It becomes possible to reset the operating frequency so that
  • the operating frequency setting unit 510 is configured by the CPU executing the program, but may be a dedicated circuit for executing the above-described processing.
  • FIG. 18 is a diagram showing another configuration example of wiring between tiles.
  • the programmable wiring resource of the tile 24 ⁇ / b> D may be insufficient.
  • the tile 24D may use the programmable wiring of the tile 25E using the inter-tile wiring 600 between the tile 24D and the tile 25E adjacent thereto.
  • Information for setting a circuit in which the tile 24D and the tile 25E operate in cooperation is written in the object code registered in each of the tile 24D and the tile 25E.
  • the tile 24D and the tile 25E can operate with different controls and different thinning clocks in the same manner as described above.
  • the various units included in the data processing apparatus and the net list conversion apparatus of the present invention only need to be formed so as to realize their functions.
  • the processing is performed according to dedicated hardware and programs that realize predetermined functions. It may be any of a data processing device that realizes a predetermined function by executing, a predetermined function realized inside the data processing device by a program, a combination thereof, and the like.
  • each unit provided in the data processing apparatus and the net list conversion apparatus of the present invention does not have to be individually independent, and a configuration in which a certain part becomes a part of another part may be employed.
  • data can be exchanged between tiles operating with different clock signals without increasing wiring resources and without increasing access latency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Hardware Design (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Design And Manufacture Of Integrated Circuits (AREA)
  • Logic Circuits (AREA)

Abstract

Un dispositif de traitement de données comprend : une pluralité de dalles (20) comportant du matériel reconfigurable (10) et une unité d’amincissement d’horloge (500) ; et des fils interdalles (600) disposés entre les dalles (20) pour connecter les fils programmables des dalles adjacentes. Le matériel reconfigurable (10) comporte une pluralité d’ordinateurs et une pluralité de fils programmables. Des informations de configuration pour configurer le contenu de fonctionnement des ordinateurs et des fils programmables sont permutées les unes vers les autres chaque fois qu’un signal d’horloge est entré de façon à établir les ordinateurs et les fils programmables et à exécuter un processus de calcul selon les informations de configuration. L’unité d’amincissement d’horloge (500) entre dans le matériel reconfigurable (10) un signal d’horloge basé sur une horloge maîtresse correspondant aux informations de configuration ou bien un signal d’horloge obtenu en amincissant une impulsion d’horloge arbitraire de l’horloge maîtresse.
PCT/JP2009/061401 2008-06-26 2009-06-23 Dispositif de traitement de données, dispositif de traitement d’informations et procédé de traitement d’informations WO2009157441A1 (fr)

Priority Applications (1)

Application Number Priority Date Filing Date Title
JP2010518020A JPWO2009157441A1 (ja) 2008-06-26 2009-06-23 データ処理装置、情報処理装置及び情報処理方法

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
JP2008167331 2008-06-26
JP2008-167331 2008-06-26

Publications (1)

Publication Number Publication Date
WO2009157441A1 true WO2009157441A1 (fr) 2009-12-30

Family

ID=41444507

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2009/061401 WO2009157441A1 (fr) 2008-06-26 2009-06-23 Dispositif de traitement de données, dispositif de traitement d’informations et procédé de traitement d’informations

Country Status (2)

Country Link
JP (1) JPWO2009157441A1 (fr)
WO (1) WO2009157441A1 (fr)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014112082A1 (fr) * 2013-01-17 2014-07-24 富士通株式会社 Dispositif logique programmable

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000181566A (ja) * 1998-12-14 2000-06-30 Mitsubishi Electric Corp マルチクロック並列処理装置
JP2002215599A (ja) * 2001-01-18 2002-08-02 Mitsubishi Electric Corp マルチプロセッサシステムおよびその制御方法
JP2004310730A (ja) * 2003-01-15 2004-11-04 Sanyo Electric Co Ltd リコンフィギュラブル回路を備えた集積回路装置、処理装置およびそれらを利用した処理方法
JP2006163815A (ja) * 2004-12-07 2006-06-22 Matsushita Electric Ind Co Ltd 再構成可能な信号処理プロセッサ

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2000181566A (ja) * 1998-12-14 2000-06-30 Mitsubishi Electric Corp マルチクロック並列処理装置
JP2002215599A (ja) * 2001-01-18 2002-08-02 Mitsubishi Electric Corp マルチプロセッサシステムおよびその制御方法
JP2004310730A (ja) * 2003-01-15 2004-11-04 Sanyo Electric Co Ltd リコンフィギュラブル回路を備えた集積回路装置、処理装置およびそれらを利用した処理方法
JP2006163815A (ja) * 2004-12-07 2006-06-22 Matsushita Electric Ind Co Ltd 再構成可能な信号処理プロセッサ

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2014112082A1 (fr) * 2013-01-17 2014-07-24 富士通株式会社 Dispositif logique programmable

Also Published As

Publication number Publication date
JPWO2009157441A1 (ja) 2011-12-15

Similar Documents

Publication Publication Date Title
JP5363064B2 (ja) ネットワーク・オン・チップ(noc)上のソフトウェア・パイプライン化の方法、プログラムおよび装置
US20160026606A1 (en) Node card management in a modular and large scalable server system
JP4547198B2 (ja) 演算装置、演算装置の制御方法、プログラム及びコンピュータ読取り可能記録媒体
JP5793690B2 (ja) インタフェース装置、およびメモリバスシステム
JP6939775B2 (ja) ネットワークシステム、その管理方法および装置
KR20090035538A (ko) 반도체 집적 회로, 프로그램 변환 장치 및 매핑 장치
WO2016107421A1 (fr) Appareil et procédé de reconstruction pour dispositif logique programmable
CN114691317A (zh) 可重新配置的计算结构中的循环执行
JP2011198228A (ja) 画像処理装置、画像形成装置及びプログラム
JP2010205108A (ja) 情報処理装置および情報処理プログラム
WO2009157441A1 (fr) Dispositif de traitement de données, dispositif de traitement d’informations et procédé de traitement d’informations
JP4859103B2 (ja) 画像形成装置
WO2011065139A1 (fr) Appareil de traitement d'informations, procédé de commande d'un appareil de traitement d'informations et support de stockage
JP5277615B2 (ja) データ処理装置及びデータ処理プログラム
US8359564B2 (en) Circuit design information generating equipment, function execution system, and memory medium storing program
EP1868293A1 (fr) Systeme informatique, structure de donnees indiquant des informations de configuration, et dispositif et procede de mappage
Berthelot et al. Partial and dynamic reconfiguration of FPGAs: a top down design methodology for an automatic implementation
CN109977053A (zh) 一种基于网络的共享io接口设计方法和系统
US8769142B2 (en) Data transfer apparatus, information processing apparatus and method of setting data transfer rate
JP2010244470A (ja) 分散処理システム及び分散処理方法
JP2013009044A (ja) 制御装置、処理装置、処理システム、制御プログラム
Brunner et al. An audio system application for the adaptive avionics platform
JP6618574B1 (ja) 制御装置、通信システムおよび制御プログラム
JP2010257411A (ja) マルチファンクション・カード・システム、マルチファンクション・カード、およびその制御方法
JP2005078177A (ja) 並列演算装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 09770151

Country of ref document: EP

Kind code of ref document: A1

WWE Wipo information: entry into national phase

Ref document number: 2010518020

Country of ref document: JP

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 09770151

Country of ref document: EP

Kind code of ref document: A1