US20180129605A1 - Information processing device and data structure - Google Patents

Information processing device and data structure Download PDF

Info

Publication number
US20180129605A1
US20180129605A1 US15/861,533 US201815861533A US2018129605A1 US 20180129605 A1 US20180129605 A1 US 20180129605A1 US 201815861533 A US201815861533 A US 201815861533A US 2018129605 A1 US2018129605 A1 US 2018129605A1
Authority
US
United States
Prior art keywords
zero
data
write
zero data
column
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
US15/861,533
Inventor
Hiroyuki Usui
Seiji Maeda
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Toshiba Corp
Original Assignee
Toshiba Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Toshiba Corp filed Critical Toshiba Corp
Priority to US15/861,533 priority Critical patent/US20180129605A1/en
Assigned to KABUSHIKI KAISHA TOSHIBA reassignment KABUSHIKI KAISHA TOSHIBA ASSIGNMENT OF ASSIGNORS INTEREST (SEE DOCUMENT FOR DETAILS). Assignors: MAEDA, SEIJI, USUI, HIROYUKI
Publication of US20180129605A1 publication Critical patent/US20180129605A1/en
Abandoned legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F12/00Accessing, addressing or allocating within memory systems or architectures
    • G06F12/02Addressing or allocation; Relocation
    • G06F12/08Addressing or allocation; Relocation in hierarchically structured memory systems, e.g. virtual memory systems
    • G06F12/0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
    • G06F12/0806Multiuser, multiprocessor or multiprocessing cache systems
    • G06F12/0813Multiuser, multiprocessor or multiprocessing cache systems with a network or matrix configuration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/15Use in a specific computing environment
    • G06F2212/154Networked environment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2212/00Indexing scheme relating to accessing, addressing or allocation within memory systems or architectures
    • G06F2212/30Providing cache or TLB in specific location of a processing system
    • G06F2212/305Providing cache or TLB in specific location of a processing system being part of a memory device, e.g. cache DRAM

Definitions

  • An embodiment described herein relates generally to an information processing device and a data structure.
  • the processing speed of a processor or a hardware engine is higher than the data supply ability of a main memory such as a DRAM; therefore, a cache memory which compensates for the performance difference thereof is used in some cases.
  • the cache memory is a memory such as a SRAM which exhibits a higher speed than the main memory, and the cache memory temporarily stores data in a data array.
  • the processor can carry out high-speed processing by accessing the data in the cache memory.
  • the cache memory acquires data from the main memory in the unit of a cache line size (for example, 256 bytes) larger than an accessed data size. By accessing the main memory in the large unit, efficiency of the access to the main memory is improved.
  • the cache memory can return data from the data array without acquisition of data from the main memory; therefore, the processor or the hardware engine can access the data at high speed.
  • FIG. 1 is a diagram showing a configuration of a computer system provided with an information processing device according to the present embodiment
  • FIG. 2 is a diagram for explaining a matrix to be processed in the present embodiment
  • FIG. 3 is a diagram for explaining a processing unit of the matrix
  • FIG. 4 is a diagram for explaining a management method of block rows
  • FIG. 5 is a diagram for explaining an example of used memory space
  • FIG. 6 is a diagram for explaining an example of a bit layout in matrix addresses
  • FIG. 7 is a diagram for explaining a detailed configuration of a matrix management engine 4 ;
  • FIG. 8 is a diagram for explaining a process of address translation
  • FIG. 9 is a diagram for explaining an example of a write request
  • FIG. 10 is a diagram for explaining a detailed configuration of Read Ctrl 23 a ;
  • FIG. 11 is a flowchart for explaining an example of the flow of a process of reading data of S[m][n 0 ] to S[m][n 1 ];
  • FIG. 12 is a flowchart for explaining an example of the flow of a process of reading data of S[m][n 0 ]to S[m][n 1 ];
  • FIG. 13 is a diagram for explaining a detailed configuration of Write Ctrl 24 a
  • FIG. 14 is a flowchart for explaining an example of the flow of a process of writing data to a block column n 0 to a block column n 1 of an m-th row;
  • FIG. 15 is a flowchart for explaining an example of the flow of a process of searching for non-zero data columns B and A;
  • FIG. 16 is a flowchart for explaining an example of the flow of a process of updating non-zero management information at the position of “the start position of the non-zero data column B” ⁇ 1 and writing write data;
  • FIG. 17 is a flowchart for explaining an example of the flow of the process of updating the non-zero management information at the position of “the start position of the non-zero data column B” ⁇ 1 and writing write data;
  • FIG. 18 is a flowchart for explaining an example of the flow of the process of updating last non-zero management information and updating non-zero management information of the column n 1 ;
  • FIG. 19 is a diagram for explaining a write operation by Write Ctrl 24 a .
  • FIG. 20 is a diagram for explaining a write operation by Write Ctrl 24 a.
  • An information processing device of an embodiment has an input unit, a storage unit, a read control unit, and a write control unit.
  • a read request and a write request with respect to a predetermined range of a block row provided with at least one or more blocks consisting of one or more elements are input to the input unit.
  • the storage unit stores, in the region of zero data having all the elements in one block being zero, management information, which stores information representing the number of continuous non-zero data having one or more non-zero elements in one block and a distance to next non-zero data.
  • the read control unit reads read data including the management information from the storage unit, references the management information, and outputs only non-zero data included in a predetermined range of a block row.
  • the write control unit writes only non-zero data, which has one or more non-zero elements in one block in the data, to the storage unit and updates the management information immediately before a start position of the continuous non-zero data started from a largest position in the continuous non-zero data started from a position smaller than the predetermined range, a last management information stored in the predetermined range, and the last management information in the predetermined range.
  • FIG. 1 is a configuration diagram of the computer system provided with the information processing device according to the present embodiment.
  • the computer system 1 consists of a central processing unit (hereinafter, referred to as CPU) 2 , a hardware engine (hereinafter, referred to as HWE) 3 , a matrix management engine 4 , a cache 5 , an address translation unit 6 , an interconnect 7 , a main memory 8 , and an input/output device (hereinafter, referred to as I/O) 9 .
  • CPU central processing unit
  • HWE hardware engine
  • I/O input/output device
  • the matrix management engine 4 serving as the information processing device is connected to the CPU 2 , the HWE 3 , and the cache 5 .
  • the cache 5 is connected to the interconnect 7 via the address translation unit 6 .
  • the interconnect 7 is further connected to the main memory 8 and the I/O 9 .
  • the main memory 8 is, for example, a DRAM.
  • Input data to the computer system 1 is transferred to the main memory 8 via the I/O 9 and the interconnect 7 .
  • the transferred input data is transferred to and processed by the CPU 2 or the HWE 3 .
  • Output data processed by the CPU 2 or the HWE 3 is output via the main memory 8 , the interconnect 7 , and the I/O 9 .
  • the computer system 1 In a case in which data other than that of a matrix is accessed, the computer system 1 directly accesses the main memory 8 (or the cache 5 ) from the CPU 2 or the HWE 3 without the intermediation of the matrix management engine 4 .
  • the matrix management engine 4 In a case in which a matrix is accessed, the matrix management engine 4 carries out processing.
  • FIG. 2 is a diagram for explaining a matrix processed in the present embodiment
  • FIG. 3 is a diagram for explaining processing units of the matrix.
  • the matrix of FIG. 2 is a two-dimensional sparse matrix S consists of 832 elements of 16 rows and 52 columns. Moreover, in the present embodiment, zeros/non-zeros are managed by a predetermined size of the sparse matrix S.
  • the predetermined size is one or more elements and is, for example, a 16-element unit of 4 ⁇ 4 as shown in FIG. 3 . This unit is referred to as a block.
  • the block number in the horizontal (column) direction is represented by x, and x is in the range of 1 to the number obtained by dividing the number of the columns of the sparse matrix S by 4 (rounded up).
  • the block number in the vertical (row) direction is y, and y is in the range of 1 to the number obtained by dividing the number of the rows of the sparse matrix S by 4 (rounded up).
  • the definitions of the zero/non-zero of the block unit a case in which all the elements of one block are 0 is zero, and a case in which one block includes at least one element that is not 0 is non-zero.
  • S[ 1 ][ 1 ] and S[ 1 ][ 4 ] are zero since all the 16 elements therein are 0.
  • a case in which at least one of 16 elements is not 0 like S[ 1 ][ 2 ] and a case in which all of 16 elements are not 0 like S[ 1 ][ 3 ] are non-zero.
  • non-zero management is carried out in a block row unit.
  • S is a matrix of three or more dimensions
  • management can be similarly carried out by carrying out management in a one-dimensional block row unit.
  • non-zero management is carried out while using S[ 1 ][ 1 ] to S[ 1 ][ 13 ] as one block row B 1 .
  • Non-zero management is similarly carried out for each of the other block rows B 2 , B 3 , and B 4 .
  • FIG. 4 is a diagram for explaining a management method of block rows.
  • a memory region corresponding to one block hereinafter, referred to as a non-zero management region
  • a block position in the non-zero management is expressed as R[y][x].
  • the block number in the horizontal (column) direction is x
  • the block number in the vertical (row) direction is y.
  • the block row B 1 uses R[ 1 ][ 0 ] to R[ 1 ][ 13 ].
  • Non-zero data is disposed so that the position thereof in the sparse matrix S and the position thereof among the block positions R are the same.
  • S[ 1 ][ 2 ] which is non-zero data in the sparse matrix S is disposed at R[ 1 ][ 2 ]
  • S[ 1 ][ 6 ] is disposed at R[ 1 ][ 6 ].
  • position information of the non-zero data (hereinafter, referred to as non-zero management information) is disposed at the position of zero data. More specifically, the non-zero management information is disposed at the position of the zero data immediately before one or more continuous non-zero data. Note that, in the present embodiment, one piece of non-zero management information is recorded at the position of the zero data immediately before continuous non-zero data, wherein the position information of a plurality of non-zero data may be configured to be disposed at the position of one zero data.
  • the non-zero management information consists of parameters (Num, Next).
  • the number of blocks of continuous non-zero data is represented by Num
  • the distance (the number of blocks) to next non-zero data is represented by Next.
  • Next a case in which Next is 0 represents that next continuous non-zero data is not present in the block row.
  • Next 0 represents that next continuous non-zero data is not present in the block row.
  • the non-zero management information of the column of the non-zero data starting from S[ 1 ][ 2 ] of FIG. 4 is disposed at R[ 1 ][ 1 ], which is zero data.
  • Num is 2 since R[ 1 ][ 2 ] and R[ 1 ][ 3 ] are non-zero.
  • the column of next non-zero data is started from R[ 1 ][ 6 ]
  • the non-zero management information stored at R[ 1 ][ 1 ] becomes (2, 4).
  • the matrix management engine 4 has matrix management information for each of matrices managed.
  • the matrix management information is a matrix base address (base), the number of rows of the matrix (width), and the number of columns of the matrix (height) and is set from outside (for example, the CPU 2 ).
  • address space has a 32-bit width in a byte address
  • one element of the sparse matrix S is 8-byte data
  • the base address thereof is 0x48000000.
  • set values of parameters become 0x48000000 as the matrix base address (base), 52 as the number of rows of the matrix (width), and 16 as the number of columns of the matrix (height).
  • the matrix management engine 4 uses memory space by using these parameters.
  • FIG. 5 is a diagram for explaining an example of the used memory space.
  • the number of elements of one line is 2 n (n is an integer 1 or higher) including a non-zero management region and row data of the matrix.
  • the matrix data are sequentially disposed one row by one row in a line direction.
  • a non-zero management region Q 11 (corresponding to 4 elements) and row data Q 12 of the matrix (corresponding to 52 elements) are disposed from Base Address (0x48000000).
  • the width of the non-zero management region Q 11 corresponds to the width of the columns of one block, which is a zero/non-zero management unit.
  • a region corresponding to 8 elements is a data region Q 13 , which is not used.
  • the width Z of the data region Q 13 which is not used, is a minimum value that satisfies a below equation.
  • the value of Z is 8, and the size of one line corresponds to 64 elements. Since a memory volume of one line is 512 bytes (64 elements ⁇ 8 bytes), the data of a second row is started from an address 0x48000200.
  • FIG. 6 is a diagram for explaining an example of a bit layout of a matrix address (address before translation).
  • FIG. 7 is a configuration diagram of the matrix management engine 4 .
  • the matrix management engine 4 of FIG. 7 is provided with a Packet Distributer 21 , Packet I/Fs 22 a to 22 d , Read Ctrls 23 a to 23 d , and Write Ctrls 24 a to 24 d .
  • the matrix management engine 4 of FIG. 7 has four Packet I/Fs 22 a to 22 d , wherein the number thereof is arbitrary, and the matrix management engine 4 has Read Ctrls 23 and Write Ctrls 24 to correspond thereto.
  • the cache 5 consists of L2 caches 25 a to 25 d .
  • the address translation unit 6 consists of Address Translators 26 a to 26 d corresponding to the L2 caches 25 a to 25 d , respectively.
  • the L2 caches 25 may be general caches.
  • Input from the master module (the CPU 2 or the HWE 3 ) is input to Packet Distributer 21 .
  • the relation of input/output between the matrix management engine 4 and the master module is as described below. Requests from the master module are carried out in block units.
  • the start address of an access target matrix, an X-coordinate and a Y-coordinate of access start, the number of transfer columns, and the number of transfer rows are input to the matrix management engine 4 .
  • these parameters are examples.
  • the parameters may be parameters such as an access start element address and the number of transfers. In that case, according to the access start element address, the other parameters are calculated in the matrix management engine 4 .
  • the matrix management engine 4 outputs continuous non-zero data columns as below read data.
  • the read data include an X-coordinate and a Y-coordinate of the start of the non-zero data, the continuous non-zero data columns, and a non-zero data flag (ON). Note that in a case in which all data are zero with no non-zero data column, the non-zero data flag becomes OFF, and nothing is output as the X-coordinate and the Y-coordinate of the start of read non-zero data and non-zero data columns.
  • the start address of an access target matrix, an X-coordinate and a Y-coordinate of access start, the number of transfer columns, the number of transfer rows, and a non-zero data flag are input to the matrix management engine 4 .
  • the non-zero data flag becomes ON.
  • Data input in the case in which the non-zero data are written are an X-coordinate and a Y-coordinate of the start of non-zero data for every continuous non-zero data columns, the continuous non-zero data columns, and a terminal flag of the continuous non-zero data columns. Then, a last non-zero data flag of the write request is attached to the ending thereof.
  • the write request from the HWE 3 writes zero data to a location for which any non-zero data column is specified in a write request range.
  • the non-zero data flag becomes OFF, and matrix data are not input from the HWE 3 to the matrix management engine 4 .
  • a read request from the CPU 2 is the same as a normal read request from a CPU to a memory.
  • the address and the read size of a read access target are input to the matrix management engine 4 .
  • the matrix management engine 4 returns read data of the requested read size to the CPU 2 .
  • the matrix management engine 4 processes the request as a read request of a size that is larger than the requested read size and is minimum among block sizes of the integral multiples. Then, the matrix management engine 4 returns only the read size actually requested from the obtained read data to the CPU 2 . Moreover, in a case in which the read data include 0 , the matrix management engine 4 returns 0 to the CPU 2 .
  • a write request from the CPU 2 is the same as a normal write request from a CPU to a memory.
  • the address of a write access target, a write size, and write data are input to the matrix management engine 4 .
  • the matrix management engine 4 stores the write data of the requested write size.
  • the matrix management engine 4 carries out processing while considering that the data other than the given write data are 0, wherein the request serves as a write request of a size that is larger than the write size and is minimum among the block sizes of the integral multiples.
  • the matrix management engine 4 has matrix information therein and, when access is input, judges whether it is access to a management matrix. In a case in which it is judged that the access is not to the management matrix, the matrix management engine 4 accesses the L2 cache 25 , etc. as normal access.
  • Read Ctrl 23 In a case in which only non-zero data are output to a request source, Read Ctrl 23 outputs the non-zero data read from the L2 cache 25 and the position information thereof to the request source. On the other hand, in a case of output to the request source as normal matrix data, Read Ctrl 23 outputs the normal matrix data into which zero data have been inserted to the request source based on the non-zero data read from the L2 cache 25 and the position information thereof.
  • Write Ctrl 24 translates the access to below two requests and outputs the requests to the L2 cache 25 . Specifically, Write Ctrl 24 translates the access to the request for carrying out write of the non-zero data and update of the position information and to the request for carrying out update of the position information by translation from non-zero to zero (write of data is not carried out).
  • Packet Distributer 21 checks whether it is access to a read/write-requested matrix, the address of which is managed. Then, in the case of the managed matrix, Packet Distributer 21 carries out address translation, judges whether it is access to the cache 5 , and distributes the request to the Packet I/Fs 22 a to 22 d for each row. In a case in which it is not the access to the managed matrix, Packet Distributer 21 returns an error to the master module.
  • Packet Distributer 21 uses the matrix management parameters (base, width, height) and checks whether the address space of the memory includes the requested address.
  • Packet Distributer 21 carries out address translation and disposes non-zero-managed blocks, which have been disposed at addresses away from each other by each row in the original address space, in continuous address space.
  • FIG. 8 is a diagram for explaining the process of the address translation.
  • the management unit is a 4 ⁇ 4 block
  • lower 2 bits of X and Y represent the position in the block. Therefore, Packet Distributer 21 inserts the lower 2 bits of Y of the address before translation into the lower side of X of the address after translation. Note that also in a case in which the matrix has three or more dimensions, the bit(s) representing the position in the block except that of X is moved to the lower side of X. The bit positions of X and Y can be calculated from width of the input parameters. Moreover, Packet Distributer 21 adds MatrixID, which represents a managed matrix, to the address after translation.
  • MatrixID which represents a managed matrix
  • Packet Distributer 21 confirms an L2 bank number of the address before translation.
  • banks in other words, the L2 caches 25 a to 25 d are switched for each block row of the non-zero management matrix. Therefore, the 2 bits at the position shown by shading of Y of the address before translation represents the L2 bank number.
  • Packet Distributer 21 outputs a request to any of the L2 caches 25 a to 25 d represented by the value of the 2 bits. Note that the 2 bits representing the L2 bank number are not included in the address after translation.
  • Output parameters are translated from the input parameters and include a base address of an access target row, an access-start X-coordinate, and the number of accesses. Note that only in a case of a write request, the parameters include a flag representing write of all 0.
  • Packet Distributer 21 carries out translation from the input write data to a start X-coordinate of write non-zero data, continuous non-zero data columns, a terminal flag (Flag-tail) of the continuous non-zero data columns, and a non-zero data flag (Flag-end) in the end of the write request and outputs them.
  • FIG. 9 is a diagram for explaining an example of the write request.
  • a write target 35 corresponds to k blocks from a start address g.
  • actually transferred write data are only non-zero regions 36 , 37 , and 38 .
  • the terminal flags (Flag-tail) of the continuous non-zero data columns become 1; and, in the non-zero data in the end of the non-zero region 38 , the non-zero data flag (Flag-end) in the end of the write request becomes 1.
  • the Packet I/Fs 22 a to 22 d Based on the read/write requests from Packet Distributer 21 , the Packet I/Fs 22 a to 22 d distribute input data to Read Ctrls 23 a to 23 d or Write Ctrls 24 a to 24 d .
  • Read Ctrls 23 a to 23 d respectively access L2 data management structures in the L2 caches 25 a to 25 d and output the non-zero data included in access ranges to the Packet I/Fs 22 a to 22 d .
  • Read Ctrls 23 a to 23 d may acquire non-zero data and non-zero management information directly from the main memory 8 without using the L2 caches 25 a to 25 d.
  • Output data include a start X-coordinate of read non-zero data, continuous non-zero data columns, a terminal flag (Flag-tail) of the continuous non-zero data columns, a non-zero data flag (Flag-end) in the end of the read request, and the non-zero data flag.
  • Read Ctrls 23 a to 23 d output only the non-zero data in the non-zero regions. Namely, the data columns of zero data are not output. In a case in which there is at least one non-zero data, Read Ctrl 23 sets 1 as the non-zero data flag. On the other hand, in a case in which not even one non-zero data is included in the read request range, Read Ctrl 23 returns only the non-zero data flag 0.
  • the data output from Read Ctrls 23 a to 23 d are input to Packet Distributer 21 via the Packet I/Fs 22 a to 22 d .
  • Packet Distributer 21 outputs data in accordance with a read data I/F for the master module.
  • Write Ctrls 24 a to 24 d respectively access the L2 data management structures on the L2 caches 25 a to 25 d and update non-zero data and non-zero management information. Moreover, Write Ctrls 24 a to 24 d may keep non-zero data and non-zero management information directly in the main memory 8 without using the L2 caches 25 a to 25 d.
  • Address Translators 26 a to 26 d reference the matrix management information and carry out address translation. In the address translation, reverse translation of the translation carried out by Packet Distributer 21 is carried out.
  • the data of one block disposed in one continuous region in the L2 caches 25 a to 25 d are divided into continuous regions of respective rows in the main memory 8 and accessed.
  • the L2 cache 25 , Read Ctrl 23 , or Write Ctrl 24 may directly access the main memory 8 without using the address translation by Address Translator 26 .
  • FIG. 10 is a configuration diagram of Read Ctrl 23 a .
  • Read Ctrl 23 a has Info Checker 41 , Read Requestor 42 , Data Requestor 43 , Data Output 44 , and Read Data Receiver 45 .
  • Info Checker 41 carries out read of non-zero management information and check of contents.
  • Info Checker 41 outputs the read request of the non-zero management information to Read Requestor 42 .
  • Read Requestor 42 outputs a read request of a block, which includes the non-zero management information, to the L2 cache 25 a .
  • read data are read from the L2 cache 25 a and input to Read Data Receive 45 .
  • Read Data Receiver 45 reads the non-zero management information from the read data and outputs that to Info Checker 41 .
  • Info Checker 41 When Info Checker 41 detects a non-zero column of a read target in accordance with the non-zero management information, Info Checker 41 outputs the coordinates of the non-zero column and a read start request to Data Requestor 43 . Moreover, Info Checker 41 outputs the start position of non-zero data to Data Output 44 .
  • Data Requestor 43 outputs a read start coordinate of the non-zero column to Data Output 44 and outputs a read request of a non-zero block(s) included in the read region to Read Requestor 42 . Then, in accordance with the read request of the non-zero block, Read Requestor 42 outputs a read request to the L2 cache 25 a in accordance with the read request of the non-zero block. In accordance with the read request of the non-zero block, the read data are read from the L2 cache 25 a and input to Data Output 44 via Read Data Receiver 45 .
  • Data Output 44 outputs a header by using the start position of the non-zero data from Info Checker 41 . Moreover, Data Output 44 outputs the read data, which are input from Read Data Receiver 45 , to the Packet I/F 22 a.
  • Data Requestor 43 When the read request of the continuous non-zero blocks is finished, Data Requestor 43 outputs an end flag to Data Output 44 . When the end flag is input, Data Output 44 outputs the terminal flag (Flag-tail) of the continuous zero-data columns.
  • Info Checker 41 outputs a read request of non-zero management information again to Read Requestor 42 , and the above described operation is carried out.
  • Info Checker 41 when read in the target read region is finished, Info Checker 41 outputs an end signal to Data Output 44 .
  • Data Output 44 When the end signal is input, Data Output 44 outputs a non-zero data flag (Flag-end) in the end of the read request, and the read operation is finished.
  • FIG. 11 and FIG. 12 are flowcharts for explaining the process of reading data of S[m][n 0 ] to S[m][n 1 ].
  • Read Ctrl 23 a judges whether X is larger than a read range (S 8 ). In a case in which n 1 ⁇ X is not satisfied (S 8 -NO), the next non-zero column may be included in the read range, and the process returns to S 3 .
  • Read Ctrl 23 a outputs a header (S 12 ).
  • This header is start position information of the non-zero data column, and the value of X is output as the header.
  • the non-zero block of S[Y][X] is read (S 13 ), and the read block is output to a read request source (S 14 ).
  • Read Ctrl 23 a judges whether read of the non-zero data column has been finished (S 16 ). In a case in which it is judged that non-zero data to be read are still remaining (S 16 -NO), that is, a case of Num> 0 , Read Ctrl 23 a judges whether X is in the read range (n 0 ⁇ X ⁇ n 1 ) (S 17 ). In a case in which n 0 ⁇ X ⁇ n 1 is satisfied (S 17 -YES), the process returns to S 13 .
  • FIG. 13 is a configuration diagram of Write Ctrl 24 a .
  • Write Ctrl 24 a has B-Searcher 51 , Read Requestor 52 , A-Searcher 53 , Read Data Receiver 54 , Write Data Receiver 55 , B-Updater 56 , Write Requestor 57 , Data Writer 58 , and Last-Info Updater 59 .
  • B-Searcher 51 searches for a non-zero data column B
  • B-Searcher 51 outputs a read request of non-zero management information to Read Requestor 52 .
  • Read Requestor 52 outputs a read request of a block including the non-zero management information to the L2 cache 25 a .
  • read data are read from the L2 cache 25 a and input to Read Data Receiver 54 .
  • Read Data Receiver 54 reads the non-zero management information from the read data and outputs the information to B-Searcher 51 .
  • B-Searcher 51 finishes the search for the non-zero data column B using the read non-zero management information
  • B-Searcher 51 outputs an operation start request to A-Searcher 53 together with the information of the non-zero data column
  • A-Searcher 53 searches for a non-zero data column A Note that the search for the non-zero data column A will be described later. As well as B-Searcher 51 , A-Searcher 53 gives a read request of non-zero management information to Read Requestor 52 and reads the non-zero management information from Read Data Receiver 54 .
  • A-Searcher 53 When the search for the non-zero data column A using the read non-zero management information is finished, A-Searcher 53 outputs a write request to B-Updater 56 together with the information of the non-zero data columns A and B.
  • B-Updater 56 carries out update of the non-zero management information of “the start position of the non-zero data column B” ⁇ 1.
  • the start position of the non-zero data column is input from Write Data Receiver 55 .
  • B-Updater 56 outputs a write request to Write Requestor 57
  • Write Requestor 57 outputs a write request of a corresponding block to the L2 cache 25 a.
  • Write data of a non-zero data block and the start position of a non-zero data column are input to Write Data Receiver 55 from Packet I/F 22 a .
  • Write Data Receiver 55 outputs the input write data of the non-zero data block and the start position of the non-zero data column to B-Updater 56 and Data Writer 58 .
  • B-Updater 56 After the update of the non-zero management information at “the start position of the non-zero data column B” ⁇ 1 is finished, B-Updater 56 outputs an operation start request to Data Writer 58 .
  • Data Writer 58 When the operation start request is input, Data Writer 58 carries out write of the write data. Data Writer 58 outputs a write request to Write Requestor 57 about write of non-zero data and non-zero management information to Write Requestor 57 as well as B-Updater 56 , and Write Requestor 57 outputs a write request of the corresponding block to the L2 cache 25 a . After the write of the write data is finished, Data Writer 58 outputs an operation start request to Last-Info Updater 59 .
  • Last-Info Updater 59 carries out write of the last non-zero management information and write of the write data at the position n 1 (the last column of the write data, which will be described later). As well as B-Updater 56 , Last-Info Updater 59 outputs a write request about write of the non-zero management information to Write Requestor 57 , and Write Requestor outputs a write request of the corresponding block to the L2 cache 25 a.
  • FIG. 14 is a flowchart for explaining the process of writing data to the block column n 0 to the block column n 1 of the m-th block row.
  • Write Ctrl 24 a searches for non-zero data columns B and A (S 22 ).
  • the non-zero data column B is a non-zero data column at the top among continuous non-zero data columns started from a position smaller than the region of the column n 0 to the column n 1 .
  • the non-zero data column A is non-zero data column at the ending among continuous non-zero data columns including data at a position larger than the region of the column n 0 to the column n 1 .
  • the start position of the non-zero data column B is assumed to be 1; and, in a case in which a non-zero data column that satisfies the condition of the non-zero data column A is not present, it is assumed that the non-zero data column A is not present.
  • Write Ctrl 24 a carries out update of the non-zero management information at the position of “the start position of the non-zero data column B” ⁇ 1 and write of write data (S 23 ).
  • Write Ctrl 24 a carries out update of the last non-zero management information of the column n 0 to the column n 1 and the non-zero management information of the column n 1 (S 24 ) and finishes the process.
  • FIG. 15 is a flowchart for explaining the process of search for the non-zero data columns B and A.
  • Write Ctrl 24 a judges whether next non-zero data column is present (S 35 ). In a case in which it is judged that the next non-zero data column is present (S 35 -YES), namely, a case in which Next is not 0, Write Ctrl 24 a judges whether the next non-zero data column is the column n 0 or thereafter (S 36 ). In a case in which it is judged that a top non-zero block of the next non-zero data column is also at a position less than the column n 0 (S 36 -NO), Write Ctrl 24 a updates X to X+Next (S 37 ), and the process returns to S 33 .
  • Write Ctrl 24 a judges whether a non-zero data column is present (S 39 ). In a case in which it is judged that the next non-zero data column is present (S 39 -YES), Write Ctrl 24 a judges whether the end of the non-zero data column is after the column n 1 (S 40 ). In a case in which it is judged that the end of the non-zero data column is not after the column n 1 (S 40 -NO), Write Ctrl 24 a judges whether a next non-zero data column is present (S 41 ). In a case in which it is judged that the next non-zero data column is present (S 41 -YES), Write Ctrl 24 a updates X to X+Next (S 42 ).
  • Write Ctrl 24 a reads the non-zero management information of R[Y][X- 1 ] (S 43 ) and sets the values of the read non-zero management information as Num and Next (S 44 ). Then, the process returns to S 39 .
  • the non-zero data column A and the management information representing the non-zero data column A are detected.
  • FIG. 16 is a flowchart for explaining update of the non-zero management information of the non-zero data column B.
  • FIG. 17 is a flowchart for explaining a write process of write data.
  • Write Ctrl 24 a judges whether the write data in a range are all 0, namely, whether the write data includes non-zero data (SM). In a case in which it is judged that the write data include non-zero data (SM-NO), Write Ctrl 24 a inputs a start position (q) of a non-zero data column.
  • SM-NO non-zero data
  • Write Ctrl 24 a judges whether the write of the non-zero data column is started from the column n 0 (S 53 ). In a case in which it is judged that the write of the non-zero data column is not from the column n 0 (S 53 -NO), Write Ctrl 24 a judges that the non-zero data column B is included in the columns n 0 to n 1 (S 54 ). Namely, in the process of S 54 , whether the length of the non-zero data column B is changed by zero-data write is checked.
  • Write Ctrl 24 a reduces Num of the non-zero data column B by the amount overlapped with the columns n 0 to n 1 (S 55 ) and changes Next of the non-zero data column B so that it specifies the start position q of the non-zero data column (S 56 ).
  • Write Ctrl 24 a judges whether the non-zero data column B and the non-zero data column W 0 are in contact or overlapped with each other (S 58 ). In a case in which it is judged that the non-zero data column B and the non-zero data column W 0 are not in contact or not overlapped with each other (S 58 -NO), the process proceeds to S 56 .
  • Write Ctrl 24 a sets the start position of the non-zero data column B as the start position of the non-zero data column W 0 (S 59 ).
  • Write Ctrl 24 a judges whether a next write non-zero data column (assumed to be W 1 ) is present (S 64 ). In a case in which it is judged that the next write non-zero data column W 1 is present (S 64 -YES), Write Ctrl 24 a inputs the start position (herein, assumed to be p) of the next non-zero data column W 1 (S 65 ).
  • Write Ctrl 24 a judges whether the elements of the non-zero data column B are included in the columns n 0 to n 1 (S 68 ). In a case in which it is judged that the non-zero data column B is not included in the columns n 0 to n 1 (S 68 -NO), the process proceeds to S 70 .
  • Write Ctrl 24 a reduces Num of the non-zero data column B by the amount overlapped with the columns n 0 to n 1 (S 69 ).
  • Write Ctrl 24 a judges whether the non-zero data column A is present (S 70 ). In a case in which it is judged that the non-zero data column A is not present (S 70 -NO), Write Ctrl 24 a changes Next of the non-zero data column B to 0 (S 71 ), and the process is finished. On the other hand, in a case in which it is judged that the non-zero data column A is present (S 70 -YES), Write Ctrl 24 a judges whether there is a location where the non-zero column of the non-zero data column A is changed to 0 by write (S 72 ).
  • FIG. 18 is a flowchart for explaining a process of updating the last non-zero management information and updating the non-zero management information of the column n 1 .
  • Write Ctrl 24 a judges whether 0 is written in all the write range (S 81 ). In a case in which it is judged that 0 is not written in all the write range (S 81 -NO), Write Ctrl 24 a judges whether the non-zero data column A is present (S 82 ). In a case in which it is judged that the non-zero data column A is present (S 82 -YES), Write Ctrl 24 a judges whether the last block of write is 0 (S 83 ).
  • Write Ctrl 24 a judges whether the non-zero data column A is present (S 92 ). In a case in which it is judged that the non-zero data column A is not present (S 92 -NO), the process is finished. On the other hand, in a case in which it is judged that the non-zero data column A is present (S 92 -YES), Write Ctrl 24 a judges whether there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S 93 ).
  • FIG. 19 and FIG. 20 are diagrams showing data columns to be written and non-zero management information of the second row.
  • a reference sign 70 represents the data of the block row B 2 of the second row of the sparse matrix S before data are written.
  • reference signs 71 a to 71 h hatched squares represent non-zero data, gray squares represent zero data, and white squares represent that write is not carried out therein.
  • Reference signs 72 a to 72 h represent the data of the block row B 2 after write data (reference signs 71 a to 71 h ) of Example 1 to Example 8 are written.
  • the search process of the non-zero data columns B and A of S 22 will be explained by using Example 1 of FIG. 19 .
  • Write Ctrl 24 a In the search for the non-zero data column B, first, Write Ctrl 24 a reads the non-zero management information R[ 2 ][ 0 ] at the top. Write Ctrl 24 a obtains the start position of a next non-zero column from the read value (Next). In the case of Example 1, it is R[ 2 ][ 3 ].
  • Write Ctrl 24 a checks whether the obtained start position of the non-zero column is in the range of write data (R[ 2 ][ 4 ] to R[ 2 ][ 11 ]).
  • “the position of the read non-zero management information”+1 is the start position of the non-zero data column B.
  • next non-zero management information is read.
  • R[ 2 ][ 3 ] is not in the range of the write data, the next non-zero management information R[ 2 ][ 2 ] is read.
  • Write Ctrl 24 a carries out a similar process until the non-zero data column B is detected.
  • the start position of the next non-zero column becomes R[ 2 ][ 7 ]
  • the start position of the non-zero data column B becomes “the position of the non-zero management information R[ 2 ][ 2 ]”+1, namely, 3.
  • Write Ctrl 24 a checks whether the last position of the obtained non-zero data column is in the range of the write data (R[ 2 ][ 4 ] to R[ 2 ][ 11 ]). In a case in which the last position of the obtained non-zero data column is not in the range of the write data, “the position of the current non-zero management information”+1 becomes the start position of the non-zero data column A In this case, since R[ 2 ][ 4 ] is in the range of the write data, Write Ctrl 24 a updates the position of the non-zero management information to the position ( 6 ) of the next non-zero management information R[ 2 ][ 6 ].
  • Write Ctrl 24 a carries out a similar process until the non-zero data column A is detected.
  • the non-zero data column A is not present.
  • the start position of the non-zero data column A becomes “the position of the non-zero management information R[ 2 ][ 10 ]”+1, namely, 11.
  • R[ 2 ][ 12 ] is in the range of the write data, and the next non-zero data column thereof is not present; therefore, the non-zero data column A is not present.
  • Example 1 of FIG. 19 and Example 7 of FIG. 20 write is started from zero data (R[ 2 ][n 0 ] is 0 , n 0 ⁇ q), and, as a result of write of the zero data, the length of the non-zero data columns is reduced. Therefore, Num of the non-zero management information R[ 2 ][ 2 ] is changed from 2 to 1 (becomes n 0 -B). Moreover, Next of the non-zero management information R[ 2 ][ 2 ] is changed so as to specify the start position q of the non-zero data column C (4 ⁇ 2).
  • Example 2 of FIG. 19 write is started from zero data (R[ 2 ][n 0 ] is 0 , n 0 ⁇ q), and, as a result of write of the zero data, the length of the non-zero data columns is not changed. Therefore, only Next of the non-zero management information R[ 2 ][ 2 ] is changed so as to specify q of the start position of the non-zero data column of C (4 ⁇ 5).
  • Example 5 of FIG. 19 all write is zero data, and, as a result of write of the zero data, the length of the non-zero data columns is reduced. Therefore, Num of the non-zero management information R[ 2 ][ 2 ] is reduced from 2 to 1 (changed to n 0 -“the start position of B”). Furthermore, as a result of write of the zero data, the length of the data column of the non-zero data column A is reduced. Therefore, Next of the non-zero management information R[ 2 ][ 2 ] is changed so as to specify n 1 +1 (4 ⁇ 11).
  • Example 6 of FIG. 20 all write is zero data, and, as a result of write of the zero data, the length of the non-zero data columns is not changed. Therefore, Num of the non-zero management information R[ 2 ][ 2 ] is not changed. Furthermore, as a result of write of the zero data, the length of the data column of the non-zero data column A is not changed. Therefore, Next of the non-zero management information R[ 2 ][ 2 ] is changed so as to specify the start position of the non-zero data column A (4 ⁇ 8).
  • Example 8 of FIG. 20 all write is zero data, and, as a result of write of the zero data, the length of non-zero data columns is reduced.
  • Num of the non-zero management information R[ 2 ][ 2 ] is reduced from 2 to 1 (changed to n 0 -“the start position of B”).
  • Next of the non-zero management information R[ 2 ][ 2 ] is changed to 0.
  • Example 1 of FIG. 19 a write process of write data will be explained by using Example 1 of FIG. 19 .
  • the write of the write data is carried out in a case in which the write data include non-zero data.
  • Example 2 to Example 4 of FIG. 19 are also similar. However, in the case of Example 3, the non-zero management information updated first is at the position “the start position of the non-zero data column B” ⁇ 1 (R[ 2 ][ 2 ]).
  • R[ 2 ][ 111 is the start position of the non-zero data column A.
  • the non-zero data column A is not present.
  • the write range is the columns n 0 to n 1 , and the column number at the largest position in the non-zero data in the write data is e.
  • the non-zero management information R[ 2 ][ 11 ] of the column n 1 is changed so that Num is reduced by the amount of reduction from Num ( 2 ) of the non-zero data columns A by the zero write ( 21 ) and Next becomes the same as Next of the non-zero data column A.
  • Write Ctrl 24 a updates the non-zero management information of the column n 1 (S 86 ) and finishes the process.
  • the matrix management engine 4 is configured to retain only the non-zero data in the L2 caches 25 a to 25 d and store the non-zero management information representing the number of the continuous non-zero data in the region of zero data and the distance to the next non-zero data. Moreover, when a read request is input, the matrix management engine 4 is configured to reference the non-zero management information and return only the non-zero data to the request source. As a result, the used amount of the L2 caches 25 a to 25 d is configured to be reduced by retaining only the non-zero data, and a band width is configured to be reduced by transferring only the non-zero data.
  • the memory used amount and the band width can be suppressed by retaining/managing only the non-zero data in the cache memories.

Abstract

An information processing device of an embodiment has an input unit, a storage unit, a read control unit, and a write control unit. A read request and a write request are input to the input unit. The storage unit stores management information. When the read request is input, the read control unit reads read data including the management information from the storage unit, references the management information, and outputs only non-zero data included in a predetermined range of a block row. The write control unit writes only non-zero data to the storage unit and updates the management information immediately before a start position of the continuous non-zero data started from a largest position in the continuous non-zero data started from a position smaller than the predetermined range, a last management information stored in the predetermined range, and the last management information in the predetermined range.

Description

    CROSS REFERENCE TO RELATED APPLICATIONS
  • This application is a continuation of U.S. application Ser. No. 14/484,093 filed on Sep. 11, 2014 and titled “INFORMATION PROCESSING DEVICE AND DATA STRUCTURE,” the entire contents of which are incorporated herein by reference, and which is based upon and claims the benefit of priority from the prior Japanese Patent Application No. 2014-050584 filed on Mar. 13, 2014, the entire contents of which are incorporated herein by reference.
  • FIELD
  • An embodiment described herein relates generally to an information processing device and a data structure.
  • BACKGROUND
  • Conventionally, there has been a demand for a sparse matrix having most of matrix elements which are 0 to suppress the used amount and the band width of a memory by retaining only non-zero components in the memory. Today, the demand is realized by managing only the values of the non-zero components and the position information thereof to suppress the used amount and the band width of the memory by using a sparse-matrix management library of software.
  • However, since these processes depend on the software, a large overhead is present upon access to the non-zero components. Moreover, upon access to the sparse matrix, massive time is taken if access according to data management methods of respective libraries is not used. Therefore, access cannot be made like that for a matrix formed by a normal two-dimensional layout, which is inconvenient. Conventionally, there has been hardware that carries out management so as to retain only non-zero components in a DRAM; however, there is a problem that processing upon rewrite is complex and has a large overhead.
  • On the other hand, generally, the processing speed of a processor or a hardware engine is higher than the data supply ability of a main memory such as a DRAM; therefore, a cache memory which compensates for the performance difference thereof is used in some cases. The cache memory is a memory such as a SRAM which exhibits a higher speed than the main memory, and the cache memory temporarily stores data in a data array. The processor can carry out high-speed processing by accessing the data in the cache memory.
  • If there are no data in the data array, the cache memory acquires data from the main memory in the unit of a cache line size (for example, 256 bytes) larger than an accessed data size. By accessing the main memory in the large unit, efficiency of the access to the main memory is improved. On the other hand, in a case in which data are in the data array, the cache memory can return data from the data array without acquisition of data from the main memory; therefore, the processor or the hardware engine can access the data at high speed.
  • BRIEF DESCRIPTION OF THE DRAWINGS
  • FIG. 1 is a diagram showing a configuration of a computer system provided with an information processing device according to the present embodiment;
  • FIG. 2 is a diagram for explaining a matrix to be processed in the present embodiment;
  • FIG. 3 is a diagram for explaining a processing unit of the matrix;
  • FIG. 4 is a diagram for explaining a management method of block rows;
  • FIG. 5 is a diagram for explaining an example of used memory space;
  • FIG. 6 is a diagram for explaining an example of a bit layout in matrix addresses;
  • FIG. 7 is a diagram for explaining a detailed configuration of a matrix management engine 4;
  • FIG. 8 is a diagram for explaining a process of address translation;
  • FIG. 9 is a diagram for explaining an example of a write request;
  • FIG. 10 is a diagram for explaining a detailed configuration of Read Ctrl 23 a;
  • FIG. 11 is a flowchart for explaining an example of the flow of a process of reading data of S[m][n0] to S[m][n1];
  • FIG. 12 is a flowchart for explaining an example of the flow of a process of reading data of S[m][n0]to S[m][n1];
  • FIG. 13 is a diagram for explaining a detailed configuration of Write Ctrl 24 a;
  • FIG. 14 is a flowchart for explaining an example of the flow of a process of writing data to a block column n0 to a block column n1 of an m-th row;
  • FIG. 15 is a flowchart for explaining an example of the flow of a process of searching for non-zero data columns B and A;
  • FIG. 16 is a flowchart for explaining an example of the flow of a process of updating non-zero management information at the position of “the start position of the non-zero data column B”−1 and writing write data;
  • FIG. 17 is a flowchart for explaining an example of the flow of the process of updating the non-zero management information at the position of “the start position of the non-zero data column B”−1 and writing write data;
  • FIG. 18 is a flowchart for explaining an example of the flow of the process of updating last non-zero management information and updating non-zero management information of the column n1;
  • FIG. 19 is a diagram for explaining a write operation by Write Ctrl 24 a; and
  • FIG. 20 is a diagram for explaining a write operation by Write Ctrl 24 a.
  • DETAILED DESCRIPTION
  • An information processing device of an embodiment has an input unit, a storage unit, a read control unit, and a write control unit. A read request and a write request with respect to a predetermined range of a block row provided with at least one or more blocks consisting of one or more elements are input to the input unit. The storage unit stores, in the region of zero data having all the elements in one block being zero, management information, which stores information representing the number of continuous non-zero data having one or more non-zero elements in one block and a distance to next non-zero data. When the read request is input, the read control unit reads read data including the management information from the storage unit, references the management information, and outputs only non-zero data included in a predetermined range of a block row. The write control unit writes only non-zero data, which has one or more non-zero elements in one block in the data, to the storage unit and updates the management information immediately before a start position of the continuous non-zero data started from a largest position in the continuous non-zero data started from a position smaller than the predetermined range, a last management information stored in the predetermined range, and the last management information in the predetermined range.
  • Hereinafter, an embodiment of the present invention will be explained in detail with reference to drawings.
  • A computer system provided with an information processing device according to the present embodiment will be explained. FIG. 1 is a configuration diagram of the computer system provided with the information processing device according to the present embodiment. The computer system 1 consists of a central processing unit (hereinafter, referred to as CPU) 2, a hardware engine (hereinafter, referred to as HWE) 3, a matrix management engine 4, a cache 5, an address translation unit 6, an interconnect 7, a main memory 8, and an input/output device (hereinafter, referred to as I/O) 9.
  • The matrix management engine 4 serving as the information processing device is connected to the CPU 2, the HWE 3, and the cache 5. The cache 5 is connected to the interconnect 7 via the address translation unit 6. The interconnect 7 is further connected to the main memory 8 and the I/O 9. The main memory 8 is, for example, a DRAM.
  • Input data to the computer system 1 is transferred to the main memory 8 via the I/O 9 and the interconnect 7. The transferred input data is transferred to and processed by the CPU 2 or the HWE 3. Output data processed by the CPU 2 or the HWE 3 is output via the main memory 8, the interconnect 7, and the I/O 9.
  • In a case in which data other than that of a matrix is accessed, the computer system 1 directly accesses the main memory 8 (or the cache 5) from the CPU 2 or the HWE 3 without the intermediation of the matrix management engine 4. On the other hand, in a case in which a matrix is accessed, the matrix management engine 4 carries out processing.
  • An example of a matrix processed in the present embodiment will be explained. FIG. 2 is a diagram for explaining a matrix processed in the present embodiment, and FIG. 3 is a diagram for explaining processing units of the matrix.
  • The matrix of FIG. 2 is a two-dimensional sparse matrix S consists of 832 elements of 16 rows and 52 columns. Moreover, in the present embodiment, zeros/non-zeros are managed by a predetermined size of the sparse matrix S. The predetermined size is one or more elements and is, for example, a 16-element unit of 4×4 as shown in FIG. 3. This unit is referred to as a block. The block of 4×4 in the sparse matrix S is expressed as S[y][x] (x=1 to 13, y=1 to 4). The block number in the horizontal (column) direction is represented by x, and x is in the range of 1 to the number obtained by dividing the number of the columns of the sparse matrix S by 4 (rounded up). The block number in the vertical (row) direction is y, and y is in the range of 1 to the number obtained by dividing the number of the rows of the sparse matrix S by 4 (rounded up).
  • Herein, as the definitions of the zero/non-zero of the block unit, a case in which all the elements of one block are 0 is zero, and a case in which one block includes at least one element that is not 0 is non-zero. For example, S[1][1] and S[1][4] are zero since all the 16 elements therein are 0. On the other hand, a case in which at least one of 16 elements is not 0 like S[1][2] and a case in which all of 16 elements are not 0 like S[1][3] are non-zero.
  • In the present embodiment, non-zero management is carried out in a block row unit. Note that even if the sparse matrix S is a matrix of three or more dimensions, management can be similarly carried out by carrying out management in a one-dimensional block row unit. In the present embodiment, as shown in FIG. 3, non-zero management is carried out while using S[1][1] to S[1][13] as one block row B1. Non-zero management is similarly carried out for each of the other block rows B2, B3, and B4.
  • FIG. 4 is a diagram for explaining a management method of block rows. In non-zero management of a block row, a memory region corresponding to one block (hereinafter, referred to as a non-zero management region) different from that for matrix data is used and disposed at a top of the block row. Hereinafter, a block position in the non-zero management is expressed as R[y][x]. The block number in the horizontal (column) direction is x, and the block number in the vertical (row) direction is y. FIG. 4 shows R[y][x] (x=0 to 13, y=1 to 4). The block row B1 uses R[1][0] to R[1][13].
  • Non-zero data is disposed so that the position thereof in the sparse matrix S and the position thereof among the block positions R are the same. For example, S[1][2] which is non-zero data in the sparse matrix S is disposed at R[1][2], and S[1][6] is disposed at R[1][6].
  • Then, position information of the non-zero data (hereinafter, referred to as non-zero management information) is disposed at the position of zero data. More specifically, the non-zero management information is disposed at the position of the zero data immediately before one or more continuous non-zero data. Note that, in the present embodiment, one piece of non-zero management information is recorded at the position of the zero data immediately before continuous non-zero data, wherein the position information of a plurality of non-zero data may be configured to be disposed at the position of one zero data.
  • The non-zero management information consists of parameters (Num, Next). The number of blocks of continuous non-zero data is represented by Num, and the distance (the number of blocks) to next non-zero data is represented by Next. However, a case in which Next is 0 represents that next continuous non-zero data is not present in the block row. Note that, although the non-zero management information uses relative distances in FIG. 4, absolute coordinates (block positions x) may be used.
  • For example, the non-zero management information of the column of the non-zero data starting from S[1][2] of FIG. 4 is disposed at R[1][1], which is zero data. Num is 2 since R[1][2] and R[1][3] are non-zero. Moreover, since the column of next non-zero data is started from R[1][6], Next becomes 4 (=6-2). Thus, the non-zero management information stored at R[1][1] becomes (2, 4).
  • At the non-zero management region R[1][0] additionally ensured by the amount corresponding to one block column, the non-zero management information about the data from the row head R[1][1] is recorded. For example, since the row head R[1][1] is zero data in the block row B1, Num of R[1][0] becomes 0, and Next becomes the distance 1 (=2-1) to the non-zero data starting from R[1][2].
  • On the other hand, since R[2][1] is non-zero data in the block row B2, Num of R[2][0] becomes the number of the non-zero data from R[2][1]. Namely, Num of the non-zero management region R[2][0] of the block row B2 becomes 1, and Next becomes the distance 2 (=3-1) to the next non-zero data.
  • Moreover, in a case in which all of block rows are non-zero like the block row B3, the data of a non-zero management region R[3][0] becomes (13, 0). On the other hand, in a case in which all of block rows are zero like the block row B4, the data of a non-zero management region R[4][0] becomes (0, 0).
  • The matrix management engine 4 has matrix management information for each of matrices managed. The matrix management information is a matrix base address (base), the number of rows of the matrix (width), and the number of columns of the matrix (height) and is set from outside (for example, the CPU 2). Herein, it is assumed that address space has a 32-bit width in a byte address, one element of the sparse matrix S is 8-byte data, and the base address thereof is 0x48000000. In this case, set values of parameters become 0x48000000 as the matrix base address (base), 52 as the number of rows of the matrix (width), and 16 as the number of columns of the matrix (height). The matrix management engine 4 uses memory space by using these parameters.
  • FIG. 5 is a diagram for explaining an example of the used memory space. The number of elements of one line is 2n (n is an integer 1 or higher) including a non-zero management region and row data of the matrix. The matrix data are sequentially disposed one row by one row in a line direction. In the data of a first row, a non-zero management region Q11 (corresponding to 4 elements) and row data Q12 of the matrix (corresponding to 52 elements) are disposed from Base Address (0x48000000). The width of the non-zero management region Q11 corresponds to the width of the columns of one block, which is a zero/non-zero management unit.
  • Note that a region corresponding to 8 elements is a data region Q13, which is not used. The width Z of the data region Q13, which is not used, is a minimum value that satisfies a below equation.

  • (Z+[the width of the management region Q11]+[the number of elements of the row data Q12])=2n (n is a positive integer, n>=1), and Z>=0
  • In FIG. 5, the value of Z is 8, and the size of one line corresponds to 64 elements. Since a memory volume of one line is 512 bytes (64 elements×8 bytes), the data of a second row is started from an address 0x48000200.
  • FIG. 6 is a diagram for explaining an example of a bit layout of a matrix address (address before translation). The matrix address includes Base Address, a Y-coordinate, an X-coordinate, and Offset used when the data (8 bytes) of one element are accessed in a byte unit. If the elements on which the sparse matrix S is present are s(y,x) (x=1 to 52, y=1 to 16), the address at which the elements are disposed is X=x+3 and Y=y−1. For example, the address in a case of S(1,1) is 0x48000020 since X=4 and Y=0, and the address in a case of S(5,25) is 0x480008e0 since X=28 and Y=4.
  • Next, the matrix management engine 4 will be explained. FIG. 7 is a configuration diagram of the matrix management engine 4. The matrix management engine 4 of FIG. 7 is provided with a Packet Distributer 21, Packet I/Fs 22 a to 22 d, Read Ctrls 23 a to 23 d, and Write Ctrls 24 a to 24 d. Note that the matrix management engine 4 of FIG. 7 has four Packet I/Fs 22 a to 22 d, wherein the number thereof is arbitrary, and the matrix management engine 4 has Read Ctrls 23 and Write Ctrls 24 to correspond thereto. Moreover, the cache 5 consists of L2 caches 25 a to 25 d. Furthermore, the address translation unit 6 consists of Address Translators 26 a to 26 d corresponding to the L2 caches 25 a to 25 d, respectively. Note that the L2 caches 25 may be general caches.
  • Input from the master module (the CPU 2 or the HWE 3) is input to Packet Distributer 21. The relation of input/output between the matrix management engine 4 and the master module is as described below. Requests from the master module are carried out in block units.
  • In a case of a read request from the HWE 3, the start address of an access target matrix, an X-coordinate and a Y-coordinate of access start, the number of transfer columns, and the number of transfer rows are input to the matrix management engine 4. Note that these parameters are examples. For example, in a case of access in one row, the parameters may be parameters such as an access start element address and the number of transfers. In that case, according to the access start element address, the other parameters are calculated in the matrix management engine 4.
  • With respect to a read request from the HWE 3, in a case in which non-zero data are in a target read region, the matrix management engine 4 outputs continuous non-zero data columns as below read data. The read data include an X-coordinate and a Y-coordinate of the start of the non-zero data, the continuous non-zero data columns, and a non-zero data flag (ON). Note that in a case in which all data are zero with no non-zero data column, the non-zero data flag becomes OFF, and nothing is output as the X-coordinate and the Y-coordinate of the start of read non-zero data and non-zero data columns.
  • Moreover, in a case of a write request from the HWE 3, the start address of an access target matrix, an X-coordinate and a Y-coordinate of access start, the number of transfer columns, the number of transfer rows, and a non-zero data flag are input to the matrix management engine 4.
  • In a case in which non-zero data are written, the non-zero data flag becomes ON. Data input in the case in which the non-zero data are written are an X-coordinate and a Y-coordinate of the start of non-zero data for every continuous non-zero data columns, the continuous non-zero data columns, and a terminal flag of the continuous non-zero data columns. Then, a last non-zero data flag of the write request is attached to the ending thereof.
  • The write request from the HWE 3 writes zero data to a location for which any non-zero data column is specified in a write request range. In a case in which zero data are written all in the request range, the non-zero data flag becomes OFF, and matrix data are not input from the HWE 3 to the matrix management engine 4.
  • On the other hand, a read request from the CPU 2 is the same as a normal read request from a CPU to a memory. The address and the read size of a read access target are input to the matrix management engine 4.
  • With respect to the read request from the CPU 2, the matrix management engine 4 returns read data of the requested read size to the CPU 2. In a case in which the requested read size is smaller than a block size or is not an integral multiple of a block size, the matrix management engine 4 processes the request as a read request of a size that is larger than the requested read size and is minimum among block sizes of the integral multiples. Then, the matrix management engine 4 returns only the read size actually requested from the obtained read data to the CPU 2. Moreover, in a case in which the read data include 0, the matrix management engine 4 returns 0 to the CPU 2.
  • Moreover, a write request from the CPU 2 is the same as a normal write request from a CPU to a memory. The address of a write access target, a write size, and write data are input to the matrix management engine 4.
  • With respect to the write request from the CPU 2, the matrix management engine 4 stores the write data of the requested write size. In a case in which the write size is smaller than a block size or is not an integral multiple of the block size, the matrix management engine 4 carries out processing while considering that the data other than the given write data are 0, wherein the request serves as a write request of a size that is larger than the write size and is minimum among the block sizes of the integral multiples.
  • The matrix management engine 4 has matrix information therein and, when access is input, judges whether it is access to a management matrix. In a case in which it is judged that the access is not to the management matrix, the matrix management engine 4 accesses the L2 cache 25, etc. as normal access.
  • In a case in which only non-zero data are output to a request source, Read Ctrl 23 outputs the non-zero data read from the L2 cache 25 and the position information thereof to the request source. On the other hand, in a case of output to the request source as normal matrix data, Read Ctrl 23 outputs the normal matrix data into which zero data have been inserted to the request source based on the non-zero data read from the L2 cache 25 and the position information thereof.
  • In both of the case in which only the non-zero data and the position information are input and the case in which the normal matrix data are input, Write Ctrl 24 translates the access to below two requests and outputs the requests to the L2 cache 25. Specifically, Write Ctrl 24 translates the access to the request for carrying out write of the non-zero data and update of the position information and to the request for carrying out update of the position information by translation from non-zero to zero (write of data is not carried out).
  • With respect to a request from the master module, Packet Distributer 21 checks whether it is access to a read/write-requested matrix, the address of which is managed. Then, in the case of the managed matrix, Packet Distributer 21 carries out address translation, judges whether it is access to the cache 5, and distributes the request to the Packet I/Fs 22 a to 22 d for each row. In a case in which it is not the access to the managed matrix, Packet Distributer 21 returns an error to the master module.
  • In order to check whether it is the access to the managed matrix, Packet Distributer 21 uses the matrix management parameters (base, width, height) and checks whether the address space of the memory includes the requested address.
  • In a case in which it is judged to be the access to the managed matrix, Packet Distributer 21 carries out address translation and disposes non-zero-managed blocks, which have been disposed at addresses away from each other by each row in the original address space, in continuous address space.
  • Herein, FIG. 8 is a diagram for explaining the process of the address translation. In a case in which the management unit is a 4×4 block, lower 2 bits of X and Y represent the position in the block. Therefore, Packet Distributer 21 inserts the lower 2 bits of Y of the address before translation into the lower side of X of the address after translation. Note that also in a case in which the matrix has three or more dimensions, the bit(s) representing the position in the block except that of X is moved to the lower side of X. The bit positions of X and Y can be calculated from width of the input parameters. Moreover, Packet Distributer 21 adds MatrixID, which represents a managed matrix, to the address after translation.
  • Moreover, Packet Distributer 21 confirms an L2 bank number of the address before translation. In the present embodiment, banks, in other words, the L2 caches 25 a to 25 d are switched for each block row of the non-zero management matrix. Therefore, the 2 bits at the position shown by shading of Y of the address before translation represents the L2 bank number. Packet Distributer 21 outputs a request to any of the L2 caches 25 a to 25 d represented by the value of the 2 bits. Note that the 2 bits representing the L2 bank number are not included in the address after translation. Output parameters are translated from the input parameters and include a base address of an access target row, an access-start X-coordinate, and the number of accesses. Note that only in a case of a write request, the parameters include a flag representing write of all 0.
  • In the case of the write request, write data are separately input, and Packet Distributer 21 carries out translation from the input write data to a start X-coordinate of write non-zero data, continuous non-zero data columns, a terminal flag (Flag-tail) of the continuous non-zero data columns, and a non-zero data flag (Flag-end) in the end of the write request and outputs them.
  • FIG. 9 is a diagram for explaining an example of the write request. A write target 35 corresponds to k blocks from a start address g. However, actually transferred write data are only non-zero regions 36, 37, and 38. Regions 39 other than that are handled as requests for writing 0. In the non-zero data at the end of the non-zero regions 36, 37, and 38, the terminal flags (Flag-tail) of the continuous non-zero data columns become 1; and, in the non-zero data in the end of the non-zero region 38, the non-zero data flag (Flag-end) in the end of the write request becomes 1.
  • Based on the read/write requests from Packet Distributer 21, the Packet I/Fs 22 a to 22 d distribute input data to Read Ctrls 23 a to 23 d or Write Ctrls 24 a to 24 d.
  • In accordance with the read requests, Read Ctrls 23 a to 23 d respectively access L2 data management structures in the L2 caches 25 a to 25 d and output the non-zero data included in access ranges to the Packet I/Fs 22 a to 22 d. Moreover, Read Ctrls 23 a to 23 d may acquire non-zero data and non-zero management information directly from the main memory 8 without using the L2 caches 25 a to 25 d.
  • Output data include a start X-coordinate of read non-zero data, continuous non-zero data columns, a terminal flag (Flag-tail) of the continuous non-zero data columns, a non-zero data flag (Flag-end) in the end of the read request, and the non-zero data flag.
  • Read Ctrls 23 a to 23 d output only the non-zero data in the non-zero regions. Namely, the data columns of zero data are not output. In a case in which there is at least one non-zero data, Read Ctrl 23 sets 1 as the non-zero data flag. On the other hand, in a case in which not even one non-zero data is included in the read request range, Read Ctrl 23 returns only the non-zero data flag 0.
  • The data output from Read Ctrls 23 a to 23 d are input to Packet Distributer 21 via the Packet I/Fs 22 a to 22 d. Packet Distributer 21 outputs data in accordance with a read data I/F for the master module.
  • On the other hand, in accordance with write requests, Write Ctrls 24 a to 24 d respectively access the L2 data management structures on the L2 caches 25 a to 25 d and update non-zero data and non-zero management information. Moreover, Write Ctrls 24 a to 24 d may keep non-zero data and non-zero management information directly in the main memory 8 without using the L2 caches 25 a to 25 d.
  • When the L2 cache 25, Read Ctrl 23, or Write Ctrl 24 is to access the main memory 8, Address Translators 26 a to 26 d reference the matrix management information and carry out address translation. In the address translation, reverse translation of the translation carried out by Packet Distributer 21 is carried out.
  • As a result of the address translation of Address Translators 26 a to 26 d, the data of one block disposed in one continuous region in the L2 caches 25 a to 25 d are divided into continuous regions of respective rows in the main memory 8 and accessed. Note that the L2 cache 25, Read Ctrl 23, or Write Ctrl 24 may directly access the main memory 8 without using the address translation by Address Translator 26.
  • Herein, a read operation of Read Ctrl 23 will be explained. Note that, since Read Ctrls 23 a to 23 d have similar configurations, only Read Ctrl 23 a will be explained. FIG. 10 is a configuration diagram of Read Ctrl 23 a. Read Ctrl 23 a has Info Checker 41, Read Requestor 42, Data Requestor 43, Data Output 44, and Read Data Receiver 45.
  • When a read request is input from the Packet I/F 22 a, Info Checker 41 carries out read of non-zero management information and check of contents.
  • Info Checker 41 outputs the read request of the non-zero management information to Read Requestor 42. In accordance with the read request of the non-zero management information, Read Requestor 42 outputs a read request of a block, which includes the non-zero management information, to the L2 cache 25 a. In accordance with the read request of the block, read data are read from the L2 cache 25 a and input to Read Data Receive 45. Read Data Receiver 45 reads the non-zero management information from the read data and outputs that to Info Checker 41.
  • When Info Checker 41 detects a non-zero column of a read target in accordance with the non-zero management information, Info Checker 41 outputs the coordinates of the non-zero column and a read start request to Data Requestor 43. Moreover, Info Checker 41 outputs the start position of non-zero data to Data Output 44.
  • Data Requestor 43 outputs a read start coordinate of the non-zero column to Data Output 44 and outputs a read request of a non-zero block(s) included in the read region to Read Requestor 42. Then, in accordance with the read request of the non-zero block, Read Requestor 42 outputs a read request to the L2 cache 25 a in accordance with the read request of the non-zero block. In accordance with the read request of the non-zero block, the read data are read from the L2 cache 25 a and input to Data Output 44 via Read Data Receiver 45.
  • Data Output 44 outputs a header by using the start position of the non-zero data from Info Checker 41. Moreover, Data Output 44 outputs the read data, which are input from Read Data Receiver 45, to the Packet I/F 22 a.
  • When the read request of the continuous non-zero blocks is finished, Data Requestor 43 outputs an end flag to Data Output 44. When the end flag is input, Data Output 44 outputs the terminal flag (Flag-tail) of the continuous zero-data columns. In a case in which there are a plurality of continuous non-zero blocks, Info Checker 41 outputs a read request of non-zero management information again to Read Requestor 42, and the above described operation is carried out.
  • Moreover, when read in the target read region is finished, Info Checker 41 outputs an end signal to Data Output 44. When the end signal is input, Data Output 44 outputs a non-zero data flag (Flag-end) in the end of the read request, and the read operation is finished.
  • Next, operations of the matrix management engine 4 configured in this manner will be explained.
  • An example of the process of a case of a read request will be explained by using FIG. 11 and FIG. 12. FIG. 11 and FIG. 12 are flowcharts for explaining the process of reading data of S[m][n0] to S[m][n1].
  • First, when the read request to the block columns n0 to n1 of an m-th row is input (S1), Read Ctrl 23 a sets Y=m and X=1 (S2). Then, Read Ctrl 23 a reads the non-zero management information of R[Y][X−1] (S3). Namely, in S3, the non-zero management information at the position immediately before the block represented by X and Y is read. Then, the values of the read non-zero management information are set as Num (the number of continuous non-zero blocks) and Next (the distance to next non-zero management information), and the current value of X is saved in Pos (S4).
  • Next, Read Ctrl 23 a judges whether a non-zero data column(s) is present (S5). In a case in which it is judged that no non-zero data column is present (S5-NO), that is, a case of Num=0, Read Ctrl 23 a judges whether a next non-zero data column is present (S6). In a case in which it is judged that the non-zero data column is present thereafter (S6-YES), update to X=Pos+Next is carried out (S7). Updated X represents the X-coordinate of the non-zero block at the top of the next non-zero data column.
  • Then, Read Ctrl 23 a judges whether X is larger than a read range (S8). In a case in which n1<X is not satisfied (S8-NO), the next non-zero column may be included in the read range, and the process returns to S3.
  • On the other hand, in a case in which it is judged that the non-zero data column is present (S5-YES), Read Ctrl 23 a judges whether the read range is included in the non-zero data column (((X<=n1)&&(X+Num>n0))) (S9). In a case in which the read range is not included therein (S9-NO), the process proceeds to S6. In a case in which the read range includes that (S9-YES), the process proceeds to S10.
  • In the case in which it is judged that the read range includes that (S9-YES), Read Ctrl 23 a judges whether X is in the read range (n0≤X≤n1) (S10). In the case in which X is not in the read range (S10-NO), update to X=X+1 and Num=Num−1 is carried out (S11), and the process returns to S10. On the other hand, if it is judged that X is in the read range (S10-YES), the process proceeds to S12 of FIG. 12.
  • Read Ctrl 23 a outputs a header (S12). This header is start position information of the non-zero data column, and the value of X is output as the header. Then, the non-zero block of S[Y][X] is read (S13), and the read block is output to a read request source (S14). Then, update to X=X+1 and Num=Num−1 is carried out (S15).
  • Then, Read Ctrl 23 a judges whether read of the non-zero data column has been finished (S16). In a case in which it is judged that non-zero data to be read are still remaining (S16-NO), that is, a case of Num>0, Read Ctrl 23 a judges whether X is in the read range (n0≤X≤n1) (S17). In a case in which n0≤X≤n1 is satisfied (S17-YES), the process returns to S13.
  • On the other hand, in a case in which it is judged that no non-zero data to be read are remaining (S16-NO), that is, a case of Num=0, the process proceeds to S18 since the continuous non-zero data columns are once finished. Moreover, also in a case in which it is judged that X is not in the read range (n0≤X≤n1) (S17-NO), the process proceeds to S18. Then, Flag-tail is output (S18), and the process proceeds to S6 of FIG. 11.
  • In a case in which the next non-zero data column is not present (S6-NO) and in a case in which X is larger than the read range (S8-YES), the process proceeds to S19. Then, Flag-end is output (S19), and the read process is finished.
  • Herein, a case in which Read Ctrl 23 a reads the fourth to eighth block columns of the block row B1 of FIGS. 3 (S[1][4] to S[1][8) will be explained. In the case in which the block columns S[1][4] to S[1][8] are to be read, the matrix data R[1][4] to R[1][8] of FIG. 4 are referenced, wherein m=1, n0=4, and n1=8.
  • First, when a read request to the fourth to eighth block columns of the first row is input (S1), Read Ctrl 23 a sets Y=1 and X=1 (S2).
  • Read Ctrl 23 a reads the non-zero management information R[1][0] of the non-zero management region (S3), and Num=0, Next=1, and Pos=1 are obtained (S4). In the case of the non-zero management region R[1][0], the process proceeds to S6 since no non-zero data is present (Num=0). Because Next=1, Read Ctrl 23 a judges that the next non-zero data column is present (S6-YES) and sets X=1+1 (S7). Because X=2<n1=8 (S8-NO), the process returns to S3.
  • Then, Read Ctrl 23 a reads the non-zero management information of R[1][1] (S3). Num=2, Next=4, and Pos=2 are obtained (S4). In this case, non-zero data are present (S5-YES); however, since the range in which the non-zero data are present (R[1][2] to R[1][3]) is not in the read range (S9-NO), the process proceeds to S6. Because Next=4, Read Ctrl 23 a judges that the next non-zero data column is present (S6-YES), and X=2+4 is set (S7). Because X=6<n1=8 (S8-NO), the process returns to S3.
  • Next, Read Ctrl 23 a reads the non-zero management information of R[1][5] (S3). Num=2, Next=5, and Pos=6 are obtained (S4). In this case, non-zero data are present (S5-YES), and the range in which the non-zero data are present (R[1][6] to R[1][7]) are included in the read range (S9-YES and S10-YES); therefore, the process proceeds to S12.
  • Read Ctrl 23 a outputs a header (S12) and reads data of S[1][6] from the cache (S13). Then, Read Ctrl 23 a outputs the data of S[1][6] to the read request source (S14) and sets X=6+1 and Num=2-1 (S15). Because Num=1, it is judged that non-zero data to be read are still remaining (S16-NO). Moreover, because X=7, X is present in the read range (S17-YES); therefore, the process returns to S13.
  • Read Ctrl 23 a reads the data of S[1][7] and outputs the data to the read request source (S13, S14) and sets X=7+1 and Num=1-1 (S15). Because Num=0, it is judged that no non-zero data to be read are remaining (S16-YES), Read Ctrl 23 a outputs Flag-tail (S18), and the process returns to S6.
  • Because Next=5, it is judged that a non-zero data column is present thereafter (S6-YES), and X=6+5 is set (S7). Because X=11>n1=8 (S8-YES), Read Ctrl 23 a outputs Flag-end (S19) and finishes the read process.
  • Next, a write operation of Write Ctrl 24 will be explained. Note that, since Write Ctrls 24 a to 24 d have similar configurations, only Write Ctrl 24 a will be explained. FIG. 13 is a configuration diagram of Write Ctrl 24 a. Write Ctrl 24 a has B-Searcher 51, Read Requestor 52, A-Searcher 53, Read Data Receiver 54, Write Data Receiver 55, B-Updater 56, Write Requestor 57, Data Writer 58, and Last-Info Updater 59.
  • When a write request is input from Packet I/F 22 a, B-Searcher 51 searches for a non-zero data column B
  • First, B-Searcher 51 outputs a read request of non-zero management information to Read Requestor 52. In accordance with the read request to the non-zero management information, Read Requestor 52 outputs a read request of a block including the non-zero management information to the L2 cache 25 a. In accordance with the read request of the block, read data are read from the L2 cache 25 a and input to Read Data Receiver 54. Read Data Receiver 54 reads the non-zero management information from the read data and outputs the information to B-Searcher 51.
  • When B-Searcher 51 finishes the search for the non-zero data column B using the read non-zero management information, B-Searcher 51 outputs an operation start request to A-Searcher 53 together with the information of the non-zero data column
  • B.
  • When the information of the non-zero data column B and the operation start request are input, A-Searcher 53 searches for a non-zero data column A Note that the search for the non-zero data column A will be described later. As well as B-Searcher 51, A-Searcher 53 gives a read request of non-zero management information to Read Requestor 52 and reads the non-zero management information from Read Data Receiver 54.
  • When the search for the non-zero data column A using the read non-zero management information is finished, A-Searcher 53 outputs a write request to B-Updater 56 together with the information of the non-zero data columns A and B.
  • In accordance with the input information of the non-zero data columns A and B and the write request, B-Updater 56 carries out update of the non-zero management information of “the start position of the non-zero data column B”−1. In this process, the start position of the non-zero data column is input from Write Data Receiver 55. In the update of the non-zero management information, B-Updater 56 outputs a write request to Write Requestor 57, and Write Requestor 57 outputs a write request of a corresponding block to the L2 cache 25 a.
  • Write data of a non-zero data block and the start position of a non-zero data column are input to Write Data Receiver 55 from Packet I/F 22 a. Write Data Receiver 55 outputs the input write data of the non-zero data block and the start position of the non-zero data column to B-Updater 56 and Data Writer 58.
  • After the update of the non-zero management information at “the start position of the non-zero data column B”−1 is finished, B-Updater 56 outputs an operation start request to Data Writer 58.
  • When the operation start request is input, Data Writer 58 carries out write of the write data. Data Writer 58 outputs a write request to Write Requestor 57 about write of non-zero data and non-zero management information to Write Requestor 57 as well as B-Updater 56, and Write Requestor 57 outputs a write request of the corresponding block to the L2 cache 25 a. After the write of the write data is finished, Data Writer 58 outputs an operation start request to Last-Info Updater 59.
  • Last-Info Updater 59 carries out write of the last non-zero management information and write of the write data at the position n1 (the last column of the write data, which will be described later). As well as B-Updater 56, Last-Info Updater 59 outputs a write request about write of the non-zero management information to Write Requestor 57, and Write Requestor outputs a write request of the corresponding block to the L2 cache 25 a.
  • Herein, an example of the process of a case of a write request will be explained by using FIG. 14. FIG. 14 is a flowchart for explaining the process of writing data to the block column n0 to the block column n1 of the m-th block row.
  • First, when the write request to the column n0 to the column n1 of the m-th row is input (S21), Write Ctrl 24 a searches for non-zero data columns B and A (S22). Herein, the non-zero data column B is a non-zero data column at the top among continuous non-zero data columns started from a position smaller than the region of the column n0 to the column n1. The non-zero data column A is non-zero data column at the ending among continuous non-zero data columns including data at a position larger than the region of the column n0 to the column n1. In a case in which a non-zero data column that satisfies the condition of the non-zero data column B is not present, the start position of the non-zero data column B is assumed to be 1; and, in a case in which a non-zero data column that satisfies the condition of the non-zero data column A is not present, it is assumed that the non-zero data column A is not present.
  • Then, Write Ctrl 24 a carries out update of the non-zero management information at the position of “the start position of the non-zero data column B”−1 and write of write data (S23).
  • Then, Write Ctrl 24 a carries out update of the last non-zero management information of the column n0 to the column n1 and the non-zero management information of the column n1 (S24) and finishes the process.
  • Next, a specific process of S22 will be explained by using FIG. 15. FIG. 15 is a flowchart for explaining the process of search for the non-zero data columns B and A.
  • First, when the write request to the column n0 to the column n1 of the m-th row is input (S31), Write Ctrl 24 a sets Y=m and X=1 (S32). Then, Write Ctrl 24 a reads the non-zero management information of R[Y][X-1] (S33). In the process of S33, the non-zero management information at the position immediately before the block represented by X and Y is read. Then, the values of the read non-zero management information are set as Num (the number of continuous non-zero blocks) and Next (the distance to next non-zero management information) (S34).
  • Write Ctrl 24 a judges whether next non-zero data column is present (S35). In a case in which it is judged that the next non-zero data column is present (S35-YES), namely, a case in which Next is not 0, Write Ctrl 24 a judges whether the next non-zero data column is the column n0 or thereafter (S36). In a case in which it is judged that a top non-zero block of the next non-zero data column is also at a position less than the column n0 (S36-NO), Write Ctrl 24 a updates X to X+Next (S37), and the process returns to S33.
  • On the other hand, in a case in which it is judged that the next non-zero data column is not present (S35-NO) or in a case in which it is judged that the top non-zero block of the next non-zero data column is in the n0 column or thereafter (S36-YES), Write Ctrl 24 a detects a non-zero column starting from X as the non-zero data column B and sets B=X, Num=Num, and Next=Next (S38). As a result of the process of S38, the non-zero data column B and the non-zero management information representing the non-zero data column B are detected.
  • Then, Write Ctrl 24 a judges whether a non-zero data column is present (S39). In a case in which it is judged that the next non-zero data column is present (S39-YES), Write Ctrl 24 a judges whether the end of the non-zero data column is after the column n1 (S40). In a case in which it is judged that the end of the non-zero data column is not after the column n1 (S40-NO), Write Ctrl 24 a judges whether a next non-zero data column is present (S41). In a case in which it is judged that the next non-zero data column is present (S41-YES), Write Ctrl 24 a updates X to X+Next (S42).
  • Then, Write Ctrl 24 a reads the non-zero management information of R[Y][X-1] (S43) and sets the values of the read non-zero management information as Num and Next (S44). Then, the process returns to S39.
  • On the other hand, in a case in which it is judged that the end of the non-zero data column is after the column n1 (S40-YES), Write Ctrl 24 a detects the non-zero data column starting from X as the non-zero data column A and sets A=X, Num=Num, and Next=Next (S45), and the process is finished. As a result of the process of S45, the non-zero data column A and the management information representing the non-zero data column A are detected.
  • Note that in a case in which it is judged in S39 that the next non-zero data column is not present (S39-NO), Write Ctrl 24 a determines that the non-zero data column A is not present (S46) and finishes the process. Similarly, in a case in which it is determined in S41 that the next non-zero data column is not present (S41-NO), Write Ctrl 24 a determines that the non-zero data column A is not present (S47) and finishes the process. As a result of the above process, the start position of the non-zero data column B and the start position of the non-zero data column A are searched.
  • Next, a specific process of S23 will be explained. FIG. 16 is a flowchart for explaining update of the non-zero management information of the non-zero data column B. FIG. 17 is a flowchart for explaining a write process of write data.
  • Write Ctrl 24 a judges whether the write data in a range are all 0, namely, whether the write data includes non-zero data (SM). In a case in which it is judged that the write data include non-zero data (SM-NO), Write Ctrl 24 a inputs a start position (q) of a non-zero data column.
  • Then, Write Ctrl 24 a judges whether the write of the non-zero data column is started from the column n0 (S53). In a case in which it is judged that the write of the non-zero data column is not from the column n0 (S53-NO), Write Ctrl 24 a judges that the non-zero data column B is included in the columns n0 to n1 (S54). Namely, in the process of S54, whether the length of the non-zero data column B is changed by zero-data write is checked. In a case in which it is judged that the non-zero data column B is included in the columns n0 to n1 (the length of the non-zero data column B is changed by zero-data write) (S54-YES), Write Ctrl 24 a reduces Num of the non-zero data column B by the amount overlapped with the columns n0 to n1 (S55) and changes Next of the non-zero data column B so that it specifies the start position q of the non-zero data column (S56).
  • Then, Write Ctrl 24 a sets the start position q of the non-zero data column as the start position of a non-zero data column W0 (S57). In this case, Write Ctrl 24 a sets Pos=q, Cnt=0, and Start=q and proceeds to the process of FIG. 17.
  • On the other hand, in a case in which it is judged that the write of the non-zero data column is started from the column n0 (S53-YES), Write Ctrl 24 a judges whether the non-zero data column B and the non-zero data column W0 are in contact or overlapped with each other (S58). In a case in which it is judged that the non-zero data column B and the non-zero data column W0 are not in contact or not overlapped with each other (S58-NO), the process proceeds to S56. On the other hand, in a case in which it is judged that the non-zero data column B and the non-zero data column W0 are in contact or overlapped with each other (S58-YES), Write Ctrl 24 a sets the start position of the non-zero data column B as the start position of the non-zero data column W0 (S59). In this case, Write Ctrl 24 a sets Pos=q, Cnt=n0-“the start position of B”, and Start=the start position of B, and the process proceeds to the process of FIG. 17.
  • When the process of S57 or S59 is executed, a transition to FIG. 17 is made, and Write Ctrl 24 a inputs a non-zero data block (S60) and writes the input non-zero data block at the position of S[m][Pos] (S61). Then, Write Ctrl 24 a increments Pos and Cnt (Pos=Pos+1, Cnt=Cnt+1) (S62) and judges whether the write of the non-zero data column W0 has been finished (S63). In a case in which it is judged that the write of the non-zero data column W0 has not been finished (S63-NO), the process returns to S60. On the other hand, in a case in which it is judged that the write of the non-zero data column W0 has been finished (S63-YES), Write Ctrl 24 a judges whether a next write non-zero data column (assumed to be W1) is present (S64). In a case in which it is judged that the next write non-zero data column W1 is present (S64-YES), Write Ctrl 24 a inputs the start position (herein, assumed to be p) of the next non-zero data column W1 (S65).
  • Then, Write Ctrl 24 a updates Num and Next of the non-zero data column W0 (S66). In this case, Write Ctrl 24 a sets Num=Cnt and Next=p-Start. Then, Write Ctrl 24 a sets the non-zero data column W1 as the next non-zero data column W0 (S67), and the process returns to S60. In this case, Write Ctrl 24 a sets Pos=p, Cnt=0, and Start=p. On the other hand, in a case in which it is judged that the next write non-zero data column W1 is not present (S64-NO), the process returns to the end of FIG. 16.
  • Moreover, in a case in which it is judged that no non-zero data is included in the write data (all of write data in the range are 0) (SM-YES), Write Ctrl 24 a judges whether the elements of the non-zero data column B are included in the columns n0 to n1 (S68). In a case in which it is judged that the non-zero data column B is not included in the columns n0 to n1 (S68-NO), the process proceeds to S70. On the other hand, in a case in which it is judged that the non-zero data column B is included in the columns n0 to n1, Write Ctrl 24 a reduces Num of the non-zero data column B by the amount overlapped with the columns n0 to n1 (S69).
  • Then, Write Ctrl 24 a judges whether the non-zero data column A is present (S70). In a case in which it is judged that the non-zero data column A is not present (S70-NO), Write Ctrl 24 a changes Next of the non-zero data column B to 0 (S71), and the process is finished. On the other hand, in a case in which it is judged that the non-zero data column A is present (S70-YES), Write Ctrl 24 a judges whether there is a location where the non-zero column of the non-zero data column A is changed to 0 by write (S72). In a case in which it is judged that there is no location where the non-zero column of the non-zero data column A is changed by the write (S72-NO), Write Ctrl 24 a changes Next of the non-zero data column B so that it specifies the non-zero data column A (S73), and the process is finished. On the other hand, in a case in which it is judged that there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S72-YES), Write Ctrl 24 a changes Next of the non-zero data column B so that it specifies n1+1 (S74), and the process is finished.
  • FIG. 18 is a flowchart for explaining a process of updating the last non-zero management information and updating the non-zero management information of the column n1.
  • First, Write Ctrl 24 a judges whether 0 is written in all the write range (S81). In a case in which it is judged that 0 is not written in all the write range (S81-NO), Write Ctrl 24 a judges whether the non-zero data column A is present (S82). In a case in which it is judged that the non-zero data column A is present (S82-YES), Write Ctrl 24 a judges whether the last block of write is 0 (S83).
  • In a case in which it is judged that the last block of the write is 0 (S83-YES), Write Ctrl 24 a judges whether there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S84). In a case in which it is judged that there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S84-YES), Write Ctrl 24 a updates the last non-zero management information (S85). In this case, Write Ctrl 24 a sets Next=n1+1-Start and Num=Cnt. Finally, Write Ctrl 24 a updates the non-zero management information of the column n1 (S86), and the process is finished. In this case, Write Ctrl 24 a sets Next=A_Next and Num=A_Num-(n1+1-“the start position of A”).
  • On the other hand, in a case in which it is judged that there is no location where the non-zero column of the non-zero data column A is changed to 0 by the write (S84-NO), Write Ctrl 24 a updates the last non-zero management information to Next=“the start position of A”-Start and Num=Cnt (S87), and the process is finished.
  • Moreover, in a case in which it is judged that the last block of the write is not 0 (S83-NO), Write Ctrl 24 a judges whether the last block of the write is connected or overlapped with the non-zero data column A (S88). In a case in which it is judged that the last block of the write is connected or overlapped with the non-zero data column A (S88-YES), Write Ctrl 24 a updates the last non-zero management information to Next=A_Next and Num=Cnt (S89), and the process is finished. On the other hand, in a case in which it is judged that the last block of the write is not connected nor overlapped with the non-zero data column A (S88-NO), Write Ctrl 24 a updates the last non-zero management information to Next=“the start position of A”-Start and Num=Cnt (S90), and the process is finished.
  • Moreover, in a case in which it is judged that the non-zero data column A is not present (S82-NO), Write Ctrl 24 a updates the last non-zero management information to Next=0 and Num=Cnt (S91), and the process is finished.
  • Moreover, in a case in which it is judged that 0 is written in all the write range (S81-YES), Write Ctrl 24 a judges whether the non-zero data column A is present (S92). In a case in which it is judged that the non-zero data column A is not present (S92-NO), the process is finished. On the other hand, in a case in which it is judged that the non-zero data column A is present (S92-YES), Write Ctrl 24 a judges whether there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S93). In a case in which it is judged that there is no location where the non-zero column of the non-zero data column A is changed to 0 by the write (S93-NO), the process is finished. On the other hand, in a case in which it is judged that there is the location where the non-zero column of the non-zero data column A is changed to 0 by the write (S93-YES), Write Ctrl 24 a updates the non-zero management information of the column n1 to Next=A_Next and Num=A_Num-(n1+1-“the start position of A”) (S94), and the process is finished.
  • Next, an example of writing data to the block row B2 of the second row (m=2) of the sparse matrix S shown in FIG. 3 will be explained by using FIG. 19 and FIG. 20. FIG. 19 and FIG. 20 are diagrams showing data columns to be written and non-zero management information of the second row.
  • In FIG. 19, a reference sign 70 represents the data of the block row B2 of the second row of the sparse matrix S before data are written. At reference signs 71 a to 71 h, hatched squares represent non-zero data, gray squares represent zero data, and white squares represent that write is not carried out therein. Reference signs 72 a to 72 h represent the data of the block row B2 after write data (reference signs 71 a to 71 h) of Example 1 to Example 8 are written. First, the search process of the non-zero data columns B and A of S22 will be explained by using Example 1 of FIG. 19. In Example 1, write of data is carried out from fourth to eleventh columns (n0=4, n1=11), non-zero data are written to fifth, sixth, ninth, and tenth columns, and zero data are written to fourth, seventh, eighth, and eleventh columns.
  • In the search for the non-zero data column B, first, Write Ctrl 24 a reads the non-zero management information R[2][0] at the top. Write Ctrl 24 a obtains the start position of a next non-zero column from the read value (Next). In the case of Example 1, it is R[2][3].
  • Then, Write Ctrl 24 a checks whether the obtained start position of the non-zero column is in the range of write data (R[2][4] to R[2][11]). In the case in which the obtained start position of the non-zero column is in the range of the write data, “the position of the read non-zero management information”+1 is the start position of the non-zero data column B. In a case in which the start position is not in the range, next non-zero management information is read. In the case of Example 1, since R[2][3] is not in the range of the write data, the next non-zero management information R[2][2] is read.
  • Thereafter, Write Ctrl 24 a carries out a similar process until the non-zero data column B is detected. In the case of Example 1, if the start position of the next non-zero column becomes R[2][7], the start position of the non-zero data column B becomes “the position of the non-zero management information R[2][2]”+1, namely, 3.
  • On the other hand, in the search for the non-zero data column A, first, Write Ctrl 24 a initializes the position of the non-zero management information to the position of “the start position of the non-zero data column B”−1. Then, the non-zero management information is read from the position of the current non-zero management information, and the last position of the non-zero data column is obtained from “the position of the current non-zero management information”+Num. In the case of Example 1, R[2][4] is obtained from 2+2=4.
  • Then, Write Ctrl 24 a checks whether the last position of the obtained non-zero data column is in the range of the write data (R[2][4] to R[2][11]). In a case in which the last position of the obtained non-zero data column is not in the range of the write data, “the position of the current non-zero management information”+1 becomes the start position of the non-zero data column A In this case, since R[2][4] is in the range of the write data, Write Ctrl 24 a updates the position of the non-zero management information to the position (6) of the next non-zero management information R[2][6].
  • Thereafter, Write Ctrl 24 a carries out a similar process until the non-zero data column A is detected. Herein, in a case in which the next non-zero management information is not present, the non-zero data column A is not present. In the case of Example 1, if the last position of the non-zero data column of the non-zero management information R[2][10] is R[2][121, it is not in the range of the write data; therefore, the start position of the non-zero data column A becomes “the position of the non-zero management information R[2][10]”+1, namely, 11. In the case of Example 7 and Example 8 of FIG. 20, R[2][12] is in the range of the write data, and the next non-zero data column thereof is not present; therefore, the non-zero data column A is not present.
  • Next, the process of searching for the non-zero data columns B and A in a case in which the data of Example 1 of FIG. 19 are written will be explained by using FIG. 15.
  • First, when a write request to the fourth column to eleventh column of the second row is input (S31), Write Ctrl 24 a carries out substitution of Y=2 and X=1 (S32) and reads the non-zero management information of R[2][0] (S33). Then, Write Ctrl 24 a sets Num=1 and Next=2 as the values of the read non-zero management information (S34).
  • Then, Write Ctrl 24 a judges that the next non-zero data column is present since Next is not 0 (S35-YES). Furthermore, ((X+Next)≥4) is not satisfied since X+Next=3, and Write Ctrl 24 a judges that the top non-zero block of the next non-zero data column is also at a position less than the fourth column (S36-NO). Write Ctrl 24 a updates X to 3 (=1+2) (S37), and the process returns to S33. The updated X (=3) represents the X-coordinate of the top non-zero block of the next non-zero data column.
  • Write Ctrl 24 a reads the non-zero management information of R[2][2] (S33) and sets Num=2 and Next=4 (S34). Write Ctrl 24 a judges that the next non-zero data column is present since Next is not 0 (S35-YES). Furthermore, since ((X+Next)≥4) is satisfied, Write Ctrl 24 a judges that the top non-zero block of the next non-zero data column is at a column of the fourth column or thereafter (S36-YES). Thus, Write Ctrl 24 a sets the non-zero data column starting from X=3 as the non-zero data column B and sets B=X=3, Num=2, and Next=4 (S38).
  • Then, Write Ctrl 24 a judges that the next non-zero data column is present because Num=2 (S39-YES). Furthermore, ((X+Num−1)>11) is not satisfied because X+Num−1=4, Write Ctrl 24 a judges that the non-zero data column at the ending is before the eleventh column (S40-NO).
  • Then, Write Ctrl 24 a judges that the next non-zero data column is present since Next is 4 (S41-YES) and updates X to 7 (=3+4) (S42). This updated X (=7) represents the X-coordinate of the top non-zero block of the next non-zero data column.
  • Then, Write Ctrl 24 a reads the non-zero management information of R[2][6] (S43) and sets Num=2 and Next=4 (S44), and the process returns to S39.
  • Write Ctrl 24 a judges that the next non-zero data column is present because Num=2 (S39-YES). Because X+Num−1=8, ((X+Num−1)>11) is not satisfied, and Write Ctrl 24 a judges that the non-zero data column at the ending is before the eleventh column (S40-NO). Moreover, since Next is 4, Write Ctrl 24 a judges that the next non-zero data column is present (S41-YES) and updates X to 11 (=7+4) (S42). Then, Write Ctrl 24 a reads the non-zero management information of R[2][101 (S43) and sets Num=2 and Next=0 (S44), and the process returns to S39.
  • In this case, Write Ctrl 24 a judges that the next non-zero data column is present because Num=2 (S39-YES) and judges that ((X+Num−1)>11) is satisfied because X+Num−1=12 (S40-YES). Write Ctrl 24 a sets the non-zero data column that starts from X=11 as the non-zero data column A As a result of the above process, the start position of the non-zero data column B is searched as 3, and the start position of the non-zero data column A is searched as 11.
  • Next, update of the non-zero management information at the position of “the start position of the non-zero data column B”−1 will be explained by using Example 1 to Example 8 of FIG. 19 and FIG. 20. In all of the eight examples, R[2][3] is the start position of the non-zero data column B. Moreover, the write range is from the column n0 to the column n1, a non-zero data column at the top in the write data is C, and the start position of C is q.
  • In Example 1 of FIG. 19 and Example 7 of FIG. 20, write is started from zero data (R[2][n0] is 0, n0≠q), and, as a result of write of the zero data, the length of the non-zero data columns is reduced. Therefore, Num of the non-zero management information R[2][2] is changed from 2 to 1 (becomes n0-B). Moreover, Next of the non-zero management information R[2][2] is changed so as to specify the start position q of the non-zero data column C (4→2).
  • Moreover, in Example 2 of FIG. 19, write is started from zero data (R[2][n0] is 0, n0≠q), and, as a result of write of the zero data, the length of the non-zero data columns is not changed. Therefore, only Next of the non-zero management information R[2][2] is changed so as to specify q of the start position of the non-zero data column of C (4→5).
  • Moreover, in Example 3 of FIG. 19, write is started from non-zero data (R[2][n0] is non-zero, n0=q), and, as a result of non-zero write, the length of the non-zero data columns is changed. Therefore, Num of the non-zero management information R[2][2] is changed from 2 to 5 (“the number of non-zero data of C”+q-“the start position of B”). Moreover, Next of the non-zero management information R[2][2] is changed so as to specify the start position of the further next non-zero data column of C (46).
  • Moreover, in Example 4 of FIG. 19, write is started from non-zero data (R[2][n0] is non-zero, n0=q), and, as a result of write of the non-zero data, the length of the non-zero data columns is not changed. Therefore, only Next of the non-zero management information R[2][2] is changed so as to specify q of the start position of the non-zero data column of C (4→3).
  • Moreover, in Example 5 of FIG. 19, all write is zero data, and, as a result of write of the zero data, the length of the non-zero data columns is reduced. Therefore, Num of the non-zero management information R[2][2] is reduced from 2 to 1 (changed to n0-“the start position of B”). Furthermore, as a result of write of the zero data, the length of the data column of the non-zero data column A is reduced. Therefore, Next of the non-zero management information R[2][2] is changed so as to specify n1+1 (4→11).
  • Moreover, in Example 6 of FIG. 20, all write is zero data, and, as a result of write of the zero data, the length of the non-zero data columns is not changed. Therefore, Num of the non-zero management information R[2][2] is not changed. Furthermore, as a result of write of the zero data, the length of the data column of the non-zero data column A is not changed. Therefore, Next of the non-zero management information R[2][2] is changed so as to specify the start position of the non-zero data column A (4→8).
  • Moreover, in Example 8 of FIG. 20, all write is zero data, and, as a result of write of the zero data, the length of non-zero data columns is reduced. Thus, Num of the non-zero management information R[2][2] is reduced from 2 to 1 (changed to n0-“the start position of B”). Furthermore, since the non-zero data column A is not present, Next of the non-zero management information R[2][2] is changed to 0.
  • Next, a write process of write data will be explained by using Example 1 of FIG. 19. The write of the write data is carried out in a case in which the write data include non-zero data.
  • First, in Example 1 of FIG. 19, the start position (X=5) of write non-zero data is input, two non-zero data blocks are input, and the non-zero data are written to X=5, 6 (S[2][5], S[2][6). Then, the start position (X=9) of the write non-zero data is input. Thus, the non-zero management information at the position of X=4 (R[2][4]) is defined; therefore, Num=2 and Next=9-5=4 are written at the position of X=4 (S[2][4]). Then, two non-zero data blocks are input, and the non-zero data are written to S[2][9] and S[2][10], and the write is finished. Update of the non-zero management information at the position of X=8 is carried out in next S24. In this manner, write of non-zero data column(s) and write of non-zero management information are repeatedly carried out. The non-zero management information is not read during the operation.
  • Example 2 to Example 4 of FIG. 19 are also similar. However, in the case of Example 3, the non-zero management information updated first is at the position “the start position of the non-zero data column B”−1 (R[2][2]).
  • Herein, a process of updating the non-zero management information at the position of “the start position of the non-zero data column B”−1 and writing write data of the case in which the data of Example 1 of FIG. 19 are written by Write Ctrl 24 a will be explained by using FIG. 16 and FIG. 17.
  • First, since the write data include non-zero data (S51-NO), Write Ctrl 24 a inputs the start position (q=5) of the non-zero data column (S52). Then, because q=5 and n0=4, q≠n0 is obtained (S53-NO), and it is started from zero data. Then, according to “the start position of the non-zero data column B”=3 and B_Num=2, (“the start position of B”-B_Num−1)≥n0 is satisfied, and elements of the non-zero data column B are included in the columns n0 to n1 (S54-YES).
  • Then, Write Ctrl 24 a updates the non-zero management information R[2][2] at the position of “the start position of the non-zero data column B”−1 to Num=1 and Next=2 (S55, S56).
  • Then, Write Ctrl 24 a sets the start position q of the non-zero data column as the start position of the non-zero data columns W0 and sets Pos=q=5, Cnt=0, and Start=q=5 (S57).
  • Then, Write Ctrl 24 a inputs a non-zero data block (S60) and writes the input non-zero data block to S[2][5] (S61). Then, Write Ctrl 24 a increments Pos and Cnt (Pos=Pos+1=6, Cnt=Cnt+1=1) (S62). Since write of the non-zero data columns W0 has not been finished (S63-NO), the process returns to S60.
  • Then, Write Ctrl 24 a inputs a non-zero data block (S60) and writes the input non-zero data block to S[2][6] (S61). Then, Write Ctrl 24 a carries out increment to Pos=7 and Cnt=2 (S62) and judges that write of the non-zero data columns W0 has been finished (S63-YES).
  • Then, since the next write non-zero data columns W1 are present (S64-YES), Write Ctrl 24 a inputs the start position (p=9) of the next non-zero data columns (S65).
  • Then, Write Ctrl 24 a sets Num=Cnt=2 and Next=p-Start=9-5=4 (S66). Then, Write Ctrl 24 a sets the non-zero data columns W1 as the next non-zero data columns W0 (S67), and the process returns to S60. In this case, Pos=p=9, Cnt=0, and Start=p=9 are set.
  • Then, Write Ctrl 24 a inputs a non-zero data block (S60) and writes the input non-zero data block to S[2][9] (S61). Then, Write Ctrl 24 a carries out increment to Pos=10 and Cnt=1 (S62), and, since write of the non-zero data columns W0 has not been finished (S63-NO), the process returns to S60.
  • Then, Write Ctrl 24 a inputs a non-zero data block (S60) and writes the input non-zero data block to S[2][10] (S61). Then, Write Ctrl 24 a carries out increment to Pos=11 and Cnt=2 (S62). The write of the non-zero data columns W0 has been finished (S63-YES), and the next write non-zero data column W1 is not present (S64-NO); therefore, Write Ctrl 24 a returns to END of FIG. 16.
  • Then, a process of updating the last non-zero management information in the write data and updating the non-zero management information of the column n1 will be explained. In all of Example 1 to Example 6 of FIG. 19 and FIG. 20, R[2][111 is the start position of the non-zero data column A. In Example 7 and Example 8, the non-zero data column A is not present. The write range is the columns n0 to n1, and the column number at the largest position in the non-zero data in the write data is e.
  • In Example 1 of FIG. 19, write is finished with zero data (R[2][nl]=0, nl≠e), and, as a result of write of the zero data, the length of the data columns of the non-zero data columns A is reduced. Therefore, the last non-zero management information R[2][8] is changed so that Num specifies 2 (remains at the number of the written data columns) and Next specifies n1+1-Start (11+1-9=3). Moreover, the non-zero management information R[2][11] of the column n1 is changed so that Num is reduced by the amount of reduction from Num (2) of the non-zero data columns A by the zero write (21) and Next becomes the same as Next of the non-zero data column A.
  • Moreover, in Example 2 of FIG. 19, e=9, wherein write is finished with zero data (R[2][n1]=0, n1≠e), and, as a result of the write of the zero data, the length of the data columns of the non-zero data columns A is not reduced. Therefore, the last non-zero management information R[2][7] is changed so that Num is 2 (remains at the number of written data columns) and Next specifies “the start position of the non-zero data columns A”-Start (11-8=3).
  • Moreover, in Example 3 of FIG. 19, write is finished with non-zero data (R[2][n1]≠0, n1=e), and, as a result of the write of the non-zero data, the last non-zero data column and the non-zero data column A are connected or overlapped with each other. Therefore, the last non-zero management information R[2][8] is changed so that Num specifies 4 (“the number (3) of written data columns” +“the number of block(s) (1) which is in the non-zero data column A and outside the write range”) and Next specifies Next (0) of the non-zero data column A.
  • Moreover, in Example 4 of FIG. 19, e=9, wherein write is finished with non-zero data (R[2][n1]≠0, n1=e), and, as a result of the write of the non-zero data, the length of the data columns of the non-zero data columns A is not changed. Therefore, the last non-zero management information R[2][8] is changed so that Num specifies 1(remains at the number of the written data column) and Next specifies “the start position of the non-zero data column A”-Start (11-9=2). If A is not present, Next is changed to 0.
  • Moreover, in Example 5 of FIG. 19, since non-zero data write is not present, write is finished with zero data (R[2][n1]=0), and, as a result of the write of the zero data, the length of the data columns of the non-zero data columns A is reduced. Therefore, the non-zero management information R[2][111 of the column n1 is changed so that Num is reduced from Num (2) of the non-zero data columns A by the amount reduced by the write of the zero data (2→1) and Next becomes the same as Next of the non-zero data columns A.
  • Moreover, in Example 6 of FIG. 20, since non-zero data write is not present, write is finished with zero data (R[2][n1]=0), and, as a result of the write of the zero data, the length of the data columns of the non-zero data columns A is not reduced. Therefore, the last non-zero management information and the non-zero management information of the column n1 is not changed.
  • Moreover, in Example 7 of FIGS. 20, e=10 and n1=12, wherein write is finished with zero data (R[2][n1] is 0, n1≠e), and the non-zero data column A is not present. Therefore, Num is 2 (remains at the number of the written data columns), and Next is changed to 0.
  • Moreover, in Example 8 of FIG. 20, since non-zero data write is not present, the write is finished with zero data (R[2][n1]=0), and the non-zero data column A is not present. Therefore, the last non-zero management information and the non-zero management information of the column n1 is not changed.
  • Next, a process of updating the last non-zero management information and updating the non-zero management information of the column n1 in a case in which the data of Example 1 of FIG. 19 are written will be explained by using FIG. 18.
  • It is assumed that Pos=11, Cnt=2, and Start=9 are set by the process of FIG. 16 and FIG. 17. First, it is not the write of 0 in all the write range (S81-NO), the non-zero data columns A are present (S82-YES), Pos=11, and Pos-1≠n1 is satisfied; therefore, Write Ctrl 24 a judges that the last write data is 0 (S83-YES).
  • Then, is satisfied according to “the start position of the non-zero data columns A”=11 (S84-YES), and Write Ctrl 24 a updates the last non-zero management information and sets Next=n1+1-Start=11+1-9=3 and Num=Cnt=2 (S85).
  • Finally, Write Ctrl 24 a updates the non-zero management information of the column n1 (S86) and finishes the process. In this case, Next=A_Next=0, Num=A_Num-(n1+1-”the start position of A”)=2-(11+1-11)=1. As a result of the above process, the non-zero management information R[2][111 of the column n1 is updated to Num=1 and Next=0.
  • As described above, the matrix management engine 4 is configured to retain only the non-zero data in the L2 caches 25 a to 25 d and store the non-zero management information representing the number of the continuous non-zero data in the region of zero data and the distance to the next non-zero data. Moreover, when a read request is input, the matrix management engine 4 is configured to reference the non-zero management information and return only the non-zero data to the request source. As a result, the used amount of the L2 caches 25 a to 25 d is configured to be reduced by retaining only the non-zero data, and a band width is configured to be reduced by transferring only the non-zero data.
  • Therefore, according to the matrix management engine as the information processing device of the present embodiment, the memory used amount and the band width can be suppressed by retaining/managing only the non-zero data in the cache memories.
  • Note that the processes in the flowcharts of the present specification may be executed in a changed order, a plurality of them may be simultaneously executed, or the processes may be executed in a different order in every execution unless they are not against the properties thereof.
  • While certain embodiments have been described, these embodiments have been presented by way of example only, and are not intended to limit the scope of the inventions. Indeed, the novel apparatuses, methods and circuits described herein may be embodied in a variety of other forms; furthermore, various omissions, substitutions and changes in the form of the apparatuses, methods and circuits described herein may be made without departing from the spirit of the inventions. The accompanying claims and their equivalents are intended to cover such forms or modifications as would fall within the scope and spirit of the inventions.

Claims (8)

What is claimed is:
1. A data structure of a block row provided with at least one or more blocks consisting of one or more elements,
the data structure being provided with a management region which is a first region at a top of the data structure, separately from the block row which is a data region,
the data structure storing, in the management region which is the first region, a first management information representing a number of continuous non-zero data from a row head of the block row and a distance to next non-zero data, and
the data structure storing, at a corresponding positon in a layout of the block row, non-zero data having one or more non-zero elements in one block in the data region, and in a second region of zero data having all the elements in one block being zero and arranged immediately before the non-zero data, second management information representing a number of continuous non-zero data and a distance to next non-zero data.
2. The data structure according to claim 1, wherein
the second management information is stored in the second region of the zero data immediately before the continuous non-zero data.
3. The data structure according to claim 1, wherein
a plurality of pieces of the second management information are stored in the second region of the zero data immediately before the continuous non -zero data.
4. The data structure according to claim 1, wherein
the management region which is the first region is a storage region corresponding to one block.
5. The data structure according to claim 1, wherein
the second management information has a size that is same as a size of data stored in the block.
6. The data structure according to claim 1, wherein
the distance to the next non-zero data of the first management information represents a storage position of the second management information.
7. The data structure according to claim 3, wherein
when a plurality of pieces of the second management information exist, the distance to the next non-zero data of the first management information represents a storage position of any one of the plurality of pieces of the second management information.
8. The data structure according to claim 1, wherein
the first management information and the second management information have a same format.
US15/861,533 2014-03-13 2018-01-03 Information processing device and data structure Abandoned US20180129605A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US15/861,533 US20180129605A1 (en) 2014-03-13 2018-01-03 Information processing device and data structure

Applications Claiming Priority (4)

Application Number Priority Date Filing Date Title
JP2014050584A JP2015176245A (en) 2014-03-13 2014-03-13 Information processing apparatus and data structure
JP2014-050584 2014-03-13
US14/484,093 US9990288B2 (en) 2014-03-13 2014-09-11 Information processing device and data structure
US15/861,533 US20180129605A1 (en) 2014-03-13 2018-01-03 Information processing device and data structure

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
US14/484,093 Continuation US9990288B2 (en) 2014-03-13 2014-09-11 Information processing device and data structure

Publications (1)

Publication Number Publication Date
US20180129605A1 true US20180129605A1 (en) 2018-05-10

Family

ID=54069039

Family Applications (2)

Application Number Title Priority Date Filing Date
US14/484,093 Active 2036-09-24 US9990288B2 (en) 2014-03-13 2014-09-11 Information processing device and data structure
US15/861,533 Abandoned US20180129605A1 (en) 2014-03-13 2018-01-03 Information processing device and data structure

Family Applications Before (1)

Application Number Title Priority Date Filing Date
US14/484,093 Active 2036-09-24 US9990288B2 (en) 2014-03-13 2014-09-11 Information processing device and data structure

Country Status (3)

Country Link
US (2) US9990288B2 (en)
JP (1) JP2015176245A (en)
CN (1) CN104915300B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9684602B2 (en) 2015-03-11 2017-06-20 Kabushiki Kaisha Toshiba Memory access control device, cache memory and semiconductor device
US11048871B2 (en) * 2018-09-18 2021-06-29 Tableau Software, Inc. Analyzing natural language expressions in a data visualization user interface
JP2022167670A (en) * 2021-04-23 2022-11-04 富士通株式会社 Information processing program, information processing method, and information processing device

Family Cites Families (18)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPS5364439A (en) 1976-11-20 1978-06-08 Agency Of Ind Science & Technol Linear coversion system
JPH0795307B2 (en) 1989-12-04 1995-10-11 日本電気株式会社 Cache memory control circuit
US5991847A (en) 1997-06-06 1999-11-23 Acceleration Software International Corporation Data pattern caching for speeding up write operations
US6490654B2 (en) 1998-07-31 2002-12-03 Hewlett-Packard Company Method and apparatus for replacing cache lines in a cache memory
US6754776B2 (en) 2001-05-17 2004-06-22 Fujitsu Limited Method and system for logical partitioning of cache memory structures in a partitoned computer system
US6775751B2 (en) * 2002-08-06 2004-08-10 International Business Machines Corporation System and method for using a compressed main memory based on degree of compressibility
US6957306B2 (en) 2002-09-09 2005-10-18 Broadcom Corporation System and method for controlling prefetching
JP3981070B2 (en) 2003-12-26 2007-09-26 株式会社東芝 Cache replacement device and cache replacement method
US7130982B2 (en) * 2004-03-31 2006-10-31 International Business Machines Corporation Logical memory tags for redirected DMA operations
US20060143396A1 (en) 2004-12-29 2006-06-29 Mason Cabot Method for programmer-controlled cache line eviction policy
KR100990526B1 (en) 2006-08-23 2010-10-29 닛본 덴끼 가부시끼가이샤 Processing element, mixed mode parallel processor system, processing element method, mixed mode parallel processor method, processing element program, and mixed mode parallel processor program
JP4942095B2 (en) 2007-01-25 2012-05-30 インターナショナル・ビジネス・マシーンズ・コーポレーション Technology that uses multi-core processors to perform operations
WO2008095025A1 (en) 2007-01-31 2008-08-07 Qualcomm Incorporated Apparatus and methods to reduce castouts in a multi-level cache hierarchy
US8291194B2 (en) 2009-11-16 2012-10-16 Mediatek Inc. Methods of utilizing address mapping table to manage data access of storage medium without physically accessing storage medium and related storage controllers thereof
US9934108B2 (en) * 2010-10-27 2018-04-03 Veritas Technologies Llc System and method for optimizing mirror creation
US8762655B2 (en) * 2010-12-06 2014-06-24 International Business Machines Corporation Optimizing output vector data generation using a formatted matrix data structure
GB2510523B (en) 2011-12-22 2014-12-10 Ibm Storage device access system
CN103593304B (en) * 2012-08-14 2016-08-03 吉林师范大学 The quantization method of effective use based on LPT device model caching

Also Published As

Publication number Publication date
US9990288B2 (en) 2018-06-05
CN104915300A (en) 2015-09-16
US20150261675A1 (en) 2015-09-17
CN104915300B (en) 2019-05-07
JP2015176245A (en) 2015-10-05

Similar Documents

Publication Publication Date Title
US10318434B2 (en) Optimized hopscotch multiple hash tables for efficient memory in-line deduplication application
US9966152B2 (en) Dedupe DRAM system algorithm architecture
US20150261466A1 (en) Device and method for storing data in distributed storage system
US9569141B2 (en) Hash map support in a storage device
CN106462494A (en) Memory controllers employing memory capacity compression, and related processor-based systems and methods
US20180129605A1 (en) Information processing device and data structure
US9697111B2 (en) Method of managing dynamic memory reallocation and device performing the method
CN107273397B (en) Virtual bucket polyhistidine table for efficient memory online deduplication applications
CN109564545A (en) Method and apparatus for compressing address
TWI761419B (en) Method, memory system and article for maximized dedupable memory
US6684267B2 (en) Direct memory access controller, and direct memory access control method
US11030714B2 (en) Wide key hash table for a graphics processing unit
US20160379004A1 (en) Semiconductor device
CN107003932B (en) Cache directory processing method and directory controller of multi-core processor system
US11669327B2 (en) Computing device and method for loading data
TW202232310A (en) Dynamic metadata relocation in memory
US20160140034A1 (en) Devices and methods for linked list array hardware implementation
CN116185910B (en) Method, device and medium for accessing device memory and managing device memory
US6996675B2 (en) Retrieval of all tag entries of cache locations for memory address and determining ECC based on same
WO2023029441A1 (en) Data read-write method and apparatus for multi-port memory, and storage medium and electronic device
US9223708B2 (en) System, method, and computer program product for utilizing a data pointer table pre-fetcher
WO2016070431A1 (en) Memory access method and apparatus, and computer device
CN117472791A (en) Data access method and data access system
US20150071021A1 (en) Accessing independently addressable memory chips
CN115497542A (en) Hardware acceleration method and device for updating summary page of SSD, computer equipment and storage medium

Legal Events

Date Code Title Description
AS Assignment

Owner name: KABUSHIKI KAISHA TOSHIBA, JAPAN

Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:USUI, HIROYUKI;MAEDA, SEIJI;SIGNING DATES FROM 20140903 TO 20140908;REEL/FRAME:044528/0033

STCB Information on status: application discontinuation

Free format text: ABANDONED -- FAILURE TO RESPOND TO AN OFFICE ACTION