WO2021024300A1 - 情報処理装置 - Google Patents

情報処理装置 Download PDF

Info

Publication number
WO2021024300A1
WO2021024300A1 PCT/JP2019/030484 JP2019030484W WO2021024300A1 WO 2021024300 A1 WO2021024300 A1 WO 2021024300A1 JP 2019030484 W JP2019030484 W JP 2019030484W WO 2021024300 A1 WO2021024300 A1 WO 2021024300A1
Authority
WO
WIPO (PCT)
Prior art keywords
matrix
submatrix
rows
sparse
sparse matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Ceased
Application number
PCT/JP2019/030484
Other languages
English (en)
French (fr)
Japanese (ja)
Inventor
裕太 井手口
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
NEC Corp
Original Assignee
NEC Corp
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by NEC Corp filed Critical NEC Corp
Priority to US17/630,621 priority Critical patent/US20220253507A1/en
Priority to PCT/JP2019/030484 priority patent/WO2021024300A1/ja
Priority to JP2021538525A priority patent/JP7310892B2/ja
Publication of WO2021024300A1 publication Critical patent/WO2021024300A1/ja
Anticipated expiration legal-status Critical
Ceased legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/76Arrangements for rearranging, permuting or selecting data according to predetermined rules, independently of the content of the data
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/50Adding; Subtracting
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/535Dividing only
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/544Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices for evaluating functions by calculation
    • G06F7/5443Sum of products
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Definitions

  • the present invention relates to an information processing device, an information processing method, and a program.
  • the learning process of learning a large amount of data and building a model takes a very long time, so speeding up is required.
  • data is expressed as a sparse matrix and operations are performed.
  • logistic regression the sparse matrix vector product is calculated. Therefore, it is important to calculate such a sparse matrix vector product at high speed. Therefore, many information processing methods have been proposed so far for the purpose of speeding up the calculation of the sparse matrix vector product.
  • a sparse matrix is stored in a plurality of compressed formats.
  • the elements of a column in which the number of non-zero elements (called non-zero elements) in a sparse matrix is a predetermined number or more are stored in the JDS (Jugged Digital Storage) format, and the elements of the other columns are stored in CRS (Compressed). Store in Row Store) format.
  • the product of the submatrix and the vector stored in the JDS format and the product of the submatrix and the vector stored in the CRS format are individually calculated, and the sum of the calculation results is calculated to be sparse. Calculate the product of a matrix and a vector.
  • a dense submatrix in which rows having a predetermined number or more of non-zero elements in a sparse matrix and a sparse submatrix in which rows having less than a predetermined number of non-zero elements are collected are used.
  • the sparse submatrix is stored in a format (link list method) that stores the row and column numbers in which non-zero elements exist and their values.
  • a sparse matrix there is a type in which a row with many non-zero elements exists in a part of the matrix and a column with many non-zero elements exists in a part of the matrix.
  • a row in which many non-zero elements are gathered is uselessly divided into JDS format and CRS format. .. Therefore, it is difficult to calculate the sparse matrix vector product at high speed.
  • An object of the present invention is that it is difficult to convert a sparse matrix in which rows and columns having many non-zero elements are present in a part of the matrix into a format capable of calculating the product with a vector at high speed.
  • the purpose is to provide an information processing device that solves the problem of being.
  • the information processing device is The sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows.
  • a first transformant that transforms into a first matrix in preferred dense matrix format The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • a second transformant that transforms a submatrix into a second matrix in column-first dense matrix format A third conversion unit that divides the fourth submatrix into a fifth submatrix and a sixth submatrix, and converts the fifth submatrix into a third matrix in a row-priority sparse matrix compression format.
  • the information processing method is The sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows. Convert to the first matrix in preferred submatrix format, The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • Convert the submatrix to a second matrix in column-first dense matrix format The fourth submatrix is divided into a fifth submatrix and a sixth submatrix, and the fifth submatrix is converted into a third matrix in a row-priority sparse matrix compression format. Converting the sixth submatrix to a fourth matrix in column-major sparse matrix compression format. It is configured as follows.
  • the computer-readable recording medium is On the computer
  • the sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows.
  • the process of converting to the first matrix in the preferred dense matrix format The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • the process of converting a submatrix to a second matrix in column-first dense matrix format A process of dividing the fourth submatrix into a fifth submatrix and a sixth submatrix, and converting the fifth submatrix into a third matrix in a row-priority sparse matrix compression format.
  • the process of converting the sixth submatrix into the fourth matrix in the column-priority sparse matrix compression format It is configured to record a program to do this.
  • the present invention can convert a sparse matrix in which rows and columns in which many non-zero elements are gathered as a part of the matrix into a format capable of calculating the product with a vector at high speed. ..
  • FIG. 1 is a block diagram of the information processing apparatus 100 according to the first embodiment of the present invention.
  • the information processing apparatus 100 is configured to divide the input sparse matrix into a plurality of submatrixes having different formats, which can calculate the product with the vector at high speed, and output the sparse matrix.
  • the information processing apparatus 100 includes a communication interface unit (hereinafter referred to as a communication I / F unit) 111, an operation input unit 112, a screen display unit 113, a storage unit 115, and an arithmetic processing unit 116. There is.
  • the communication I / F unit 111 is composed of a dedicated data communication circuit, and is configured to perform data communication with various devices (not shown) connected via a communication line (not shown).
  • the operation input unit 112 is composed of an operation input device such as a keyboard and a mouse, and is configured to detect an operator's operation and output it to the arithmetic processing unit 116.
  • the screen display unit 113 is composed of a screen display device such as an LCD (Liquid Crystal Display) or a PDP (Plasma Display Panel), and is configured to display various information on the screen in response to an instruction from the arithmetic processing unit 116. There is.
  • the storage unit 115 is composed of a storage device such as a hard disk or a memory, and is configured to store processing information and a program 1151 required for various processes in the arithmetic processing unit 116.
  • the program 1151 is a program that realizes various processing units by being read and executed by the arithmetic processing unit 116, and is an external device (not shown) via a data input / output function such as a communication I / F unit 111. It is read in advance from a storage medium (not shown) and stored in the storage unit 115.
  • the main processing information stored in the storage unit 115 includes a sparse matrix 1152, a Rowmajo dense matrix 11531, a Colmajo dense matrix 11532, a CRS sparse matrix 11533, a JDS sparse matrix 11534, and row sorting information 11535.
  • the sparse matrix 1152 is a sparse matrix to be converted. Many of the elements of the sparse matrix 1152 are zero elements that are not necessary for processing. In addition, some rows of the sparse matrix 1152 have many non-zero elements. Furthermore, some columns of the sparse matrix 1152 have many non-zero elements.
  • the Rowmajor dense matrix 11531, the Colmajor dense matrix 11532, the CRS sparse matrix 11533, and the JDS sparse matrix 11534 are submatrixes generated by converting the sparse matrix 1152.
  • the row sorting information 11535 is information indicating how the entire row of the sparse matrix 1152 is sorted by the conversion process.
  • the arithmetic processing unit 116 has a processor such as an MPU (Micro Processing Unit) and a GPU (Graphics Processing Unit) and its peripheral circuits, and by reading and executing the program 1151 from the storage unit 115, the hardware and the program 1151 It is configured to realize various processing units in cooperation with.
  • the main processing units realized by the arithmetic processing unit 116 are an input unit 1161, a matrix conversion unit 1162, and an output unit 1163.
  • the input unit 1161 is configured to input the sparse matrix 1152 through the operation input unit 112 and / and the communication I / F unit 111 and store it in the storage unit 115.
  • the matrix conversion unit 1162 reads the sparse matrix 1152 from the storage unit 115, performs a matrix conversion process, generates a Rowmajor dense matrix 11531, a Colmajo dense matrix 11532, a CRS sparse matrix 11533, and a JDS sparse matrix 11534, and stores the storage unit. It is configured to be stored in 115. Further, the matrix conversion unit 1162 is configured to generate row rearrangement information 11535 and store it in the storage unit 115 when the rows of the sparse matrix 1152 are rearranged in the process of matrix conversion processing.
  • the matrix conversion unit 1162 includes a Rowmajor dense matrix generation unit 11621, a Colmajor dense matrix generation unit 11622, a CRS sparse matrix generation unit 11623, and a JDS sparse matrix generation unit 11624.
  • the Lowmajor dense matrix generator 11621 describes the sparse matrix 1152 as a first submatrix 1152-1 composed of rows having a predetermined number of non-zero elements (first threshold value) or more, as shown in FIG. It is configured to be divided into a second submatrix 1152-2 composed of other rows.
  • the Rowmajor dense matrix generator 11621 arranges the sparse matrix 1152 into a first submatrix 1152-1 and a second submatrix 1152- so that rows having a number of nonzero elements equal to or greater than the first threshold are gathered in the upward direction. Divide into two. Further, the Rowmajor dense matrix generation unit 11621 is configured to update the row rearrangement information 11535 in accordance with the above division.
  • the Lowmajor dense matrix generation unit 11621 is configured to convert the first submatrix 1152-1 into a Rowmajor dense matrix 11531 which is a row-major dense matrix format and store it in the storage unit 115. ing.
  • the Rowmajor dense matrix 11531 is composed of a value array, row information, and column information. Further, the Rowmajor dense matrix generation unit 11621 is configured to transmit the second submatrix 1152-2 to the Colmajor dense matrix generation unit 11622.
  • the Colmajor dense matrix generation unit 11622 makes the second submatrix 1152-2 a third portion composed of columns having a predetermined number (second threshold) or more of non-zero elements, as shown in FIG. It is configured to be divided into a matrix 1152-3 and a fourth submatrix 1152-4 composed of other columns.
  • the Colmajor dense matrix generator 11622 makes the second submatrix 1152-2 into the third submatrix 1152-3 and the fourth submatrix 1152-3 so that the columns having the number of nonzero elements equal to or larger than the second threshold are gathered to the left. It is divided into the submatrix 1152-4 of.
  • the Colmajor dense matrix generation unit 11622 is configured to convert the third submatrix 1152-3 into a Columajor dense matrix 11532 which is a dense matrix form of column priority (Color-major orderer). Further, the Colmajor sparse matrix generation unit 11622 sorts (sorts) the fourth submatrix 1152-4 in ascending order of the number of non-zero elements in each row, and sorts the rearranged fourth submatrix 1152-4. It is configured to transmit to the CRS sparse matrix generation unit 11623.
  • the Colmajor dense matrix generation unit 11622 rearranges the Colmajor dense matrix 11532 in the same order as the rearranged fourth submatrix 1152-4, and stores the sorted Colmajor dense matrix 11532 in the storage unit 115. It is configured.
  • the Colmajo dense matrix 11532 is composed of a value array, row information, and column information. Further, the Colmajo dense matrix 11532 is configured to update the row sorting information 11535 according to the above sorting.
  • the CRS sparse matrix generation unit 11623 makes a fourth submatrix 1152-4, as shown in FIG. 2, a fifth portion composed of rows having a predetermined number of non-zero elements (third threshold value) or more. It is configured to be divided into a matrix 1152-5 and a sixth submatrix 1152-6 composed of other rows. Further, the CRS sparse matrix generation unit 11623 is configured to convert the fifth submatrix 1152-5 into a CRS sparse matrix 11533 and store it in the storage unit 115.
  • the CRS sparse matrix 11533 is composed of a value array, a column number array, and an offset array. Further, the CRS sparse matrix generation unit 11623 is configured to transmit the sixth submatrix 1152-6 to the JDS sparse matrix generation unit 11624.
  • the JDS sparse matrix generation unit 11624 is configured to convert the sixth submatrix 1152-6 into a JDS sparse matrix 11534 and store it in the storage unit 115.
  • the JDS sparse matrix 11534 is composed of a value array, a column number array, and an offset array.
  • the row swap information in the JDS sparse matrix is managed in the row sort information 11535.
  • the output unit 1163 reads out the Rowmajor dense matrix 11531, the Colmajor dense matrix 11532, the CRS sparse matrix 11533, the JDS sparse matrix 11534, and the row rearrangement information 11535 from the storage unit 115, and displays the screen display unit as the conversion result of the sparse matrix 1152. It is configured to be displayed on 113 and / or transmitted to an external device through the communication I / F unit 111.
  • FIG. 3 is a flowchart showing an example of the operation of the information processing device 100.
  • the operation of the information processing apparatus 100 will be described with reference to FIG.
  • the input unit 1161 inputs the sparse matrix 1152 through the operation input unit 112 and / and the communication I / F unit 111, and stores the sparse matrix 1152 in the storage unit 115 (step S1).
  • FIG. 4 shows an example of a sparse matrix 1152 input by the input unit 1161.
  • the sparse matrix 1152 in this example is composed of 10 rows ⁇ 9 columns. The intersection of the row and the column corresponds to one element, the blank element indicates the zero element, and the element for which the value is set indicates the non-zero element.
  • the input unit 1161 stores the row rearrangement information 11535 in the initial state in the storage unit 115.
  • FIG. 5 shows an example of row rearrangement information 11535 in the initial state.
  • the Lowmajor dense matrix generation unit 11621 creates the Lowmajor dense matrix 11531 (step S2).
  • the Rowmajor dense matrix generation unit 11621 first divides the sparse matrix 1152 into a first submatrix 1152-1 composed of rows having a predetermined number of non-zero elements (first threshold value) or more, and other submatrix 1152-1. It is divided into a second submatrix 1152-2 composed of rows of.
  • first threshold value non-zero elements
  • the Rowmajor dense matrix generation unit 11621 has the first submatrix 1152-1 shown in FIG. 6 and the second portion shown in FIG. 7 when the first threshold value is 7. It is divided into the matrix 1152-2.
  • the first submatrix 1152-1 shown in FIG. 6 is composed of the 0th row and the 2nd row of the sparse matrix 1152.
  • the second submatrix 1152-2 shown in FIG. 7 is composed of the first row and the third to ninth rows of the sparse matrix 1152.
  • the Rowmajor dense matrix generation unit 11621 creates a Rowmajor dense matrix 11531 by storing the value 0 in the zero element having no value in the first submatrix 1152-1 and stores it in the storage unit 115. For example, in the case of the first submatrix 1152-1 of FIG.
  • the Rowmajor dense matrix generation unit 11621 creates a Rowmajor dense matrix 11531 of 2 rows and 9 columns and stores it in the storage unit 115, for example, as shown in FIG. To do. Further, the Rowmajor dense matrix generation unit 11621 updates the row rearrangement information 11535 to "0, 2, 1, 3, 4, 5, 6, 7, 8, 9".
  • the Colmajor dense matrix generation unit 11622 creates the Colmajor dense matrix 11532 (step S3).
  • the Colmajor dense matrix generation unit 11622 first comprises a second submatrix 1152-2 and a third submatrix 1152 composed of columns having a predetermined number of non-zero elements (second threshold) or more. It is divided into a fourth submatrix 1152-4 composed of -3 and the other columns. For example, in the case of the second submatrix 1152-2 shown in FIG. 7, assuming that the second threshold value is 5, the Colmajor dense matrix generation unit 11622 shows the third submatrix 1152- of 8 rows and 2 columns shown in FIG.
  • the third submatrix 1152-3 shown in FIG. 9 is composed of columns 0 and 6 of the second submatrix 1152-2.
  • the fourth submatrix 1152-4 shown in FIG. 10 is composed of columns 1 to 5 and columns 7 and 8 of the second submatrix 1152-2.
  • the Colmajor dense matrix generation unit 11622 creates the Colmajor dense matrix 11532 by storing the value 0 in the zero element having no value in the third submatrix 1152-3.
  • the Colmajor dense matrix generation unit 11622 creates the Colmajor dense matrix 11532 as shown in FIG.
  • the Colmajor sparse matrix generator 11622 sorts (sorts) the fourth submatrix 1152-4 in ascending order of the number of non-zero elements in each row, and the rearranged fourth submatrix 1152-4. Is transmitted to the CRS sparse matrix generation unit 11623. Further, the Colmajor dense matrix generation unit 11622 rearranges the Colmajor dense matrix 11532 in the same order as the rearranged fourth submatrix 1152-4, and stores the Colmajor dense matrix 11532 as the sorted Colmajor dense matrix 11532. For example, in the case of the fourth submatrix 1152-4 in FIG.
  • the Colmajor dense matrix generator 11622 shows the fourth submatrix 1152-4 shown in FIG. 12 and the Colmajor shown in FIG. The rows are rearranged as in the dense matrix 11532. Further, the Colmajor dense matrix generation unit 11622 updates the row sorting information 11535 to "0, 2, 8, 9, 3, 4, 6, 1, 7, 5" in response to the above sorting.
  • the CRS sparse matrix generation unit 11623 creates a CRS sparse matrix 11533 (step S4).
  • the CRS sparse matrix generation unit 11623 first generates a fourth submatrix 1152-4, and a fifth submatrix 1152 composed of rows having a predetermined number of non-zero elements (third threshold value) or more. It is divided into a sixth submatrix 1152-6 composed of -5 and other rows.
  • the third threshold value is 3
  • the CRS sparse matrix generation unit 11623 has the fifth submatrix 1152-5 and FIG. 15 shown on the upper side of FIG. 14 in the case of the fourth submatrix 1152-4 of FIG.
  • the CRS sparse matrix generation unit 11623 creates a CRS sparse matrix 11533 that stores the non-zero elements in the fifth submatrix 1152-5 in the CRS format, and stores it in the storage unit 115.
  • the CRS sparse matrix generation unit 11623 is composed of a value array, a column number array, and an offset array, as shown on the lower side of FIG. Create a CRS sparse matrix 11533.
  • the JDS sparse matrix generation unit 11624 creates a JDS sparse matrix 11534 (step S5).
  • the JDS sparse matrix generation unit 11624 creates a JDS sparse matrix 11534 from the sixth submatrix 1152-6 and stores it in the storage unit 115.
  • the JDS sparse matrix generator 11624 in the case of the sixth submatrix 1152-6 on the upper side of FIG. 15, left-justifies the non-zero elements and sets the values and columns as shown in the lower part of FIG.
  • a JDS sparse matrix 11534 composed of a number array and an offset array is created.
  • the procedure for creating a CRS sparse matrix and a JDS sparse matrix from the fourth submatrix 1152-4 is not limited to the above.
  • the non-zero elements are first left-justified, and after the left-justification, a predetermined number or more of the rows of the fourth submatrix 1152-4 are left-justified.
  • a CRS sparse matrix may be created on the rows with non-zero elements, and a JDS sparse matrix may be created on the remaining rows.
  • the output unit 1163 reads out the Rowmajor dense matrix 11531, the Colmajor dense matrix 11532, the CRS sparse matrix 11533, the JDS sparse matrix 11534, and the row rearrangement information 11535 from the storage unit 115, and displays a screen as a conversion result of the sparse matrix 1152. It is displayed on the display unit 113 and / and transmitted to the external device through the communication I / F unit 111 (step S6).
  • the rows and columns in which many non-zero elements are gathered in the input sparse matrix can be stored as a Rowmajor dense matrix and a Colmajor dense matrix without unnecessary division, and the rest.
  • the sparse submatrix of is divided into a CRS sparse matrix and a JDS sparse matrix and stored. This makes it possible to convert a sparse matrix into a multi-format matrix that can calculate the product with a vector at high speed.
  • FIG. 16 is a block diagram of the information processing apparatus 200 according to the second embodiment of the present invention. Similar to the information processing device 100 shown in FIG. 1, the information processing device 200 has a function of converting a sparse matrix into a plurality of submatrixes capable of calculating the product of vectors at high speed, and further, a plurality of converted parts. It has a function to find the product of a sparse matrix and a vector using a matrix. Referring to FIG. 16, the information processing apparatus 200 is configured such that the storage unit 115 further stores the vector 1154 and the sparse matrix vector product calculation result 1155 as compared with the information processing apparatus 100 shown in FIG. The difference is that the arithmetic processing unit 116 is further configured to include a matrix vector product arithmetic unit 1164, and is otherwise configured in the same manner as the information processing apparatus 100.
  • Vector 1154 is a vector that can calculate the product with a sparse matrix.
  • FIG. 17 shows an example of the vector 1154.
  • the vector 1154 in this example is composed of 9 rows and 1 column.
  • the sparse matrix vector product calculation result 1155 is the calculation result of the product of the sparse matrix 1152 and the vector 1154.
  • the matrix vector product calculation unit 1164 is configured to calculate the product of the sparse matrix 1152 and the vector 1154.
  • the matrix vector product calculation unit 1164 includes a Rowmajor dense matrix vector product calculation unit 11641, a Colmajo dense matrix vector product calculation unit 11642, a CRS sparse matrix vector product calculation unit 11643, a JDS sparse matrix vector product calculation unit 11644, a sum calculation unit 11645, and ,
  • the rearrangement unit 11646 is provided.
  • the Lowmajor dense matrix vector product calculation unit 11641 is configured to calculate the product of the Lowmajor dense matrix 11531 and the vector 1154.
  • the Colmajor dense matrix vector product calculation unit 11642 is configured to calculate the product of the Colmajor dense matrix 11532 and the vector 1154.
  • the CRS sparse matrix vector product calculation unit 11643 is configured to calculate the product of the CRS sparse matrix 11533 and the vector 1154.
  • the JDS sparse matrix vector product calculation unit 11644 is configured to calculate the product of the JDS sparse matrix 11534 and the vector 1154.
  • the sum calculation unit 11645 combines the products calculated by the Rowmajor dense matrix vector product calculation unit 11641, the Colmajor dense matrix vector product calculation unit 11642, the CRS sparse matrix vector product calculation unit 11643, and the JDS sparse matrix vector product calculation unit 11644 in the same row. It is configured to add together.
  • the rearrangement unit 11646 is configured to rearrange the rows of the calculation result of the sum calculation unit 11645.
  • FIG. 18 is a flowchart showing an example of the operation of the information processing device 200.
  • the operation of the information processing apparatus 200 will be described with reference to FIG.
  • the input unit 1161 inputs the sparse matrix 1152 and the vector 1154 through the operation input unit 112 and / and the communication I / F unit 111, and stores them in the storage unit 115 (step S11).
  • the matrix conversion unit 1162 reads out the sparse matrix 1152 from the storage unit 115, performs the same matrix conversion process as the information processing apparatus 100 according to the first embodiment, and performs the Rowmajo dense matrix 11531, the Colmajo dense matrix 11532, and the CRS.
  • the sparse matrix 11533, the JDS sparse matrix 11534, and the row rearrangement information 11535 are generated and stored in the storage unit 115 (step S12).
  • the Rowmajor dense matrix vector product calculation unit 11641 calculates the product of the Rowmajor dense matrix 11531 and the vector 1154 (step S13).
  • FIG. 19 shows the result of calculating the product of the Rowmajor dense matrix 1153 shown in FIG. 8 and the vector 1154 shown in FIG.
  • the Colmajor dense matrix vector product calculation unit 11642 calculates the product of the Colmajor dense matrix 11532 and the vector 1154 (step S14).
  • FIG. 20 shows the result of calculating the product of the Colmajor dense matrix 11532 shown in FIG. 11 and the vector 1154 shown in FIG.
  • the CRS sparse matrix vector product calculation unit 11643 calculates the product of the CRS sparse matrix 11533 and the vector 1154 (step S15).
  • FIG. 21 shows the result of calculating the product of the CRS sparse matrix 11533 shown in FIG. 14 and the vector 1154 shown in FIG.
  • the JDS sparse matrix vector product calculation unit 11644 calculates the product of the JDS sparse matrix 11534 and the vector 1154 (step S16).
  • FIG. 22 shows the result of calculating the product of the JDS sparse matrix 11534 and the vector 1154 shown in FIG.
  • the sum calculation unit 11645 calculates the product calculated by the Rowmajor dense matrix vector product calculation unit 11641, the Colmajor dense matrix vector product calculation unit 11642, the CRS sparse matrix vector product calculation unit 11643, and the JDS sparse matrix vector product calculation unit 11644. Add the same lines together (step S17). As is clear from a comparison between FIGS. 23 and 19 to 22, which show the calculation result of the product of the 10-by-9 sparse matrix shown in FIG. 4 and the 9-by-1 vector 1154 shown in FIG. 17, the Colmajor density is high.
  • the calculation results of the matrix vector product calculation unit 11642, the CRS sparse matrix vector product calculation unit 11643, and the JDS sparse matrix vector product calculation unit 11644 represent the partial product of the rows of the sparse matrix vector product calculation result. Therefore, the sum calculation unit 11645 adds the products calculated by the Colmajor dense matrix vector product calculation unit 11642, the CRS sparse matrix vector product calculation unit 11643, and the JDS sparse matrix vector product calculation unit 11644 to each other in the same row. Calculate the product of the entire row.
  • the sorting unit 11646 sorts the rows of the calculation result of the sum calculation unit 11645 based on the row sorting information 11535 (step S18). That is, since the row arrangement of the sparse matrix vector product calculation result calculated by the sum calculation unit 11645 is different from the row arrangement of the sparse matrix 1152, the rows of the sparse matrix 1152 are based on the row rearrangement information 11535. Sort in the same way as in a row.
  • the output unit 1163 reads the sparse matrix vector product calculation result 1155 from the storage unit 115 and displays it on the screen display unit 113 as the calculation result of the sparse matrix 1152 and the vector 1154, or / and through the communication I / F unit 111. It is transmitted to an external device (step S19).
  • the product of the sparse matrix 1152 and the vector 1154 can be calculated at high speed.
  • the reason is that the rows and columns with many non-zero elements in the sparse matrix 1152 are stored as the Rowmajor dense matrix 11531 and the Colmajor dense matrix 11532 without unnecessary division, and the remaining sparse submatrix is stored as the CRS sparse matrix 11533.
  • JDS sparse matrix 11534 are stored separately, the product of Rowmajor sparse matrix 11531 and vector 1154, the product of Colmajor sparse matrix 11532 and vector 1154, the product of CRS sparse matrix 11533 and vector 1154, JDS sparse matrix 11534 This is because the product of the vector 1154 and the vector 1154 is calculated, the sum of the rows is calculated, and the rows are rearranged at the end.
  • the reason why the product of the Rowmajo dense matrix 11531 and the vector 1154 and the product of the Colmajo dense matrix 11532 and the vector 1154 can be calculated at high speed is that the data required for the calculation can be transferred from the storage unit 115 to the arithmetic processing unit in a small number of transfer cycles. This is because it can be acquired at 116. For example, assuming that the amount of data that can be acquired from the storage unit 115 to the arithmetic processing unit 116 in one transfer cycle is for three elements, a total of 18 elements of the Rowmajo dense matrix 11531 in FIG. 19 can be acquired in six transfer cycles. A total of 9 elements of vector 1154 can be acquired in 3 transfer cycles. And, out of the acquired 27 elements, only one is useless zero element.
  • the Rowmajor dense matrix vector product operation can be performed at high speed. Further, a total of 16 elements of the Colmajor dense matrix 11532 in FIG. 20 can be acquired in 6 transfer cycles, and a total of 9 elements of the vector 1154 can be acquired in 3 transfer cycles. And, out of the acquired 25 elements, only 3 elements are useless. From this, the CoLmajor dense matrix vector product operation can be performed at high speed.
  • FIG. 24 shows an example of the program of the Rowmajor dense matrix vector product calculation unit 11641 in the matrix vector product calculation unit 1164.
  • Val [], low, and ncol represent the value array, the number of rows, and the number of columns of the Rowmajo dense matrix 11531, respectively
  • U [] represents the value array of the vector 1154
  • P'[ ] Represents the value array of the operation result vector.
  • FIG. 25 shows an example of the program of the Colmajor dense matrix vector product calculation unit 1164 in the matrix vector product calculation unit 1164.
  • Val [], now, and ncol represent the value array, the number of rows, and the number of columns of the Colmajo dense matrix 11532, respectively
  • U [] represents the value array of the vector 1154
  • P'[ ] Represents the value array of the operation result vector.
  • FIG. 26 shows an example of the program of the CRS sparse matrix vector product calculation unit 11643 in the matrix vector product calculation unit 1164.
  • Val [], Col [], and off [] represent the values of the elements of the CRS sparse matrix 11533, the column number array, and the offset array, respectively, and low is the CRS sparse matrix 11533.
  • U [] represents the value array of the vector 1154
  • P'[] represents the value array of the operation result vector.
  • FIG. 27 shows an example of the program of the JDS sparse matrix vector product calculation unit 11644 in the matrix vector product calculation unit 1164.
  • Val [], Col [], and off [] represent the values of the elements of the JDS sparse matrix 11534, the column number array, and the offset array, respectively
  • colmax is the JDS sparse matrix 11534. Represents the maximum number of columns of, U [] represents the value array of the vector 1154, and P'[] represents the value array of the operation result vector.
  • FIG. 28 shows an example of the program of the sorting unit 11646 in the matrix vector product calculation unit 1164.
  • Row [] represents the value array of the row sorting information 11535
  • narrow represents the number of rows of the sparse matrix 1152
  • P'[] represents the value array of the operation result vector
  • P [ ] Represents the value array of the sparse matrix vector product operation result.
  • the program of FIG. 24 is executed first, then the program of FIG. 25, the program of FIG. 26, the program of FIG. 27 are executed in this order, and finally the program of FIG. 28 is executed. ..
  • the value array P'[] of the operation result vector operates so as to be inherited between the programs. For example, if 9 ⁇ 1 + 11 ⁇ 7 is stored in the array P ′ [1] at the end of program execution in FIG. 25, 10 ⁇ 5 is further added to P ′ [1] in the program of FIG. 27. To. Such an operation corresponds to the operation of the sum calculation unit 11645. Therefore, in the above program example, a dedicated program for the sum calculation unit 11645 is not provided. However, if the value array P'[] of the calculation result vector is provided independently for each calculation unit 11641 to 11644, it is necessary to program the sum calculation unit 11645 that obtains the sum in line units after executing them.
  • a sparse matrix is converted into a plurality of sub-matrix that can quickly calculate the product of a vector, that is, a Rowmajor dense matrix, a Colmajo dense matrix, a CRS sparse matrix, and a JDS sparse matrix, and the submatrix This is because the product with the vector is calculated individually and the calculation results are added together to generate the result of the sparse matrix vector product operation.
  • FIG. 29 is a block diagram of the information processing apparatus 300 according to the third embodiment of the present invention. Similar to the information processing device 200 shown in FIG. 16, the information processing device 300 has a function of converting a sparse matrix into a plurality of submatrixes capable of calculating a product with a vector at high speed, and a plurality of submatrixes after conversion. It has the function of finding the product of a sparse matrix and a vector, and also has the function of performing statistical machine learning. Referring to FIG. 29, the information processing apparatus 300 is different from the information processing apparatus 200 shown in FIG.
  • the storage unit 115 is data W, U, P, W T , Q, Y, to store A It is configured in the same manner as the information processing apparatus 200, except that the arithmetic processing unit 116 is configured to further include a learning unit 1165.
  • Data W is input data for statistical machine learning.
  • the data W is a sparse matrix in which many of its elements are zero elements that are not necessary for processing, similar to the sparse matrix 1152 of FIG. Further, the data W has rows and columns in which many non-zero elements are gathered as a part of the matrix.
  • the user information shown in FIG. 30 is composed of a label, a user name, an environment, a time, a feature amount 1, a feature amount 2, ..., And the first four items (label, user name, environment, time) are It becomes dense data that all user information has.
  • the feature items in the latter half are sparse data as a whole, but most users may have a specific feature value, or a specific user may have a large amount of features. .. When most users have a specific feature value, the columns related to the specific feature item become dense data. If a particular user has a large number of features, the user's rows will be dense data.
  • the data W T is a transposed matrix of the data W.
  • Data A is teacher data.
  • the data U is a vector composed of a set of parameters of the model to be trained.
  • Data P is the product of data W and data U.
  • the data Q is a vector calculated from the difference between the data P and the teacher data A.
  • Data Y is the product of the data W T and the data Q.
  • the data W, U, P, W T , Q, Y is configured to perform statistical machine learning using the A.
  • logistic regression logistic regression (LR) is used in this embodiment. In the LR, the calculation as shown in FIG. 31 is repeated many times in the execution process, and the final data U is obtained.
  • FIG. 32 is a flowchart showing an example of the operation of the information processing device 300.
  • the operation of the information processing apparatus 300 will be described with reference to FIG.
  • matrix transformation unit 1162 data W from the storage unit 115, reads the W T, and performs the same matrix conversion processing as the information processing apparatus 200 according to the second embodiment, the data W, the respective W T mutually It is converted into a plurality of sub-matrixes having different formats and stored in the storage unit 115 (step S22). That is, the matrix conversion unit 1162 converts the data W into a Rowmajor dense matrix, a Colmajor dense matrix, a CRS sparse matrix, and a JDS sparse matrix. The matrix transformation unit 1162, data W T, Rowmajor dense matrix, Colmajor dense matrix, CRS sparse, converted to JDS sparse matrix.
  • the learning unit 1165 sets a random number in the data U and stores it in the storage unit 115 (step S23).
  • the learning unit 1165 repeats steps S25 to S28 until a predetermined condition is satisfied (steps S24 and S29).
  • a predetermined condition there is a condition that the number of repetitions reaches a predetermined number of times, but the condition is not limited thereto.
  • the learning unit 1165 calculates the product of the data W and the data U using the matrix vector product calculation unit 1164, and stores the data P which is the calculation result in the storage unit 115.
  • the matrix vector product calculation unit 1164 is the product of the Rowmajor dense matrix, the Colmajor dense matrix, the CRS sparse matrix, and the JDS sparse matrix generated from the data W and the data U. Is calculated individually, and the product P is calculated by adding the products thereof.
  • step S26 the learning unit 1165 calculates the data Q from the difference between the calculated data P and the teacher data A, and stores the data Q in the storage unit 115.
  • the learning unit 1165 in step S27, the product of the data W T and the data Q calculated using the matrix-vector product operation unit 1164, stores data Y which is a calculation result in the storage unit 115.
  • Data Y is calculated by calculating the products of the above individually and adding the products together.
  • step S28 the learning unit 1165 updates the data U with the calculated data Y.
  • the output unit 1163 reads the data U from the storage unit 115, displays it on the screen display unit 113 as a learning result (parameter of the trained model), or / and transmits it to the external device through the communication I / F unit 111. (Step S30).
  • the method of the present invention speeds up the sparse matrix vector product, which needs to be repeatedly executed by statistical machine learning.
  • FIG. 33 is a block diagram of the information processing device 400 according to the fourth embodiment.
  • the information processing apparatus 400 includes a first conversion unit 401, a second conversion unit 402, a third conversion unit 403, and a fourth conversion unit 404.
  • the first conversion unit 401 divides the sparse matrix into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows. It is configured to transform the first submatrix into a row-priority dense matrix-style first matrix.
  • the first conversion unit 401 can be configured by, for example, the Rowmajor dense matrix generation unit 11621 of FIG. 1, but is not limited thereto.
  • the second conversion unit 402 converts the second submatrix into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows. It is configured to split and transform the third submatrix into a second matrix in the form of a column-first dense matrix.
  • the second conversion unit 402 can be configured by, for example, the Colmajor dense matrix generation unit 11622 of FIG. 1, but is not limited thereto.
  • the third conversion unit 403 divides the fourth submatrix into a fifth submatrix and a sixth submatrix, and converts the fifth submatrix into a third matrix in a row-priority sparse matrix compression format. It is configured to do.
  • the third conversion unit 403 can be configured by, for example, the CRS sparse matrix generation unit 11623 of FIG. 1, but is not limited thereto.
  • the fourth conversion unit 404 is configured to convert the sixth submatrix into a fourth matrix in a column-priority sparse matrix compression format.
  • the fourth conversion unit 404 can be configured by, for example, the JDS sparse matrix generation unit 11624 of FIG. 1, but is not limited thereto.
  • the information processing device 400 configured as described above operates as follows. That is, the first conversion unit 401 divides the sparse matrix into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows. Then, the first submatrix is converted into the first matrix in the row-priority sparse matrix format. Next, the second conversion unit 402 uses the second submatrix as a third submatrix composed of columns having a predetermined number or more of non-zero elements and a fourth submatrix composed of other rows. And the third submatrix is converted into a second matrix in the form of a column-first dense matrix.
  • the third conversion unit 403 divides the fourth submatrix into a fifth submatrix and a sixth submatrix, and divides the fifth submatrix into a row-priority sparse matrix compressed third matrix. Convert to.
  • the fourth conversion unit 404 converts the sixth submatrix into the fourth matrix in the column-priority sparse matrix compression format.
  • the information processing apparatus 400 configured and operating as described above, it is possible to calculate the matrix vector product for a sparse matrix in which rows and columns in which many non-zero elements are gathered are present in a part of the matrix at high speed.
  • the reason is that as the first matrix in row-priority dense matrix format and the second matrix in column-priority dense matrix format without unnecessarily dividing rows and columns with many non-zero elements in the input sparse matrix. This is because it can be retained and the remaining sparse submatrix is retained in a sparse matrix compression format.
  • the matrix conversion unit 1162 converts the fourth submatrix 1152-4 into a fifth submatrix 1152-5 composed of rows having a predetermined number of non-zero elements or more. It is divided into a sixth submatrix 1152-6 composed of other rows, the non-zero elements in the fifth submatrix 1152-5 are stored in CRS format, and the sixth submatrix 1152-6 is JDS. Stored in format.
  • the matrix conversion unit 1162 left-justifies the non-zero elements of the fourth submatrix 1152-4, and JDS sets the submatrix for the columns in which the number of non-zero elements is greater than or equal to a predetermined number. It may be configured to store in the format and store the submatrix for the other columns of the left-justified matrix in the CRS format.
  • the matrix conversion unit 1162 uses the CRS format and the JDS format as the sparse matrix compression format for storing the non-zero elements of the fourth submatrix 1152-4.
  • the matrix transformant 1162 may use another sparse matrix compression format that stores the matrix elements with row priority instead of the CRS, and may use another sparse matrix compression format that stores the matrix elements with column priority. It may be used instead of the JDS format.
  • the present invention is applied to the sparse matrix vector product performed in statistical machine learning.
  • the present invention is not limited to such applications, and can be applied to the calculation of sparse matrix vector products in various scientific and technological calculations such as K-means, singular value decomposition, and Lanczos method.
  • the present invention can be used in the field of compressing and holding a sparse matrix and the field of calculating a sparse matrix vector product.
  • the sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows.
  • a first transformant that transforms into a first matrix in preferred dense matrix format The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • a second transformant that transforms a submatrix into a second matrix in column-first dense matrix format A third conversion unit that divides the fourth submatrix into a fifth submatrix and a sixth submatrix, and converts the fifth submatrix into a third matrix in a row-priority sparse matrix compression format.
  • a fourth conversion unit that converts the sixth submatrix into a fourth matrix in a column-priority sparse matrix compression format and Information processing device equipped with.
  • the third conversion unit is the sixth portion of the fourth submatrix composed of the fifth submatrix composed of rows having a predetermined number of non-zero elements or more and the other rows.
  • the information processing device It is configured to divide into a matrix, The information processing device according to Appendix 1.
  • the second conversion unit rearranges the fourth submatrix in ascending order of the number of non-zero elements in each row, and arranges the second matrix in the same order as the rearranged fourth submatrix. It is configured to replace, The information processing device according to Appendix 2.
  • the row-priority sparse matrix compression format is a CRS (Compressed Row Storage) format.
  • the information processing device according to any one of Supplementary note 1 to 3.
  • the column-priority sparse matrix compression format is a JDS (Jugged Digital Storage) format.
  • the information processing device according to any one of Appendix 1 to 4.
  • [Appendix 6] Stores the first matrix, the second matrix, the third matrix, and a vector capable of calculating the product of the sparse matrix converted into a set of the fourth matrix and the sparse matrix.
  • Memory unit and The product of the first matrix and the vector, the product of the second matrix and the vector, the product of the third matrix and the vector, and the product of the fourth matrix and the vector.
  • a matrix vector product calculation unit that obtains the product of the sparse matrix and the vector by calculating them individually and adding the calculated products together.
  • the information processing apparatus according to any one of Supplementary Provisions 1 to 5, further comprising.
  • the matrix vector product calculation unit is configured to perform a sparse matrix vector product calculation instructed by a learning unit that controls statistical machine learning.
  • the information processing device according to Appendix 6.
  • the sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows. Convert to the first matrix in preferred submatrix format, The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • the fourth submatrix is divided into a fifth submatrix and a sixth submatrix, and the fifth submatrix is converted into a third matrix in a row-priority sparse matrix compression format. Converting the sixth submatrix to a fourth matrix in column-major sparse matrix compression format.
  • Information processing method [Appendix 9] On the computer The sparse matrix is divided into a first submatrix composed of rows having a predetermined number of non-zero elements or more and a second submatrix composed of other rows, and the first submatrix is divided into rows.
  • the process of converting to the first matrix in the preferred dense matrix format The second submatrix is divided into a third submatrix composed of columns having a predetermined number of non-zero elements or more and a fourth submatrix composed of other rows, and the third submatrix is divided into the third submatrix.
  • the process of converting a submatrix to a second matrix in column-first dense matrix format A process of dividing the fourth submatrix into a fifth submatrix and a sixth submatrix, and converting the fifth submatrix into a third matrix in a row-priority sparse matrix compression format.
  • the process of converting the sixth submatrix into the fourth matrix in the column-priority sparse matrix compression format A computer-readable recording medium on which a program is recorded to perform the program.
  • Information processing device 111 Communication I / F unit 112 ... Operation input unit 113 ... Screen display unit 115 ... Storage unit 1151 ... Program 1152 ... Sparse matrix 1152-1 ... First submatrix 1152- 2 ... 2nd sub-matrix 1152-3 ... 3rd sub-matrix 1152-4 ... 4th sub-matrix 1152-5 ... 5th sub-matrix 1152-6 ... 6th sub-matrix 11531 ... Low major dense matrix 11532 ... Colmajor dense matrix 11533 ... CRS sparse matrix 11534 ... JDS sparse matrix 11535 ... Row sorting information 116 ... Arithmetic processing unit 1161 ...
  • Input unit 1162 Matrix conversion unit 11621 ... Rowmajor dense matrix generation unit 11622 ... Colmajor dense matrix generation unit 11623 ... CRS sparse matrix generation unit 11624 ... JDS sparse matrix generation unit 1163 ... output unit 1164 ... matrix vector product calculation unit 11641 ... Lowmajor dense matrix vector product calculation unit 11642 ... Colmajor dense matrix vector product calculation unit 11643 ... CRS sparse matrix vector product calculation unit 11644 ... JDS sparse matrix vector product calculation unit 11645 ... Sum calculation unit 11646 ... Sorting unit 1165 ... Learning unit 401 ... First conversion unit 402 ... Second conversion unit 403 ... Third conversion unit 404 ... Fourth conversion Department

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Pure & Applied Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Data Mining & Analysis (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)
PCT/JP2019/030484 2019-08-02 2019-08-02 情報処理装置 Ceased WO2021024300A1 (ja)

Priority Applications (3)

Application Number Priority Date Filing Date Title
US17/630,621 US20220253507A1 (en) 2019-08-02 2019-08-02 Information processing apparatus
PCT/JP2019/030484 WO2021024300A1 (ja) 2019-08-02 2019-08-02 情報処理装置
JP2021538525A JP7310892B2 (ja) 2019-08-02 2019-08-02 情報処理装置

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/JP2019/030484 WO2021024300A1 (ja) 2019-08-02 2019-08-02 情報処理装置

Publications (1)

Publication Number Publication Date
WO2021024300A1 true WO2021024300A1 (ja) 2021-02-11

Family

ID=74503371

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/JP2019/030484 Ceased WO2021024300A1 (ja) 2019-08-02 2019-08-02 情報処理装置

Country Status (3)

Country Link
US (1) US20220253507A1 (https=)
JP (1) JP7310892B2 (https=)
WO (1) WO2021024300A1 (https=)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436370A (zh) * 2023-12-06 2024-01-23 山东省计算中心(国家超级计算济南中心) 面向流体力学网格生成的超定矩阵方程并行方法及系统

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118520206A (zh) * 2023-02-20 2024-08-20 华为技术有限公司 一种数据处理方法、系统及相关设备
CN117609677B (zh) * 2023-12-08 2024-06-18 上海交通大学 一种稀疏矩阵乘法加速方法、fpga、计算系统及存储介质

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064808A (en) * 1997-08-01 2000-05-16 Lucent Technologies Inc. Method and apparatus for designing interconnections and passive components in integrated circuits and equivalent structures by efficient parameter extraction
JP2016066329A (ja) * 2014-09-26 2016-04-28 日本電気株式会社 情報処理装置、情報処理方法、及び、コンピュータ・プログラム
WO2017154946A1 (ja) * 2016-03-09 2017-09-14 日本電気株式会社 情報処理装置、情報処理方法、データ構造およびプログラム

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11392829B1 (en) * 2018-05-02 2022-07-19 Nvidia Corporation Managing data sparsity for neural networks
US10620951B2 (en) * 2018-06-22 2020-04-14 Intel Corporation Matrix multiplication acceleration of sparse matrices using column folding and squeezing

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6064808A (en) * 1997-08-01 2000-05-16 Lucent Technologies Inc. Method and apparatus for designing interconnections and passive components in integrated circuits and equivalent structures by efficient parameter extraction
JP2016066329A (ja) * 2014-09-26 2016-04-28 日本電気株式会社 情報処理装置、情報処理方法、及び、コンピュータ・プログラム
WO2017154946A1 (ja) * 2016-03-09 2017-09-14 日本電気株式会社 情報処理装置、情報処理方法、データ構造およびプログラム

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117436370A (zh) * 2023-12-06 2024-01-23 山东省计算中心(国家超级计算济南中心) 面向流体力学网格生成的超定矩阵方程并行方法及系统
CN117436370B (zh) * 2023-12-06 2024-03-19 山东省计算中心(国家超级计算济南中心) 面向流体力学网格生成的超定矩阵方程并行方法及系统

Also Published As

Publication number Publication date
JPWO2021024300A1 (https=) 2021-02-11
JP7310892B2 (ja) 2023-07-19
US20220253507A1 (en) 2022-08-11

Similar Documents

Publication Publication Date Title
US11803360B2 (en) Compilation method, apparatus, computing device and medium
CN110826719B (zh) 一种量子程序的处理方法、装置、存储介质和电子装置
JP2019148969A (ja) 行列演算装置、行列演算方法および行列演算プログラム
US20250068694A1 (en) Sparse Matrix Multiplication in Hardware
CN113850389B (zh) 一种量子线路的构建方法及装置
US20170206089A1 (en) Information processing apparatus and computational method
WO2021024300A1 (ja) 情報処理装置
CN110968943A (zh) 一种终端界面的显示方法及装置
US11182157B2 (en) Information processing device, arithmetic device, and information processing method
CN114764549A (zh) 基于矩阵乘积态的量子线路模拟计算方法、装置
CN111563599A (zh) 一种量子线路的分解方法、装置、存储介质及电子装置
CN104750731A (zh) 一种获取完整用户画像的方法及装置
KR20190099931A (ko) 시스톨릭 배열(Systolic Array)을 이용하여 딥 러닝(Deep Learning) 연산을 수행하는 방법 및 장치
US5926803A (en) Circuit designing method and circuit designing device
JP2022529178A (ja) 人工知能推奨モデルの特徴処理方法、装置、電子機器、及びコンピュータプログラム
JP2020080048A (ja) 並列処理装置およびプログラム
CN118966369B (zh) 一种量子电路的模拟方法
JP2023009904A (ja) プログラム、推論方法および情報処理装置
CN114692880A (zh) 一种量子线路中量子态振幅的模拟方法及装置
Yi et al. Development of a design system for EPS cushioning package of a monitor using axiomatic design
JPWO2021024300A5 (https=)
JP6104469B2 (ja) 行列生成装置及び行列生成方法及び行列生成プログラム
KR102572429B1 (ko) 압축된 다차원 행렬 곱셈 방법, 연산 장치 및 프로그램을 저장하는 저장매체
JPWO2009147794A1 (ja) 有限オートマトン生成システム
JP3846306B2 (ja) 表処理装置およびプログラム

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19940513

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2021538525

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19940513

Country of ref document: EP

Kind code of ref document: A1