WO2020063225A1 - Data processing method and apparatus - Google Patents

Data processing method and apparatus Download PDF

Info

Publication number
WO2020063225A1
WO2020063225A1 PCT/CN2019/102252 CN2019102252W WO2020063225A1 WO 2020063225 A1 WO2020063225 A1 WO 2020063225A1 CN 2019102252 W CN2019102252 W CN 2019102252W WO 2020063225 A1 WO2020063225 A1 WO 2020063225A1
Authority
WO
WIPO (PCT)
Prior art keywords
weight
data
data set
address
matrix
Prior art date
Application number
PCT/CN2019/102252
Other languages
French (fr)
Chinese (zh)
Inventor
梁晓峣
景乃锋
崔晓松
廖健行
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Publication of WO2020063225A1 publication Critical patent/WO2020063225A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/15Correlation function computation including computation of convolution operations
    • G06F17/153Multidimensional correlation or convolution
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/52Multiplying; Dividing
    • G06F7/523Multiplying only
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means

Definitions

  • the present application relates to the field of information technology, and more particularly, to a method and a data processing apparatus for processing data.
  • CNNs Convolutional neural networks
  • convolution operation The core of convolutional neural network operation is convolution operation.
  • the amount of data that a convolution operation needs to process is usually large. Therefore, the storage and operation resources required for the convolution operation occupy a large amount.
  • Today's processors are increasingly meeting the demands of difficult convolution operations.
  • mobile smart devices With the development of mobile smart devices, mobile smart devices also require convolution operations. But mobile devices have limited computing and storage capabilities. Therefore, how to improve the efficiency of the convolution operation is an urgent problem.
  • the present application provides a method and a data processing apparatus for processing data, which can reduce the number of times to access a storage device.
  • an embodiment of the present application provides a data processing apparatus.
  • the data processing apparatus includes a data processing module, which is configured to obtain a first weight matrix in a first weight data set, where the first A weight matrix is represented as n rows and m columns of weight data.
  • the data in the first weight data set comes from the same input channel, where n is an integer greater than or equal to 2, and m is an integer greater than or equal to 2.
  • a second weight matrix is obtained according to a first weight matrix, where the second weight matrix is a matrix after the first weight matrix is rearranged in rows; using the first weight matrix and the first feature
  • the data set is subjected to a first multiplication operation; the second weight matrix is used to perform a second multiplication operation with the first characteristic data set;
  • the data processing module further includes a control module, configured to perform the first multiplication operation and the second multiplication operation according to the The result of the operation determines the target data set.
  • the target data set includes a product result between elements in the first feature data set and elements in the first weight matrix. Based on the product result, a part of the first feature data set and the first weight matrix can be further obtained. Cartesian product and partial convolution result, the partial Cartesian product and partial convolution result can be output from the data processing device, so that the prediction of the convolution result can be realized with a small amount of calculation and a fast calculation rate.
  • the first weight matrix is a matrix with 3 rows and 3 columns
  • the second weight matrix is a first weight matrix that is rearranged by rows
  • the convolution result of the feature data and the first weight matrix can be obtained, as well as 3 rows and 3 columns of adjacent positions.
  • the feature data of the sum of the convolution part of the first weight matrix, and the feature data of adjacent positions often have continuity, so the data processing device can use the convolution result and the convolution part and the convolution in the above target data set. The results are predicted.
  • the data processing device uses feature data and performs object recognition according to the solution provided in this application
  • the convolution result and the convolution part in the obtained target data set do not match the expected range of values
  • it can directly Eliminates the need for subsequent calculations, saving calculations.
  • the data processing device implements object recognition according to the technical solution provided in the present application, it can further use the object recognition result to realize other functions, for example, it can use the object recognition result to sort products, monitor targets, and so on.
  • the data processing device obtains a second weight matrix according to the first weight matrix, where the second weight matrix is a matrix in which the first weight matrix is rearranged in rows, and the first weight matrix and By performing a multiplication operation on the second weight matrix and the first feature data set, the feature data can be reused when obtaining a partial Cartesian product and a partial convolution result of the first feature data set and the first weight matrix, thereby improving the operation efficiency.
  • the acquired feature data is multiplexed.
  • the above scheme improves the efficiency of the operation.
  • the data processing apparatus further includes an address processing module, where the address processing module is configured to obtain the first weight matrix and the second weight matrix The address of the weight data; use the address of the weight data in the first weight matrix and the second weight matrix to perform the address operation with the address in the first feature data set; the data processing module is configured to perform the multiplication operation Determining the target data set includes: a control module configured to determine the target data set according to the operation result of the multiplication operation and the operation result of the address operation.
  • the address processing module calculates the address of the product of the weight data in the first weight matrix and the second weight matrix and the feature data in the first data set.
  • the feature data and weight can be further obtained.
  • the Cartesian product of the value matrix and the convolution result are used as the target data set, thereby expanding the function of the data processing device.
  • the data processing module is further configured to: obtain a third weight matrix to an n-th weight matrix in the first weight data set, where , The third weight matrix to the nth weight matrix are matrixes in which the first weight matrix is rearranged in rows, and n rows of the first weight matrix to the nth weight matrix are located in the same row. Any two row vectors in the vector are not the same; the address processing module is further configured to: obtain an address of the weight data in the third weight matrix to the n-th weight matrix; use the third weight to the n-th weight An address operation is performed on the address of the weight data of the matrix and the address of the feature data in the first feature data set.
  • the first weight matrix with n rows is rearranged in rows to obtain n weight matrices, and any two of the n row vectors in the n weight matrices in the same row are different. Therefore, after the feature data is multiplied with the n weight matrixes, a Cartesian product of the feature data and the first weight matrix is obtained, thereby increasing the degree of reuse of the feature data and further improving the operation efficiency.
  • the target data set includes a result matrix
  • the result matrix is a result of a convolution operation performed on the first feature data set and the first weight data set.
  • the first feature data set is represented as a first feature matrix
  • the address processing module is further configured to calculate an address of the weight data stored in the array, an address of the first feature data set, and the first feature according to the each address.
  • the size of the matrix, the padding size, and the weighting size determine the first target address, where the weighting size is n rows and m columns, and the padding size includes a horizontal padding size and a vertical padding size, and the horizontal padding size is (n- 1) / 2, and the longitudinal filling size is (m-1) / 2.
  • This solution further refines the method of obtaining the target data address based on the address of the weight data and the address of the characteristic data, thereby improving the feasibility of the data processing device obtaining the convolution result through the Cartesian product.
  • the data processing apparatus further includes a compression module, configured to: obtain a second feature data set, and obtain an element with a value of 0 in the second feature data set
  • the first feature data set is obtained by removal; the second weight data set is obtained, and elements with a value of 0 in the second weight data set are removed to obtain the first weight data set;
  • the address of each feature data determines the address of each weight in the first weight data set.
  • This solution sparses feature data and weight data, that is, removes elements with a value of 0 in the feature data set and weight data set, which reduces the amount of convolution operations and thus improves the operation of the data processing device. effectiveness.
  • an embodiment of the present application provides a data processing method.
  • the method includes: obtaining a first weight matrix in a first weight data set, where the first weight matrix is represented by n rows and m columns.
  • Weight data the data in the first weight data set is from the same input channel, where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2;
  • a two weight matrix wherein the second weight matrix is a matrix after the first weight matrix is rearranged by rows; using the first weight matrix and the first feature data set to perform a first multiplication operation; using the The second weight matrix performs a second multiplication operation with the first feature data set; and determines a target data set according to the operation results of the first multiplication operation and the second multiplication operation.
  • the method further includes: obtaining addresses of weight data in the first weight matrix and the second weight matrix; and using the first weight The address of the weight data in the matrix and the second weight matrix and the address in the first feature data set are subjected to an address operation; and the target data set is determined according to the operation results of the first multiplication operation and the second multiplication operation, including: The target data set is determined according to the operation results of the first multiplication operation and the second multiplication operation and the operation results of the address operation.
  • the method further includes: obtaining a third weight matrix to an n-th weight matrix in the first weight data set, where the third weight matrix
  • the weight matrix to the n-th weight matrix are matrixes in which the first weight matrix is rearranged in rows, and any one of n row vectors of the first weight matrix to the n-th weight matrix located in the same row is in the same row.
  • the two row vectors are not the same; obtain the address of the weight data in the third to n-th weight matrix; use the address of the weight data in the third to n-th weight matrix and the first feature data
  • the address of the feature data in the set is subjected to an address operation.
  • the target data set includes a result matrix
  • the result matrix is a result of a convolution operation performed on the first feature data set and the first weight data set.
  • the first feature data set is represented as a first feature matrix
  • the method further includes: calculating an address of the weight data stored in the array according to the each address, an address of the first feature data set, and corresponding to the first feature matrix. Size, padding size, and weight size to determine the first target address, where the weighting size is n rows and m columns, and the padding size includes a horizontal padding size and a vertical padding size, and the horizontal padding size is (n-1 ) / 2, and the longitudinal filling size is (m-1) / 2.
  • the method further includes: obtaining a second feature data set, and removing elements with a value of 0 in the second feature data set to obtain the first feature data A set; obtaining a second weight data set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining an address of each feature data in the first feature data set, An address for each weight in the first weight data set is determined.
  • the present application provides a data processing device.
  • the data processing device includes a processor and a memory.
  • the memory stores program code.
  • the processor is configured to call the program code in the memory to execute the program code provided in the second aspect of the application. Data processing methods.
  • FIG. 1 is a schematic diagram of a convolution operation process in the prior art.
  • FIG. 2 is a structural block diagram of a data processing apparatus according to an embodiment of the present application.
  • FIG. 3 is a schematic diagram of a data calculation array according to an embodiment of the present application.
  • FIG. 4 is a structural block diagram of a data calculation unit in a data calculation array provided by an embodiment of the present application.
  • FIG. 5 is a schematic diagram of performing a multiplication operation on a first feature data set according to an embodiment of the present application.
  • FIG. 6 is a schematic diagram of an address of a first feature data set and an address of a weight data set provided in an embodiment of the present application.
  • FIG. 7 is a schematic diagram of an address calculation array according to an embodiment of the present application.
  • FIG. 8 is a structural block diagram of an address calculation unit in an address calculation array according to an embodiment of the present application.
  • FIG. 9 is a schematic diagram of weight data stored in two data calculation arrays according to an embodiment of the present application.
  • FIG. 10 is a schematic diagram of weight data stored in a data calculation array according to an embodiment of the present application.
  • FIG. 11 is a schematic diagram of a weight matrix with three filters and thinning processing provided in an embodiment of the present application.
  • FIG. 12 is a schematic diagram of a weight matrix that has not undergone thinning processing according to an embodiment of the present application.
  • FIG. 13 is a schematic flowchart of a data processing method according to an embodiment of the present application.
  • FIG. 14 is a result block diagram of a data processing apparatus provided by an embodiment of the present application.
  • At least one means one or more, and “multiple” means two or more.
  • “And / or” describes the association relationship of related objects, and indicates that there can be three kinds of relationships, for example, A and / or B can represent: the case where A exists alone, A and B exist simultaneously, and B alone exists, where A, B can be singular or plural.
  • the character “/” generally indicates that the related objects are an "or” relationship.
  • “At least one or more of the following” or similar expressions refers to any combination of these items, including any combination of single or plural items.
  • At least one item (a), a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, and c may be single or multiple.
  • the words “first”, “second” and the like do not limit the number and execution order.
  • FIG. 1 is a schematic diagram of a convolution operation process in the prior art.
  • FIG. 1 shows a feature data set, which includes a total of 5 ⁇ 5 feature data.
  • FIG. 1 also shows a weight data set, which includes a total of 3 ⁇ 3 weight data.
  • the weight data set can be used as a convolution kernel to perform a convolution operation with the data feature set.
  • FIG. 1 also shows a schematic diagram of a two-step operation with a step size of 1 during a convolution operation on a feature data set using a weight data set.
  • the 3 ⁇ 3 weight data in the feature data set needs to be multiplied with the 3 ⁇ 3 data in the feature data set, respectively.
  • the result of the multiplication operation is added to obtain the value of one data of the convolution result.
  • the convolution result c 11 can be expressed as Formula 1.1
  • the convolution result c 22 can be expressed as Formula 1.2:
  • c 11 a 11 ⁇ b 11 + a 12 ⁇ b 12 + a 13 ⁇ b 13 + a 21 ⁇ b 21 + a 22 ⁇ b 22 + a 23 ⁇ b 23 + a 31 ⁇ b 31 + a 32 ⁇ b 32 + a 33 ⁇ b 33 , formula 1.1
  • c 12 a 12 ⁇ b 11 + a 13 ⁇ b 12 + a 14 ⁇ b 13 + a 22 ⁇ b 21 + a 23 ⁇ b 22 + a 24 ⁇ b 23 + a 32 ⁇ b 31 + a 33 ⁇ b 32 + a 34 ⁇ b 33 , formula 1.2
  • the feature data set continues to slide to the right, and the next operation is continued until the entire feature data set is traversed.
  • the Cartesian product of the set E 1 and the set F 1 includes all the multiplication results needed to calculate c 11 : a 11 ⁇ b 11 , a 12 ⁇ b 12 , a 13 ⁇ b 13 , a 21 ⁇ b 21 , a 22 ⁇ b 22 , a 23 ⁇ b 23 , a 31 ⁇ b 31 , a 32 ⁇ b 32 , and a 33 ⁇ b 33 .
  • the results of the Cartesian product of sets E and F also include some of the multiplication results needed to calculate c 12 : a 12 ⁇ b 11 , a 13 ⁇ b 12 , a 22 ⁇ b 21 , a 23 ⁇ b 22 , a 32 ⁇ b 31 , a 33 ⁇ b 32 .
  • the results of the Cartesian product of the sets E 2 and F 1 include some of the multiplication results needed to calculate c 12 : a 14 ⁇ b 13 , a 24 ⁇ b 23 , and a 34 ⁇ b 33 .
  • the convolution operation can be decomposed into a Cartesian product operation.
  • the result obtained by one Cartesian product operation can be used for multi-step convolution operation.
  • the result of the one-step convolution operation may be the addition of one or more Cartesian product operations.
  • FIG. 2 is a structural block diagram of a data processing apparatus according to an embodiment of the present application.
  • the data processing apparatus 200 shown in FIG. 2 includes a storage module 210, a data processing module 220, an address processing module 230, and a control module 240.
  • the storage module 210 is configured to store a first feature data set, an address of each feature data in the first feature data set, a first weight set, and an address of each weight in the first weight set.
  • the data processing module 220 includes N data calculation arrays.
  • Each of the N data calculation arrays includes n ⁇ m data calculation units, where N is a positive integer greater than or equal to 2, n is a positive integer greater than or equal to 2, and m is a number greater than or equal to 2. Positive integer.
  • the address processing module 230 includes N address calculation arrays. Each of the N address calculation arrays includes n ⁇ m address calculation units.
  • Each data calculation array is configured to obtain n ⁇ m weight data from the storage module 210 and save the obtained weight data to the n ⁇ m data calculation units of each data calculation array.
  • Each address calculation array is configured to obtain an address of n ⁇ m weight data from the storage module 210 and save the address of the obtained weight data to the n ⁇ m address calculation unit of each address calculation array.
  • the address of the weight data stored in the N address calculation arrays is the address of the weight data stored in the N data calculation arrays.
  • the N address calculation arrays have one-to-one correspondence with the N data calculation arrays, and each address calculation array in the N address calculation arrays holds an address of the weight data stored in the corresponding data calculation array.
  • the weight data stored by one of the N data calculation arrays is b 11 , b 12 , b 13 , b 21 , b 22 , b 23 , b 31 , b 32 , b 33
  • the addresses stored in the N address calculation arrays corresponding to the data calculation array are the addresses of b 11 , b 12 , b 13 , b 21 , b 22 , and b 23 .
  • B 31 , b 32 , and b 33 are the addresses of b 11 , b 12 , b 13 , b 21 , b 22 , and b 23 .
  • the N data calculation arrays use the weight data stored by the N data calculation arrays to perform a multiplication operation on the first feature data set. During the operation of the first feature data set, the weight data stored in the N data calculation arrays are unchanged.
  • the N address calculation arrays use the addresses of the weight data stored by the N address calculation arrays to perform address calculations on the addresses of the first feature data set, where addresses are performed on the addresses of the first feature data set. During the operation, the addresses of the weight data stored in the N address calculation arrays remain unchanged.
  • the control module 240 is configured to determine a target data set according to the N data calculation array according to the operation result of the multiplication operation and the operation result of the address operation.
  • the N data calculation arrays can determine the operation result of the convolution operation on the first feature data set based on the weight data stored by the N data calculation arrays according to the multiplication operation result and the operation result of the address operation.
  • the target data set may be a data set obtained by performing a convolution operation on the first feature data set with weight data stored by the N data calculation arrays.
  • FIG. 3 is a schematic diagram of a data calculation array according to an embodiment of the present application.
  • the data calculation array 300 shown in FIG. 3 includes a total of 9 data calculation units, which are a data calculation unit 311, a data calculation unit 312, a data calculation unit 313, a data calculation unit 321, a data calculation unit 322, a data calculation unit 323, The data calculation unit 331, the data calculation unit 332, and the data calculation unit 333.
  • the data calculation array may further include an input and output unit (not shown in the figure).
  • the input-output unit is used to acquire data that needs to be input to the data calculation array 300.
  • the input / output unit is further configured to input data to be output by the data calculation array 300 to a corresponding unit and / or module.
  • the input / output unit may obtain weight data and feature data from the storage module, and send the obtained weight data and feature data to a corresponding data calculation unit.
  • the input-output unit is further configured to obtain target data calculated by each data calculation unit and send the target data to a storage module.
  • the data transfer between the computing units in the data computing array is unidirectional.
  • the arrows used to connect the data calculation units in FIG. 3 may indicate a unidirectional transmission direction of data.
  • the data calculation unit 311 can send data (for example, characteristic data) to the data calculation unit 312, and the data calculation unit 312 cannot send data to the data calculation unit 311.
  • the data calculation unit 312 can send data to the data calculation unit 313, and the data calculation unit 313 cannot send data to the data calculation unit 312.
  • FIG. 4 is a structural block diagram of a data calculation unit in a data calculation array provided by an embodiment of the present application.
  • the data calculation unit 400 may include a storage subunit 401 and a data calculation subunit 402. It can be understood that the data calculation unit 400 may further include an input-output sub-unit.
  • the input-output subunit is configured to obtain data required by the data calculation unit, and output data required to be output by the data calculation unit.
  • the data calculation array 300 shown in FIG. 3 may obtain 3 ⁇ 3 weight data in the weight data set shown in FIG. 1, and save the 3 ⁇ 3 weight data to the data calculation respectively. 3 ⁇ 3 data calculation units of the array 300.
  • the weight data b 11 may be stored in a storage sub-unit of the data calculation unit 311
  • the weight data b 12 may be stored in a storage sub-unit of the data calculation unit 312
  • the weight data b 13 may be stored in the data calculation unit In the storage subunit of 313, and so on.
  • the data calculation array 300 stores 3 ⁇ 3 weight data.
  • the data calculation array 300 may slide the first feature data set unidirectionally, and use the weight data saved by the data calculation array 300 to perform a multiplication operation on the first feature data set.
  • the weight data stored in the data calculation array 300 does not change.
  • the data calculation unit in the data calculation array 300 will not delete the saved weight data.
  • the data calculation unit will not read and save the new weight data from the storage module.
  • FIG. 5 is a schematic diagram of a process of multiplying the first feature data set according to an embodiment of the present application.
  • the first feature data set may be turned 180 degrees first.
  • the first column of the first feature data set becomes the fifth column after being inverted, the second column becomes the fourth column after being inverted, and so on. It should be noted that, as shown in FIG.
  • the first feature data set is first flipped 180 and then swiped to the right for the convenience of describing the feature data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a The calculation process of 13 , a 23 , a 33 and weight data b 11 , b 21 , b 31 , b 12 , b 22 , b 32 , b 13 , b 23 and b 33 .
  • the first feature data set can be directly multiplied with the weight data stored in the data calculation array 300 by sliding rightward.
  • the calculation result of the first feature data set that is directly multiplied by sliding to the right is the same as the value of the data of the calculation result of the first feature data set that is first flipped 180 degrees and then slided to the right in the manner shown in Figure 5. Yes, only the order of the final data is different.
  • the flipped first feature data set slides to the right one-way, and performs multiplication operations with the weight data stored in the data calculation array 300, respectively. Specifically, at the first operation, the feature data a 11 , a 21, and a 31 are multiplied with the weight data b 11 , b 21, and b 31, respectively . After the first operation, the first feature data set after the flip is swiped to the right to perform the second operation. In the second operation, the characteristic data a 11 , a 21 and a 31 are respectively multiplied by weight data b 12 , b 22 and b 32 , and the characteristic data a 12 , a 22 and a 32 are respectively weighted by b 11 , B 21 and b 31 are multiplied.
  • the inverted feature data set continues to slide to the right for the third operation, and so on.
  • the step size of each sliding of the first feature data set is 1.
  • the step size of each sliding of the first feature data set may also be a positive integer greater than 1.
  • the data calculating unit 311 may acquire characteristic data from a 11 wherein a first set of data stored in the storage module 210, and the acquired characteristic data is stored in a 11 memory sub-data calculating unit 311 Unit.
  • the storage subunit of the data calculation unit 311 holds the weighted value data b 11 and the characteristic data a 11 .
  • the data calculation sub-unit in the data calculation unit 311 multiplies the weight data b 11 and the feature data a 11 stored in the storage sub-unit to obtain intermediate data k (11,11) .
  • the multiplication operation of the weight data b 11 and the characteristic data a 11 may be implemented by a multiplier in the data calculation subunit.
  • the data calculation unit 311 may also obtain the cache data r (11,11) stored in the first target address according to the target address determined by the address calculation unit corresponding to the data calculation unit 311. Specifically, the address calculation unit corresponding to the data calculation unit 311 may determine the first target address according to the address of the characteristic data a 11 and the address of the weight data b 11 . The data calculation unit 311 may obtain the current cache data r (11,11) stored in the first target address. The manner in which the address calculation unit determines the first target address will be described later. The data calculation subunit adds the intermediate data k (11,11) and the current buffer data r (11,11) to obtain target data d (11,11) .
  • the addition operation of the intermediate data k (11,11) and the current buffered data r (11,11) can be implemented by an adder in a data calculation subunit.
  • the target data d (11,11) can be stored in the first target address.
  • the current cache data r (11,11) stored in the first target address is updated to the target data d (11,11) .
  • the data calculation unit 321 can determine the product of the weight data b 21 and the feature data a 21 (hereinafter referred to as the intermediate data k (21, 21) ) held by the data calculation unit 321 in the same manner.
  • the target address determined by the address calculation unit corresponding to the data calculation unit 321 is also the first target address.
  • the data calculation unit 321 adds the intermediate data k (21, 21) and the current cache data (the current cache data has been updated to the target data d (11, 11) ) stored at the first target address to obtain the target data. d (21, 21) .
  • the target data d (21, 21) can be stored in the first target address. In other words, the current cache data d (11, 11) stored in the first target address is updated to the target data d (21, 21).
  • the data calculation unit 331 can determine the product of the weight data b 31 and the feature data a 31 (hereinafter referred to as the intermediate data k (31, 31) ) held by the data calculation unit 331 in the same manner.
  • the target address determined by the address calculation unit corresponding to the data calculation unit 331 is also the first target address.
  • the data calculation unit 331 adds the intermediate data k (31, 31) and the current cache data (the current cache data has been updated to the target data d (21, 21) ) at the first target address to obtain the target data. d (31, 31) .
  • the target data d (31, 31) can be stored in the first target address. In other words, the current cache data d (21, 21) stored in the first target address is updated to the target data d (31, 31) .
  • the target data stored in the first target address is a 11 ⁇ b 11 + a 21 ⁇ b 21 + a 31 ⁇ b 31 .
  • the data calculation array 300 may continue to perform operations on the first feature data set using the weight data saved by the data calculation unit in the data calculation array 300.
  • the data stored in the first destination address is a 11 ⁇ b 11 + a 21 ⁇ b 21 + a 31 ⁇ b 31 + a 12 ⁇ b 12 + a 22 ⁇ b 22 + a 32 ⁇ b 32 . That is, during the third operation, the target address determined by the address calculation unit corresponding to the data calculation unit 312, the data calculation unit 322, and the data calculation unit 332 is also the first target address.
  • the target data stored in the first target address is a 12 ⁇ b 12 determined by the data stored in the first target address and the data calculation unit 312 after the first calculation, and the data calculation unit 322 The sum of the determined a 22 ⁇ b 22 and the a 32 ⁇ b 32 determined by the data calculation unit 332.
  • the data stored in the first destination address is a 11 ⁇ b 11 + a 21 ⁇ b 21 + a 31 ⁇ b 31 + a 12 ⁇ b 12 + a 22 ⁇ b 22 + a 32 ⁇ b 32 + a 13 ⁇ b 13 + a 23 ⁇ b 23 + a 33 ⁇ b 33 .
  • the target address determined by the address calculation unit corresponding to the data calculation unit 313, the data calculation unit 323, and the data calculation unit 333 is also the first target address. Therefore, after the fifth operation, the target data stored in the first target address is a 13 ⁇ b 13 determined by the data stored in the first target address and the data calculation unit 313 after the third operation, and the data calculation unit 323 The determined a 23 ⁇ b 23 and the sum of a 33 ⁇ b 33 determined by the data calculation unit 333.
  • the data stored in the first target address is the convolution result c 11 as shown in Formula 1.1.
  • the multiplication operation and the address operation result can be used to complete the convolution operation of the first feature data set and the weight data set.
  • FIG. 6 is a schematic diagram of an address of a first feature data set and an address of a weight data set provided in an embodiment of the present application.
  • the address of the first feature data set shown in FIG. 6 is the address of the first feature data set shown in FIG. 1.
  • the address Add a11 is the address of the characteristic data a 11
  • the address Add a12 is the address of the characteristic data a 12
  • the address of the weight data set shown in FIG. 6 is the address of the weight data set shown in FIG. 1.
  • the address Add b11 is the address of the weight data b 11
  • the address Add b12 is the address of the weight data b 12 , and so on.
  • FIG. 7 is a schematic diagram of an address calculation array according to an embodiment of the present application.
  • the address calculation array 700 shown in FIG. 7 includes nine data calculation units, which are respectively an address calculation unit 711, an address calculation unit 712, an address calculation unit 713, an address calculation unit 721, an address calculation unit 722, an address calculation unit 723, The address calculation unit 731, the address calculation unit 732, and the address calculation unit 733.
  • the address calculation array may further include an input-output unit (not shown in the figure).
  • the I / O unit is used to obtain data that needs to be input to the address calculation array 700.
  • the input / output unit is further configured to input data to be output by the address calculation array 700 to a corresponding unit and / or module.
  • the input / output unit may obtain the address of the weight data and the address of the characteristic data from the storage module, and send the obtained address of the weight data and the address of the characteristic data to the corresponding address calculation unit.
  • the input-output unit is further configured to obtain a target address calculated by each address calculation unit, and send the target address to a corresponding data calculation unit.
  • the N address calculation arrays are in one-to-one correspondence with the N data calculation arrays.
  • the one-to-one correspondence here means that one data calculation array in the N data calculation arrays corresponds to one address calculation array in the N address calculation arrays, and different data calculation arrays have different address calculation arrays. For example, suppose N is equal to 3.
  • the three data calculation arrays are data calculation array 1, data calculation array 2, and data calculation array 3.
  • the three address calculation arrays are address calculation array 1, address calculation array 2, and address calculation array 3. .
  • the data calculation array 1 corresponds to the address calculation array 1
  • the data calculation array 2 corresponds to the address calculation array 2
  • the data calculation array 3 corresponds to the address calculation array 3.
  • the address calculation array corresponding to the data calculation array is used to calculate a target address of each target data in the data calculation array.
  • the data calculation unit in the data calculation array and the address calculation unit in the address calculation array also correspond one-to-one. Assuming that the data calculation array shown in FIG. 3 corresponds to the address calculation array shown in FIG. 7, the data calculation unit 311 corresponds to the address calculation unit 711, the data calculation unit 312 corresponds to the address calculation unit 712, and the data calculation unit 313 corresponds to the address The calculation unit 731, and so on.
  • the address calculation unit is used to determine the address of the target data of the corresponding data calculation unit. Specifically, as described above, the first target address where the cache data r (11, 11) obtained by the data calculation unit 311 is obtained after the address calculation unit 711 performs an address operation.
  • FIG. 8 is a structural block diagram of an address calculation unit in an address calculation array according to an embodiment of the present application.
  • the address calculation unit 800 may include a storage subunit 801 and an address calculation subunit 802. It can be understood that the address calculation unit 800 may further include an input-output sub-unit.
  • the input-output sub-unit is configured to obtain data required by the address calculation unit and output data required by the address calculation unit.
  • the address calculation array 700 shown in FIG. 7 may obtain an address of 3 ⁇ 3 weight data in the address of the weight data set shown in FIG. 6, and compare the 3 ⁇ 3 weight data with The addresses are respectively stored in 3 ⁇ 3 data calculation units of the address calculation array 700.
  • the address Add b11 may be stored in the storage subunit of the address calculation unit 711
  • the address Add b12 may be stored in the storage subunit of the address calculation unit 712
  • the address Add b13 may be stored in the storage subunit of the address calculation unit 713. And so on.
  • the address calculation array 700 stores addresses of 3 ⁇ 3 weight data.
  • the address calculation array 700 may unidirectionally slide the addresses of the first feature data set, and use the addresses of the weight data stored by the address calculation array 700 to the first feature data.
  • the address of the set performs the address operation.
  • the address of the weight data stored in the address calculation array 700 does not change.
  • the address calculation unit in the address calculation array 700 will not delete the saved address of the weight data.
  • the address calculation unit will not read and save the address of the new weight data from the storage module.
  • the process of performing the address calculation by sliding the address of the first feature data set to the right in one direction is similar to the process of sliding the right of the first feature data set to the right to perform the multiplication operation, and it is unnecessary to repeat it here.
  • the address of the weight obtained by the address calculation unit 800 is referred to as the address of the first weight
  • the address of the feature data obtained by the address calculation unit 800 is referred to as the address of the first feature data.
  • the address obtained after the unit 800 performs an address operation is called a first target address.
  • the input / output sub-unit in the address calculation unit 800 can obtain the following information in addition to the address of the first feature data and the address of the first weight data from the storage module: the input corresponding to the first feature data set The size of the data, the filling size, and the weight size.
  • the weight size is the size of the address calculation array to which the address calculation unit 800 belongs, and the filling size is a preset size. In this example, the weight size is 3 ⁇ 3.
  • the size of the input data corresponding to the first feature data set, the padding size, and the weight size may also be stored in a storage subunit 801 of the address calculation unit 800.
  • the address calculation subunit 801 may determine the first target address according to the first weight data address, the first feature data address, the size of the input data corresponding to the first feature data set, the filling size, and the weight size.
  • the size of the output picture after convolution is (a-n + 1) ⁇ (b-m + 1).
  • the size of the output picture is reduced after each convolution operation; 2.
  • the corners and edges of the original picture are used less in the output, and the output picture loses a lot of information about the position of the edges.
  • the original picture may be padded on the boundary to increase the size of the matrix.
  • 0 is usually used as the padding value.
  • the size of the original picture after filling is (a + 2p) ⁇ (b + 2q), and the size of the convolution kernel remains n rows and m columns, then the output picture If the size is constant, the output picture size is (a + 2p-n + 1) ⁇ (b + 2q-m + 1).
  • the number of pixels p and q expanded in each direction is the fill size. It can be concluded that the horizontal filling size p is equal to (n-1) / 2, and the vertical filling size q is equal to (m-1) / 2.
  • the address calculation subunit 801 may specifically determine the target address according to the following formula:
  • result_cord (input_cord / input_size x -w_cord / kernel_size x + padding_size x) ⁇ input_size y + (input_cord% input_size y -w_cord% kernel_size y + padding_size y), ( Equation 1.3)
  • % represents the margin
  • result_cord represents the target address
  • input_cord represents the address of the feature data
  • input_size x represents the abscissa of the size of the input data corresponding to the first feature data set
  • input_size y represents the first feature corresponding to the first feature
  • w_cord represents the address of the weight data
  • kernel_size x represents the abscissa of the weight size
  • kernel_size y represents the ordinate of the weight size
  • padding_size x represents the horizontal padding size
  • padding_size y represents the vertical fill size.
  • the address of the feature data in Formula 1.3 and the address of the weight data are absolute addresses.
  • the absolute address refers to the absolute position of the feature data / weight data in the corresponding feature data set / weight data set.
  • the feature data set includes X feature data
  • the absolute address of the x-th feature data in the X feature data is x-1, where X is a positive integer greater than 1, and x is greater than 1 and less than or equal to X.
  • Positive integer
  • the feature data set includes: 5, 0, 0, 32, 0, 0, 0, 0, 23, and the absolute addresses of the feature data 5, 32, and 23 are: 0, 3, and 8, respectively.
  • the absolute address listed above refers to the position of the feature data in the feature data, and can be converted into an address composed of the abscissa and the ordinate according to the specifications of the feature matrix. Similarly, the absolute address of the weight data can also be converted into an address composed of the abscissa and the ordinate.
  • the address calculation subunit 801 may further determine the target address according to the following formula:
  • result_cord ((base_input + input_cord) / input_size x - (base_w + w_cord) / kernel_size x + padding_size x) ⁇ input_size y + ((base_cord + input_cord)% input_size y - (base_w + w_cord)% kernel__size y + padding__size y) , (Equation 1.4)
  • % represents the margin
  • result_cord represents the target address
  • input_cord represents the address of the feature data
  • input_size x represents the abscissa of the size of the input data corresponding to the first feature data set
  • input_size y represents the first feature corresponding to the first feature
  • w_cord represents the address of the weight data
  • kernel_size x represents the abscissa of the weight size
  • kernel_size y represents the ordinate of the weight size
  • padding_size x represents the horizontal padding size
  • padding_size y represents the vertical padding size
  • base_input represents the base address of the address of the feature data
  • base_w represents the base address of the address of the weight data.
  • the address of the characteristic data and the address of the weight data in Equation 1.4 are relative addresses.
  • the relative address refers to the position of the feature data / weight data in the corresponding feature data set / weight data set relative to the address of the first feature data / weight data. Assuming that the address of the first feature data combined with the feature data is Y, the address of the y-th feature data in the feature data set is Y + y-1, where Y and y are both positive integers greater than or equal to 1.
  • the address calculation unit may directly send the target address to the corresponding data calculation unit.
  • the data calculation unit may determine the cached data in the target address according to the target address.
  • the cache data in the target address may be determined, and then the cache data and the target address are sent to the corresponding data calculation unit together.
  • the data processing apparatus may include two or more data calculation arrays and corresponding address calculation arrays.
  • the weight data set shown in FIG. 1 includes only 3 ⁇ 3 weight data, and only one weight data set is used for the convolution operation on the feature data set.
  • the weight data set used to perform the convolution operation on the feature data set may also be two or more.
  • each of the N data calculation arrays may obtain and save a weight data set, and multiply the first feature data set by using the saved weight data.
  • each of the N address calculation arrays can obtain and save the address of the corresponding weight data, and multiply the address of the first feature data set by using the saved weight data address.
  • the N data calculation arrays can obtain the N weight data sets each time and multiply the first feature data set . If the number of weight data sets that can be acquired at one time is less than N, all the weight data sets are acquired to perform a multiplication operation on the first feature data set. Assume that the value of N is 4, and the number of weight data sets is 9. In this case, the four data calculation arrays can first obtain the first to fourth weight data sets and multiply the first feature data set, and then the four data calculation arrays can then obtain the fifth to eighth data sets.
  • the weight data set performs a multiplication operation on the first feature data set, and then the four data calculation arrays obtain a ninth weight data set and perform a multiplication operation on the first feature data set.
  • the manner in which the N address calculation arrays perform address operations is similar, and it is unnecessary to repeat them here.
  • the weight data stored in different data calculation arrays in the N data calculation arrays may be the result of rearranging the same weight data in rows.
  • the N data calculation arrays include a first data calculation array and a second data calculation array.
  • the n ⁇ m weight data stored in the second data calculation array are n ⁇ m pieces of data stored in the first data calculation array.
  • the weight data is n ⁇ m weight data after row rearrangement.
  • FIG. 9 is a schematic diagram of weight data stored in two data calculation arrays according to an embodiment of the present application.
  • the data calculation array 1 stores 3 ⁇ 3 weight data, wherein the weight data of the first row is b 11 , b 12 and b 13 ; the weight data of the second row is b 21 , b 22 And b 23 ; the weight data of the third row are b 31 , b 32 and b 33 .
  • the data calculation array 2 holds 3 ⁇ 3 weight data, wherein the weight data of the first row are b 31 , b 32 and b 33 ; the weight data of the second row are b 11 , b 12 and b 13 ; the third The exercise weight data are b 21 , b 22 and b 23 .
  • the result of rearranging the weight data stored in the data calculation array 1 is the weight data stored in the data calculation array 2.
  • the weight data stored in the data calculation array 1 may also be considered as a result of rearranging the weight data stored in the data calculation array 2 in rows.
  • the weight data obtained after the rearrangement by rows is referred to as rearrangement weight data
  • the weight data stored by the two data calculation arrays shown in FIG. 9 are referred to as mutual rearrangement weights. Value data.
  • Figure 9 shows the relationship between the weight data stored in the two data calculation arrays.
  • the weight data stored in any two of the three or more data calculation arrays is also rearranged weight data.
  • the N data calculation arrays also include a data calculation array 3 as shown in FIG. 10.
  • the data calculation 3 array stores 3 ⁇ 3 weight data, where the weight data of the first row is b 21 , b 22 And b 23 ; weight data in the second row are b 31 , b 32 and b 33 ; weight data in the third row are b 11 , b 12 and b 13 . It can be seen that the weight data stored in data calculation array 1 and data calculation array 3 shown in FIG.
  • weight data stored in data calculation array 2 and data calculation array 3 are also each other. Rearrange weight data.
  • the weight data can be rearranged at most n-1 times.
  • the weight data stored in the 2nd to nth data calculation arrays are the weighted data in which the weight data stored in the first data calculation array of the n data calculation arrays are rearranged in rows, where: Any two row vectors of the n weight data stored in the n data calculation arrays in the row vector of the same row are different.
  • N is a positive integer greater than or equal to n.
  • the first data calculation array and the second data calculation array are any two data calculation arrays among the n data calculation arrays.
  • each of the n data calculation arrays holds the first row weight data of the remaining n-1 data calculation arrays from the second row weight data to the nth row weight data. data.
  • the data calculation array 2 and the data calculation array 3 may first obtain 3 ⁇ 3 weight data as shown in FIG. 1, and then perform data rearrangement to obtain rearranged weight data.
  • the storage module may store the rearrangement weight data, and the data calculation array 2 and the data calculation array 3 directly obtain the rearrangement weight data from the storage module.
  • the address of the weight data stored in the second address calculation array corresponding to the second data calculation array also corresponds to the first data calculation array.
  • the first address calculates the result of row-wise rearrangement of the address of the weight data held by the array.
  • the weight data includes a total of n lines, and the address of the weight data also includes n lines.
  • the address of the weight data can be rearranged at most n-1 times.
  • the addresses of the weight data stored in the 2nd to nth address calculation arrays in the n address calculation arrays of the N address calculation arrays are all For the first address in the n address calculation arrays, the address of the weight data stored in the array is sorted by the address of the weight data.
  • N is a positive integer greater than or equal to n.
  • the first address calculation array and the second address calculation array are any two address calculation arrays among the n address calculation arrays.
  • the addresses of the first row of weight data stored in each of the n address calculation arrays are the addresses of the second row of weight data in the remaining n-1 data calculation arrays, respectively.
  • the address of n rows of weight data are possible.
  • the feature value data can be reused to further reduce the number of times the data calculation array and the address calculation array access the storage module.
  • c 21 a 21 ⁇ b 11 + a 22 ⁇ b 12 + a 23 ⁇ b 13 + a 31 ⁇ b 21 + a 32 ⁇ b 22 + a 33 ⁇ b 23 + a 41 ⁇ b 31 + a 42 ⁇ b 32 + a 43 ⁇ b 33 , formula 1.4
  • the operation results of a 21 ⁇ b 11 and the operation results of a 22 ⁇ b 12 can be obtained.
  • the operation result of a 23 ⁇ b 13 , the operation result of a 31 ⁇ b 21 , the operation result of a 32 ⁇ b 22 , and the operation result of a 33 ⁇ b 23 According to the operation rules described earlier, the sum of the above 6 operation results will be saved to the same destination address.
  • the data processing device includes only the data calculation array 1 and the data calculation array 2 and the weight data stored in the data calculation array 1 and the data calculation array 2 is shown in FIG. 9.
  • the data calculation array 1 and the data calculation array 2 are in the first row to the third line of the feature data set.
  • the feature data in the third to fifth rows of the feature data set may be multiplied.
  • the step size for sliding down may be two.
  • the weight data is not rearranged (in other words, the data processing device has only the data calculation array 1 shown in FIG.
  • the data calculation array 1 is used to perform the multiplication of the feature data of the second to fourth rows.
  • the multiplication needs to obtain the feature data of the second to third rows of the feature data set again.
  • the feature data in the second to third rows of the feature data set needs to be read a second time to obtain the operation results of a 21 ⁇ b 11 , a 22 ⁇ b 12 , a 23 ⁇ b 13, etc.
  • the same feature data needs to be read multiple times.
  • the data calculation array 2 performs a multiplication operation on the second to third feature data of the feature data set, which is equivalent to using the data to calculate the array 1 with a step size of 1. After sliding down, the result of multiplication is performed on the feature data in the second to third rows. In other words, as long as the feature data of the second to third rows of the feature data set is read once, the multiplication operation of the two weight data sets to the feature data of the second to third rows can be realized. In this way, more partial Cartesian products can be obtained by one reading of the characteristic data.
  • the feature After the data set is multiplied with the n weight matrix, the Cartesian product of the feature data set and the first weight matrix can be obtained, and the convolution of the feature data set and the first weight matrix can be further obtained.
  • Each feature data in the data set need only be loaded into the data processing unit once.
  • FIG. 10 is a schematic diagram of weight data stored in a data calculation array according to an embodiment of the present application.
  • the fourth data calculation array may Multiplication is performed on the characteristic data from the fifth to the fifth rows.
  • the step size for sliding down may be 3.
  • Three data calculation arrays can complete the Cartesian product operation on the feature data set.
  • the feature data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , and a 33 are also taken as examples.
  • the three data calculation arrays can perform the multiplication process shown in FIG. 5 with the characteristic data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , and a 33, respectively.
  • These three data calculation arrays use the weight data stored separately to complete the multiplication of the characteristic data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , a 33 , such as Table 1 shows.
  • the weight data can be rearranged at most n-1 times. If the weight data is rearranged once, during the process of traversing the feature data set for multiplication, the step size for sliding down may be 2; if the weight data is rearranged twice, the feature data set is traversed In the process of multiplication, the step size for sliding down can be 3; if the weight data is rearranged n-1 times, during the process of traversing the feature data set for multiplication, the step size for sliding down can be Is n.
  • the first feature data set is a feature data set obtained by thinning the second feature data set.
  • the first weight data set is a weight data set obtained after thinning.
  • the data processing apparatus 200 shown in FIG. 2 may further include a compression module.
  • the compression module is configured to obtain a second feature data set, and perform thinning processing on the second feature data set to obtain the first feature data set.
  • the second feature data set includes feature data corresponding to the input data.
  • the compression module is further configured to obtain a second weight data set, and perform thinning processing on the second weight data set to obtain the first weight data set.
  • the compression module is further configured to determine an address of each feature data in the first feature data set, and determine an address of each weight data in the first weight data set.
  • the compression module will obtain the first feature data set, the first weight data set, the address of each feature data in the first feature data set, and the weight of each weight data in the first weight data set.
  • the address is sent to the storage module and saved by the storage module. If the thinned weight data is less than n ⁇ m, the remaining bits are padded with zeros.
  • the input data referred to in the embodiments of the present application may be any data capable of performing a multiplication operation, a Cartesian product operation, and / or a convolution operation.
  • it may be image data, voice data, and the like.
  • the input data is a collective term for all data input to the data processing device.
  • the input data may consist of characteristic data.
  • the feature data corresponding to the input data may be all data included in the input data, or may be part of the feature data of the input data. Taking image data as an example, assuming that the input data is an entire image, all the data of the image is called feature data.
  • the second weight data set may include all feature data of the input data, or may be all or part of the feature data of the image after some processing.
  • the second weight data may be feature data of a partial image obtained after the image is segmented.
  • the second feature data set includes: 5, 0, 0, 32, 0, 0, 0, 0, 23, 0, 0, 0, 0, 0, 43, 54, 0, 0, 0, 1, 4, 9,34,0,0,0,0,0,0,87,0,0,0,0,0,5,8, then the first feature data set obtained after thinning includes: 5, 32, 23, 43 , 54,1,4,9,34,87,5,8.
  • the address of the first feature data in the second feature data set is 0, the address of the second feature data is 1, the address of the third feature data is 2, and the address of the nth feature data is n-1.
  • the address (absolute address) of the first feature data set is: 0, 3, 8, 14, 15, 19, 20, 21, 22, 29, 34, 35.
  • the second weight data set includes 8, 4, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 24, 54, 0, 0, 0, 0,0,12,0,0,22,3,45,0,0,0,67,44,0,0,0,0,0,0,0,0,35,65,75
  • the second weighted data set after sparseness includes: 8, 4, 2, 24, 54, 12, 22, 3, 45, 67, 44, 35, 65, 75.
  • the thinned second weight data set includes 14 weight data. It is assumed that each data calculation array includes 3 ⁇ 3 data calculation units.
  • the number of weight data of the second thinned data set after sparseness is less than the number of data calculation units included in the two data calculation arrays. Therefore, 4 zeros are finally added to the second sparse weighted data set to obtain the first weighted data set. Therefore, the set of first weight data corresponding to the second weight data is: 8, 4, 2, 24, 54, 12, 22, 3, 45, 67, 44, 35, 65, 75, 0, 0,0,0.
  • the address of the first weight data in the second weight data set is 0, the address of the second weight data is 1, the address of the third weight data is 2, and the address of the nth weight data is If the address is n-1, the address (absolute address) of the first weight data set is: 0, 1, 6, 16, 17, 23, 26, 27, 28, 33, 34, 43, 44, 45.
  • the first feature data set may also be a feature data set that has not been thinned. In other words, the first feature data set may be equal to the second feature data set.
  • the first feature data set in the above embodiment corresponds to a matrix, and accordingly, the weight data used to perform the convolution operation on the first feature data set also corresponds to a matrix.
  • the convolution operation described in the above embodiment is a two-dimensional convolution operation.
  • T is a positive integer greater than or equal to 3.
  • the first feature data set may be a three-dimensional tensor.
  • a three-dimensional convolution operation may be performed on the first feature data set.
  • the first feature data set includes three subsets: feature data subset 1, feature data subset 2, and feature data subset 3.
  • the feature data of the three subsets correspond to the three input channels of red, green, and blue, respectively.
  • the feature data in each of the three subsets may correspond to a matrix.
  • the weight data set used to perform the convolution operation on the feature data set may also be referred to as a filter. Therefore, the three weight data sets can be referred to as filter 1, filter 2, and filter 3.
  • Each of the three weight data sets includes three weight channels, namely channel 1, channel 2 and channel 3.
  • the weight data included in each of the three weight channels may correspond to a matrix.
  • the three weight channels correspond one-to-one with the three feature data subsets. For example, channel 1 corresponds to feature data subset 1, channel 2 corresponds to feature data subset 2, and channel 3 corresponds to feature data subset 3.
  • the weight channel can perform a convolution operation on the corresponding feature data subset.
  • the filter 1, filter 2 and filter 3 may respectively perform a three-dimensional convolution operation on the first feature data set. That is, channel 1 of filter 1 performs a convolution operation on the characteristic data subset 1 of the first characteristic data set, and channel 2 of filter 1 performs a convolution on the characteristic data subset 2 of the first characteristic data set.
  • channel 3 of filter 1 performs a convolution operation on the characteristic data subset 3 of the first characteristic data set
  • channel 1 of filter 2 performs a convolution operation on the characteristic data subset 1 of the first characteristic data set
  • Channel 2 of filter 2 performs a convolution operation on the feature data subset 2 of the first feature data set
  • channel 3 of filter 2 performs a convolution operation on the feature data subset 3 of the first feature data set
  • filter Channel 1 of 3 performs a convolution operation on the feature data subset 1 of the first feature data set
  • channel 2 of filter 3 performs a convolution operation on the feature data subset 2 of the first feature data set.
  • Channel 3 performs a convolution operation on the feature data subset 3 of the first feature data set.
  • the process of performing a three-dimensional convolution operation on the first feature data set by each of the three filters can be decomposed into three two-dimensional convolution operation processes.
  • the specific implementations of the three two-dimensional convolution operations are similar to the specific implementations of the two-dimensional convolution operation in the foregoing embodiment.
  • channel 1 for convolution operation on the characteristic data subset 1 can be considered as the weight data set shown in FIG. 1
  • the characteristic data subset 1 can be considered as the characteristic data set shown in FIG. 1.
  • the process of performing a convolution operation on the feature data subset by channel 1 is a process of performing a convolution operation on the feature data set by the weight data set shown in FIG. 1.
  • the convolution operation process can be decomposed into a multiplication operation and an addition operation. Therefore, the data processing apparatus shown in FIG. 2 can also perform a three-dimensional convolution operation.
  • the first feature data set referred to in the above embodiment may be considered as a feature data subset in the feature data set corresponding to the three-dimensional tensor.
  • the first weight data set may be considered as one weight data set among the multiple weight data sets.
  • the weight data set also corresponds to a three-dimensional tensor
  • the first weight data set can be considered as a channel in the weight data set of the three-dimensional tensor.
  • the first weight data set may also be a weight value obtained by performing thinning processing on multiple weight data sets.
  • FIG. 11 is a schematic diagram of a weight matrix with three filters and thinning processing provided in an embodiment of the present application.
  • Each of the three filters shown in FIG. 11 includes three weight channels, and each weight channel includes 3 ⁇ 3 weight data.
  • the weight data of weight data set 1 comes from the weight data in channel 1 of filter 1 and filter 2
  • the weight data of weight data set 4 comes from filter 2 and filter Weight data in channel 1 of 3.
  • the weight data of weight data set 2 comes from the weight data in filter 1 and channel 2 of filter 2
  • the weight data of weight data set 5 comes from the filter 2 and channel 2 of filter 3.
  • the weight data of the weight data set 3 comes from the weight data in channel 3 of filter 1 and filter 2
  • the weight data of the weight data set 6 comes from the channel 3 of filter 2 and filter 3. Weight data.
  • the same channel with weight data from different weight data means that the weight data can belong to different filters, but the channels in different filters are the same.
  • the weight data of the weight data set 4 comes from the weight data in channel 1 of filter 2 and the weight data in channel 1 of filter 3.
  • the weight data set obtained by thinning the weight data in multiple filters is hereinafter referred to as the sparse weight data set
  • the weight data included in the sparse weight data set may come from the same filter.
  • the process of multiplying feature data by the sparse weighted data set and the process of determining the result of the convolution operation of the set of sparsely weighted data set and feature data according to the operation result of the multiplication are the same as the above embodiments, here No need to repeat them.
  • the weight data included in the sparse weight data set may come from different filters.
  • the operation process of multiplying the feature data by the sparse weighted data set is the same as the above embodiment, and it is unnecessary to repeat it here.
  • the weight data included in the thinning weight data set can come from different filters, the process of determining the convolution operation result of the thinning weight data set and the characteristic data according to the operation result of the multiplication operation and the above implementation The examples are not exactly the same.
  • the weight data included in the thinning weight data set comes from P filters (P is a positive integer greater than or equal to 2).
  • the sparseness weight data set can be divided into P sparseness weight data subsets, and the p-th sparseness weight data subset of the P sparseness weight data subsets includes data from the P filters.
  • the weight data of the p-th filter, p 1,..., P.
  • the p-th sparse thinned weight data subset includes Num p weight data, where Num p is a positive integer greater than or equal to 1, and Num p is less than n ⁇ m.
  • the weight data stored in the three data calculation arrays shown in FIG. 9 and FIG. 10 are also taken as an example. It is assumed that the weight data shown in FIG. 9 and FIG. 10 are obtained by thinning the weight data of channel 1 of filter 1 and the weight data of channel 1 of filter 2 shown in FIG. 12. Using the three data calculation arrays shown in FIG.
  • a 11 ⁇ b 11 , a 12 ⁇ b 12 , a 13 ⁇ b 13 , a 21 ⁇ b 21 , a 22 ⁇ b 22 , and a 23 ⁇ b 23 is the weight of channel 1 of filter 1
  • the sum of a 13 ⁇ b 33 is the weight data of channel 1 of filter 2 to convolve ⁇ a 11 , a 12 , a 13 , a 21 , a 22 , a 23 , a 31 , a 32 , a 33 ⁇
  • the compression module may also perform thinning processing on the target data set, and delete 0 in the target data set.
  • a product of each feature data in the first feature data set and each weight data in the first weight data set can be obtained. After that, the corresponding product results can be added to obtain the convolution operation result of the first feature data set and the first weight data set.
  • each data in the data calculation array The calculation unit adds the product of the weight data and the characteristic data to the data stored in the target address determined by the corresponding address calculation unit, and writes the added data back to the target address. In this way, the final saved result of the target address is the result of the convolution operation.
  • each data calculation unit in the data calculation array may perform only a multiplication operation, that is, multiply the data with the characteristic data, and save the multiplied result to the target address determined by the corresponding address calculation unit, and then The multiplication result is obtained from the corresponding target address, and the obtained multiplication results are added to obtain the corresponding convolution operation result.
  • the result of a 11 ⁇ b 11 is stored in target address 1
  • the result of a 21 ⁇ b 21 is stored in target address 2
  • the result of a 31 ⁇ b 31 is stored in target address 3
  • the result of a 12 ⁇ b 12 is stored in The result at destination address 4
  • a 22 ⁇ b 22 is stored at destination address 5
  • the result at a 32 ⁇ b 32 is stored at destination address 6
  • the result at a 13 ⁇ b 13 is stored at destination address 7
  • the result stored at destination address 8 a 33 ⁇ b 33 is stored at destination address 9.
  • the data stored in the target address 1 to the target address 9 can be added to obtain c 11 as shown in formula 1.1.
  • the storage module may include an addition unit.
  • Each data calculation unit in the data calculation array can only perform multiplication operations, that is, multiply the data with the characteristic data, and output the result of the multiplication to the storage module.
  • the storage module stores the received data to the data calculation unit.
  • the corresponding address calculation unit determines the target address
  • the received data is first added to the data stored in the target address, and the added data is saved to the target address. In this way, the final saved result of the target address is the result of the convolution operation.
  • FIG. 13 is a schematic flowchart of a data processing method according to an embodiment of the present application. The method shown in FIG. 13 may be executed by the data processing apparatus shown in FIG. 2 or FIG. 14.
  • the method further includes: obtaining addresses of weight data in the first weight matrix and the second weight matrix; using the first weight matrix and the second weight matrix An address operation is performed on the address of the weight data and the address in the first characteristic data set; the determining the target data set according to the operation result of the multiplication operation includes: according to the operation result of the multiplication operation and the operation result of the address operation To determine the target data set.
  • the method further includes: obtaining a third weight matrix to an n-th weight matrix in the first weight data set, wherein the third weight matrix to the n-th weight matrix
  • the matrix is a matrix in which the first weight matrix is rearranged in rows, and any two row vectors of n row vectors in the same row of the first weight matrix to the nth weight matrix are different; obtain The addresses of the weight data in the third to n-th weight matrices; using the addresses of the weight data in the third to n-th weight matrices and the addresses of the feature data in the first feature data set Address calculation.
  • the target data set includes a result matrix, which is a result of a convolution operation performed on the first feature data set and the first weight data set
  • the first feature data set is Expressed as a first feature matrix
  • the method further includes: calculating an address of weight data stored in the array, an address of a first feature data set, a size corresponding to the first feature matrix, a padding size, and a weight value according to the each address
  • the size determines the first target address, where the weight size is n rows and m columns, and the padding size is the difference between the size of the first feature data set and the size of the result matrix.
  • the method further includes: obtaining a second feature data set, removing elements with a value of 0 in the second feature data set to obtain the first feature data set; obtaining second weight data Set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining the address of each feature data in the first feature data set, and determining the first weight data set The address of each weight in.
  • FIG. 14 is a structural block diagram of a data processing apparatus according to an embodiment of the present application.
  • the data processing device 1400 shown in FIG. 14 includes a data processing module 1401 and a control module 1404.
  • the data processing module 1401 includes N data calculation units, where N is an integer greater than or equal to 2, where: the data processing module 1401 is used for: Obtain a first weight matrix in a first weight data set, where the first weight matrix is represented as n rows and m columns of weight data, and the data in the first weight data set is from the same input channel Where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2; a second weight matrix is obtained, where the second weight matrix is after the first weight matrix is rearranged in rows.
  • the control module 1404 is configured to determine a target data set according to an operation result of the multiplication operation.
  • the data processing device 1400 further includes an address processing module 1402, and the address processing module 1402 includes N address calculation units.
  • the data calculation unit and the address calculation unit correspond one-to-one, where: the address processing module 1402 Configured to: obtain the addresses of the weight data in the first weight matrix and the second weight matrix; use the addresses of the weight data in the first weight matrix and the second weight matrix and the first feature data The address in the set performs an address operation; the control module 1404 is configured to determine the target data set according to the operation result of the multiplication operation, and includes: determining the target data set according to the operation result of the multiplication operation and the operation result of the address operation.
  • the data processing module 1401 is further configured to obtain a third weight matrix to an n-th weight matrix in the first weight data set, where the third weight matrix is to The n-th weight matrix is a matrix after the first weight matrix is rearranged in rows, and any two of the n row vectors in the same row of the first weight matrix to the n-th weight matrix are in any two row vectors.
  • the address processing module 1402 is further configured to: obtain the addresses of the weight data in the third to n-th weight matrix; use the addresses of the weight data in the third to n-th weight matrix and The address of the feature data in the first feature data set is subjected to an address operation.
  • the target data set includes a result matrix, which is a result of a convolution operation performed on the first feature data set and the first weight data set, and the first feature data set is Represented as a first feature matrix
  • the address processing module 1402 is further configured to calculate an address of the weight data stored in the array, an address of a first feature data set, a size corresponding to the first feature matrix, and a padding size according to each address. And the weight size to determine the first target address, where the weight size is n rows and m columns, and the padding size is the difference between the size of the first feature data set and the size of the result matrix.
  • the data processing device 1400 further includes a compression module 1403, configured to: obtain a second feature data set, and remove elements having a value of 0 in the second feature data set to obtain the first feature data A set; obtaining a second weight data set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining an address of each feature data in the first feature data set, An address for each weight in the first weight data set is determined.
  • a compression module 1403 configured to: obtain a second feature data set, and remove elements having a value of 0 in the second feature data set to obtain the first feature data A set; obtaining a second weight data set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining an address of each feature data in the first feature data set, An address for each weight in the first weight data set is determined.
  • the terminal device or the network device includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer.
  • This hardware layer includes hardware such as a central processing unit (CPU), a memory management unit (MMU), and a memory (also called main memory).
  • the operating system may be any one or more computer operating systems that implement business processing through processes, such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a windows operating system.
  • This application layer contains applications such as browsers, address books, word processing software, and instant messaging software.
  • the embodiment of the present application does not specifically limit the specific structure of the execution subject of the method provided by the embodiment of the present application, as long as the program that records the code of the method provided by the embodiment of the application can be run to provide the program according to the embodiment of the application.
  • the communication may be performed by using the method described above.
  • the method execution subject provided in the embodiments of the present application may be a terminal device or a network device, or a function module in the terminal device or the network device that can call a program and execute the program.
  • various aspects or features of the present application may be implemented as a method, apparatus, or article of manufacture using standard programming and / or engineering techniques.
  • article of manufacture encompasses a computer program accessible from any computer-readable device, carrier, or medium.
  • computer-readable media may include, but are not limited to: magnetic storage devices (eg, hard disks, floppy disks, or magnetic tapes, etc.), optical disks (eg, compact discs (CD), digital versatile discs (DVD) Etc.), smart cards and flash memory devices (for example, erasable programmable read-only memory (EPROM), cards, sticks or key drives, etc.).
  • various storage media described herein may represent one or more devices and / or other machine-readable media used to store information.
  • machine-readable medium may include, but is not limited to, wireless channels and various other media capable of storing, containing, and / or carrying instruction (s) and / or data.
  • the disclosed systems, devices, and methods may be implemented in other ways.
  • the device embodiments described above are only schematic.
  • the division of the unit is only a logical function division.
  • multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented.
  • the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
  • the units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
  • each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
  • the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium.
  • the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product.
  • the computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application.
  • the aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .

Abstract

Provided are a data processing method and a data processing apparatus. The data processing apparatus comprises a data processing module, and the data processing module is used for: acquiring a first weight matrix in a first weight data set, wherein the first weight matrix is represented as n rows and m columns of weight data, and data in the first weight data set are from the same input channel; acquiring a second weight matrix, wherein the second weight matrix is a matrix obtained after rearranging the first weight matrix in rows; using the first weight matrix to perform a multiplication operation with a first feature data set, wherein data in the first feature data set are from the same input channel; using the second weight matrix to perform a multiplication operation with the first feature data set; and according to the operation result of the multiplication operation, determining a target data set. The technical solution can reduce the number of times a storage device is accessed.

Description

一种数据处理方法和装置Data processing method and device 技术领域Technical field
本申请涉及信息技术领域,更具体地,涉及处理数据的方法和数据处理装置。The present application relates to the field of information technology, and more particularly, to a method and a data processing apparatus for processing data.
背景技术Background technique
卷积神经网络(convolutional neural networks,CNN)是深度学习中使用最广泛的算法,它广泛应用于图像分类、语音识别、视频理解、人脸检测等多种应用中。Convolutional neural networks (CNNs) are the most widely used algorithms in deep learning. They are widely used in image classification, speech recognition, video understanding, face detection, and other applications.
卷积神经网络运算的核心是卷积运算。卷积运算需要处理的数据量通常很大。因此卷积运算所需的存储和运算资源占用较大。目前的处理器越来越满足难以满足卷积运算的需求。另外,随着移动智能设备的发展,移动智能设备也有卷积运算的需求。但是移动设备所能提供的运算能力和存储能力都有限。因此,如何提高卷积运算的效率是一个亟待解决的问题。The core of convolutional neural network operation is convolution operation. The amount of data that a convolution operation needs to process is usually large. Therefore, the storage and operation resources required for the convolution operation occupy a large amount. Today's processors are increasingly meeting the demands of difficult convolution operations. In addition, with the development of mobile smart devices, mobile smart devices also require convolution operations. But mobile devices have limited computing and storage capabilities. Therefore, how to improve the efficiency of the convolution operation is an urgent problem.
发明内容Summary of the Invention
本申请提供一种处理数据的方法和数据处理装置,能够减少访问存储设备的次数。The present application provides a method and a data processing apparatus for processing data, which can reduce the number of times to access a storage device.
第一方面,本申请实施例提供一种数据处理装置,该数据处理装置包括:数据处理模块,该数据处理模块用于获取第一权值数据集合中的第一权值矩阵,其中,该第一权值矩阵被表示为n行m列个权值数据,该第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数;根据第一权值矩阵获取第二权值矩阵,其中,该第二权值矩阵是对该第一权值矩阵进行按行重排后的矩阵;使用第一权值矩阵与第一特征数据集合进行第一乘法运算;使用该第二权值矩阵与该第一特征数据集合进行第二乘法运算;数据处理模块还包括控制模块,用于根据该第一乘法运算和第二乘法运算的运算结果,确定目标数据集合。In a first aspect, an embodiment of the present application provides a data processing apparatus. The data processing apparatus includes a data processing module, which is configured to obtain a first weight matrix in a first weight data set, where the first A weight matrix is represented as n rows and m columns of weight data. The data in the first weight data set comes from the same input channel, where n is an integer greater than or equal to 2, and m is an integer greater than or equal to 2. An integer; a second weight matrix is obtained according to a first weight matrix, where the second weight matrix is a matrix after the first weight matrix is rearranged in rows; using the first weight matrix and the first feature The data set is subjected to a first multiplication operation; the second weight matrix is used to perform a second multiplication operation with the first characteristic data set; the data processing module further includes a control module, configured to perform the first multiplication operation and the second multiplication operation according to the The result of the operation determines the target data set.
其中,目标数据集合包括了第一特征数据集合中的元素和第一权值矩阵中的元素之间的乘积结果,根据该乘积结果可以进一步得到第一特征数据集合与第一权值矩阵的部分笛卡尔积和部分卷积结果,该部分笛卡尔积和部分卷积结果可以从数据处理装置里输出,从而可以通过较小的计算量和较快的计算速率实现对卷积结果的预测。例如,假设第一权值矩阵是3行3列的矩阵,第二权值矩阵是按行重排后的第一权值矩阵,当将第一特征数据集合中的某3行3列的数据输入数据处理模块分别与第一权值矩阵和第二权值矩阵进行乘法运算后,根据目标数据集合可以得到该特征数据与第一权值矩阵的卷积结果以及相邻位置的3行3列的特征数据与第一权值矩阵的卷积部分和,由于相邻位置的特征数据往往具有连续性,因此数据处理装置可以利用上述目标数据集合中的卷积结果和卷积部分和对卷积结果进行预测,比如数据处理装置利用特征数据并根据本申请所提供的方案进行物体识别时,当得到的目标数据集合中的卷积结果和卷积部分和预期的值域范围不符时,可以直接进行排除,而不用进行后续的计算,从而节省了计算量。而数据处理装置根据本申请所提供的技术方案实现了物体识别后,还可以进一步利用该物体识别结果实现其他功能,例如,可以利用物体识别结果分拣商品、监控目标等。The target data set includes a product result between elements in the first feature data set and elements in the first weight matrix. Based on the product result, a part of the first feature data set and the first weight matrix can be further obtained. Cartesian product and partial convolution result, the partial Cartesian product and partial convolution result can be output from the data processing device, so that the prediction of the convolution result can be realized with a small amount of calculation and a fast calculation rate. For example, assuming that the first weight matrix is a matrix with 3 rows and 3 columns, and the second weight matrix is a first weight matrix that is rearranged by rows, when data of a certain 3 rows and 3 columns in the first feature data set After the input data processing module performs multiplication with the first weight matrix and the second weight matrix, respectively, according to the target data set, the convolution result of the feature data and the first weight matrix can be obtained, as well as 3 rows and 3 columns of adjacent positions. The feature data of the sum of the convolution part of the first weight matrix, and the feature data of adjacent positions often have continuity, so the data processing device can use the convolution result and the convolution part and the convolution in the above target data set. The results are predicted. For example, when the data processing device uses feature data and performs object recognition according to the solution provided in this application, when the convolution result and the convolution part in the obtained target data set do not match the expected range of values, it can directly Eliminates the need for subsequent calculations, saving calculations. After the data processing device implements object recognition according to the technical solution provided in the present application, it can further use the object recognition result to realize other functions, for example, it can use the object recognition result to sort products, monitor targets, and so on.
在上述方案中,数据处理装置根据第一权值矩阵得到第二权值矩阵,其中,第二权值矩阵是第一权值矩阵按行重排后的矩阵,并使用第一权值矩阵和第二权值矩阵与第一特征数据集合进行乘法运算,可以在得到第一特征数据集合与第一权值矩阵的部分笛卡尔积和部分卷积结果时复用特征数据,从而提高运算的效率。In the above solution, the data processing device obtains a second weight matrix according to the first weight matrix, where the second weight matrix is a matrix in which the first weight matrix is rearranged in rows, and the first weight matrix and By performing a multiplication operation on the second weight matrix and the first feature data set, the feature data can be reused when obtaining a partial Cartesian product and a partial convolution result of the first feature data set and the first weight matrix, thereby improving the operation efficiency. .
具体来说,现有技术计算特征矩阵和权值矩阵的卷积时,是通过将权值矩阵在特征矩阵上进行滑动,并进行权值矩阵元素与对应的特征数据的乘法运算实现的。由于同一个特征矩阵中的特征数据往往在多次权值矩阵滑动后的乘法运算中都需要被使用,在实际操作中需要多次加载该特征数据。也就是说,需要对保存了该特征数据的存储器进行多次读操作,从而多次获取该特征数据。参见图1,计算特征数据集合和权值数据集合的笛卡尔积时,需要执行多步卷积。执行第一步卷积时,需要通过对存储器进行读操作从而获取特征数据a 21,从而计算a 21和b 21的积。当计算第四步卷积时(权值矩阵以从上到下、从左到右的顺序进行滑动),需要通过对存储器进行读操作从而再次获取特征数据a 21,并计算a 21和b 11的积。也就是说,需要对保存了特征数据a 21的存储器进行多次读操作,增加了开销。本申请提供的技术方案中,通过对权值矩阵进行重排,则可以实现特征数据一次加载即可与更多的权值矩阵元素进行乘法运算。减少了对特征数据进行加载的次数。另外,通过计算特征数据与第一权值矩阵中的元素之间的乘积结果,以及特征数据与第二权值矩阵中的元素之间的乘积结果,实现了对获取的特征数据的复用。综上,上述方案提高了运算的效率。 Specifically, in the prior art, when calculating the convolution of the feature matrix and the weight matrix, it is realized by sliding the weight matrix on the feature matrix and performing a multiplication operation of the weight matrix element and the corresponding feature data. Since the feature data in the same feature matrix often needs to be used in multiplication operations after multiple weight matrix sliding, the feature data needs to be loaded multiple times in actual operation. That is to say, multiple reading operations need to be performed on the memory storing the characteristic data, so as to obtain the characteristic data multiple times. Referring to FIG. 1, when calculating a Cartesian product of a feature data set and a weight data set, a multi-step convolution needs to be performed. When performing the first step of convolution, it is necessary to read the memory to obtain the characteristic data a 21 , so as to calculate the product of a 21 and b 21 . When calculating the fourth step of the convolution (the weight matrix is sliding from top to bottom and from left to right), it is necessary to read the memory to obtain the characteristic data a 21 again, and calculate a 21 and b 11 Product. That is, multiple reading operations need to be performed on the memory storing the characteristic data a 21 , which increases the overhead. In the technical solution provided by the present application, by rearranging the weight matrix, it is possible to realize that the characteristic data can be multiplied with more weight matrix elements by once loading the characteristic data. The number of times the feature data is loaded is reduced. In addition, by calculating a product result between the feature data and the elements in the first weight matrix and a product result between the feature data and the elements in the second weight matrix, the acquired feature data is multiplexed. In summary, the above scheme improves the efficiency of the operation.
结合第一方面,在第一方面的一种可能的实现方式中,该数据处理装置还包括地址处理模块,该地址处理模块用于:获取该第一权值矩阵和第二权值矩阵中的权值数据的地址;使用该第一权值矩阵和第二权值矩阵中的权值数据的地址与该第一特征数据集合中的地址进行地址运算;该数据处理模块用于根据该乘法运算的运算结果,确定目标数据集合包括:控制模块用于,根据该乘法运算的运算结果以及该地址运算的运算结果,确定目标数据集合。With reference to the first aspect, in a possible implementation manner of the first aspect, the data processing apparatus further includes an address processing module, where the address processing module is configured to obtain the first weight matrix and the second weight matrix The address of the weight data; use the address of the weight data in the first weight matrix and the second weight matrix to perform the address operation with the address in the first feature data set; the data processing module is configured to perform the multiplication operation Determining the target data set includes: a control module configured to determine the target data set according to the operation result of the multiplication operation and the operation result of the address operation.
该方案引入了地址处理模块,通过地址处理模块计算第一权值矩阵和第二权值矩阵中的权值数据与第一数据集合中的特征数据的乘积的地址,可以进一步获得特征数据与权值矩阵的笛卡尔积以及卷积结果作为目标数据集合,从而扩展了数据处理装置的功能。This solution introduces an address processing module. The address processing module calculates the address of the product of the weight data in the first weight matrix and the second weight matrix and the feature data in the first data set. The feature data and weight can be further obtained. The Cartesian product of the value matrix and the convolution result are used as the target data set, thereby expanding the function of the data processing device.
结合第一方面,在第一方面的一种可能的实现方式中,该数据处理模块,还用于:获取该第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,该第三权值矩阵至第n权值矩阵为对该第一权值矩阵按行重排后的矩阵,且该第一权值矩阵至第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;该地址处理模块,还用于:获取该第三权值矩阵至第n权值矩阵中的权值数据的地址;使用该第三至第n权值矩阵的权值数据的地址与该第一特征数据集合中的特征数据的地址进行地址运算。With reference to the first aspect, in a possible implementation manner of the first aspect, the data processing module is further configured to: obtain a third weight matrix to an n-th weight matrix in the first weight data set, where , The third weight matrix to the nth weight matrix are matrixes in which the first weight matrix is rearranged in rows, and n rows of the first weight matrix to the nth weight matrix are located in the same row. Any two row vectors in the vector are not the same; the address processing module is further configured to: obtain an address of the weight data in the third weight matrix to the n-th weight matrix; use the third weight to the n-th weight An address operation is performed on the address of the weight data of the matrix and the address of the feature data in the first feature data set.
该方案将具有n行的第一权值矩阵按行重排后得到n个权值矩阵,且这n个权值矩阵中位于同一行的n个行向量中的任意两个行向量不相同,从而使得特征数据与该n个权值矩阵进行乘法运算后,就得到了特征数据与第一权值矩阵的笛卡尔积,从而提高了特征数据的复用程度,进一步提高了运算的效率。In this solution, the first weight matrix with n rows is rearranged in rows to obtain n weight matrices, and any two of the n row vectors in the n weight matrices in the same row are different. Therefore, after the feature data is multiplied with the n weight matrixes, a Cartesian product of the feature data and the first weight matrix is obtained, thereby increasing the degree of reuse of the feature data and further improving the operation efficiency.
结合第一方面,在第一方面的一种可能的实现方式中,该目标数据集合包括结果矩阵,该结果矩阵是该第一特征数据集合与该第一权值数据集合进行卷积运算的结果,该 第一特征数据集合被表示为第一特征矩阵,该地址处理模块,还用于根据该每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、该第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,该权值尺寸为n行m列,该填充尺寸包括横向填充尺寸和纵向填充尺寸,所述横向填充尺寸是(n-1)/2,所述纵向填充尺寸是(m-1)/2。With reference to the first aspect, in a possible implementation manner of the first aspect, the target data set includes a result matrix, and the result matrix is a result of a convolution operation performed on the first feature data set and the first weight data set. The first feature data set is represented as a first feature matrix, and the address processing module is further configured to calculate an address of the weight data stored in the array, an address of the first feature data set, and the first feature according to the each address. The size of the matrix, the padding size, and the weighting size determine the first target address, where the weighting size is n rows and m columns, and the padding size includes a horizontal padding size and a vertical padding size, and the horizontal padding size is (n- 1) / 2, and the longitudinal filling size is (m-1) / 2.
该方案进一步细化了根据权值数据的地址以及特征数据的地址来获得目标数据地址的方法,从而提高了该数据处理装置通过笛卡尔积得到卷积结果的可实现性。This solution further refines the method of obtaining the target data address based on the address of the weight data and the address of the characteristic data, thereby improving the feasibility of the data processing device obtaining the convolution result through the Cartesian product.
结合第一方面,在第一方面的一种可能的实现方式中,该数据处理装置还包括压缩模块,用于:获取第二特征数据集合,将该第二特征数据集合中值为0的元素去除得到该第一特征数据集合;获取第二权值数据集合,将该第二权值数据集合中值为0的元素去除得到该第一权值数据集合;确定该第一特征数据集合中的每个特征数据的地址,确定该第一权值数据集合中的每个权值的地址。With reference to the first aspect, in a possible implementation manner of the first aspect, the data processing apparatus further includes a compression module, configured to: obtain a second feature data set, and obtain an element with a value of 0 in the second feature data set The first feature data set is obtained by removal; the second weight data set is obtained, and elements with a value of 0 in the second weight data set are removed to obtain the first weight data set; The address of each feature data determines the address of each weight in the first weight data set.
该方案通过对特征数据和权值数据进行稀疏化,即将特征数据集合和权值数据集合里中的值为0的元素去除,减少了卷积运算的运算量,从而提高了数据处理装置运行的效率。This solution sparses feature data and weight data, that is, removes elements with a value of 0 in the feature data set and weight data set, which reduces the amount of convolution operations and thus improves the operation of the data processing device. effectiveness.
第二方面,本申请实施例提供一种数据处理方法,该方法包括:获取第一权值数据集合中的第一权值矩阵,其中,该第一权值矩阵被表示为n行m列个权值数据,该第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数;根据所述第一权值矩阵获取第二权值矩阵,其中,该第二权值矩阵是对该第一权值矩阵进行按行重排后的矩阵;使用第一权值矩阵与第一特征数据集合进行第一乘法运算;使用该第二权值矩阵与该第一特征数据集合进行第二乘法运算;根据该第一乘法运算和第二乘法运算的运算结果,确定目标数据集合。In a second aspect, an embodiment of the present application provides a data processing method. The method includes: obtaining a first weight matrix in a first weight data set, where the first weight matrix is represented by n rows and m columns. Weight data, the data in the first weight data set is from the same input channel, where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2; A two weight matrix, wherein the second weight matrix is a matrix after the first weight matrix is rearranged by rows; using the first weight matrix and the first feature data set to perform a first multiplication operation; using the The second weight matrix performs a second multiplication operation with the first feature data set; and determines a target data set according to the operation results of the first multiplication operation and the second multiplication operation.
结合第二方面,在第二方面的一种可能的实现方式中,该方法还包括:获取该第一权值矩阵和第二权值矩阵中的权值数据的地址;使用该第一权值矩阵和第二权值矩阵中的权值数据的地址与该第一特征数据集合中的地址进行地址运算;根据该第一乘法运算和第二乘法运算的运算结果,确定目标数据集合,包括:根据该第一乘法运算和第二乘法运算的运算结果以及该地址运算的运算结果,确定目标数据集合。With reference to the second aspect, in a possible implementation manner of the second aspect, the method further includes: obtaining addresses of weight data in the first weight matrix and the second weight matrix; and using the first weight The address of the weight data in the matrix and the second weight matrix and the address in the first feature data set are subjected to an address operation; and the target data set is determined according to the operation results of the first multiplication operation and the second multiplication operation, including: The target data set is determined according to the operation results of the first multiplication operation and the second multiplication operation and the operation results of the address operation.
结合第二方面,在第二方面的一种可能的实现方式中,该方法还包括:获取该第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,该第三权值矩阵至第n权值矩阵为对该第一权值矩阵按行重排后的矩阵,且该第一权值矩阵至第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;获取该第三权值矩阵至第n权值矩阵中的权值数据的地址;使用该第三至第n权值矩阵的权值数据的地址与该第一特征数据集合中的特征数据的地址进行地址运算。With reference to the second aspect, in a possible implementation manner of the second aspect, the method further includes: obtaining a third weight matrix to an n-th weight matrix in the first weight data set, where the third weight matrix The weight matrix to the n-th weight matrix are matrixes in which the first weight matrix is rearranged in rows, and any one of n row vectors of the first weight matrix to the n-th weight matrix located in the same row is in the same row. The two row vectors are not the same; obtain the address of the weight data in the third to n-th weight matrix; use the address of the weight data in the third to n-th weight matrix and the first feature data The address of the feature data in the set is subjected to an address operation.
结合第二方面,在第二方面的一种可能的实现方式中,该目标数据集合包括结果矩阵,该结果矩阵是该第一特征数据集合与该第一权值数据集合进行卷积运算的结果,该第一特征数据集合被表示为第一特征矩阵,该方法还包括:根据该每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、对应于该第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,该权值尺寸为n行m列,该填充尺寸包括横向填充尺寸和纵向填充尺寸,所述横向填充尺寸是(n-1)/2,所述纵向填充尺寸是(m-1) /2。With reference to the second aspect, in a possible implementation manner of the second aspect, the target data set includes a result matrix, and the result matrix is a result of a convolution operation performed on the first feature data set and the first weight data set. The first feature data set is represented as a first feature matrix, and the method further includes: calculating an address of the weight data stored in the array according to the each address, an address of the first feature data set, and corresponding to the first feature matrix. Size, padding size, and weight size to determine the first target address, where the weighting size is n rows and m columns, and the padding size includes a horizontal padding size and a vertical padding size, and the horizontal padding size is (n-1 ) / 2, and the longitudinal filling size is (m-1) / 2.
结合第二方面,在第二方面的一种可能的实现方式中,该方法还包括:获取第二特征数据集合,将该第二特征数据集合中值为0的元素去除得到该第一特征数据集合;获取第二权值数据集合,将该第二权值数据集合中值为0的元素去除得到该第一权值数据集合;确定该第一特征数据集合中的每个特征数据的地址,确定该第一权值数据集合中的每个权值的地址。With reference to the second aspect, in a possible implementation manner of the second aspect, the method further includes: obtaining a second feature data set, and removing elements with a value of 0 in the second feature data set to obtain the first feature data A set; obtaining a second weight data set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining an address of each feature data in the first feature data set, An address for each weight in the first weight data set is determined.
第三方面,本申请提供一种数据处理装置,该数据处理装置包括处理器和存储器,该存储器存储程序代码,该处理器用于调用该存储器中的程序代码执行如本申请第二方面所提供的数据处理的方法。According to a third aspect, the present application provides a data processing device. The data processing device includes a processor and a memory. The memory stores program code. The processor is configured to call the program code in the memory to execute the program code provided in the second aspect of the application. Data processing methods.
附图说明BRIEF DESCRIPTION OF THE DRAWINGS
图1是现有技术中的卷积运算过程的示意图。FIG. 1 is a schematic diagram of a convolution operation process in the prior art.
图2是本申请实施例提供的一种数据处理装置的结构框图。FIG. 2 is a structural block diagram of a data processing apparatus according to an embodiment of the present application.
图3是本申请实施例提供的一个数据计算阵列的示意图。FIG. 3 is a schematic diagram of a data calculation array according to an embodiment of the present application.
图4是本申请实施例提供的数据计算阵列中的一个数据计算单元的结构框图。FIG. 4 is a structural block diagram of a data calculation unit in a data calculation array provided by an embodiment of the present application.
图5是本申请实施例提供的对第一特征数据集合进行乘法运算的示意图。FIG. 5 is a schematic diagram of performing a multiplication operation on a first feature data set according to an embodiment of the present application.
图6是本申请实施例提供的第一特征数据集合的地址与权值数据集合地址的示意图。FIG. 6 is a schematic diagram of an address of a first feature data set and an address of a weight data set provided in an embodiment of the present application.
图7是本申请实施例提供的一个地址计算阵列的示意图。FIG. 7 is a schematic diagram of an address calculation array according to an embodiment of the present application.
图8是本申请实施例提供的地址计算阵列中的一个地址计算单元的结构框图。FIG. 8 is a structural block diagram of an address calculation unit in an address calculation array according to an embodiment of the present application.
图9是本申请实施例提供的两个数据计算阵列保存的权值数据的示意图。FIG. 9 is a schematic diagram of weight data stored in two data calculation arrays according to an embodiment of the present application.
图10是本申请实施例提供的一个数据计算阵列保存的权值数据的示意图。FIG. 10 is a schematic diagram of weight data stored in a data calculation array according to an embodiment of the present application.
图11是本申请实施例提供的具有3个过滤器并进行稀疏化处理的权值矩阵的示意图。FIG. 11 is a schematic diagram of a weight matrix with three filters and thinning processing provided in an embodiment of the present application.
图12是本申请实施例提供的未经过稀疏化处理的权值矩阵的示意图。FIG. 12 is a schematic diagram of a weight matrix that has not undergone thinning processing according to an embodiment of the present application.
图13是本申请实施例提供的一种数据处理方法的示意性流程图。FIG. 13 is a schematic flowchart of a data processing method according to an embodiment of the present application.
图14是本申请实施例提供的一种数据处理装置的结果框图。FIG. 14 is a result block diagram of a data processing apparatus provided by an embodiment of the present application.
具体实施方式detailed description
下面将结合附图,对本申请中的技术方案进行描述。The technical solutions in this application will be described below with reference to the drawings.
本申请中,“至少一个”是指一个或者多个,“多个”是指两个或两个以上。“和/或”,描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B的情况,其中A,B可以是单数或者复数。字符“/”一般表示前后关联对象是一种“或”的关系。“以下中的至少一项(个)”或其类似表达,是指的这些项中的任意组合,包括单项(个)或复数项(个)的任意组合。例如,a,b或c中的至少一项(个),可以表示:a、b、c、a-b、a-c、b-c、或a-b-c,其中a、b、c可以是单个,也可以是多个。另外,在本申请的实施例中,“第一”、“第二”等字样并不对数量和执行次序进行限定。In the present application, "at least one" means one or more, and "multiple" means two or more. "And / or" describes the association relationship of related objects, and indicates that there can be three kinds of relationships, for example, A and / or B can represent: the case where A exists alone, A and B exist simultaneously, and B alone exists, where A, B can be singular or plural. The character "/" generally indicates that the related objects are an "or" relationship. "At least one or more of the following" or similar expressions refers to any combination of these items, including any combination of single or plural items. For example, at least one item (a), a, b, or c may represent: a, b, c, a-b, a-c, b-c, or a-b-c, where a, b, and c may be single or multiple. In addition, in the embodiments of the present application, the words “first”, “second” and the like do not limit the number and execution order.
需要说明的是,本申请中,“示例性的”或者“例如”等词用于表示作例子、例证或说明。本申请中被描述为“示例性的”或者“例如”的任何实施例或设计方案不应被解释为比其他实施例或设计方案更优选或更具优势。确切而言,使用“示例性的”或者 “例如”等词旨在以具体方式呈现相关概念。It should be noted that, in this application, words such as "exemplary" or "for example" are used as examples, illustrations, or illustrations. Any embodiment or design described as "exemplary" or "for example" in this application should not be construed as preferred or advantageous over other embodiments or designs. Rather, the use of words such as "exemplary" or "for example" is intended to present related concepts in a concrete manner.
图1是现有技术中的卷积运算过程的示意图。FIG. 1 is a schematic diagram of a convolution operation process in the prior art.
图1示出了一个特征数据集合,该特征数据集合共包括5×5个特征数据。图1还示出了一个权值数据集合,该权值数据集合共包括3×3个权值数据。该权值数据集合可以作为卷积核与该数据特征集合进行卷积运算。FIG. 1 shows a feature data set, which includes a total of 5 × 5 feature data. FIG. 1 also shows a weight data set, which includes a total of 3 × 3 weight data. The weight data set can be used as a convolution kernel to perform a convolution operation with the data feature set.
图1还示出了利用权值数据集合对特征数据集合进行卷积运算过程中的步长为1的两步运算的示意。如图1所示,特征数据集合中的3×3个权值数据需要分别与特征数据集合中的3×3个数据进行乘法运算。乘法运算的结果相加就得到了卷积结果的一个数据的值。具体地,根据图1所示,卷积结果c 11可以表示为公式1.1,卷积结果c 22可以表示为公式1.2: FIG. 1 also shows a schematic diagram of a two-step operation with a step size of 1 during a convolution operation on a feature data set using a weight data set. As shown in FIG. 1, the 3 × 3 weight data in the feature data set needs to be multiplied with the 3 × 3 data in the feature data set, respectively. The result of the multiplication operation is added to obtain the value of one data of the convolution result. Specifically, according to FIG. 1, the convolution result c 11 can be expressed as Formula 1.1, and the convolution result c 22 can be expressed as Formula 1.2:
c 11=a 11×b 11+a 12×b 12+a 13×b 13+a 21×b 21+a 22×b 22+a 23×b 23+a 31×b 31+a 32×b 32+a 33×b 33,公式1.1 c 11 = a 11 × b 11 + a 12 × b 12 + a 13 × b 13 + a 21 × b 21 + a 22 × b 22 + a 23 × b 23 + a 31 × b 31 + a 32 × b 32 + a 33 × b 33 , formula 1.1
c 12=a 12×b 11+a 13×b 12+a 14×b 13+a 22×b 21+a 23×b 22+a 24×b 23+a 32×b 31+a 33×b 32+a 34×b 33,公式1.2 c 12 = a 12 × b 11 + a 13 × b 12 + a 14 × b 13 + a 22 × b 21 + a 23 × b 22 + a 24 × b 23 + a 32 × b 31 + a 33 × b 32 + a 34 × b 33 , formula 1.2
在完成了如图1所示的两步运算后,该特征数据集合继续向右滑动,继续下一步运算,直到遍历完整个特征数据集合。After the two-step operation shown in FIG. 1 is completed, the feature data set continues to slide to the right, and the next operation is continued until the entire feature data set is traversed.
假设集合E 1={a 11,a 12,a 13,a 21,a 22,a 23,a 31,a 32,a 33},集合F 1={b 11,b 12,b 13,b 21,b 22,b 23,b 31,b 32,b 33}。对集合E 1和集合F 1进行笛卡尔积运算,可以得到集合G 1,集合G 1可以包括如表1所示的多个乘法结果。 Suppose the set E 1 = {a 11 , a 12 , a 13 , a 21 , a 22 , a 23 , a 31 , a 32 , a 33 }, and the set F 1 = {b 11 , b 12 , b 13 , b 21 , B 22 , b 23 , b 31 , b 32 , b 33 }. Performing a Cartesian product operation on the sets E 1 and F 1 can obtain the set G 1 , and the set G 1 can include multiple multiplication results as shown in Table 1.
表1Table 1
a 11×b 11 a 11 × b 11 a 11×b 12 a 11 × b 12 a 11×b 13 a 11 × b 13 a 11×b 21 a 11 × b 21 a 11×b 22 a 11 × b 22 a 11×b 23 a 11 × b 23 a 11×b 31 a 11 × b 31 a 11×b 32 a 11 × b 32 a 11×b 33 a 11 × b 33
a 21×b 11 a 21 × b 11 a 21×b 12 a 21 × b 12 a 21×b 13 a 21 × b 13 a 21×b 21 a 21 × b 21 a 21×b 22 a 21 × b 22 a 21×b 23 a 21 × b 23 a 21×b 31 a 21 × b 31 a 21×b 32 a 21 × b 32 a 21×b 33 a 21 × b 33
a 31×b 11 a 31 × b 11 a 31×b 12 a 31 × b 12 a 31×b 13 a 31 × b 13 a 31×b 21 a 31 × b 21 a 31×b 22 a 31 × b 22 a 31×b 23 a 31 × b 23 a 31×b 31 a 31 × b 31 a 31×b 32 a 31 × b 32 a 31×b 33 a 31 × b 33
a 12×b 11 a 12 × b 11 a 12×b 12 a 12 × b 12 a 12×b 13 a 12 × b 13 a 12×b 21 a 12 × b 21 a 12×b 22 a 12 × b 22 a 12×b 23 a 12 × b 23 a 12×b 31 a 12 × b 31 a 12×b 32 a 12 × b 32 a 12×b 33 a 12 × b 33
a 22×b 11 a 22 × b 11 a 22×b 12 a 22 × b 12 a 22×b 13 a 22 × b 13 a 22×b 21 a 22 × b 21 a 22×b 22 a 22 × b 22 a 22×b 23 a 22 × b 23 a 22×b 31 a 22 × b 31 a 22×b 32 a 22 × b 32 a 22×b 33 a 22 × b 33
a 32×b 11 a 32 × b 11 a 32×b 12 a 32 × b 12 a 32×b 13 a 32 × b 13 a 32×b 21 a 32 × b 21 a 32×b 22 a 32 × b 22 a 32×b 23 a 32 × b 23 a 32×b 31 a 32 × b 31 a 32×b 32 a 32 × b 32 a 32×b 33 a 32 × b 33
a 13×b 11 a 13 × b 11 a 13×b 12 a 13 × b 12 a 13×b 13 a 13 × b 13 a 13×b 21 a 13 × b 21 a 13×b 22 a 13 × b 22 a 13×b 23 a 13 × b 23 a 13×b 31 a 13 × b 31 a 13×b 32 a 13 × b 32 a 13×b 33 a 13 × b 33
a 23×b 11 a 23 × b 11 a 23×b 12 a 23 × b 12 a 23×b 13 a 23 × b 13 a 23×b 21 a 23 × b 21 a 23×b 22 a 23 × b 22 a 23×b 23 a 23 × b 23 a 23×b 31 a 23 × b 31 a 23×b 32 a 23 × b 32 a 23×b 33 a 23 × b 33
a 33×b 11 a 33 × b 11 a 33×b 12 a 33 × b 12 a 33×b 13 a 33 × b 13 a 33×b 21 a 33 × b 21 a 33×b 22 a 33 × b 22 a 33×b 23 a 33 × b 23 a 33×b 31 a 33 × b 31 a 33×b 32 a 33 × b 32 a 33×b 33 a 33 × b 33
如表1所示,集合E 1和集合F 1的笛卡尔积运算结果中,包括了计算c 11时需要用到的全部乘法结果:a 11×b 11、a 12×b 12、a 13×b 13、a 21×b 21、a 22×b 22、a 23×b 23、a 31×b 31、a 32×b 32、a 33×b 33。集合E和集合F的笛卡尔积运算结果中,还包括了计算c 12时需要用到的部分乘法结果:a 12×b 11、a 13×b 12、a 22×b 21、a 23×b 22、a 32×b 31、a 33×b 32As shown in Table 1, the Cartesian product of the set E 1 and the set F 1 includes all the multiplication results needed to calculate c 11 : a 11 × b 11 , a 12 × b 12 , a 13 × b 13 , a 21 × b 21 , a 22 × b 22 , a 23 × b 23 , a 31 × b 31 , a 32 × b 32 , and a 33 × b 33 . The results of the Cartesian product of sets E and F also include some of the multiplication results needed to calculate c 12 : a 12 × b 11 , a 13 × b 12 , a 22 × b 21 , a 23 × b 22 , a 32 × b 31 , a 33 × b 32 .
假设集合E 2={a 12,a 13,a 14,a 22,a 23,a 24,a 32,a 33,a 34}。对集合E 2和集合F 1进行笛卡尔积运算,可以得到集合G 2,集合G 2可以包括如表2所示的多个乘法结果。 Suppose the set E 2 = {a 12 , a 13 , a 14 , a 22 , a 23 , a 24 , a 32 , a 33 , a 34 }. Performing a Cartesian product operation on the sets E 2 and F 1 can obtain the set G 2 , and the set G 2 can include multiple multiplication results as shown in Table 2.
表2Table 2
a 12×b 11 a 12 × b 11 a 12×b 12 a 12 × b 12 a 12×b 13 a 12 × b 13 a 12×b 21 a 12 × b 21 a 12×b 22 a 12 × b 22 a 12×b 23 a 12 × b 23 a 12×b 31 a 12 × b 31 a 12×b 32 a 12 × b 32 a 12×b 33 a 12 × b 33
a 22×b 11 a 22 × b 11 a 22×b 12 a 22 × b 12 a 22×b 13 a 22 × b 13 a 22×b 21 a 22 × b 21 a 22×b 22 a 22 × b 22 a 22×b 23 a 22 × b 23 a 22×b 31 a 22 × b 31 a 22×b 32 a 22 × b 32 a 22×b 33 a 22 × b 33
a 32×b 11 a 32 × b 11 a 32×b 12 a 32 × b 12 a 32×b 13 a 32 × b 13 a 32×b 21 a 32 × b 21 a 32×b 22 a 32 × b 22 a 32×b 23 a 32 × b 23 a 32×b 31 a 32 × b 31 a 32×b 32 a 32 × b 32 a 32×b 33 a 32 × b 33
a 13×b 11 a 13 × b 11 a 13×b 12 a 13 × b 12 a 13×b 13 a 13 × b 13 a 13×b 21 a 13 × b 21 a 13×b 22 a 13 × b 22 a 13×b 23 a 13 × b 23 a 13×b 31 a 13 × b 31 a 13×b 32 a 13 × b 32 a 13×b 33 a 13 × b 33
a 23×b 11 a 23 × b 11 a 23×b 12 a 23 × b 12 a 23×b 13 a 23 × b 13 a 23×b 21 a 23 × b 21 a 23×b 22 a 23 × b 22 a 23×b 23 a 23 × b 23 a 23×b 31 a 23 × b 31 a 23×b 32 a 23 × b 32 a 23×b 33 a 23 × b 33
a 33×b 11 a 33 × b 11 a 33×b 12 a 33 × b 12 a 33×b 13 a 33 × b 13 a 33×b 21 a 33 × b 21 a 33×b 22 a 33 × b 22 a 33×b 23 a 33 × b 23 a 33×b 31 a 33 × b 31 a 33×b 32 a 33 × b 32 a 33×b 33 a 33 × b 33
a 14×b 11 a 14 × b 11 a 14×b 12 a 14 × b 12 a 14×b 13 a 14 × b 13 a 14×b 21 a 14 × b 21 a 14×b 22 a 14 × b 22 a 14×b 23 a 14 × b 23 a 14×b 31 a 14 × b 31 a 14×b 32 a 14 × b 32 a 14×b 33 a 14 × b 33
a 24×b 11 a 24 × b 11 a 24×b 12 a 24 × b 12 a 24×b 13 a 24 × b 13 a 24×b 21 a 24 × b 21 a 24×b 22 a 24 × b 22 a 24×b 23 a 24 × b 23 a 24×b 31 a 24 × b 31 a 24×b 32 a 24 × b 32 a 24×b 33 a 24 × b 33
a 34×b 11 a 34 × b 11 a 34×b 12 a 34 × b 12 a 34×b 13 a 34 × b 13 a 34×b 21 a 34 × b 21 a 34×b 22 a 34 × b 22 a 34×b 23 a 34 × b 23 a 34×b 31 a 34 × b 31 a 34×b 32 a 34 × b 32 a 34×b 33 a 34 × b 33
如表2所示,集合E 2和集合F 1的笛卡尔积运算结果中,包括了计算c 12时需要用到的部分乘法结果:a 14×b 13、a 24×b 23、a 34×b 33As shown in Table 2, the results of the Cartesian product of the sets E 2 and F 1 include some of the multiplication results needed to calculate c 12 : a 14 × b 13 , a 24 × b 23 , and a 34 × b 33 .
表1和表2所示的在计算c 11和c 12不需要的乘法结果在后续的卷积运算中也可以应用到。 The multiplication results shown in Tables 1 and 2 that are not needed to calculate c 11 and c 12 can also be applied to subsequent convolution operations.
通过上述卷积运算和笛卡尔积运算过程的分析可见,卷积运算可以分解为笛卡尔积运算。一次笛卡尔积运算得到运算结果可以用于多步卷积运算。一步卷积运算结果可以是将一次或多次笛卡尔积运算结果相加。According to the analysis of the convolution operation and the Cartesian product operation process, it can be seen that the convolution operation can be decomposed into a Cartesian product operation. The result obtained by one Cartesian product operation can be used for multi-step convolution operation. The result of the one-step convolution operation may be the addition of one or more Cartesian product operations.
图2是根据本申请实施例提供的一种数据处理装置的结构框图。如图2所示的数据处理装置200包括:存储模块210、数据处理模块220、地址处理模块230和控制模块240。FIG. 2 is a structural block diagram of a data processing apparatus according to an embodiment of the present application. The data processing apparatus 200 shown in FIG. 2 includes a storage module 210, a data processing module 220, an address processing module 230, and a control module 240.
存储模块210,用于保存第一特征数据集合,该第一特征数据集合中的每个特征数据的地址,第一权值集合以及该第一权值集合中的每个权值的地址。The storage module 210 is configured to store a first feature data set, an address of each feature data in the first feature data set, a first weight set, and an address of each weight in the first weight set.
数据处理模块220包括N个数据计算阵列。该N个数据计算阵列中的每个数据计算阵列包括n×m个数据计算单元,其中N为大于或等于2的正整数,n为大于或等于2的正整数,m为大于或等于2的正整数。The data processing module 220 includes N data calculation arrays. Each of the N data calculation arrays includes n × m data calculation units, where N is a positive integer greater than or equal to 2, n is a positive integer greater than or equal to 2, and m is a number greater than or equal to 2. Positive integer.
地址处理模块230包括N个地址计算阵列。该N个地址计算阵列中的每个地址计算阵列包括n×m个地址计算单元。The address processing module 230 includes N address calculation arrays. Each of the N address calculation arrays includes n × m address calculation units.
其中,每个数据计算阵列,用于从存储模块210获取n×m个权值数据,并将获取到的权值数据保存到该每个数据计算阵列的n×m个数据计算单元。Each data calculation array is configured to obtain n × m weight data from the storage module 210 and save the obtained weight data to the n × m data calculation units of each data calculation array.
每个地址计算阵列,用于从存储模块210获取n×m个权值数据的地址并将获取到的权值数据的地址保存到该每个地址计算阵列的n×m个地址计算单元。该N个地址计算阵列保存的权值数据的地址是该N个数据计算阵列保存的权值数据的地址。换句话说,该N个地址计算阵列与该N个数据计算阵列一一对应,该N个地址计算阵列中的每个地址计算阵列保存对应的数据计算阵列所保存的权值数据的地址。例如,假设该N个数据计算阵列中的一个数据计算阵列所保存的权值数据为b 11,b 12、b 13、b 21、b 22、b 23、b 31、b 32、b 33,则该N个地址计算阵列中与该数据计算阵列对应的地址计算阵列保存的地址为b 11的地址,b 12的地址,b 13的地址,b 21的地址,b 22的地址,b 23的地址,b 31的地址,b 32的地址和b 33的地址。 Each address calculation array is configured to obtain an address of n × m weight data from the storage module 210 and save the address of the obtained weight data to the n × m address calculation unit of each address calculation array. The address of the weight data stored in the N address calculation arrays is the address of the weight data stored in the N data calculation arrays. In other words, the N address calculation arrays have one-to-one correspondence with the N data calculation arrays, and each address calculation array in the N address calculation arrays holds an address of the weight data stored in the corresponding data calculation array. For example, assuming that the weight data stored by one of the N data calculation arrays is b 11 , b 12 , b 13 , b 21 , b 22 , b 23 , b 31 , b 32 , b 33 , then The addresses stored in the N address calculation arrays corresponding to the data calculation array are the addresses of b 11 , b 12 , b 13 , b 21 , b 22 , and b 23 . , B 31 , b 32 , and b 33 .
该N个数据计算阵列使用由该N个数据计算阵列保存的权值数据对该第一特征数据集合进行乘法运算。在对该第一特征数据集合的运算过程中,该N个数据计算阵列所保存的权值数据不变。The N data calculation arrays use the weight data stored by the N data calculation arrays to perform a multiplication operation on the first feature data set. During the operation of the first feature data set, the weight data stored in the N data calculation arrays are unchanged.
类似的,该N个地址计算阵列使用由该N个地址计算阵列保存的权值数据的地址对该第一特征数据集合的地址进行地址运算,其中在对该第一特征数据集合的地址进行地址运算过程中,该N个地址计算阵列保存的权值数据的地址保持不变。Similarly, the N address calculation arrays use the addresses of the weight data stored by the N address calculation arrays to perform address calculations on the addresses of the first feature data set, where addresses are performed on the addresses of the first feature data set. During the operation, the addresses of the weight data stored in the N address calculation arrays remain unchanged.
控制模块240,用于根据N个数据计算阵列根据该乘法运算的运算结果以及该地址运算的运算结果,确定目标数据集合。The control module 240 is configured to determine a target data set according to the N data calculation array according to the operation result of the multiplication operation and the operation result of the address operation.
因此,该N个数据计算阵列可以根据该乘法运算结果以及该地址运算的运算结果确定出由该N个数据计算阵列保存的权值数据对该第一特征数据集合进行卷积运算的运算结果。换句话说,在一些实施例中,该目标数据集合可以是由该N个数据计算阵列保存的权值数据对该第一特征数据集合进行卷积运算后得到的数据的集合。Therefore, the N data calculation arrays can determine the operation result of the convolution operation on the first feature data set based on the weight data stored by the N data calculation arrays according to the multiplication operation result and the operation result of the address operation. In other words, in some embodiments, the target data set may be a data set obtained by performing a convolution operation on the first feature data set with weight data stored by the N data calculation arrays.
下面结合图1、图3至图5对该N个数据计算阵列使用保存的权值数据对如图1所示的第一特征数据集合运算进行描述。The operation of the first feature data set shown in FIG. 1 is described below using the saved weight data for the N data calculation arrays with reference to FIG. 1, FIG. 3 to FIG. 5.
图3是本申请实施例提供的一个数据计算阵列的示意图。如图3所示的数据计算阵列300共包括9个数据计算单元,分别为数据计算单元311、数据计算单元312、数据计算单元313、数据计算单元321、数据计算单元322、数据计算单元323、数据计算单元331、数据计算单元332和数据计算单元333。FIG. 3 is a schematic diagram of a data calculation array according to an embodiment of the present application. The data calculation array 300 shown in FIG. 3 includes a total of 9 data calculation units, which are a data calculation unit 311, a data calculation unit 312, a data calculation unit 313, a data calculation unit 321, a data calculation unit 322, a data calculation unit 323, The data calculation unit 331, the data calculation unit 332, and the data calculation unit 333.
可以理解的是,除了如图3所示的数据计算单元外,数据计算阵列还可以包括输入输出单元(图中未示出)。该输入输出单元用于获取需要输入到数据计算阵列300的数据。该输入输出单元,还用于将数据计算阵列300需要输出的数据输入到相应的单元和/或模块。例如,该输入输出单元可以从该存储模块获取权值数据和特征数据,将获取到的权值数据和特征数据发送至对应的数据计算单元。该输入输出单元,还用于获取每个数据计算单元计算出的目标数据,将目标数据发送至存储模块。It can be understood that, in addition to the data calculation unit shown in FIG. 3, the data calculation array may further include an input and output unit (not shown in the figure). The input-output unit is used to acquire data that needs to be input to the data calculation array 300. The input / output unit is further configured to input data to be output by the data calculation array 300 to a corresponding unit and / or module. For example, the input / output unit may obtain weight data and feature data from the storage module, and send the obtained weight data and feature data to a corresponding data calculation unit. The input-output unit is further configured to obtain target data calculated by each data calculation unit and send the target data to a storage module.
可选的,在一些实施例中,数据计算阵列中的各个计算单元间数据传递是单向的。以图3为例,图3中用于连接各个数据计算单元的箭头可以表示表示数据的单向传递方向。以数据计算单元311、数据计算单元312和数据计算单元313为例。数据计算单元311可以将数据(例如特征数据)发送至数据计算单元312,而数据计算单元312无法将数据发送至数据计算单元311。数据计算单元312可以将数据发送至数据计算单元313,而数据计算单元313无法将数据发送至数据计算单元312。Optionally, in some embodiments, the data transfer between the computing units in the data computing array is unidirectional. Taking FIG. 3 as an example, the arrows used to connect the data calculation units in FIG. 3 may indicate a unidirectional transmission direction of data. Take the data calculation unit 311, the data calculation unit 312, and the data calculation unit 313 as examples. The data calculation unit 311 can send data (for example, characteristic data) to the data calculation unit 312, and the data calculation unit 312 cannot send data to the data calculation unit 311. The data calculation unit 312 can send data to the data calculation unit 313, and the data calculation unit 313 cannot send data to the data calculation unit 312.
图4是本申请实施例提供的数据计算阵列中的一个数据计算单元的结构框图。如图4所示,数据计算单元400可以包括存储子单元401和数据计算子单元402。可以理解的是,数据计算单元400还可以包括输入输出子单元。该输入输出子单元用于获取该数据计算单元需要获取的数据,并输出该数据计算单元需要输出的数据。FIG. 4 is a structural block diagram of a data calculation unit in a data calculation array provided by an embodiment of the present application. As shown in FIG. 4, the data calculation unit 400 may include a storage subunit 401 and a data calculation subunit 402. It can be understood that the data calculation unit 400 may further include an input-output sub-unit. The input-output subunit is configured to obtain data required by the data calculation unit, and output data required to be output by the data calculation unit.
具体地,如图3所示的数据计算阵列300可以获取如图1所示的权值数据集合中的3×3个权值数据,并将该3×3个权值数据分别保存至数据计算阵列300的3×3个数据计算单元中。Specifically, the data calculation array 300 shown in FIG. 3 may obtain 3 × 3 weight data in the weight data set shown in FIG. 1, and save the 3 × 3 weight data to the data calculation respectively. 3 × 3 data calculation units of the array 300.
具体地,权值数据b 11可以保存在数据计算单元311的存储子单元中,权值数据b 12可以保存在数据计算单元312的存储子单元中,权值数据b 13可以保存在数据计算单元313的存储子单元中,以此类推。这样,数据计算阵列300中保存了3×3个权值数据。 Specifically, the weight data b 11 may be stored in a storage sub-unit of the data calculation unit 311, the weight data b 12 may be stored in a storage sub-unit of the data calculation unit 312, and the weight data b 13 may be stored in the data calculation unit In the storage subunit of 313, and so on. In this way, the data calculation array 300 stores 3 × 3 weight data.
在保存了3×3个权值数据后,数据计算阵列300可以将该第一特征数据集合单向滑动,使用数据计算阵列300保存的权值数据对该第一特征数据集合进行乘法运算。在数据计算阵列300对该第一特征数据集合进行乘法运算的过程中,数据计算阵列300中保存的权值数据不发生变化。换句话说,在数据计算阵列300对该第一特征数据进行乘法运算的过程中,数据计算阵列300中的数据计算单元不会将保存的权值数据删除。相应 的,数据计算单元也不会从存储模块中读取并保存新的权值数据。After saving 3 × 3 weight data, the data calculation array 300 may slide the first feature data set unidirectionally, and use the weight data saved by the data calculation array 300 to perform a multiplication operation on the first feature data set. During the multiplication operation of the first feature data set by the data calculation array 300, the weight data stored in the data calculation array 300 does not change. In other words, during the multiplication of the first feature data by the data calculation array 300, the data calculation unit in the data calculation array 300 will not delete the saved weight data. Correspondingly, the data calculation unit will not read and save the new weight data from the storage module.
该第一特征数据集合单向滑动的方式可以参考图5。图5是根据本申请实施例提供的对该第一特征数据集合进行乘法运算过程的示意图。如图5所示,该第一特征数据集合可以先翻转180度。如图5所示,该第一特征数据集合的第1列在翻转后成为第5列,第2列在翻转后成为第4列,以此类推。需要说明的是,如图5所示的先将第一特征数据集合先翻转180再向右滑动只是为了便于描述特征数据a 11、a 21、a 31、a 12、a 22、a 32、a 13、a 23、a 33与权值数据b 11、b 21、b 31、b 12、b 22、b 32、b 13、b 23和b 33的计算过程。在实际实现中,第一特征数据集合可以直接通过向右滑动的方式与数据计算阵列300保存的权值数据进行乘法运算。第一特征数据集合直接通过向右滑动进行乘法运算的运算结果与第一特征数据集合采用如图5所示的方式先翻转180度再向右滑动进行乘法运算的运算结果的数据的值是相同的,只是得到的最终数据的先后顺序不同。 For a manner of unidirectional sliding of the first feature data set, refer to FIG. 5. FIG. 5 is a schematic diagram of a process of multiplying the first feature data set according to an embodiment of the present application. As shown in FIG. 5, the first feature data set may be turned 180 degrees first. As shown in FIG. 5, the first column of the first feature data set becomes the fifth column after being inverted, the second column becomes the fourth column after being inverted, and so on. It should be noted that, as shown in FIG. 5, the first feature data set is first flipped 180 and then swiped to the right for the convenience of describing the feature data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a The calculation process of 13 , a 23 , a 33 and weight data b 11 , b 21 , b 31 , b 12 , b 22 , b 32 , b 13 , b 23 and b 33 . In actual implementation, the first feature data set can be directly multiplied with the weight data stored in the data calculation array 300 by sliding rightward. The calculation result of the first feature data set that is directly multiplied by sliding to the right is the same as the value of the data of the calculation result of the first feature data set that is first flipped 180 degrees and then slided to the right in the manner shown in Figure 5. Yes, only the order of the final data is different.
翻转后的第一特征数据集合向右单向滑动,分别与数据计算阵列300保存的权值数据进行乘法运算。具体地,在第一次运算时,特征数据a 11、a 21和a 31分别与权值数据b 11、b 21和b 31相乘。在第一次运算后,该翻转后的第一特征数据集合向右滑动,进行第二次运算。在第二次运算时,特征数据a 11、a 21和a 31分别与权值数据b 12、b 22和b 32相乘,特征数据a 12、a 22和a 32分别与权值数据b 11、b 21和b 31相乘。在第二次运算后,该翻转后的特征数据集合继续向右滑动,进行第三次运算,以此类推。上述实施例中,该第一特征数据集合每次滑动的步长为1。当然,在一些其他的实施例中,该第一特征数据集合每次滑动的步长也可以是大于1的正整数。 The flipped first feature data set slides to the right one-way, and performs multiplication operations with the weight data stored in the data calculation array 300, respectively. Specifically, at the first operation, the feature data a 11 , a 21, and a 31 are multiplied with the weight data b 11 , b 21, and b 31, respectively . After the first operation, the first feature data set after the flip is swiped to the right to perform the second operation. In the second operation, the characteristic data a 11 , a 21 and a 31 are respectively multiplied by weight data b 12 , b 22 and b 32 , and the characteristic data a 12 , a 22 and a 32 are respectively weighted by b 11 , B 21 and b 31 are multiplied. After the second operation, the inverted feature data set continues to slide to the right for the third operation, and so on. In the above embodiment, the step size of each sliding of the first feature data set is 1. Of course, in some other embodiments, the step size of each sliding of the first feature data set may also be a positive integer greater than 1.
以第一次运算为例,数据计算单元311可以从存储模块210中保存的第一特征数据集合中获取特征数据a 11,并且将获取到的特征数据a 11保存在数据计算单元311的存储子单元中。在此情况下,数据计算单元311的存储子单元中保存有权值数据b 11和特征数据a 11。数据计算单元311中的数据计算子单元将存储子单元中保存权值数据b 11和特征数据a 11相乘,得到中间数据k (11,11)。权值数据b 11和特征数据a 11相乘运算可以由数据计算子单元中的乘法器实现。 In the first operation example, the data calculating unit 311 may acquire characteristic data from a 11 wherein a first set of data stored in the storage module 210, and the acquired characteristic data is stored in a 11 memory sub-data calculating unit 311 Unit. In this case, the storage subunit of the data calculation unit 311 holds the weighted value data b 11 and the characteristic data a 11 . The data calculation sub-unit in the data calculation unit 311 multiplies the weight data b 11 and the feature data a 11 stored in the storage sub-unit to obtain intermediate data k (11,11) . The multiplication operation of the weight data b 11 and the characteristic data a 11 may be implemented by a multiplier in the data calculation subunit.
数据计算单元311还可以根据与数据计算单元311对应的地址计算单元确定的目标地址,获取第一目标地址中保存的缓存数据r (11,11)。具体地,与数据计算单元311对应的地址计算单元可以根据特征数据a 11的地址和权值数据b 11的地址确定该第一目标地址。数据计算单元311可以获取该第一目标地址中保存的当前缓存数据r (11,11)。地址计算单元确定该第一目标地址的方式将在稍后描述。该数据计算子单元将该中间数据k (11,11)和该当前缓存数据r (11,11)相加,得到目标数据d (11,11)。该中间数据k (11,11)和该当前缓存数据r (11,11)的加法运算可以由数据计算子单元中的加法器实现。该目标数据d (11,11)可以保存至该第一目标地址中。换句话说,该第一目标地址中所保存的该当前缓存数据r (11,11)被更新为该目标数据d (11,11)The data calculation unit 311 may also obtain the cache data r (11,11) stored in the first target address according to the target address determined by the address calculation unit corresponding to the data calculation unit 311. Specifically, the address calculation unit corresponding to the data calculation unit 311 may determine the first target address according to the address of the characteristic data a 11 and the address of the weight data b 11 . The data calculation unit 311 may obtain the current cache data r (11,11) stored in the first target address. The manner in which the address calculation unit determines the first target address will be described later. The data calculation subunit adds the intermediate data k (11,11) and the current buffer data r (11,11) to obtain target data d (11,11) . The addition operation of the intermediate data k (11,11) and the current buffered data r (11,11) can be implemented by an adder in a data calculation subunit. The target data d (11,11) can be stored in the first target address. In other words, the current cache data r (11,11) stored in the first target address is updated to the target data d (11,11) .
类似的,数据计算单元321可以根据相同的方式确定由数据计算单元321保存的权值数据b 21与特征数据a 21的乘积(以下称为中间数据k (21,21))。与数据计算单元321对应的地址计算单元确定的目标地址也为该第一目标地址。数据计算单元321将中间数据k (21,21)与该第一目标地址保存的当前缓存数据(此时该当前缓存数据已被更新为目标数据d (11,11))相加,得到目标数据d (21,21)。该目标数据d (21,21)可以保存至该第一目标地址中。 换句话说,该第一目标地址中所保存的该当前缓存数据d (11,11)被更新为该目标数据d (21,21)。 Similarly, the data calculation unit 321 can determine the product of the weight data b 21 and the feature data a 21 (hereinafter referred to as the intermediate data k (21, 21) ) held by the data calculation unit 321 in the same manner. The target address determined by the address calculation unit corresponding to the data calculation unit 321 is also the first target address. The data calculation unit 321 adds the intermediate data k (21, 21) and the current cache data (the current cache data has been updated to the target data d (11, 11) ) stored at the first target address to obtain the target data. d (21, 21) . The target data d (21, 21) can be stored in the first target address. In other words, the current cache data d (11, 11) stored in the first target address is updated to the target data d (21, 21).
数据计算单元331可以根据相同的方式确定由数据计算单元331保存的权值数据b 31与特征数据a 31的乘积(以下称为中间数据k (31,31))。与数据计算单元331对应的地址计算单元确定的目标地址也为该第一目标地址。数据计算单元331将中间数据k (31,31)与该第一目标地址保存的当前缓存数据(此时该当前缓存数据已被更新为目标数据d (21,21))相加,得到目标数据d (31,31)。该目标数据d (31,31)可以保存至该第一目标地址中。换句话说,该第一目标地址中所保存的该当前缓存数据d (21,21)被更新为该目标数据d (31,31)The data calculation unit 331 can determine the product of the weight data b 31 and the feature data a 31 (hereinafter referred to as the intermediate data k (31, 31) ) held by the data calculation unit 331 in the same manner. The target address determined by the address calculation unit corresponding to the data calculation unit 331 is also the first target address. The data calculation unit 331 adds the intermediate data k (31, 31) and the current cache data (the current cache data has been updated to the target data d (21, 21) ) at the first target address to obtain the target data. d (31, 31) . The target data d (31, 31) can be stored in the first target address. In other words, the current cache data d (21, 21) stored in the first target address is updated to the target data d (31, 31) .
在第一次运算过后,该第一目标地址中保存的目标数据为a 11×b 11+a 21×b 21+a 31×b 31After the first operation, the target data stored in the first target address is a 11 × b 11 + a 21 × b 21 + a 31 × b 31 .
采用类似的方法,数据计算阵列300可以继续使用由数据计算阵列300中的数据计算单元保存的权值数据对该第一特征数据集合进行运算。In a similar method, the data calculation array 300 may continue to perform operations on the first feature data set using the weight data saved by the data calculation unit in the data calculation array 300.
在第三次运算过后,该第一目标地址中保存的数据为a 11×b 11+a 21×b 21+a 31×b 31+a 12×b 12+a 22×b 22+a 32×b 32。也就是说,在第三次运算过程中,对应于数据计算单元312、数据计算单元322和数据计算单元332的地址计算单元所确定的目标地址也为该第一目标地址。因此,在第三次预算过后,该第一目标地址中保存的目标数据为第一次运算过后该第一目标地址保存的数据与数据计算单元312确定的a 12×b 12,数据计算单元322确定的a 22×b 22以及数据计算单元332确定的a 32×b 32之和。在第五次运算后,该第一目标地址中保存的数据为a 11×b 11+a 21×b 21+a 31×b 31+a 12×b 12+a 22×b 22+a 32×b 32+a 13×b 13+a 23×b 23+a 33×b 33。也就是说,在第五次运算过程中,对应于数据计算单元313、数据计算单元323和数据计算单元333的地址计算单元所确定的目标地址也为该第一目标地址。因此,在第五次运算过后,该第一目标地址中保存的目标数据为第三次运算过后该第一目标地址保存的数据与数据计算单元313确定的a 13×b 13,数据计算单元323确定的a 23×b 23以及数据计算单元333确定的a 33×b 33之和。 After the third operation, the data stored in the first destination address is a 11 × b 11 + a 21 × b 21 + a 31 × b 31 + a 12 × b 12 + a 22 × b 22 + a 32 × b 32 . That is, during the third operation, the target address determined by the address calculation unit corresponding to the data calculation unit 312, the data calculation unit 322, and the data calculation unit 332 is also the first target address. Therefore, after the third budget, the target data stored in the first target address is a 12 × b 12 determined by the data stored in the first target address and the data calculation unit 312 after the first calculation, and the data calculation unit 322 The sum of the determined a 22 × b 22 and the a 32 × b 32 determined by the data calculation unit 332. After the fifth operation, the data stored in the first destination address is a 11 × b 11 + a 21 × b 21 + a 31 × b 31 + a 12 × b 12 + a 22 × b 22 + a 32 × b 32 + a 13 × b 13 + a 23 × b 23 + a 33 × b 33 . That is, during the fifth operation, the target address determined by the address calculation unit corresponding to the data calculation unit 313, the data calculation unit 323, and the data calculation unit 333 is also the first target address. Therefore, after the fifth operation, the target data stored in the first target address is a 13 × b 13 determined by the data stored in the first target address and the data calculation unit 313 after the third operation, and the data calculation unit 323 The determined a 23 × b 23 and the sum of a 33 × b 33 determined by the data calculation unit 333.
这样,在经过五次计算过后,该第一目标地址中保存的数据就是如公式1.1所示的卷积结果c 11。类似的,利用乘法运算以及地址运算结果就可以完成该第一特征数据集合与权值数据集合的卷积运算。 In this way, after five calculations, the data stored in the first target address is the convolution result c 11 as shown in Formula 1.1. Similarly, the multiplication operation and the address operation result can be used to complete the convolution operation of the first feature data set and the weight data set.
下面结合图1、图3至图8对该N个地址计算阵列使用保存的权值数据的地址对如图1所示的第一特征数据集合的地址进行地址运算进行描述。The following describes operations performed on the addresses of the first feature data set shown in FIG. 1 by using the saved weight data addresses for the N address calculation arrays with reference to FIG. 1, FIG. 3, and FIG. 8.
图6是本申请实施例提供的第一特征数据集合的地址与权值数据集合地址的示意图。如图6所示的第一特征数据集合的地址是如图1所示的第一特征数据集合的地址。具体地,地址Add a11是特征数据a 11的地址,地址Add a12是特征数据a 12的地址,以此类推。如图6所示的权值数据集合的地址是如图1所示的权值数据集合的地址。具体地,地址Add b11是权值数据b 11的地址,地址Add b12是权值数据b 12的地址,以此类推。 FIG. 6 is a schematic diagram of an address of a first feature data set and an address of a weight data set provided in an embodiment of the present application. The address of the first feature data set shown in FIG. 6 is the address of the first feature data set shown in FIG. 1. Specifically, the address Add a11 is the address of the characteristic data a 11 , the address Add a12 is the address of the characteristic data a 12 , and so on. The address of the weight data set shown in FIG. 6 is the address of the weight data set shown in FIG. 1. Specifically, the address Add b11 is the address of the weight data b 11 , the address Add b12 is the address of the weight data b 12 , and so on.
图7是本申请实施例提供的一个地址计算阵列的示意图。如图7所示的地址计算阵列700共包括9个数据计算单元,分别为地址计算单元711、地址计算单元712、地址计算单元713、地址计算单元721、地址计算单元722、地址计算单元723、地址计算单元731、地址计算单元732和地址计算单元733。FIG. 7 is a schematic diagram of an address calculation array according to an embodiment of the present application. The address calculation array 700 shown in FIG. 7 includes nine data calculation units, which are respectively an address calculation unit 711, an address calculation unit 712, an address calculation unit 713, an address calculation unit 721, an address calculation unit 722, an address calculation unit 723, The address calculation unit 731, the address calculation unit 732, and the address calculation unit 733.
可以理解的是,除了如图7所示的地址计算单元外,地址计算阵列还可以包括输入输出单元(图中未示出)。该输入输出单元用于获取需要输入到地址计算阵列700的数 据。该输入输出单元,还用于将地址计算阵列700需要输出的数据输入到相应的单元和/或模块。例如,该输入输出单元可以从该存储模块获取权值数据的地址和特征数据的地址,将获取到的权值数据的地址和特征数据的地址发送至对应的地址计算单元。该输入输出单元,还用于获取每个地址计算单元计算出的目标地址,将目标地址发送至对应的数据计算单元。It can be understood that, in addition to the address calculation unit shown in FIG. 7, the address calculation array may further include an input-output unit (not shown in the figure). The I / O unit is used to obtain data that needs to be input to the address calculation array 700. The input / output unit is further configured to input data to be output by the address calculation array 700 to a corresponding unit and / or module. For example, the input / output unit may obtain the address of the weight data and the address of the characteristic data from the storage module, and send the obtained address of the weight data and the address of the characteristic data to the corresponding address calculation unit. The input-output unit is further configured to obtain a target address calculated by each address calculation unit, and send the target address to a corresponding data calculation unit.
该N个地址计算阵列与该N个数据计算阵列一一对应。这里的一一对应是指,该N个数据计算阵列中的一个数据计算阵列对应于该N个地址计算阵列中的一个地址计算阵列,不同的数据计算阵列对应的地址计算阵列不同。例如,假设N等于3,3个数据计算阵列分别为数据计算阵列1、数据计算阵列2和数据计算阵列3,3个地址计算阵列分别为地址计算阵列1、地址计算阵列2和地址计算阵列3。数据计算阵列1对应于地址计算阵列1,数据计算阵列2对应于地址计算阵列2,数据计算阵列3对应于地址计算阵列3。与数据计算阵列对应的地址计算阵列用于计算该数据计算阵列中的每个目标数据的目标地址。进一步,数据计算阵列中的数据计算单元与地址计算阵列中的地址计算单元也一一对应。假设图3所示的数据计算阵列对应于图7所示的地址计算阵列,则数据计算单元311对应于地址计算单元711,数据计算单元312对应于地址计算单元712,数据计算单元313对应于地址计算单元731,以此类推。地址计算单元用于确定对应的数据计算单元的目标数据的地址。具体而言,如上所述数据计算单元311获取的缓存数据r (11,11)所在的第一目标地址是由地址计算单元711进行地址运算后得到的。 The N address calculation arrays are in one-to-one correspondence with the N data calculation arrays. The one-to-one correspondence here means that one data calculation array in the N data calculation arrays corresponds to one address calculation array in the N address calculation arrays, and different data calculation arrays have different address calculation arrays. For example, suppose N is equal to 3. The three data calculation arrays are data calculation array 1, data calculation array 2, and data calculation array 3. The three address calculation arrays are address calculation array 1, address calculation array 2, and address calculation array 3. . The data calculation array 1 corresponds to the address calculation array 1, the data calculation array 2 corresponds to the address calculation array 2, and the data calculation array 3 corresponds to the address calculation array 3. The address calculation array corresponding to the data calculation array is used to calculate a target address of each target data in the data calculation array. Further, the data calculation unit in the data calculation array and the address calculation unit in the address calculation array also correspond one-to-one. Assuming that the data calculation array shown in FIG. 3 corresponds to the address calculation array shown in FIG. 7, the data calculation unit 311 corresponds to the address calculation unit 711, the data calculation unit 312 corresponds to the address calculation unit 712, and the data calculation unit 313 corresponds to the address The calculation unit 731, and so on. The address calculation unit is used to determine the address of the target data of the corresponding data calculation unit. Specifically, as described above, the first target address where the cache data r (11, 11) obtained by the data calculation unit 311 is obtained after the address calculation unit 711 performs an address operation.
图8是本申请实施例提供的地址计算阵列中的一个地址计算单元的结构框图。如图8所示,地址计算单元800可以包括存储子单元801和地址计算子单元802。可以理解的是,地址计算单元800还可以包括输入输出子单元。该输入输出子单元用于获取该地址计算单元需要获取的数据,并输出该地址计算单元需要输出的数据。FIG. 8 is a structural block diagram of an address calculation unit in an address calculation array according to an embodiment of the present application. As shown in FIG. 8, the address calculation unit 800 may include a storage subunit 801 and an address calculation subunit 802. It can be understood that the address calculation unit 800 may further include an input-output sub-unit. The input-output sub-unit is configured to obtain data required by the address calculation unit and output data required by the address calculation unit.
具体地,如图7所示的地址计算阵列700可以获取如图6所示的权值数据集合的地址中的3×3个权值数据的地址,并将该3×3个权值数据的地址分别保存至地址计算阵列700的3×3个数据计算单元中。Specifically, the address calculation array 700 shown in FIG. 7 may obtain an address of 3 × 3 weight data in the address of the weight data set shown in FIG. 6, and compare the 3 × 3 weight data with The addresses are respectively stored in 3 × 3 data calculation units of the address calculation array 700.
具体地,地址Add b11可以保存在地址计算单元711的存储子单元中,地址Add b12可以保存在地址计算单元712的存储子单元中,地址Add b13可以保存在地址计算单元713的存储子单元中,以此类推。这样,地址计算阵列700中保存了3×3个权值数据的地址。 Specifically, the address Add b11 may be stored in the storage subunit of the address calculation unit 711, the address Add b12 may be stored in the storage subunit of the address calculation unit 712, and the address Add b13 may be stored in the storage subunit of the address calculation unit 713. And so on. In this way, the address calculation array 700 stores addresses of 3 × 3 weight data.
在保存了3×3个权值数据的地址后,地址计算阵列700可以将该第一特征数据集合的地址单向滑动,使用地址计算阵列700保存的权值数据的地址对该第一特征数据集合的地址进行地址运算。在地址计算阵列700对该第一特征数据集合的地址进行地址运算的过程中,地址计算阵列700中保存的权值数据的地址不发生变化。换句话说,在地址计算阵列700对该第一特征数据的地址进行地址运算的过程中,地址计算阵列700中的地址计算单元不会将保存的权值数据的地址删除。相应的,地址计算单元也不会从存储模块中读取并保存新的权值数据的地址。After the addresses of the 3 × 3 weight data are saved, the address calculation array 700 may unidirectionally slide the addresses of the first feature data set, and use the addresses of the weight data stored by the address calculation array 700 to the first feature data. The address of the set performs the address operation. During the address calculation process performed by the address calculation array 700 on the address of the first feature data set, the address of the weight data stored in the address calculation array 700 does not change. In other words, during the address calculation process of the address of the first characteristic data by the address calculation array 700, the address calculation unit in the address calculation array 700 will not delete the saved address of the weight data. Correspondingly, the address calculation unit will not read and save the address of the new weight data from the storage module.
该第一特征数据集合的地址向右单向滑动进行地址计算的过程与该第一特征数据集合向右滑动进行乘法运算的过程类似,在此就不必赘述。The process of performing the address calculation by sliding the address of the first feature data set to the right in one direction is similar to the process of sliding the right of the first feature data set to the right to perform the multiplication operation, and it is unnecessary to repeat it here.
下面将对该地址计算单元如何进行地址运算进行介绍。The following describes how the address calculation unit performs address operations.
为便于描述,以下将地址计算单元800获取到的权值的地址称为第一权值的地址, 将地址计算单元800获取到的特征数据的地址称为第一特征数据的地址,将地址计算单元800进行地址运算后得到的地址称为第一目标地址。For convenience of description, the address of the weight obtained by the address calculation unit 800 is referred to as the address of the first weight, and the address of the feature data obtained by the address calculation unit 800 is referred to as the address of the first feature data. The address obtained after the unit 800 performs an address operation is called a first target address.
地址计算单元800可以中的输入输出子单元除了从该存储模块获取该第一特征数据的地址以及该第一权值数据的地址外,还可以获取以下信息:对应于第一特征数据集合的输入数据的尺寸、填充尺寸和权值尺寸,该权值尺寸是地址计算单元800所属的地址计算阵列的大小,该填充尺寸是一个预设好的尺寸。本例中,该权值尺寸为3×3。该对应于第一特征数据集合的输入数据的尺寸、该填充尺寸和该权值尺寸也可以保存在地址计算单元800的存储子单元801中。地址计算子单元801可以根据第一权值数据地址、第一特征数据地址、对应于该第一特征数据集合的输入数据的尺寸、填充尺寸和权值尺寸,确定第一目标地址。The input / output sub-unit in the address calculation unit 800 can obtain the following information in addition to the address of the first feature data and the address of the first weight data from the storage module: the input corresponding to the first feature data set The size of the data, the filling size, and the weight size. The weight size is the size of the address calculation array to which the address calculation unit 800 belongs, and the filling size is a preset size. In this example, the weight size is 3 × 3. The size of the input data corresponding to the first feature data set, the padding size, and the weight size may also be stored in a storage subunit 801 of the address calculation unit 800. The address calculation subunit 801 may determine the first target address according to the first weight data address, the first feature data address, the size of the input data corresponding to the first feature data set, the filling size, and the weight size.
假设输入图片的大小为a行b列,而卷积核的大小为n行m列,则卷积后的输出图片大小为(a-n+1)×(b-m+1)。这样就有两个问题:1,每次卷积运算后,输出图片的尺寸缩小;2,原始图片的角落、边缘区像素点在输出中采用较少,输出图片丢失边缘位置的很多信息。Assuming that the size of the input picture is a row and b column, and the size of the convolution kernel is n rows and m column, the size of the output picture after convolution is (a-n + 1) × (b-m + 1). There are two problems: 1. The size of the output picture is reduced after each convolution operation; 2. The corners and edges of the original picture are used less in the output, and the output picture loses a lot of information about the position of the edges.
为了解决这些问题,可以在进行卷积操作前,对原始图片在边界上进行填充(Padding),以增加矩阵的大小。通常将0作为填充值。In order to solve these problems, before the convolution operation is performed, the original picture may be padded on the boundary to increase the size of the matrix. 0 is usually used as the padding value.
设横向和纵向的扩展像素点数量分别为p和q,则填充后原始图片的大小为(a+2p)×(b+2q),卷积核大小保持n行m列不变,则输出图片大小为不变,则输出图片大小为(a+2p-n+1)×(b+2q-m+1)。每个方向扩展的像素点数目p和q就是填充尺寸。可以得出,横向填充尺寸p等于(n-1)/2,纵向填充尺寸q等于(m-1)/2。Set the number of horizontal and vertical extended pixels to be p and q respectively, then the size of the original picture after filling is (a + 2p) × (b + 2q), and the size of the convolution kernel remains n rows and m columns, then the output picture If the size is constant, the output picture size is (a + 2p-n + 1) × (b + 2q-m + 1). The number of pixels p and q expanded in each direction is the fill size. It can be concluded that the horizontal filling size p is equal to (n-1) / 2, and the vertical filling size q is equal to (m-1) / 2.
地址计算子单元801可以具体根据以下公式确定目标地址:The address calculation subunit 801 may specifically determine the target address according to the following formula:
result_cord=(input_cord/input_size x-w_cord/kernel_size x+padding_size x)×input_size y+(input_cord%input_size y-w_cord%kernel_size y+padding_size y),(公式1.3) result_cord = (input_cord / input_size x -w_cord / kernel_size x + padding_size x) × input_size y + (input_cord% input_size y -w_cord% kernel_size y + padding_size y), ( Equation 1.3)
其中,%表示取余,result_cord表示该目标地址,input_cord表示该特征数据的地址,input_size x表示对应于该第一特征数据集合的输入数据的尺寸的横坐标,input_size y表示对应于该第一特征数据集合的输入数据的尺寸的纵坐标,w_cord表示该权值数据的地址,kernel_size x表示该权值尺寸的横坐标,kernel_size y表示该权值尺寸的纵坐标,padding_size x表示横向填充尺寸,padding_size y表示纵向填充尺寸。 Among them,% represents the margin, result_cord represents the target address, input_cord represents the address of the feature data, input_size x represents the abscissa of the size of the input data corresponding to the first feature data set, and input_size y represents the first feature corresponding to the first feature The ordinate of the size of the input data of the data set, w_cord represents the address of the weight data, kernel_size x represents the abscissa of the weight size, kernel_size y represents the ordinate of the weight size, padding_size x represents the horizontal padding size, and padding_size y represents the vertical fill size.
公式1.3中的特征数据的地址与权值数据的地址是绝对地址。绝对地址是指该特征数据/权值数据在相应的特征数据集合/权值数据集合中的绝对位置。假设该特征数据集合包括X个特征数据,该X个特征数据中的第x个特征数据的绝对地址就是x-1,其中X为大于1的正整数,x为大于1且小于或等于X的正整数。例如特征数据集合包括:5,0,0,32,0,0,0,0,23,特征数据5、32和23的绝对地址分别为:0,3,8。上述列出的绝对地址指的是特征数据在特征数据中的位置,可以根据特征矩阵的规格转换成由横坐标和纵坐标组成的地址。同理,权值数据的绝对地址也可以转换成由横坐标和纵坐标构成的地址。The address of the feature data in Formula 1.3 and the address of the weight data are absolute addresses. The absolute address refers to the absolute position of the feature data / weight data in the corresponding feature data set / weight data set. Assume that the feature data set includes X feature data, and the absolute address of the x-th feature data in the X feature data is x-1, where X is a positive integer greater than 1, and x is greater than 1 and less than or equal to X. Positive integer. For example, the feature data set includes: 5, 0, 0, 32, 0, 0, 0, 0, 23, and the absolute addresses of the feature data 5, 32, and 23 are: 0, 3, and 8, respectively. The absolute address listed above refers to the position of the feature data in the feature data, and can be converted into an address composed of the abscissa and the ordinate according to the specifications of the feature matrix. Similarly, the absolute address of the weight data can also be converted into an address composed of the abscissa and the ordinate.
可选的,在一些实施例中,地址计算子单元801还可以根据以下公式确定该目标地址:Optionally, in some embodiments, the address calculation subunit 801 may further determine the target address according to the following formula:
result_cord=((base_input+input_cord)/input_size x-(base_w+w_cord)/kernel_size x+padding_size x)×input_size y+((base_cord+input_cord)%input_size y-(base_w+w_cord)%kernel__size y+padding__size y),(公式1.4) result_cord = ((base_input + input_cord) / input_size x - (base_w + w_cord) / kernel_size x + padding_size x) × input_size y + ((base_cord + input_cord)% input_size y - (base_w + w_cord)% kernel__size y + padding__size y) , (Equation 1.4)
其中,%表示取余,result_cord表示该目标地址,input_cord表示该特征数据的地址,input_size x表示对应于该第一特征数据集合的输入数据的尺寸的横坐标,input_size y表示对应于该第一特征数据集合的输入数据的尺寸的纵坐标,w_cord表示该权值数据的地址,kernel_size x表示该权值尺寸的横坐标,kernel_size y表示该权值尺寸的纵坐标,padding_size x表示横向填充尺寸,padding_size y表示纵向填充尺寸,base_input表示特征数据的地址的基准地址,base_w表示权值数据的地址的基准地址。 Among them,% represents the margin, result_cord represents the target address, input_cord represents the address of the feature data, input_size x represents the abscissa of the size of the input data corresponding to the first feature data set, and input_size y represents the first feature corresponding to the first feature The ordinate of the size of the input data of the data set, w_cord represents the address of the weight data, kernel_size x represents the abscissa of the weight size, kernel_size y represents the ordinate of the weight size, padding_size x represents the horizontal padding size, and padding_size y represents the vertical padding size, base_input represents the base address of the address of the feature data, and base_w represents the base address of the address of the weight data.
公式1.4中的特征数据的地址与权值数据的地址是相对地址。相对地址是指该特征数据/权值数据在相应的特征数据集合/权值数据集合中相对于第一个特征数据/权值数据的地址的位置。假设特征数据结合的第一个特征数据的地址为Y,则该特征数据集合中的第y个特征数据的地址为Y+y-1,其中Y和y均为大于或等于1的正整数。The address of the characteristic data and the address of the weight data in Equation 1.4 are relative addresses. The relative address refers to the position of the feature data / weight data in the corresponding feature data set / weight data set relative to the address of the first feature data / weight data. Assuming that the address of the first feature data combined with the feature data is Y, the address of the y-th feature data in the feature data set is Y + y-1, where Y and y are both positive integers greater than or equal to 1.
可选的,在一些实施例中,地址计算单元在确定出目标地址后,可以直接将目标地址发送至对应的数据计算单元。该数据计算单元可以根据该目标地址确定该目标地址中的缓存数据。Optionally, in some embodiments, after determining the target address, the address calculation unit may directly send the target address to the corresponding data calculation unit. The data calculation unit may determine the cached data in the target address according to the target address.
可选的,在另一些实施例中,该地址计算单元在确定出目标地址后,可以确定出目标地址中的缓存数据,然后将该缓存数据以及目标地址一起发送至对应的数据计算单元。Optionally, in other embodiments, after the address calculation unit determines the target address, the cache data in the target address may be determined, and then the cache data and the target address are sent to the corresponding data calculation unit together.
以上描述了一个数据计算阵列如何进行乘法运算以及一个地址计算阵列如何进行地址运算。The foregoing describes how a data calculation array performs multiplication operations and an address calculation array performs address operations.
如上所述,该数据处理装置中可以包括2个或两个以上的数据计算阵列以及对应的地址计算阵列。As described above, the data processing apparatus may include two or more data calculation arrays and corresponding address calculation arrays.
图1所示的权值数据集合仅包括3×3个权值数据,对特征数据集合进行卷积运算也只使用了一个权值数据集合。可选的,在另一些实施例中,用于对该特征数据集合进行卷积运算的权值数据集合也可以是两个或两个以上。The weight data set shown in FIG. 1 includes only 3 × 3 weight data, and only one weight data set is used for the convolution operation on the feature data set. Optionally, in other embodiments, the weight data set used to perform the convolution operation on the feature data set may also be two or more.
可选的,在一些实施例中,该N个数据计算阵列中的每个数据计算阵列可以获取并保存一个权值数据集合,并利用保存的权值数据对该第一特征数据集合进行乘法运算。相应的,该N个地址计算阵列中的每个地址计算阵列可以获取并保存对应的权值数据的地址,并利用保存的权值数据的地址对该第一特征数据集合的地址进行乘法运算。Optionally, in some embodiments, each of the N data calculation arrays may obtain and save a weight data set, and multiply the first feature data set by using the saved weight data. . Correspondingly, each of the N address calculation arrays can obtain and save the address of the corresponding weight data, and multiply the address of the first feature data set by using the saved weight data address.
若用于对该特征数据集合进行卷积运算的权值数据集合的数目大于N,则该N个数据计算阵列每次可以获取该N个权值数据集合对该第一特征数据集合进行乘法运算。若一次可获取的权值数据集合数目小于N,则获取所有的权值数据集合对该第一特征数据集合进行乘法运算。假设N的取值为4,权值数据集合数目为9。在此情况下,该4个数据计算阵列可以先获取第1至第4个权值数据集合对该第一特征数据集合进行乘法运算,然后该4个数据计算阵列再获取第5至第8个权值数据集合对该第一特征数据集合进行乘法运算,然后该4个数据计算阵列再获取第9个权值数据集合对该第一特征数据集合进行乘法运算。该N个地址计算阵列进行地址运算的方式类似,在此就不必赘述。If the number of weight data sets used to perform the convolution operation on the feature data set is greater than N, the N data calculation arrays can obtain the N weight data sets each time and multiply the first feature data set . If the number of weight data sets that can be acquired at one time is less than N, all the weight data sets are acquired to perform a multiplication operation on the first feature data set. Assume that the value of N is 4, and the number of weight data sets is 9. In this case, the four data calculation arrays can first obtain the first to fourth weight data sets and multiply the first feature data set, and then the four data calculation arrays can then obtain the fifth to eighth data sets. The weight data set performs a multiplication operation on the first feature data set, and then the four data calculation arrays obtain a ninth weight data set and perform a multiplication operation on the first feature data set. The manner in which the N address calculation arrays perform address operations is similar, and it is unnecessary to repeat them here.
可选的,在另一些实施例中,该N个数据计算阵列中的不同数据计算阵列所保存的 权值数据可以是相同的权值数据进行按行重排后的结果。例如,假设该N个数据计算阵列中包括第一数据计算阵列和第二数据计算阵列,该第二数据计算阵列保存的n×m个权值数据是第一数据计算阵列保存的n×m个权值数据按行重排后的n×m个权值数据。Optionally, in other embodiments, the weight data stored in different data calculation arrays in the N data calculation arrays may be the result of rearranging the same weight data in rows. For example, suppose that the N data calculation arrays include a first data calculation array and a second data calculation array. The n × m weight data stored in the second data calculation array are n × m pieces of data stored in the first data calculation array. The weight data is n × m weight data after row rearrangement.
图9是本申请实施例提供的两个数据计算阵列保存的权值数据的示意图。FIG. 9 is a schematic diagram of weight data stored in two data calculation arrays according to an embodiment of the present application.
如图9所示,数据计算阵列1保存了3×3个权值数据,其中,第一行权值数据为b 11,b 12和b 13;第二行权值数据为b 21,b 22和b 23;第三行权值数据为b 31,b 32和b 33。数据计算阵列2保存了3×3个权值数据,其中,第一行权值数据为b 31,b 32和b 33;第二行权值数据为b 11,b 12和b 13;第三行权值数据为b 21,b 22和b 23。由此可见,数据计算阵列1所保存的权值数据按行重排后的结果是数据计算阵列2所保存的权值数据。对应的,也可以认为数据计算阵列1所保存的权值数据是数据计算阵列2所保存的权值数据按行重排后的结果。为便于描述,以下称这种按行重排后得到的权值数据称为重排权值数据,将如图9所示的两个数据计算阵列保存的权值数据称为互为重排权值数据。 As shown in FIG. 9, the data calculation array 1 stores 3 × 3 weight data, wherein the weight data of the first row is b 11 , b 12 and b 13 ; the weight data of the second row is b 21 , b 22 And b 23 ; the weight data of the third row are b 31 , b 32 and b 33 . The data calculation array 2 holds 3 × 3 weight data, wherein the weight data of the first row are b 31 , b 32 and b 33 ; the weight data of the second row are b 11 , b 12 and b 13 ; the third The exercise weight data are b 21 , b 22 and b 23 . It can be seen that the result of rearranging the weight data stored in the data calculation array 1 is the weight data stored in the data calculation array 2. Correspondingly, the weight data stored in the data calculation array 1 may also be considered as a result of rearranging the weight data stored in the data calculation array 2 in rows. For the convenience of description, the weight data obtained after the rearrangement by rows is referred to as rearrangement weight data, and the weight data stored by the two data calculation arrays shown in FIG. 9 are referred to as mutual rearrangement weights. Value data.
如图9所示的是两个数据计算阵列保存的权值数据的关系。可选的,在一些实施例中,三个或者三个以上数据计算阵列中的任意两个数据计算阵列保存的权值数据也是互为重排权值数据。例如,该N个数据计算阵列中还包括如图10所示的数据计算阵列3,数据计算3阵列保存了3×3个权值数据,其中,第一行权值数据为b 21,b 22和b 23;第二行权值数据为b 31,b 32和b 33;第三行权值数据为b 11,b 12和b 13。由此可见,如图9所示的数据计算阵列1与数据计算阵列3所保存的权值数据互为重排权值数据;数据计算阵列2与数据计算阵列3保存的权值数据也互为重排权值数据。综上,若N的取值大于或等于n,权值数据共包括n行,则最多可以对该权值数据进行n-1次重排,N个数据计算阵列中的n个数据计算阵列中的第2至第n个数据计算阵列所保存的权值数据都是对该n个数据计算阵列中的第1个数据计算阵列保存的权值数据按行重排后的权值数据,其中,n个数据计算阵列中分别保存的n个权值数据位于同一行的行向量中的任意两个行向量不相同。N为大于或等于n的正整数。在此情况下,该第一数据计算阵列和该第二数据计算阵列是n个数据计算阵列中的任意两个数据计算阵列。换句话说,该n个数据计算阵列中的每个数据计算阵列保存的第一行权值数据分别为其余的n-1个数据计算阵列中的第二行权值数据至第n行权值数据。 Figure 9 shows the relationship between the weight data stored in the two data calculation arrays. Optionally, in some embodiments, the weight data stored in any two of the three or more data calculation arrays is also rearranged weight data. For example, the N data calculation arrays also include a data calculation array 3 as shown in FIG. 10. The data calculation 3 array stores 3 × 3 weight data, where the weight data of the first row is b 21 , b 22 And b 23 ; weight data in the second row are b 31 , b 32 and b 33 ; weight data in the third row are b 11 , b 12 and b 13 . It can be seen that the weight data stored in data calculation array 1 and data calculation array 3 shown in FIG. 9 are rearranged weight data; the weight data stored in data calculation array 2 and data calculation array 3 are also each other. Rearrange weight data. In summary, if the value of N is greater than or equal to n, and the weight data includes a total of n rows, the weight data can be rearranged at most n-1 times. In the n data calculation arrays of the N data calculation arrays, The weight data stored in the 2nd to nth data calculation arrays are the weighted data in which the weight data stored in the first data calculation array of the n data calculation arrays are rearranged in rows, where: Any two row vectors of the n weight data stored in the n data calculation arrays in the row vector of the same row are different. N is a positive integer greater than or equal to n. In this case, the first data calculation array and the second data calculation array are any two data calculation arrays among the n data calculation arrays. In other words, each of the n data calculation arrays holds the first row weight data of the remaining n-1 data calculation arrays from the second row weight data to the nth row weight data. data.
可选的,在一些实施例中,数据计算阵列2和数据计算阵列3可以先获取到如图1所示的3×3个权值数据,然后进行数据重排得到重排权值数据。Optionally, in some embodiments, the data calculation array 2 and the data calculation array 3 may first obtain 3 × 3 weight data as shown in FIG. 1, and then perform data rearrangement to obtain rearranged weight data.
可选的,在另一些实施例中,该存储模块可以保存重排权值数据,数据计算阵列2和数据计算阵列3直接从存储模块中获取重排权值数据。Optionally, in other embodiments, the storage module may store the rearrangement weight data, and the data calculation array 2 and the data calculation array 3 directly obtain the rearrangement weight data from the storage module.
可以理解的是,由于数据计算阵列与地址计算阵列是对应的,因此,对应于该第二数据计算阵列的第二地址计算阵列所保存的权值数据的地址也是对应于该第一数据计算阵列的第一地址计算阵列保存的权值数据的地址的按行重排后的结果。It can be understood that, because the data calculation array corresponds to the address calculation array, the address of the weight data stored in the second address calculation array corresponding to the second data calculation array also corresponds to the first data calculation array. The first address calculates the result of row-wise rearrangement of the address of the weight data held by the array.
类似的,若N的取值大于或等于n,权值数据共包括n行,则权值数据的地址也包括n行。最多可以对该权值数据的地址进行n-1次重排,N个地址计算阵列中的n个地址计算阵列中的第2至第n个地址计算阵列所保存的权值数据的地址都是对该n个地址计算阵列中的第1个地址计算阵列保存的权值数据的地址按行重排后的权值数据的地址。N为大于或等于n的正整数。在此情况下,该第一地址计算阵列和该第二地址计算阵列是n 个地址计算阵列中的任意两个地址计算阵列。换句话说,该n个地址计算阵列中的每个地址计算阵列保存的第一行权值数据的地址分别为其余的n-1个数据计算阵列中的第二行权值数据的地址至第n行权值数据的地址。Similarly, if the value of N is greater than or equal to n, the weight data includes a total of n lines, and the address of the weight data also includes n lines. The address of the weight data can be rearranged at most n-1 times. The addresses of the weight data stored in the 2nd to nth address calculation arrays in the n address calculation arrays of the N address calculation arrays are all For the first address in the n address calculation arrays, the address of the weight data stored in the array is sorted by the address of the weight data. N is a positive integer greater than or equal to n. In this case, the first address calculation array and the second address calculation array are any two address calculation arrays among the n address calculation arrays. In other words, the addresses of the first row of weight data stored in each of the n address calculation arrays are the addresses of the second row of weight data in the remaining n-1 data calculation arrays, respectively. The address of n rows of weight data.
权值数据与对应的权值数据地址按行重排后可以通过复用特征值数据,进一步减少数据计算阵列与地址计算阵列对存储模块的访问次数。After the weight data and the corresponding weight data address are rearranged in a row, the feature value data can be reused to further reduce the number of times the data calculation array and the address calculation array access the storage module.
例如,在使用如图1所示的权值数据集合对如图1所示的特征数据集合进行卷积运算的过程中,还需要确定如公式1.3所示的运算结果:For example, in the process of performing a convolution operation on the feature data set shown in FIG. 1 using the weight data set shown in FIG. 1, it is also necessary to determine the operation result shown in Formula 1.3:
c 21=a 21×b 11+a 22×b 12+a 23×b 13+a 31×b 21+a 32×b 22+a 33×b 23+a 41×b 31+a 42×b 32+a 43×b 33,公式1.4 c 21 = a 21 × b 11 + a 22 × b 12 + a 23 × b 13 + a 31 × b 21 + a 32 × b 22 + a 33 × b 23 + a 41 × b 31 + a 42 × b 32 + a 43 × b 33 , formula 1.4
若对该权值数据进行重排后第二数据计算阵列保存的权值数据如图9所示,则通过对存储模块的一次访问后,就可以得到公式1.4的部分结果。If the weight data stored in the second data calculation array after rearranging the weight data is shown in FIG. 9, after one access to the storage module, a partial result of formula 1.4 can be obtained.
具体地,当如图9所示的数据计算阵列2利用保存的权值数据对该特征数据集合进行乘法运算,则可以得到a 21×b 11的运算结果,a 22×b 12的运算结果,a 23×b 13的运算结果,a 31×b 21的运算结果,a 32×b 22的运算结果和a 33×b 23的运算结果。根据之前描述的运算规则,上述6个运算结果的和会保存到同一个目标地址。 Specifically, when the data calculation array 2 shown in FIG. 9 uses the stored weight data to multiply the feature data set, the operation results of a 21 × b 11 and the operation results of a 22 × b 12 can be obtained. The operation result of a 23 × b 13 , the operation result of a 31 × b 21 , the operation result of a 32 × b 22 , and the operation result of a 33 × b 23 . According to the operation rules described earlier, the sum of the above 6 operation results will be saved to the same destination address.
假设该数据处理装置仅包括数据计算阵列1和数据计算阵列2且数据计算阵列1和数据计算阵列2所保存的权值数据如图9所示。在使用数据计算阵列1和数据计算阵列2对如图1所示的特征数据集合进行乘法运算的过程中,数据计算阵列1和数据计算阵列2在对该特征数据集合的第一行至第三行特征数据进行乘法运算后,可以对该特征数据集合的第三行至第五行特征数据进行乘法运算。换句话说,在遍历该特征数据集合进行乘法运算的过程中,向下滑动的步长可以为2。在未对该权值数据进行重排(换句话说,该数据处理装置只有如图9所示的数据计算阵列1)的情况下,若希望获得a 21×b 11、a 22×b 12、a 23×b 13等的运算结果,则需要在完成第一行至第三行的特征数据的乘法运算后,使用数据计算阵列1对第二行至第四行的特征数据进行乘法运算。乘法需要再次获取该特征数据集合的第二行至第三行的特征数据。换句话说,该特征数据集合的第二行至第三行特征数据需要被第二次读取才能获得a 21×b 11、a 22×b 12、a 23×b 13等的运算结果,这使得相同的特征数据需要被多次读取。 It is assumed that the data processing device includes only the data calculation array 1 and the data calculation array 2 and the weight data stored in the data calculation array 1 and the data calculation array 2 is shown in FIG. 9. In the process of using the data calculation array 1 and the data calculation array 2 to multiply the feature data set shown in FIG. 1, the data calculation array 1 and the data calculation array 2 are in the first row to the third line of the feature data set. After the row characteristic data is multiplied, the feature data in the third to fifth rows of the feature data set may be multiplied. In other words, in the process of traversing the feature data set and performing the multiplication operation, the step size for sliding down may be two. In the case that the weight data is not rearranged (in other words, the data processing device has only the data calculation array 1 shown in FIG. 9), if it is desired to obtain a 21 × b 11 , a 22 × b 12 , a 23 × b 13 and other calculation results, after the multiplication of the feature data of the first to third rows is completed, the data calculation array 1 is used to perform the multiplication of the feature data of the second to fourth rows. The multiplication needs to obtain the feature data of the second to third rows of the feature data set again. In other words, the feature data in the second to third rows of the feature data set needs to be read a second time to obtain the operation results of a 21 × b 11 , a 22 × b 12 , a 23 × b 13, etc. The same feature data needs to be read multiple times.
由于对该权值数据进行重排,该数据计算阵列2对该特征数据集合的第二行至第三行特征数据进行乘法运算的运算结果就相当于使用该数据计算阵列1以步长为1向下滑动后对第二行至第三行的特征数据进行乘法运算的运算结果。换句话说,该特征数据集合的第二行至第三行特征数据只要被读取一次,就可以实现两个权值数据集合对该第二行至第三行特征数据的乘法运算。这样可以通过对特征数据的一次读取即可获得更多的部分笛卡尔积。由于在实践中,也有利用特征数据集合与权值数据集合的部分笛卡尔积来进行预测的做法,因此通过将权值数据按行重排,并根据特征数据集合分别与原权值数据和重排后的权值数据进行乘法运算,并根据其结果得到包括部分笛卡尔积在内的目标数据集合,可以减少对存储模块的访问次数,并提高可数据处理的速度。Because the weight data is rearranged, the data calculation array 2 performs a multiplication operation on the second to third feature data of the feature data set, which is equivalent to using the data to calculate the array 1 with a step size of 1. After sliding down, the result of multiplication is performed on the feature data in the second to third rows. In other words, as long as the feature data of the second to third rows of the feature data set is read once, the multiplication operation of the two weight data sets to the feature data of the second to third rows can be realized. In this way, more partial Cartesian products can be obtained by one reading of the characteristic data. Because in practice, there is also a method of making predictions by using part of the Cartesian product of the feature data set and the weight data set, so by rearranging the weight data by rows, and according to the feature data set, the original weight data and weight Multiplying the weighted data in the back row and obtaining the target data set including a partial Cartesian product according to the result can reduce the number of accesses to the storage module and increase the speed of data processing.
而当具有n行的第一权值矩阵重排(n-1)次,所得到的n个权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同时,将特征数据集合与该n个权值矩阵进行乘法运算后,则可以得到特征数据集合与第一权值矩阵的笛卡尔积,并可以进一步得到特 征数据集合与第一权值矩阵的卷积,而特征数据集合中的每个特征数据都只要被加载到数据处理单元中一次即可。When the first weight matrix with n rows is rearranged (n-1) times, and any two of the n row vectors of the n weight matrices located in the same row are different, the feature After the data set is multiplied with the n weight matrix, the Cartesian product of the feature data set and the first weight matrix can be obtained, and the convolution of the feature data set and the first weight matrix can be further obtained. Each feature data in the data set need only be loaded into the data processing unit once.
图10是本申请实施例提供的一个数据计算阵列保存的权值数据的示意图。FIG. 10 is a schematic diagram of weight data stored in a data calculation array according to an embodiment of the present application.
在使用如图10所示的权值数据对如图1所示的特征数据集合的第一行至第三行特征数据进行乘法运算的过程中,可以获得a 31×b 11、a 32×b 12和a 33×b 13的运算结果。该第一数据计算阵列、该第二数据计算阵列和该第三数据计算阵列在对该特征数据集合的第一行至第三行特征数据进行乘法运算后,可以对该特征数据集合的第四行至第五行特征数据进行乘法运算。换句话说,在遍历该特征数据集合进行乘法运算的过程中,向下滑动的步长可以为3。 In the process of multiplying the feature data of the first to third rows of the feature data set shown in FIG. 1 by using the weight data shown in FIG. 10, a 31 × b 11 , a 32 × b can be obtained. 12 and a 33 × b 13 result. After the first data calculation array, the second data calculation array, and the third data calculation array multiply the feature data of the first to third rows of the feature data set, the fourth data calculation array may Multiplication is performed on the characteristic data from the fifth to the fifth rows. In other words, in the process of traversing the feature data set for multiplication, the step size for sliding down may be 3.
假设该N个数据计算阵列中存在三个数据计算阵列,该三个数据计算阵列分别如图9所示的数据计算阵列1、数据计算阵列2以及图10所示的数据计算阵列3,则该三个数据计算阵列可以完成对特征数据集合的笛卡尔积运算。Assuming that there are three data calculation arrays in the N data calculation arrays, and the three data calculation arrays are respectively shown as data calculation array 1, data calculation array 2 and data calculation array 3 shown in FIG. 10, then Three data calculation arrays can complete the Cartesian product operation on the feature data set.
还以特征数据a 11、a 21、a 31、a 12、a 22、a 32、a 13、a 23、a 33为例。这三个数据计算阵列可以分别与特征数据a 11、a 21、a 31、a 12、a 22、a 32、a 13、a 23、a 33进行如图5所示的乘法运算过程。这三个数据计算阵列使用各自保存的权值数据完成对特征数据a 11、a 21、a 31、a 12、a 22、a 32、a 13、a 23、a 33的乘法运算的运算结果如表1所示。 The feature data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , and a 33 are also taken as examples. The three data calculation arrays can perform the multiplication process shown in FIG. 5 with the characteristic data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , and a 33, respectively. These three data calculation arrays use the weight data stored separately to complete the multiplication of the characteristic data a 11 , a 21 , a 31 , a 12 , a 22 , a 32 , a 13 , a 23 , a 33 , such as Table 1 shows.
综上所述,若权值数据共包括n行,则最多可以对该权值数据进行n-1次重排。若对权值数据进行一次重排,在遍历该特征数据集合进行乘法运算的过程中,向下滑动的步长可以为2;若对权值数据进行两次重排,在遍历该特征数据集合进行乘法运算的过程中,向下滑动的步长可以为3;若对权值数据进行n-1次重排,在遍历该特征数据集合进行乘法运算的过程中,向下滑动的步长可以为n。In summary, if the weight data includes a total of n rows, the weight data can be rearranged at most n-1 times. If the weight data is rearranged once, during the process of traversing the feature data set for multiplication, the step size for sliding down may be 2; if the weight data is rearranged twice, the feature data set is traversed In the process of multiplication, the step size for sliding down can be 3; if the weight data is rearranged n-1 times, during the process of traversing the feature data set for multiplication, the step size for sliding down can be Is n.
可选的,在一些实施例中,该第一特征数据集合是第二特征数据集合经过稀疏化处理后得到的特征数据集合。该第一权值数据集合是经过稀疏化处理后得到的权值数据集合。如图2所示的数据处理装置200还可以包括压缩模块。该压缩模块用于获取第二特征数据集合,并对该第二特征数据集合进行稀疏化处理得到该第一特征数据集合,该第二特征数据集合包括对应于输入数据的特征数据。该压缩模块还用于获取第二权值数据集合,对该第二权值数据集合进行稀疏化处理得到该第一权值数据集合。该压缩模块还用于确定该第一特征数据集合中的每个特征数据的地址,确定该第一权值数据集合中的每个权值数据的地址。该压缩模块将获取到的第一特征数据集合、第一权值数据集合、该第一特征数据集合中的每个特征数据的地址以及该第一权值数据集合中的每个权值数据的地址发送至存储模块,由该存储模块保存。若稀疏化后的权值数据少于n×m,则将剩余位补0。Optionally, in some embodiments, the first feature data set is a feature data set obtained by thinning the second feature data set. The first weight data set is a weight data set obtained after thinning. The data processing apparatus 200 shown in FIG. 2 may further include a compression module. The compression module is configured to obtain a second feature data set, and perform thinning processing on the second feature data set to obtain the first feature data set. The second feature data set includes feature data corresponding to the input data. The compression module is further configured to obtain a second weight data set, and perform thinning processing on the second weight data set to obtain the first weight data set. The compression module is further configured to determine an address of each feature data in the first feature data set, and determine an address of each weight data in the first weight data set. The compression module will obtain the first feature data set, the first weight data set, the address of each feature data in the first feature data set, and the weight of each weight data in the first weight data set. The address is sent to the storage module and saved by the storage module. If the thinned weight data is less than n × m, the remaining bits are padded with zeros.
本申请实施例中所称的输入数据可以是任何能够进行乘法运算、笛卡尔积运算和/或卷积运算的数据。例如,可以是图像数据、语音数据等。输入数据是输入到数据处理装置的全部数据的统称。输入数据可以由特征数据组成。对应于该输入数据的特征数据可以是该输入数据包括的全部数据,也可以是该输入数据的部分特征数据。以图像数据为例,假设输入数据是一整幅图像,该图像的所有数据被称为特征数据。该第二权值数据集合可以包括该输入数据的全部特征数据,也可以是该图像经过一些处理后的全部或部分特征数据。例如,该第二权值数据可以是该图像经过分割后得到的部分图像的特征数 据。The input data referred to in the embodiments of the present application may be any data capable of performing a multiplication operation, a Cartesian product operation, and / or a convolution operation. For example, it may be image data, voice data, and the like. The input data is a collective term for all data input to the data processing device. The input data may consist of characteristic data. The feature data corresponding to the input data may be all data included in the input data, or may be part of the feature data of the input data. Taking image data as an example, assuming that the input data is an entire image, all the data of the image is called feature data. The second weight data set may include all feature data of the input data, or may be all or part of the feature data of the image after some processing. For example, the second weight data may be feature data of a partial image obtained after the image is segmented.
假设第二特征数据集合包括:5,0,0,32,0,0,0,0,23,0,0,0,0,0,43,54,0,0,0,1,4,9,34,0,0,0,0,0,0,87,0,0,0,0,5,8,则稀疏化后得到的第一特征数据集合包括:5,32,23,43,54,1,4,9,34,87,5,8。假设该第二特征数据集合中的第一个特征数据的地址为0,第二个特征数据的地址为1,第三个特征数据的地址为2,第n个特征数据的地址为n-1,则该第一特征数据集合的地址(绝对地址)为:0,3,8,14,15,19,20,21,22,29,34,35。Assume that the second feature data set includes: 5, 0, 0, 32, 0, 0, 0, 0, 23, 0, 0, 0, 0, 0, 43, 54, 0, 0, 0, 1, 4, 9,34,0,0,0,0,0,0,87,0,0,0,0,5,8, then the first feature data set obtained after thinning includes: 5, 32, 23, 43 , 54,1,4,9,34,87,5,8. Assume that the address of the first feature data in the second feature data set is 0, the address of the second feature data is 1, the address of the third feature data is 2, and the address of the nth feature data is n-1. , The address (absolute address) of the first feature data set is: 0, 3, 8, 14, 15, 19, 20, 21, 22, 29, 34, 35.
假设第二权值数据集合包括8,4,0,0,0,0,2,0,0,0,0,0,0,0,0,0,24,54,0,0,0,0,0,12,0,0,22,3,45,0,0,0,0,67,44,0,0,0,0,0,0,0,0,35,65,75,则稀疏化后的第二权值数据集合包括:8,4,2,24,54,12,22,3,45,67,44,35,65,75。可以看出稀疏化后的第二权值数据集合包括14个权值数据。假设每个数据计算阵列包括3×3个数据计算单元。因此稀疏化后的第二权值数据集合的权值数据数目少于2个数据计算阵列包括的数据计算单元数目。因此,在稀疏化后的第二权值数据集合最后补充4个0,得到该第一权值数据集合。因此,对应于该第二权值数据的第一权值数据的集合为:8,4,2,24,54,12,22,3,45,67,44,35,65,75,0,0,0,0。假设该第二权值数据集合中的第一个权值数据的地址为0,第二个权值数据的地址为1,第三个权值数据的地址为2,第n个权值数据的地址为n-1,则该第一权值数据集合的地址(绝对地址)为:0,1,6,16,17,23,26,27,28,33,34,43,44,45。Assume that the second weight data set includes 8, 4, 0, 0, 0, 0, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 24, 54, 0, 0, 0, 0,0,12,0,0,22,3,45,0,0,0,0,67,44,0,0,0,0,0,0,0,0,35,65,75, Then the second weighted data set after sparseness includes: 8, 4, 2, 24, 54, 12, 22, 3, 45, 67, 44, 35, 65, 75. It can be seen that the thinned second weight data set includes 14 weight data. It is assumed that each data calculation array includes 3 × 3 data calculation units. Therefore, the number of weight data of the second thinned data set after sparseness is less than the number of data calculation units included in the two data calculation arrays. Therefore, 4 zeros are finally added to the second sparse weighted data set to obtain the first weighted data set. Therefore, the set of first weight data corresponding to the second weight data is: 8, 4, 2, 24, 54, 12, 22, 3, 45, 67, 44, 35, 65, 75, 0, 0,0,0. Assume that the address of the first weight data in the second weight data set is 0, the address of the second weight data is 1, the address of the third weight data is 2, and the address of the nth weight data is If the address is n-1, the address (absolute address) of the first weight data set is: 0, 1, 6, 16, 17, 23, 26, 27, 28, 33, 34, 43, 44, 45.
在一些实施例中,该第一特征数据集合也可以是未经过稀疏化处理的特征数据集合。换句话说,该第一特征数据集合可以与第二特征数据集合相等。In some embodiments, the first feature data set may also be a feature data set that has not been thinned. In other words, the first feature data set may be equal to the second feature data set.
上述实施例中的第一特征数据集合对应于一个矩阵,相应的,用于对该第一特征数据集合进行卷积运算的权值数据也是对应于一个矩阵。换句话说,以上实施例所描述的卷积运算是二维卷积运算。The first feature data set in the above embodiment corresponds to a matrix, and accordingly, the weight data used to perform the convolution operation on the first feature data set also corresponds to a matrix. In other words, the convolution operation described in the above embodiment is a two-dimensional convolution operation.
本申请实施例的技术方案也可以应用到T维乘法运算、笛卡尔积运算和/或卷积运算(T为大于或等于3的正整数)。此外,用于对第一特征数据集合进行卷积运算的权值数据集合也可以是多个。The technical solutions of the embodiments of the present application can also be applied to T-dimensional multiplication operations, Cartesian product operations, and / or convolution operations (T is a positive integer greater than or equal to 3). In addition, there may be multiple weight data sets for performing a convolution operation on the first feature data set.
下面以三维卷积运算为例对本申请的技术方案进行描述。The technical solution of the present application is described below by taking a three-dimensional convolution operation as an example.
若对应于第一特征数据集合的输入数据是一个彩色图片数据,则该第一特征数据集合可以是一个三维张量。可以对该第一特征数据集合进行三维卷积运算。If the input data corresponding to the first feature data set is a color picture data, the first feature data set may be a three-dimensional tensor. A three-dimensional convolution operation may be performed on the first feature data set.
该第一特征数据集合包括三个子集合:特征数据子集合1、特征数据子集合2和特征数据子集合3。该三个子集合的特征数据分别对应于红、绿、蓝三个输入通道。该三个子集合中的每个子集合中的特征数据可以对应于一个矩阵。The first feature data set includes three subsets: feature data subset 1, feature data subset 2, and feature data subset 3. The feature data of the three subsets correspond to the three input channels of red, green, and blue, respectively. The feature data in each of the three subsets may correspond to a matrix.
假设使用三个权值数据集合对该第一特征数据集合进行三维卷积运算。用于对特征数据集合进行卷积运算的权值数据集合也可以称为过滤器(Filter)。因此,该三个权值数据集合可以称为过滤器1、过滤器2和过滤器3。该三个权值数据集合中的每个权值数据集合包括三个权值通道,分别为通道1、通道2和通道3。三个权值通道中的每个权值通道所包括的权值数据可以对应于一个矩阵。该三个权值通道与三个特征数据子集一一对应。例如,通道1对应于特征数据子集合1,通道2对应于特征数据子集合2,通道 3对应于特征数据子集合3。权值通道可以对对应的特征数据子集合进行卷积运算。过滤器1、过滤器2和过滤器3可以分别对该第一特征数据集合进行三维卷积运算。也就是说,过滤器1的通道1对该第一特征数据集合的特征数据子集合1进行卷积运算,过滤器1的通道2对该第一特征数据集合的特征数据子集合2进行卷积运算,过滤器1的通道3对该第一特征数据集合的特征数据子集合3进行卷积运算;过滤器2的通道1对该第一特征数据集合的特征数据子集合1进行卷积运算,过滤器2的通道2对该第一特征数据集合的特征数据子集合2进行卷积运算,过滤器2的通道3对该第一特征数据集合的特征数据子集合3进行卷积运算;过滤器3的通道1对该第一特征数据集合的特征数据子集合1进行卷积运算,过滤器3的通道2对该第一特征数据集合的特征数据子集合2进行卷积运算,过滤器3的通道3对该第一特征数据集合的特征数据子集合3进行卷积运算。It is assumed that a three-dimensional convolution operation is performed on the first feature data set using three weight data sets. The weight data set used to perform the convolution operation on the feature data set may also be referred to as a filter. Therefore, the three weight data sets can be referred to as filter 1, filter 2, and filter 3. Each of the three weight data sets includes three weight channels, namely channel 1, channel 2 and channel 3. The weight data included in each of the three weight channels may correspond to a matrix. The three weight channels correspond one-to-one with the three feature data subsets. For example, channel 1 corresponds to feature data subset 1, channel 2 corresponds to feature data subset 2, and channel 3 corresponds to feature data subset 3. The weight channel can perform a convolution operation on the corresponding feature data subset. The filter 1, filter 2 and filter 3 may respectively perform a three-dimensional convolution operation on the first feature data set. That is, channel 1 of filter 1 performs a convolution operation on the characteristic data subset 1 of the first characteristic data set, and channel 2 of filter 1 performs a convolution on the characteristic data subset 2 of the first characteristic data set. Operation, channel 3 of filter 1 performs a convolution operation on the characteristic data subset 3 of the first characteristic data set; channel 1 of filter 2 performs a convolution operation on the characteristic data subset 1 of the first characteristic data set, Channel 2 of filter 2 performs a convolution operation on the feature data subset 2 of the first feature data set, and channel 3 of filter 2 performs a convolution operation on the feature data subset 3 of the first feature data set; filter Channel 1 of 3 performs a convolution operation on the feature data subset 1 of the first feature data set, and channel 2 of filter 3 performs a convolution operation on the feature data subset 2 of the first feature data set. Channel 3 performs a convolution operation on the feature data subset 3 of the first feature data set.
由此可见,三个过滤器中的每个过滤器对该第一特征数据集合进行三维卷积运算的过程可以分解为三个二维卷积运算过程。这三个二维卷积运算的具体实现方式与上述实施例中二维卷积运算的具体实现方式类似。以通道1对特征数据子集合1进行卷积运算为例,通道1可以认为是如图1所示的权值数据集合,特征数据子集合1可以认为是如图1所示的特征数据集合。通道1对特征数据子集合进行卷积运算的过程就是如图1所示的权值数据集合对特征数据集合进行卷积运算的过程。如上所述,卷积运算过程可以分解为乘法运算和加法运算。因此,如图2所示的数据处理装置也可以进行三维卷积运算。在特征数据集合对应的输入数据是一个三维张量的情况下,上述实施例中所称的第一特征数据集合可以认为是对应于三维张量的特征数据集合中的一个特征数据子集合。在使用多个权值数据集合对该特征数据集合进行卷积运算的情况下,该第一权值数据集合可以认为多个权值数据集合中的一个权值数据集合。在该权值数据集合也对应于三维张量的情况下,该第一权值数据集合可以认为是三维张量的权值数据集合中的一个通道。It can be seen that the process of performing a three-dimensional convolution operation on the first feature data set by each of the three filters can be decomposed into three two-dimensional convolution operation processes. The specific implementations of the three two-dimensional convolution operations are similar to the specific implementations of the two-dimensional convolution operation in the foregoing embodiment. Taking channel 1 for convolution operation on the characteristic data subset 1 as an example, channel 1 can be considered as the weight data set shown in FIG. 1, and the characteristic data subset 1 can be considered as the characteristic data set shown in FIG. 1. The process of performing a convolution operation on the feature data subset by channel 1 is a process of performing a convolution operation on the feature data set by the weight data set shown in FIG. 1. As described above, the convolution operation process can be decomposed into a multiplication operation and an addition operation. Therefore, the data processing apparatus shown in FIG. 2 can also perform a three-dimensional convolution operation. In the case where the input data corresponding to the feature data set is a three-dimensional tensor, the first feature data set referred to in the above embodiment may be considered as a feature data subset in the feature data set corresponding to the three-dimensional tensor. In a case where a convolution operation is performed on the feature data set using multiple weight data sets, the first weight data set may be considered as one weight data set among the multiple weight data sets. In the case where the weight data set also corresponds to a three-dimensional tensor, the first weight data set can be considered as a channel in the weight data set of the three-dimensional tensor.
三维以上的多维卷积运算的过程与三维卷积运算过程类似,在此就不必重复描述。The process of multi-dimensional convolution operations above 3D is similar to the process of 3D convolution operations, and it is unnecessary to repeat the description here.
可选的,在使用多个权值数据集合对该特征数据集合进行卷积运算的情况下,该第一权值数据集合还可以是多个权值数据集合进行稀疏化处理后得到的权值数据集合。具体地,该第一权值数据集合所包括的非0权值数据来自于同一个权值数据集合的一个通道或者不同权值数据集合的同一个通道。Optionally, in the case of performing a convolution operation on the feature data set by using multiple weight data sets, the first weight data set may also be a weight value obtained by performing thinning processing on multiple weight data sets. Data collection. Specifically, the non-zero weight data included in the first weight data set comes from one channel of the same weight data set or the same channel of different weight data sets.
下面将结合图11对多个权值数据集合进行稀疏化处理进行描述。The following describes the thinning processing performed on multiple weight data sets with reference to FIG. 11.
图11是本申请实施例提供的具有3个过滤器并进行稀疏化处理的权值矩阵示意图。如图11所示的3个过滤器中每个过滤器包括3个权值通道,每个权值通道包括3×3个权值数据。FIG. 11 is a schematic diagram of a weight matrix with three filters and thinning processing provided in an embodiment of the present application. Each of the three filters shown in FIG. 11 includes three weight channels, and each weight channel includes 3 × 3 weight data.
如图11所示,权值数据集合1的权值数据来自于过滤器1和过滤器2的通道1中的权值数据,权值数据集合4的权值数据来自于过滤器2和过滤器3的通道1中的权值数据。权值数据集合2的权值数据来自于过滤器1和过滤器2的通道2中的权值数据,权值数据集合5的权值数据来自于过滤器2和过滤器3的通道2中的权值数据。权值数据集合3的权值数据来自于过滤器1和过滤器2的通道3中的权值数据,权值数据集合6的权值数据来自于过滤器2和过滤器3的通道3中的权值数据。As shown in FIG. 11, the weight data of weight data set 1 comes from the weight data in channel 1 of filter 1 and filter 2, and the weight data of weight data set 4 comes from filter 2 and filter Weight data in channel 1 of 3. The weight data of weight data set 2 comes from the weight data in filter 1 and channel 2 of filter 2, and the weight data of weight data set 5 comes from the filter 2 and channel 2 of filter 3. Weight data. The weight data of the weight data set 3 comes from the weight data in channel 3 of filter 1 and filter 2, and the weight data of the weight data set 6 comes from the channel 3 of filter 2 and filter 3. Weight data.
如图11所示,权值数据来自于不同权值数据的同一个通道是指权值数据可以是属于 不同过滤器的,但是在不同的过滤器中的通道是相同的。如权值数据集合4的权值数据来自于过滤器2的通道1中的权值数据以及过滤器3的通道1中的权值数据。As shown in Figure 11, the same channel with weight data from different weight data means that the weight data can belong to different filters, but the channels in different filters are the same. For example, the weight data of the weight data set 4 comes from the weight data in channel 1 of filter 2 and the weight data in channel 1 of filter 3.
为便于描述,以下将对多个过滤器中的权值数据进行稀疏化处理后得到的权值数据集合称为稀疏化权值数据集合)For the convenience of description, the weight data set obtained by thinning the weight data in multiple filters is hereinafter referred to as the sparse weight data set)
在一些实施例中,稀疏化权值数据集合包括的权值数据可以来自于同一个过滤器。该稀疏化权值数据集合对特征数据进行乘法运算的运算过程,以及根据乘法运算的运算结果确定该稀疏化权值数据集合与特征数据的卷积运算结果的过程与上述实施例相同,在此就不必赘述。In some embodiments, the weight data included in the sparse weight data set may come from the same filter. The process of multiplying feature data by the sparse weighted data set and the process of determining the result of the convolution operation of the set of sparsely weighted data set and feature data according to the operation result of the multiplication are the same as the above embodiments, here No need to repeat them.
在一些实施例中,稀疏化权值数据集合包括的权值数据可以来自于不同的过滤器。该稀疏化权值数据集合对特征数据进行乘法运算的运算过程与上述实施例相同,在此就不必赘述。在稀疏化权值数据集合包括的权值数据可以来自于不同的过滤器的情况下,根据乘法运算的运算结果确定该稀疏化权值数据集合与特征数据的卷积运算结果的过程与上述实施例并不完全相同。In some embodiments, the weight data included in the sparse weight data set may come from different filters. The operation process of multiplying the feature data by the sparse weighted data set is the same as the above embodiment, and it is unnecessary to repeat it here. In the case that the weight data included in the thinning weight data set can come from different filters, the process of determining the convolution operation result of the thinning weight data set and the characteristic data according to the operation result of the multiplication operation and the above implementation The examples are not exactly the same.
具体地,假设该稀疏化权值数据集合包括的权值数据来自于P个过滤器(P为大于或等于2的正整数)。该稀疏化权值数据集合可以划分为P个稀疏化权值数据子集合,该P个稀疏化权值数据子集合的第p个稀疏化权值数据子集合包括来自于该P个过滤器中的第p个过滤器的权值数据,p=1,……,P。假设第p个稀疏化权值数据子集合包括Num p个权值数据,其中Num p为大于或等于1的正整数,且Num p小于n×m。 Specifically, it is assumed that the weight data included in the thinning weight data set comes from P filters (P is a positive integer greater than or equal to 2). The sparseness weight data set can be divided into P sparseness weight data subsets, and the p-th sparseness weight data subset of the P sparseness weight data subsets includes data from the P filters. The weight data of the p-th filter, p = 1,..., P. Assume that the p-th sparse thinned weight data subset includes Num p weight data, where Num p is a positive integer greater than or equal to 1, and Num p is less than n × m.
利用该N个数据计算阵列对该稀疏化权值数据集合与该特征数据集合进行笛卡尔积运算,可以得到每个过滤器与该特征数据集合进行卷积运算所需的乘法结果,再将相应的乘法结果相加,则可以得到每个过滤器与该特征数据集合的卷积运算结果。By using the N data calculation arrays to perform a Cartesian product operation between the sparse weighted data set and the feature data set, the multiplication result required for each filter to perform a convolution operation with the feature data set, and then correspondingly By adding the multiplication results of, you can get the convolution operation result of each filter and the feature data set.
还以图9和图10所示的三个数据计算阵列保存的权值数据为例。假设图9和图10所示的权值数据是基于图12所示的过滤器1的通道1的权值数据与过滤器2的通道1的权值数据进行稀疏化后得到的。利用图10所示的三个数据计算阵列对{a 11,a 12,a 13,a 21,a 22,a 23,a 31,a 32,a 33}进行笛卡尔积运算,可以得到如下运算结果:a 11×b 11、a 12×b 12、a 13×b 13、a 21×b 21、a 22×b 22、a 23×b 23、a 11×b 31、a 12×b 32、a 13×b 33。可以看出a 11×b 11、a 12×b 12、a 13×b 13、a 21×b 21、a 22×b 22、a 23×b 23的和为过滤器1的通道1的权值数据对{a 11,a 12,a 13,a 21,a 22,a 23,a 31,a 32,a 33}进行卷积运算的运算结果;a 11×b 31、a 12×b 32、a 13×b 33的和为过滤器2的通道1的权值数据对{a 11,a 12,a 13,a 21,a 22,a 23,a 31,a 32,a 33}进行卷积运算的运算结果。 The weight data stored in the three data calculation arrays shown in FIG. 9 and FIG. 10 are also taken as an example. It is assumed that the weight data shown in FIG. 9 and FIG. 10 are obtained by thinning the weight data of channel 1 of filter 1 and the weight data of channel 1 of filter 2 shown in FIG. 12. Using the three data calculation arrays shown in FIG. 10 to perform a Cartesian product operation on {a 11 , a 12 , a 13 , a 21 , a 22 , a 23 , a 31 , a 32 , a 33 }, the following operation can be obtained: Results: a 11 × b 11 , a 12 × b 12 , a 13 × b 13 , a 21 × b 21 , a 22 × b 22 , a 23 × b 23 , a 11 × b 31 , a 12 × b 32 , a 13 × b 33 . It can be seen that the sum of a 11 × b 11 , a 12 × b 12 , a 13 × b 13 , a 21 × b 21 , a 22 × b 22 , and a 23 × b 23 is the weight of channel 1 of filter 1 Data results of convolution operations on {a 11 , a 12 , a 13 , a 21 , a 22 , a 23 , a 31 , a 32 , a 33 }; a 11 × b 31 , a 12 × b 32 , The sum of a 13 × b 33 is the weight data of channel 1 of filter 2 to convolve {a 11 , a 12 , a 13 , a 21 , a 22 , a 23 , a 31 , a 32 , a 33 } The result of the operation.
进一步,该压缩模块还可以对目标数据集合进行稀疏化处理,将该目标数据集合中的0删除。Further, the compression module may also perform thinning processing on the target data set, and delete 0 in the target data set.
通过上述技术方案,可以得到第一特征数据集合中的每个特征数据与第一权值数据集合中的每个权值数据的乘积。在此之后,可以将对应的乘积结果相加就可以得到该第一特征数据集合与该第一权值数据集合的卷积运算结果。Through the above technical solution, a product of each feature data in the first feature data set and each weight data in the first weight data set can be obtained. After that, the corresponding product results can be added to obtain the convolution operation result of the first feature data set and the first weight data set.
此外,上述实施例中,在根据笛卡尔积的运算结果和地址运算结果确定该第一特征数据集合与该第一权值集合的卷积运算结果的过程中,数据计算阵列中的每个数据计算单元是将权值数据与特征数据的乘积与对应的地址计算单元确定的目标地址中保存的数据相加,并将相加后的数据写回到该目标地址。这样,该目标地址最终保存的结果就是卷积运算结果。In addition, in the above embodiment, in the process of determining a convolution operation result of the first feature data set and the first weight value set according to the calculation result of the Cartesian product and the address operation result, each data in the data calculation array The calculation unit adds the product of the weight data and the characteristic data to the data stored in the target address determined by the corresponding address calculation unit, and writes the added data back to the target address. In this way, the final saved result of the target address is the result of the convolution operation.
在另一些实施例中,数据计算阵列中的每个数据计算单元可以只进行乘法运算,即将数据与特征数据相乘,并将相乘的结果保存到对应的地址计算单元确定的目标地址,然后从对应的目标地址获取乘法结果,并将获取到的乘法结果相加,得到对应的卷积运算结果。例如,a 11×b 11的结果保存在目标地址1,a 21×b 21的结果保存在目标地址2,a 31×b 31的结果保存在目标地址3,a 12×b 12的结果保存在目标地址4,a 22×b 22的结果保存在目标地址5,a 32×b 32的结果保存在目标地址6,a 13×b 13的结果保存在目标地址7,a 23×b 23的结果保存在目标地址8,a 33×b 33的结果保存在目标地址9。在计算卷积结果时,可以将保存在目标地址1至目标地址9中的数据相加,即得到如公式1.1所示的c 11In other embodiments, each data calculation unit in the data calculation array may perform only a multiplication operation, that is, multiply the data with the characteristic data, and save the multiplied result to the target address determined by the corresponding address calculation unit, and then The multiplication result is obtained from the corresponding target address, and the obtained multiplication results are added to obtain the corresponding convolution operation result. For example, the result of a 11 × b 11 is stored in target address 1, the result of a 21 × b 21 is stored in target address 2, the result of a 31 × b 31 is stored in target address 3, and the result of a 12 × b 12 is stored in The result at destination address 4, a 22 × b 22 is stored at destination address 5, the result at a 32 × b 32 is stored at destination address 6, the result at a 13 × b 13 is stored at destination address 7, and the result at a 23 × b 23 The result stored at destination address 8, a 33 × b 33 is stored at destination address 9. When calculating the convolution result, the data stored in the target address 1 to the target address 9 can be added to obtain c 11 as shown in formula 1.1.
在另一些实施例中,存储模块可以包括一个加法单元。数据计算阵列中的每个数据计算单元可以只进行乘法运算,即将数据与特征数据相乘,并将相乘的结果输出到存储模块,存储模块在将接收到的数据存储到与该数据计算单元对应的地址计算单元确定的目标地址时,先将接收到的数据与该目标地址中保存的数据相加,将相加后的数据保存到该目标地址。这样,该目标地址最终保存的结果就是卷积运算的结果。In other embodiments, the storage module may include an addition unit. Each data calculation unit in the data calculation array can only perform multiplication operations, that is, multiply the data with the characteristic data, and output the result of the multiplication to the storage module. The storage module stores the received data to the data calculation unit. When the corresponding address calculation unit determines the target address, the received data is first added to the data stored in the target address, and the added data is saved to the target address. In this way, the final saved result of the target address is the result of the convolution operation.
图13是根据本申请实施例提供的一种数据处理方法的示意性流程图。图13所示的方法可以由图2或图14所示的数据处理装置执行。FIG. 13 is a schematic flowchart of a data processing method according to an embodiment of the present application. The method shown in FIG. 13 may be executed by the data processing apparatus shown in FIG. 2 or FIG. 14.
1301,获取第一权值数据集合中的第一权值矩阵,其中,该第一权值矩阵被表示为n行m列个权值数据,该第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数。1301: Obtain a first weight matrix in a first weight data set, where the first weight matrix is represented as n rows and m columns of weight data, and the data in the first weight data set is from the same Input channel, where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2.
1302,获取第二权值矩阵,其中,该第二权值矩阵是对该第一权值矩阵进行按行重排后的矩阵。1302: Obtain a second weight matrix, where the second weight matrix is a matrix after the first weight matrix is rearranged in rows.
1303,使用第一权值矩阵与第一特征数据集合进行乘法运算,其中,该第一特征数据集合中的数据来自相同的输入通道。1303. Use a first weight matrix to perform a multiplication operation with a first feature data set, where the data in the first feature data set comes from the same input channel.
1304,使用该第二权值矩阵与该第一特征数据集合进行乘法运算。1304. Perform a multiplication operation using the second weight matrix and the first feature data set.
1305,根据该乘法运算的运算结果,确定目标数据集合。1305. Determine a target data set according to an operation result of the multiplication operation.
图13所示方法的各个步骤的具体实现方式,可以参见图2至图12的描述,在此就不必赘述。For a specific implementation manner of each step of the method shown in FIG. 13, reference may be made to the description of FIGS. 2 to 12, and details are not described herein again.
可选的,在一些实施例中,该方法还包括:获取该第一权值矩阵和第二权值矩阵中的权值数据的地址;使用该第一权值矩阵和第二权值矩阵中的权值数据的地址与该第一特征数据集合中的地址进行地址运算;该根据该乘法运算的运算结果,确定目标数据集合,包括:根据该乘法运算的运算结果以及该地址运算的运算结果,确定目标数据集合。上述各个步骤的具体实现方式,也可以参见图2至图12的描述,在此就不必赘述。Optionally, in some embodiments, the method further includes: obtaining addresses of weight data in the first weight matrix and the second weight matrix; using the first weight matrix and the second weight matrix An address operation is performed on the address of the weight data and the address in the first characteristic data set; the determining the target data set according to the operation result of the multiplication operation includes: according to the operation result of the multiplication operation and the operation result of the address operation To determine the target data set. For specific implementation manners of the foregoing steps, reference may also be made to the description of FIG. 2 to FIG. 12, and details are not described herein again.
可选的,在一些实施例中,该方法还包括:获取该第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,该第三权值矩阵至第n权值矩阵为对该第一权值矩阵按行重排后的矩阵,且该第一权值矩阵至第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;获取该第三权值矩阵至第n权值矩阵中的权值数据的地址;使用该第三至第n权值矩阵的权值数据的地址与该第一特征数据集合中的特征数据的地址进行地址运算。上述各个步骤的具体实现方式,也可以参见图2至图12的描述,在此就不必赘述。Optionally, in some embodiments, the method further includes: obtaining a third weight matrix to an n-th weight matrix in the first weight data set, wherein the third weight matrix to the n-th weight matrix The matrix is a matrix in which the first weight matrix is rearranged in rows, and any two row vectors of n row vectors in the same row of the first weight matrix to the nth weight matrix are different; obtain The addresses of the weight data in the third to n-th weight matrices; using the addresses of the weight data in the third to n-th weight matrices and the addresses of the feature data in the first feature data set Address calculation. For specific implementation manners of the foregoing steps, reference may also be made to the description of FIG. 2 to FIG. 12, and details are not described herein again.
可选的,在一些实施例中,该目标数据集合包括结果矩阵,该结果矩阵是该第一特 征数据集合与该第一权值数据集合进行卷积运算的结果,该第一特征数据集合被表示为第一特征矩阵,该方法还包括:根据该每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、对应于该第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,该权值尺寸为n行m列,该填充尺寸为该第一特征数据集合的尺寸与该结果矩阵的尺寸的差值。上述各个步骤的具体实现方式,也可以参见图2至图12的描述,在此就不必赘述。Optionally, in some embodiments, the target data set includes a result matrix, which is a result of a convolution operation performed on the first feature data set and the first weight data set, and the first feature data set is Expressed as a first feature matrix, the method further includes: calculating an address of weight data stored in the array, an address of a first feature data set, a size corresponding to the first feature matrix, a padding size, and a weight value according to the each address The size determines the first target address, where the weight size is n rows and m columns, and the padding size is the difference between the size of the first feature data set and the size of the result matrix. For specific implementation manners of the foregoing steps, reference may also be made to the description of FIG. 2 to FIG. 12, and details are not described herein again.
可选的,在一些实施例中,该方法还包括:获取第二特征数据集合,将该第二特征数据集合中值为0的元素去除得到该第一特征数据集合;获取第二权值数据集合,将该第二权值数据集合中值为0的元素去除得到该第一权值数据集合;确定该第一特征数据集合中的每个特征数据的地址,确定该第一权值数据集合中的每个权值的地址。上述各个步骤的具体实现方式,也可以参见图2至图12的描述,在此就不必赘述。Optionally, in some embodiments, the method further includes: obtaining a second feature data set, removing elements with a value of 0 in the second feature data set to obtain the first feature data set; obtaining second weight data Set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining the address of each feature data in the first feature data set, and determining the first weight data set The address of each weight in. For specific implementation manners of the foregoing steps, reference may also be made to the description of FIG. 2 to FIG. 12, and details are not described herein again.
图14是本申请实施例提供一种数据处理装置的结构框图。如图14所示的数据处理装置1400包括:数据处理模块1401和控制模块1404,数据处理模块1401包括N个数据计算单元,N为大于或等于2的整数,其中:数据处理模块1401,用于获取第一权值数据集合中的第一权值矩阵,其中,该第一权值矩阵被表示为n行m列个权值数据,该第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数;获取第二权值矩阵,其中,该第二权值矩阵是对该第一权值矩阵进行按行重排后的矩阵;使用第一权值矩阵与第一特征数据集合进行乘法运算,其中,该第一特征数据集合中的数据来自相同的输入通道;使用该第二权值矩阵与该第一特征数据集合进行乘法运算;控制模块1404用于,根据该乘法运算的运算结果,确定目标数据集合。FIG. 14 is a structural block diagram of a data processing apparatus according to an embodiment of the present application. The data processing device 1400 shown in FIG. 14 includes a data processing module 1401 and a control module 1404. The data processing module 1401 includes N data calculation units, where N is an integer greater than or equal to 2, where: the data processing module 1401 is used for: Obtain a first weight matrix in a first weight data set, where the first weight matrix is represented as n rows and m columns of weight data, and the data in the first weight data set is from the same input channel Where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2; a second weight matrix is obtained, where the second weight matrix is after the first weight matrix is rearranged in rows. Matrix; use the first weight matrix to multiply with the first feature data set, where the data in the first feature data set comes from the same input channel; use the second weight matrix and the first feature data set Perform a multiplication operation; the control module 1404 is configured to determine a target data set according to an operation result of the multiplication operation.
可选的,在一些实施例中,数据处理装置1400还包括地址处理模块1402,地址处理模块1402包括N个地址计算单元,该数据计算单元和地址计算单元一一对应,其中:地址处理模块1402用于:获取该第一权值矩阵和第二权值矩阵中的权值数据的地址;使用该第一权值矩阵和第二权值矩阵中的权值数据的地址与该第一特征数据集合中的地址进行地址运算;控制模块1404用于根据该乘法运算的运算结果,确定目标数据集合包括:根据该乘法运算的运算结果以及该地址运算的运算结果,确定目标数据集合。Optionally, in some embodiments, the data processing device 1400 further includes an address processing module 1402, and the address processing module 1402 includes N address calculation units. The data calculation unit and the address calculation unit correspond one-to-one, where: the address processing module 1402 Configured to: obtain the addresses of the weight data in the first weight matrix and the second weight matrix; use the addresses of the weight data in the first weight matrix and the second weight matrix and the first feature data The address in the set performs an address operation; the control module 1404 is configured to determine the target data set according to the operation result of the multiplication operation, and includes: determining the target data set according to the operation result of the multiplication operation and the operation result of the address operation.
可选的,在一些实施例中,数据处理模块1401,还用于:获取该第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,该第三权值矩阵至第n权值矩阵为对该第一权值矩阵按行重排后的矩阵,且该第一权值矩阵至第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;地址处理模块1402,还用于:获取该第三权值矩阵至第n权值矩阵中的权值数据的地址;使用该第三至第n权值矩阵的权值数据的地址与该第一特征数据集合中的特征数据的地址进行地址运算。Optionally, in some embodiments, the data processing module 1401 is further configured to obtain a third weight matrix to an n-th weight matrix in the first weight data set, where the third weight matrix is to The n-th weight matrix is a matrix after the first weight matrix is rearranged in rows, and any two of the n row vectors in the same row of the first weight matrix to the n-th weight matrix are in any two row vectors. Not the same; the address processing module 1402 is further configured to: obtain the addresses of the weight data in the third to n-th weight matrix; use the addresses of the weight data in the third to n-th weight matrix and The address of the feature data in the first feature data set is subjected to an address operation.
可选的,在一些实施例中,该目标数据集合包括结果矩阵,该结果矩阵是该第一特征数据集合与该第一权值数据集合进行卷积运算的结果,该第一特征数据集合被表示为第一特征矩阵,地址处理模块1402,还用于根据该每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、对应于该第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,该权值尺寸为n行m列,该填充尺寸为该第一特征数据集合的尺寸与该结果矩阵的尺寸的差值。Optionally, in some embodiments, the target data set includes a result matrix, which is a result of a convolution operation performed on the first feature data set and the first weight data set, and the first feature data set is Represented as a first feature matrix, the address processing module 1402 is further configured to calculate an address of the weight data stored in the array, an address of a first feature data set, a size corresponding to the first feature matrix, and a padding size according to each address. And the weight size to determine the first target address, where the weight size is n rows and m columns, and the padding size is the difference between the size of the first feature data set and the size of the result matrix.
可选的,在一些实施例中,数据处理装置1400还包括压缩模块1403,用于:获取第二特征数据集合,将该第二特征数据集合中值为0的元素去除得到该第一特征数据集合;获取第二权值数据集合,将该第二权值数据集合中值为0的元素去除得到该第一权值数据集合;确定该第一特征数据集合中的每个特征数据的地址,确定该第一权值数据集合中的每个权值的地址。Optionally, in some embodiments, the data processing device 1400 further includes a compression module 1403, configured to: obtain a second feature data set, and remove elements having a value of 0 in the second feature data set to obtain the first feature data A set; obtaining a second weight data set, removing elements with a value of 0 in the second weight data set to obtain the first weight data set; determining an address of each feature data in the first feature data set, An address for each weight in the first weight data set is determined.
图14所示的数据处理装置1400中各个模块的具体功能和有益效果,可以参见图2至图12的描述,在此就不必赘述。For specific functions and beneficial effects of each module in the data processing apparatus 1400 shown in FIG. 14, reference may be made to the description of FIGS. 2 to 12, and details are not described herein.
在本申请实施例中,终端设备或网络设备包括硬件层、运行在硬件层之上的操作系统层,以及运行在操作系统层上的应用层。该硬件层包括中央处理器(central processing unit,CPU)、内存管理单元(memory management unit,MMU)和内存(也称为主存)等硬件。该操作系统可以是任意一种或多种通过进程(process)实现业务处理的计算机操作系统,例如,Linux操作系统、Unix操作系统、Android操作系统、iOS操作系统或windows操作系统等。该应用层包含浏览器、通讯录、文字处理软件、即时通信软件等应用。并且,本申请实施例并未对本申请实施例提供的方法的执行主体的具体结构特别限定,只要能够通过运行记录有本申请实施例的提供的方法的代码的程序,以根据本申请实施例提供的方法进行通信即可,例如,本申请实施例提供的方法的执行主体可以是终端设备或网络设备,或者,是终端设备或网络设备中能够调用程序并执行程序的功能模块。In the embodiment of the present application, the terminal device or the network device includes a hardware layer, an operating system layer running on the hardware layer, and an application layer running on the operating system layer. This hardware layer includes hardware such as a central processing unit (CPU), a memory management unit (MMU), and a memory (also called main memory). The operating system may be any one or more computer operating systems that implement business processing through processes, such as a Linux operating system, a Unix operating system, an Android operating system, an iOS operating system, or a windows operating system. This application layer contains applications such as browsers, address books, word processing software, and instant messaging software. In addition, the embodiment of the present application does not specifically limit the specific structure of the execution subject of the method provided by the embodiment of the present application, as long as the program that records the code of the method provided by the embodiment of the application can be run to provide the program according to the embodiment of the application. The communication may be performed by using the method described above. For example, the method execution subject provided in the embodiments of the present application may be a terminal device or a network device, or a function module in the terminal device or the network device that can call a program and execute the program.
另外,本申请的各个方面或特征可以实现成方法、装置或使用标准编程和/或工程技术的制品。本申请中使用的术语“制品”涵盖可从任何计算机可读器件、载体或介质访问的计算机程序。例如,计算机可读介质可以包括,但不限于:磁存储器件(例如,硬盘、软盘或磁带等),光盘(例如,压缩盘(compact disc,CD)、数字通用盘(digital versatile disc,DVD)等),智能卡和闪存器件(例如,可擦写可编程只读存储器(erasable programmable read-only memory,EPROM)、卡、棒或钥匙驱动器等)。另外,本文描述的各种存储介质可代表用于存储信息的一个或多个设备和/或其它机器可读介质。术语“机器可读介质”可包括但不限于,无线信道和能够存储、包含和/或承载指令和/或数据的各种其它介质。In addition, various aspects or features of the present application may be implemented as a method, apparatus, or article of manufacture using standard programming and / or engineering techniques. The term "article of manufacture" as used in this application encompasses a computer program accessible from any computer-readable device, carrier, or medium. For example, computer-readable media may include, but are not limited to: magnetic storage devices (eg, hard disks, floppy disks, or magnetic tapes, etc.), optical disks (eg, compact discs (CD), digital versatile discs (DVD) Etc.), smart cards and flash memory devices (for example, erasable programmable read-only memory (EPROM), cards, sticks or key drives, etc.). In addition, the various storage media described herein may represent one or more devices and / or other machine-readable media used to store information. The term "machine-readable medium" may include, but is not limited to, wireless channels and various other media capable of storing, containing, and / or carrying instruction (s) and / or data.
本领域普通技术人员可以意识到,结合本文中所公开的实施例描述的各示例的单元及算法步骤,能够以电子硬件、或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those of ordinary skill in the art may realize that the units and algorithm steps of each example described in connection with the embodiments disclosed herein can be implemented by electronic hardware, or a combination of computer software and electronic hardware. Whether these functions are performed in hardware or software depends on the specific application and design constraints of the technical solution. Professional technicians can use different methods to implement the described functions for each specific application, but such implementation should not be considered to be beyond the scope of this application.
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统、装置和单元的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。Those skilled in the art can clearly understand that, for the convenience and brevity of description, the specific working processes of the systems, devices, and units described above can refer to the corresponding processes in the foregoing method embodiments, and are not repeated here.
在本申请所提供的几个实施例中,应该理解到,所揭露的系统、装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置 或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed systems, devices, and methods may be implemented in other ways. For example, the device embodiments described above are only schematic. For example, the division of the unit is only a logical function division. In actual implementation, there may be another division manner. For example, multiple units or components may be combined or Can be integrated into another system, or some features can be ignored or not implemented. In addition, the displayed or discussed mutual coupling or direct coupling or communication connection may be indirect coupling or communication connection through some interfaces, devices or units, which may be electrical, mechanical or other forms.
所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。The units described as separate components may or may not be physically separated, and the components displayed as units may or may not be physical units, may be located in one place, or may be distributed on multiple network units. Some or all of the units may be selected according to actual needs to achieve the objective of the solution of this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each of the units may exist separately physically, or two or more units may be integrated into one unit.
所述功能如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本申请各个实施例所述方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read-only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。If the functions are implemented in the form of software functional units and sold or used as independent products, they can be stored in a computer-readable storage medium. Based on this understanding, the technical solution of this application is essentially a part that contributes to the existing technology or a part of the technical solution can be embodied in the form of a software product. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present application. The aforementioned storage media include: U disks, mobile hard disks, read-only memories (ROMs), random access memories (RAMs), magnetic disks or compact discs and other media that can store program codes .
以上所述,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以所述权利要求的保护范围为准。The above is only a specific implementation of this application, but the scope of protection of this application is not limited to this. Any person skilled in the art can easily think of changes or replacements within the technical scope disclosed in this application. It should be covered by the protection scope of this application. Therefore, the protection scope of this application shall be subject to the protection scope of the claims.

Claims (11)

  1. 一种数据处理装置,其特征在于,所述数据处理装置包括:A data processing device, characterized in that the data processing device includes:
    数据处理模块,用于:获取第一权值数据集合中的第一权值矩阵,其中,所述第一权值矩阵被表示为n行m列个权值数据,所述第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数;A data processing module, configured to obtain a first weight matrix in a first weight data set, where the first weight matrix is represented as n rows and m columns of weight data, and the first weight data The data in the set comes from the same input channel, where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2;
    根据所述第一权值矩阵获取第二权值矩阵,其中,所述第二权值矩阵是对所述第一权值矩阵进行按行重排后的矩阵;Obtaining a second weight matrix according to the first weight matrix, wherein the second weight matrix is a matrix in which the first weight matrix is rearranged in rows;
    使用第一权值矩阵与第一特征数据集合进行第一乘法运算First multiplication using first weight matrix and first feature data set
    使用所述第二权值矩阵与所述第一特征数据集合进行第二乘法运算;Perform a second multiplication operation using the second weight matrix and the first feature data set;
    控制模块,用于根据所述第一乘法运算和所述第二乘法运算的运算结果,确定目标数据集合。The control module is configured to determine a target data set according to an operation result of the first multiplication operation and the second multiplication operation.
  2. 根据权利要求1所述的数据处理装置,其特征在于,所述数据处理装置还包括:The data processing device according to claim 1, wherein the data processing device further comprises:
    地址处理模块,用于:获取所述第一权值矩阵和第二权值矩阵中的权值数据的地址;An address processing module, configured to obtain addresses of weight data in the first weight matrix and the second weight matrix;
    使用所述第一权值矩阵和第二权值矩阵中的权值数据的地址与所述第一特征数据集合中的地址进行地址运算;Perform an address operation using the addresses of the weight data in the first weight matrix and the second weight matrix and the addresses in the first feature data set;
    所述控制模块用于,The control module is used for:
    根据所述乘法运算的运算结果以及所述地址运算的运算结果,确定目标数据集合。A target data set is determined according to an operation result of the multiplication operation and an operation result of the address operation.
  3. 根据权利要求2所述的数据处理装置,其特征在于,The data processing device according to claim 2, wherein:
    所述数据处理模块,还用于:获取所述第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,所述第三权值矩阵至所述第n权值矩阵为对所述第一权值矩阵按行重排后的矩阵,且所述第一权值矩阵至所述第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;The data processing module is further configured to obtain a third weight matrix to an n-th weight matrix in the first weight data set, wherein the third weight matrix to the n-th weight matrix Is a matrix after the first weight matrix is rearranged by rows, and any two row vectors of the n row vectors of the first weight matrix to the nth weight matrix in the same row are not the same;
    所述地址处理模块,还用于:The address processing module is further configured to:
    获取所述第三权值矩阵至第n权值矩阵中的权值数据的地址;Obtaining addresses of weight data in the third weight matrix to the n-th weight matrix;
    使用所述第三至第n权值矩阵的权值数据的地址与所述第一特征数据集合中的特征数据的地址进行地址运算。An address operation is performed using the addresses of the weight data of the third to n-th weight matrixes and the addresses of the feature data in the first feature data set.
  4. 根据权利要求2或3所述的数据处理装置,其特征在于,所述目标数据集合包括结果矩阵,所述结果矩阵是所述第一特征数据集合与所述第一权值数据集合进行卷积运算的结果,所述第一特征数据集合被表示为第一特征矩阵;The data processing device according to claim 2 or 3, wherein the target data set includes a result matrix, and the result matrix is a convolution of the first feature data set and the first weight data set A result of the operation, the first feature data set is represented as a first feature matrix;
    所述地址处理模块,还用于根据所述每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、所述第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,所述权值尺寸为n行m列,所述填充尺寸包括横向填充尺寸和纵向填充尺寸,所述横向填充尺寸是(n-1)/2,所述纵向填充尺寸是(m-1)/2。The address processing module is further configured to calculate an address of the weight data stored in the array, an address of a first feature data set, a size of the first feature matrix, a padding size, and a weight size according to the each address to determine The first target address, wherein the weight size is n rows and m columns, the filling size includes a horizontal filling size and a vertical filling size, the horizontal filling size is (n-1) / 2, and the vertical filling size Yes (m-1) / 2.
  5. 根据权利要求1-4任一项所述的数据处理装置,其特征在于,所述数据处理装置还包括压缩模块,用于:获取第二特征数据集合,将所述第二特征数据集合中值为0的元素去除得到所述第一特征数据集合;The data processing device according to any one of claims 1-4, wherein the data processing device further comprises a compression module, configured to: obtain a second feature data set, and median the second feature data set Removing elements with 0 to obtain the first feature data set;
    获取第二权值数据集合,将所述第二权值数据集合中值为0的元素去除得到所述第一权值数据集合;Acquiring a second weight data set, and removing elements with a value of 0 in the second weight data set to obtain the first weight data set;
    确定所述第一特征数据集合中的每个特征数据的地址,确定所述第一权值数据集合中的每个权值的地址。Determining an address of each feature data in the first feature data set, and determining an address of each weight in the first weight data set.
  6. 一种数据处理方法,其特征在于,所述方法包括:A data processing method, characterized in that the method includes:
    获取第一权值数据集合中的第一权值矩阵,其中,所述第一权值矩阵被表示为n行m列个权值数据,所述第一权值数据集合中的数据来自相同的输入通道,其中,n为大于或等于2的整数,m为大于或等于2的整数;Obtain a first weight matrix in a first weight data set, where the first weight matrix is represented as n rows and m columns of weight data, and the data in the first weight data set is from the same Input channel, where n is an integer greater than or equal to 2 and m is an integer greater than or equal to 2;
    根据所述第一权值矩阵获取第二权值矩阵,其中,所述第二权值矩阵是对所述第一权值矩阵进行按行重排后的矩阵;Obtaining a second weight matrix according to the first weight matrix, wherein the second weight matrix is a matrix in which the first weight matrix is rearranged in rows;
    使用第一权值矩阵与第一特征数据集合进行第一乘法运算;Perform a first multiplication operation using a first weight matrix and a first feature data set;
    使用所述第二权值矩阵与所述第一特征数据集合进行第二乘法运算;Perform a second multiplication operation using the second weight matrix and the first feature data set;
    根据所述第一乘法运算和所述第二乘法运算的运算结果,确定目标数据集合。A target data set is determined according to an operation result of the first multiplication operation and the second multiplication operation.
  7. 根据权利要求6所述的方法,其特征在于,所述方法还包括:The method according to claim 6, further comprising:
    获取所述第一权值矩阵和第二权值矩阵中的权值数据的地址;Obtaining addresses of weight data in the first weight matrix and the second weight matrix;
    使用所述第一权值矩阵和第二权值矩阵中的权值数据的地址与所述第一特征数据集合中的地址进行地址运算;Perform an address operation using the addresses of the weight data in the first weight matrix and the second weight matrix and the addresses in the first feature data set;
    所述根据所述第一乘法运算和所述第二乘法运算的运算结果,确定目标数据集合,包括:The determining a target data set according to an operation result of the first multiplication operation and the second multiplication operation includes:
    根据所述第一乘法运算的运算结果、所述第二乘法运算的运算结果以及所述地址运算的运算结果,确定目标数据集合。A target data set is determined according to an operation result of the first multiplication operation, an operation result of the second multiplication operation, and an operation result of the address operation.
  8. 根据权利要求7所述的方法,其特征在于,所述方法还包括:获取所述第一权值数据集合中的第三权值矩阵至第n权值矩阵,其中,所述第三权值矩阵至第n权值矩阵为对所述第一权值矩阵按行重排后的矩阵,且所述第一权值矩阵至第n权值矩阵的位于同一行的n个行向量中的任意两个行向量不相同;The method according to claim 7, further comprising: obtaining a third weight matrix to an n-th weight matrix in the first weight data set, wherein the third weight The matrix to the n-th weight matrix is a matrix in which the first weight matrix is rearranged in rows, and any of the n row vectors of the first to n-th weight matrices located in the same row is in the same row. The two row vectors are not the same;
    获取所述第三权值矩阵至第n权值矩阵中的权值数据的地址;Obtaining addresses of weight data in the third weight matrix to the n-th weight matrix;
    使用所述第三至第n权值矩阵的权值数据的地址与所述第一特征数据集合中的特征数据的地址进行地址运算。An address operation is performed using the addresses of the weight data of the third to n-th weight matrixes and the addresses of the feature data in the first feature data set.
  9. 根据权利要求7或8所述的方法,其特征在于,所述目标数据集合包括结果矩阵,所述结果矩阵是所述第一特征数据集合与所述第一权值数据集合进行卷积运算的结果,所述第一特征数据集合被表示为第一特征矩阵,The method according to claim 7 or 8, wherein the target data set includes a result matrix, and the result matrix is a convolution operation performed on the first feature data set and the first weight data set. As a result, the first feature data set is represented as a first feature matrix,
    所述方法还包括:The method further includes:
    根据所述每个地址计算阵列保存的权值数据的地址、第一特征数据集合的地址、对应于所述第一特征矩阵的尺寸、填充尺寸和权值尺寸,确定第一目标地址,其中,所述权值尺寸为n行m列,所述填充尺寸包括横向填充尺寸和纵向填充尺寸,所述横向填充尺寸是(n-1)/2,所述纵向填充尺寸是(m-1)/2。The address of the weight data stored in the array, the address of the first feature data set, the size corresponding to the first feature matrix, the filling size, and the weight size are determined according to each address to determine a first target address, where: The weight size is n rows and m columns, the filling size includes a horizontal filling size and a vertical filling size, the horizontal filling size is (n-1) / 2, and the vertical filling size is (m-1) / 2.
  10. 根据权利要求5至9中任一项所述的方法,其特征在于,所述方法还包括:The method according to any one of claims 5 to 9, wherein the method further comprises:
    获取第二特征数据集合,将所述第二特征数据集合中值为0的元素去除得到所述第一特征数据集合;Acquiring a second feature data set, and removing elements with a value of 0 in the second feature data set to obtain the first feature data set;
    获取第二权值数据集合,将所述第二权值数据集合中值为0的元素去除得到所述第一权值数据集合;Acquiring a second weight data set, and removing elements with a value of 0 in the second weight data set to obtain the first weight data set;
    确定所述第一特征数据集合中的每个特征数据的地址,确定所述第一权值数据集合中的每个权值的地址。Determining an address of each feature data in the first feature data set, and determining an address of each weight in the first weight data set.
  11. 一种数据处理装置,其特征在于,所述数据处理装置包括:A data processing device, characterized in that the data processing device includes:
    处理器和存储器,所述存储器存储程序代码,所述处理器用于调用所述存储器中的程序代码执行如权利要求6-10任一项所述的数据处理的方法。A processor and a memory, where the memory stores program code, and the processor is configured to call the program code in the memory to perform the data processing method according to any one of claims 6-10.
PCT/CN2019/102252 2018-09-29 2019-08-23 Data processing method and apparatus WO2020063225A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201811148307.0 2018-09-29
CN201811148307.0A CN110968832B (en) 2018-09-29 2018-09-29 Data processing method and device

Publications (1)

Publication Number Publication Date
WO2020063225A1 true WO2020063225A1 (en) 2020-04-02

Family

ID=69951080

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/102252 WO2020063225A1 (en) 2018-09-29 2019-08-23 Data processing method and apparatus

Country Status (2)

Country Link
CN (1) CN110968832B (en)
WO (1) WO2020063225A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
TWI799169B (en) * 2021-05-19 2023-04-11 神盾股份有限公司 Data processing method and circuit based on convolution computation

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326985A (en) * 2016-08-18 2017-01-11 北京旷视科技有限公司 Neural network training method, neural network training device, data processing method and data processing device
CN107844827A (en) * 2017-11-28 2018-03-27 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks
US20180096226A1 (en) * 2016-10-04 2018-04-05 Magic Leap, Inc. Efficient data layouts for convolutional neural networks
CN108122030A (en) * 2016-11-30 2018-06-05 华为技术有限公司 A kind of operation method of convolutional neural networks, device and server

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107402905B (en) * 2016-05-19 2021-04-09 北京旷视科技有限公司 Neural network-based computing method and device
US10515302B2 (en) * 2016-12-08 2019-12-24 Via Alliance Semiconductor Co., Ltd. Neural network unit with mixed data and weight size computation capability

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106326985A (en) * 2016-08-18 2017-01-11 北京旷视科技有限公司 Neural network training method, neural network training device, data processing method and data processing device
US20180096226A1 (en) * 2016-10-04 2018-04-05 Magic Leap, Inc. Efficient data layouts for convolutional neural networks
CN108122030A (en) * 2016-11-30 2018-06-05 华为技术有限公司 A kind of operation method of convolutional neural networks, device and server
CN107844827A (en) * 2017-11-28 2018-03-27 北京地平线信息技术有限公司 The method and apparatus for performing the computing of the convolutional layer in convolutional neural networks

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
YANG, WEIKE ET AL.: "Design and Implementation of CNN Acceleration Module based on Rocket-Chip Open Source Processor", MICROELECTRONICS & COMPUTER, vol. 35, no. 4, 30 April 2018 (2018-04-30) *
ZHENG, SHIXUAN: "An Efficient Kernel Transformation Architecture for Bina- ry- and Ternary-Weight Neural Network Inference", 2018 55TH ACM/ESDA/ IEEE DESIGN AUTOMATION CONFERENCE (DAC, 20 September 2018 (2018-09-20), XP033405809, DOI: 10.1109/DAC.2018.8465573 *

Also Published As

Publication number Publication date
CN110968832A (en) 2020-04-07
CN110968832B (en) 2023-10-20

Similar Documents

Publication Publication Date Title
US10482156B2 (en) Sparsity-aware hardware accelerators
KR20190066473A (en) Method and apparatus for processing convolution operation in neural network
CN109754359B (en) Pooling processing method and system applied to convolutional neural network
US20220083857A1 (en) Convolutional neural network operation method and device
CN112840356A (en) Operation accelerator, processing method and related equipment
US20200327185A1 (en) Signal Processing Method and Apparatus
CN111382867A (en) Neural network compression method, data processing method and related device
US11763150B2 (en) Method and system for balanced-weight sparse convolution processing
US11775807B2 (en) Artificial neural network and method of controlling fixed point in the same
US20200218777A1 (en) Signal Processing Method and Apparatus
TWI775210B (en) Data dividing method and processor for convolution operation
WO2021147276A1 (en) Data processing method and apparatus, and chip, electronic device and storage medium
US20230196113A1 (en) Neural network training under memory restraint
WO2022041188A1 (en) Accelerator for neural network, acceleration method and device, and computer storage medium
WO2020063225A1 (en) Data processing method and apparatus
CN112200310B (en) Intelligent processor, data processing method and storage medium
US11435941B1 (en) Matrix transpose hardware acceleration
US20180150741A1 (en) Accelerated Convolution in Convolutional Neural Networks
CN112966729A (en) Data processing method and device, computer equipment and storage medium
US20210224632A1 (en) Methods, devices, chips, electronic apparatuses, and storage media for processing data
US11636569B1 (en) Matrix transpose hardware acceleration
JP2021005242A (en) Information processing device, information processing program, and information processing method
US20220318604A1 (en) Sparse machine learning acceleration
KR20200023154A (en) Method and apparatus for processing convolution neural network
WO2021179117A1 (en) Method and apparatus for searching number of neural network channels

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19866662

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19866662

Country of ref document: EP

Kind code of ref document: A1