CN108984115A

CN108984115A - Data parallel write-in, read method, apparatus and system

Info

Publication number: CN108984115A
Application number: CN201810614178.3A
Authority: CN
Inventors: 刘大可; 苗志东
Original assignee: Beijing Institute of Technology BIT
Current assignee: Beijing Institute of Technology BIT
Priority date: 2018-06-14
Filing date: 2018-06-14
Publication date: 2018-12-11
Anticipated expiration: 2038-06-14
Also published as: CN108984115B

Abstract

The present invention provides data parallel write-in, read method, apparatus and system, which comprises the write-in data directory for being written into data vector is transformed to one-dimensional writing address；According to the one-dimensional writing address and default write-in data amount check, the write-in data for obtaining the data vector to be written enable vector, the first storage index vector and the first storage address vector；According to the first storage index vector, said write data are enabled with vector, data vector to be written reorders described in the first storage address vector sum, the data vector to be written after enabling the first storage address vector described in vector sum according to the said write data after reordering and reordering is stored in the parallel storage.The present invention supports data to be written in parallel to from one or more dimensions, improves flexibility and the write efficiency of data write-in.

Description

Data parallel write-in, read method, apparatus and system

Technical field

The invention belongs to field of data access technology, more particularly, to data parallel write-in, read method, device and System.

Background technique

In recent years, artificial intelligence is widely used in every field.Intelligent algorithm be usually all data volume very Big algorithm, therefore in order to accelerate the execution speed of intelligent algorithm, it not only needs to optimize arithmetic system, it is also necessary to which optimization is deposited Storage system.

GPU is the hardware platform for the intelligent algorithm being widely used, and the storage of GPU has specifically for matrix meter The multi-level buffer structure of calculation carrys out optimizing memory system.In Embedded Application field, due to the constraint of power consumption etc., usually using customization The programmable chip of change rather than GPU realize intelligent algorithm.Having one kind in these embedded chips is that vector calculates Machine core piece is well suited for carrying out intelligent algorithm acceleration, and this kind of computer chip is usually using vector memory as storage System.

But for GPU, size, power consumption are all very big.Therefore, Embedded Application field use by very big Limitation.Intelligent algorithm needs to carry out a large amount of matrix operations, therefore the data handled are often the data block of multidimensional, including One peacekeeping multidimensional.Different algorithms needs to carry out parallel continuous read-write to data from one or more dimensions.And vector memory The vector data of a specific length can only be fixedly accessed every time, and the flexibility ratio of data access is insufficient, and it is more to be unable to satisfy complexity Demand of the intelligent algorithm of change to data access.

Summary of the invention

To overcome above-mentioned existing data access arrangement size and power consumption big, and the inflexible problem of data access or extremely It partially solves the above problems, the present invention provides a kind of data parallel write-in, read method, apparatus and system.

According to the first aspect of the invention, a kind of data parallel wiring method is provided, comprising:

The write-in data directory for being written into data vector is transformed to one-dimensional writing address；Wherein, the data to be written Vector is an one or more dimensions vector in multidimensional data matrix to be written, and said write data directory is described to be written Index of first element to be written in the multidimensional data matrix to be written is to be written in all elements of data vector Multidimensional data matrix in index；

According to the one-dimensional writing address and default write-in data amount check, the write-in number of the data vector to be written is obtained According to enabled vector, the first storage index vector and the first storage address vector；Wherein, said write data enable every in vector A element is for indicating whether the element of corresponding position in the data vector to be written is written；The first storage index vector The vector being made of the index of each storing sub-units in the corresponding parallel storage of each element in the data vector to be written； The first storage address vector is the address institute in the data vector to be written in the corresponding each storing sub-units of each element The vector of composition；

According to the first storage index vector, vector, the first storage address vector are enabled to said write data Reorder with the data vector to be written, according to the said write data after reordering enable vector sum described in first deposit The data vector to be written after storage address vector will reorder is stored in the parallel storage.

A kind of data parallel read method is provided according to a second aspect of the present invention, comprising:

The reading data directory of data vector to be read is transformed to one-dimensional reading address；Wherein, the data to be read Vector is an one or more dimensions vector in multidimensional data matrix to be read；The reading data directory is described to be read Index of first element to be read in the multidimensional data matrix to be read in all elements of data vector；

According to the one-dimensional reading address and default reading data amount check, the reading number of the data vector to be read is obtained According to enabled vector, the second storage index vector and the second storage address vector；Wherein, the reading data enable every in vector A element is for indicating whether the element of corresponding position in the data vector to be read reads；The second storage index vector By each element vector that the index of each storing sub-units is constituted in parallel storage in the data vector to be read；It is described Second storage address vector is made of address of each element in each storing sub-units in the data vector to be read Vector；

According to the second storage index vector, the second storage address vector described in vector sum is enabled to the reading data Reorder, according to the reading data after reordering enable vector sum described in the second storage address vector from described parallel Storing data vector is read in memory, and the storing data vector is reset according to the second storage index vector Sequence obtains the data vector to be read.

A kind of data parallel writing station is provided according to a third aspect of the present invention, comprising:

First conversion module, the write-in data directory for being written into data vector are transformed to one-dimensional writing address；Its In, the data vector to be written is an one or more dimensions vector in multidimensional data matrix to be written, said write number According to first element to be written in all elements that index is the data vector to be written in the multidimensional data to be written The index indexed in multidimensional data matrix to be written in matrix；

First obtains module, for obtaining described to be written according to the one-dimensional writing address and default write-in data amount check The write-in data for entering data vector enable vector, the first storage index vector and the first storage address vector；Wherein, said write Data enable each element in vector for indicating whether the element of corresponding position in the data vector to be written is written；Institute The first storage index vector is stated as each storing sub-units in the corresponding parallel storage of each element in the data vector to be written The vector that is constituted of index；The first storage address vector, which is that each element is corresponding in the data vector to be written, respectively to be deposited The vector that address in storage subelement is constituted；

It is stored in module, for enabling vector, described first to said write data according to the first storage index vector Data vector to be written described in storage address vector sum reorders, and enables vector according to the said write data after reordering The data vector to be written after reordering with the first storage address vector is stored in the parallel storage.

A kind of data parallel reading device is provided according to a fourth aspect of the present invention, comprising:

Second conversion module, for the reading data directory of data vector to be read to be transformed to one-dimensional reading address；Its In, the data vector to be read is an one or more dimensions vector in multidimensional data matrix to be read, the reading number According to first element to be read in all elements that index is the data vector to be read in the multidimensional data to be read Index in matrix；

Second obtains module, for reading data amount check with default according to the one-dimensional reading address, continues described in acquisition The reading data of data vector are taken to enable vector, the second storage index vector and the second storage address vector；Wherein, the reading Data enable each element in vector for indicating whether the element of corresponding position in the data vector to be read reads；Institute The second storage index vector is stated as each storing sub-units in the corresponding parallel storage of each element in the data vector to be read The vector that is constituted of index；The second storage address vector, which is that each element is corresponding in the data vector to be read, respectively to be deposited The vector that address in storage subelement is constituted；

Read module, for according to the second storage index vector, to described in the enabled vector sum of the readings data the Two storage address vectors reorder, and enable the second storage address described in vector sum according to the reading data after reordering Vector from the parallel storage read storing data vector, according to it is described second storage index vector to the storing data to Amount reorders, and obtains the data vector to be read.

A kind of data parallel read-write system is provided according to a fifth aspect of the present invention, comprising:

Parallel storage and above-mentioned data parallel writing station and above-mentioned data parallel reading device.

The present invention provides a kind of data parallel write-in, read method, apparatus and system, and this method is by being written into data The write-in data directory of vector is transformed to one-dimensional writing address, according to one-dimensional writing address and default write-in data amount check, obtains The write-in data of data vector to be written enable vector, the first storage index vector and the first storage address vector, according to described First storage index vector enables data to be written described in vector, the first storage address vector sum to said write data Vector reorders, and enables the first storage address vector described in vector sum for rearrangement according to the said write data after reordering The data vector to be written after sequence is stored in the parallel storage, so that data be supported to carry out simultaneously from one or more dimensions Row write enters, and improves flexibility and the write efficiency of data write-in.

Detailed description of the invention

Fig. 1 is data parallel wiring method overall flow schematic diagram provided in an embodiment of the present invention；

Fig. 2 is to tie up to four-dimensional data vector to be written at two in data parallel wiring method provided in an embodiment of the present invention Spend the schematic diagram being written in parallel to；

Fig. 3 is to tie up to four-dimensional data vector to be written at one in data parallel wiring method provided in an embodiment of the present invention Spend the schematic diagram being written in parallel to；

Fig. 4 is that write-in data directory is transformed to one-dimensional write-in in data parallel wiring method provided in an embodiment of the present invention The schematic diagram of address；

Fig. 5 is that write-in is reordered schematic network structure in data parallel wiring method provided in an embodiment of the present invention；

Fig. 6 is data parallel read method overall flow schematic diagram provided in an embodiment of the present invention；

Fig. 7 is data parallel writing station overall structure diagram provided in an embodiment of the present invention；

Fig. 8 is data parallel reading device overall structure diagram provided in an embodiment of the present invention.

Specific embodiment

With reference to the accompanying drawings and examples, specific embodiments of the present invention will be described in further detail.Implement below Example is not intended to limit the scope of the invention for illustrating the present invention.

A kind of data parallel wiring method is provided in one embodiment of the invention, and Fig. 1 provides for the embodiment of the present invention Data parallel wiring method overall flow schematic diagram, this method comprises: S101, is written into the write-in data rope of data vector Draw and is transformed to one-dimensional writing address；Wherein, the data vector to be written is one one in multidimensional data matrix to be written Dimension or multi-C vector, write-in data directory are first element to be written in all elements of data vector to be written to be written Multidimensional data matrix in index multidimensional data matrix to be written in index；

Wherein, data vector to be written is the data vector being written in parallel to.It is to be written that data directory, which is written, Index of first element to be written in multidimensional data matrix to be written in data vector.Wherein element to be written is to need The element being written.Write-in data directory is become under control under the control of control signal w_ctrl0 with w_ctrl1 It changes, generates an one-dimensional writing address.For example, data vector to be written is four dimensional vectors [dim3, dim2, dim1, dim0], In, dim3, dim2, dim1 and dim0 indicate different dimensions.Size of the data vector to be written in each dimension be respectively DIM3, DIM2, DIM1 and DIM0.Maximum be written in parallel to data amount check be S, data vector W_DATA to be written be a length be N to Amount.It is continuously written into from two dimensions of dim1 and dim0, as shown in Figure 2.When being continuously written into parallel from two dimensions, need Define two parameters, i.e. dim1 dimension maximum concurrent reading and concurrent writing number K and dim0 dimension maximum concurrent reading and concurrent writing number L, K*L=S. The form that is continuously written into parallel from mono- dimension of dim0 is as shown in figure 3, L=S, K=0 in this case.

S102 obtains the write-in data of data vector to be written according to one-dimensional writing address and default write-in data amount check Enabled vector, the first storage index vector and the first storage address vector；Wherein, write-in data enable each element in vector For indicating whether the element of corresponding position in data vector to be written is written；First storage index vector be data to be written to The vector that the index of each storing sub-units is constituted in the corresponding parallel storage of each element in amount；First storage address vector is The vector that address in data vector to be written in the corresponding each storing sub-units of each element is constituted；

Wherein, the number that write-in data amount check W_M is the element for needing to be written in data vector to be written is preset.According to one Dimension writing address w_base and default write-in data amount check W_M are calculated, and the write-in data for obtaining data vector to be written make It can vector, the first storage index vector and the first storage address vector.Wherein, write-in data enable vector, the first storage index Vector sum the first storage address vector is respectively the vector that length is equal to N.It is 0 that data, which are written, and enable each element in vector W_BE Or 1, for indicating whether the element of corresponding position in data vector W_DATA to be written is written, wherein 1 indicates write-in, 0 is indicated It is not written into.First storage index vector W_BI be data vector W_DATA to be written in each element it is corresponding being deposited into and The vector that the index of storing sub-units is constituted in line storage.First storage address vector W_BA is data vector W_ to be written The vector that the address in each storing sub-units that DATA will be deposited into is constituted.

It is to be written to enable vector, the first storage address vector sum to write-in data according to the first storage index vector by S103 Data vector reorders, after being reordered according to enabled the first storage address of the vector sum vector of write-in data after reordering Data vector to be written be stored in parallel storage.

Specifically, write-in data are enabled into vector, the first storage address vector sum data vector input write-in weight to be written Sorting network, according to first storage index vector, to write-in data enable vector W_BE, the first storage address vector W_BA and to Write-in data vector W_DATA reorders, and obtains the write-in data after reordering and enables vector W_BE_R, the first storage ground Location vector W_BA_R and data vector W_DATA_R to be written.Wherein W_BA_R be each storing sub-units it is corresponding to be stored in it is to be written Enter the address of each element in data vector, W_BE_R be each storing sub-units whether Shi Neng enabled vector, W_DATA_R is Element in the corresponding data vector to be written to be stored in of each storing sub-units.

The present embodiment is transformed to one-dimensional writing address by being written into the write-in data directory of data vector, according to one-dimensional Writing address and default write-in data amount check, the write-in data for obtaining data vector to be written enable vector, the first storage index Vector sum the first storage address vector, according to first storage index vector, to write-in data enable vector, the first storage address to Amount and data vector to be written reorder, and enable the first storage address of vector sum vector according to the write-in data after reordering Data vector to be written after reordering is stored in parallel storage, so that it is parallel to support that data are carried out from one or more dimensions Write-in improves flexibility and the write efficiency of data write-in.

On the basis of the above embodiments, step S101 is specifically included in the present embodiment: carrying out weight to write-in data directory The corresponding index value of each default dimension being written in parallel in the write-in data directory to reorder is split as more by sequence respectively A index value；

For example, as shown in figure 4, under the control of w_ctrl0 will write-in data directory [dim0, dim1, dim2, dim3] into Rearrangement sequence, the write-in data directory [dimnp1, dimnp0, dimp1, dimp0] after being reordered.Wherein, dimp1 and Dimp0 is two dimensions being written in parallel to, and dimnp1 and dimnp0 are two dimensions for not needing to be written in parallel to, to be written Size on data vector dimnp1, dimnp0, dimp1 and dimp0 four dimensions be respectively DIMNP1, DIMNP0, DIMP1 and DIMP0。

Defining two maximums for being written in parallel to dimension of dimp1 and dimp0 and being written in parallel to number is respectively K and L.K and L can be with Performance according to parallel storage is configured.Dimp1 is split as two dimensions, i.e. dim1_p and dim1_b according to K.Wherein, Dim1_p=dimp1%K, dim1_b=dimp1//K.Size of the data vector to be written on dimension dim1_p is DIM1_P =K, the size on dimension dim1_b are DIM1_B=DIMP1//K.Dimp0 is split as two dimensions, i.e. dim0_p according to L And dim0_b.Wherein, dim0_p=dimp0%L, dim0_b=dimp0//L.Data vector to be written is on dimension dim0_p Size be DIM0_P=L, size on dimension dim0_b is DIM0_B=DIMP1//L.At this point, data vector to be written 6 DOF is split as by the four-dimension, the write-in data directory after fractionation be [dimnp1, dimnp0, dimp1_b, dimp1_p, dimp0_b,dimp0_p].Size of the data vector to be written in each dimension be respectively DIMNP1, DIMNP0, DIMP1_B, DIMP1_P, DIMP0_B and DIMP0_P.

It reorders again under the control of w_ctrl1 to the write-in data directory after fractionation, according to reordering again Write-in data directory calculated, obtain one-dimensional writing address.

For example, by after fractionation write-in data directory [dimnp1, dimnp0, dimp1_b, dimp1_p, dimp0_b, Dimp0_p] it is reordered again, the write-in data directory that is reordered again [dnp3, dnp2, dnp1, dnp0, dp1, dp0].Wherein, dp1=dimp1_p, dp0=dimp0_p.Dnp3, dnp2, dnp1, dnp0 are then dimnp1, dimnp0, What dimp1_b, dimp0_b reordered again.Data vector to be written after reordering again in corresponding dimension Size is respectively DNP3, DNP2, DNP1, DNP0, DP1 and DP0, is by DIMNP1, DIMNP0, DIMP1_B, DIMP1_ P, DIMP0_B and DIMP0_P progress is similarly reordered again and is obtained.It is carried out according to the write-in data directory to reorder again It calculates, obtains the formula of one-dimensional writing address w_base are as follows:

W_base=dp0+dp1*DP0+dnp0*DP0*DP1+dnp1*DP0*DP1*DNP0+dn p2*DP0*DP1* DNP0*DNP1+dnp3*DP0*DP1*DNP0*DNP1*DNP1.The present embodiment is not limited to the dimension of data vector to be written, carries out The dimension and the dimension split into that data are split, dimension when being also not necessarily limited to be sorted again without reordering again.

On the basis of the above embodiments, it is obtained in the present embodiment according to one-dimensional writing address and default write-in data amount check The step of taking the write-in data of data vector to be written to enable vector, the first storage index vector and the first storage address vector tool Body includes: to determine that write-in data enable the number for the element that vector intermediate value is 1, according to be written according to default write-in data amount check Enter the difference between the length of data vector and default write-in data amount check, determines that write-in data enable the member that vector intermediate value is 0 The number of element；According to index of each element in data vector to be written in data vector to be written, one-dimensional writing address and to The length of data vector is written, obtains the first storage index vector and the first storage address vector.

Specifically, according to the mathematical notation of the enabled vector W_BE of the available write-in data of W_M: W_BE=[W_M { 1 }, (N-W_M){0}].Preceding W_M element is written in data vector W_DATA i.e. to be written, and the other elements in W_DATA are not written into. The calculation formula of first storage index vector W_BI are as follows: W_BI=(w_base+ [0,1,2 ..., N-1]) %N.First storage ground Location vector W_BA calculation formula are as follows: W_BA=(w_base+ [0,1,2 ..., N-1]) //N.

It is enabled to write-in data according to the first storage index vector in the present embodiment on the basis of the various embodiments described above The step of vector, the first storage address vector sum data vector to be written are reordered specifically includes: according to the first storage rope The amount of guiding into obtains the write-in number that index of the corresponding element of index of each storing sub-units in data vector to be written is constituted According to index vector；According to write-in data directory vector, vector, the first storage address vector sum number to be written are enabled to write-in data It reorders according to vector.

Specifically, write-in data directory vector W_BI_R is vector corresponding with W_BI, indicates each in parallel storage The index vector that index of the element that a storing sub-units will be written in data vector W_DATA to be written is constituted, the vector It can be calculated by W_BI, formula are as follows: W_BI_R=(N-W_BI [0]+[0,1,2 ..., N-1]) %N.According to write-in number Vector W_BE, the first storage address vector W_BA and data vector W_ to be written are enabled to write-in data according to index vector W_BI_R DATA reorders.It is reset in sequence network in write-in, phase is independently carried out to W_BE, W_BA and W_DATA according to W_BI_R Same reorders, and the structural schematic diagram that sequence network is reset in write-in is as shown in Figure 5.

For example, it is desired to which being written into data vector is that [dim3, dim2, dim1, dim0] is written by 4 storing sub-units structures At parallel storage, i.e. N=4.Data vector to be written corresponds to DIM3=12, DIM2=10 in the size of each dimension, DIM1=8, DIM0=6.Index value of the data directory in 4 dimensions is written to be interleaved under the control of w_ctrl0, exports With the relationship of input are as follows: dimnp1=dim1, dimnp0=dim0, dimp1=dim3, dimp0=dim2.Data to be written to Amount is DIMNP1=DIM1=8, DIMNP0=DIM0=6, DIMP1=DIM3=12, DIMP0=DIM2 in the size of each dimension =10.In the two dimensions of dimp1 and dimp0, maximum concurrent reading and concurrent writing number is K=2, L=2 respectively.Foundation K and L respectively will Dimp1 and dimp0 splits into two dimensions, obtains: dimp1_b=dimp1//K, dimp1_p=dimp1%K, dimp0_b= Dimp0//L, dimp0_p=dimp0%L.The size in dimension that data vector to be written obtains after fractionation mutually should be DIMP1_B=DIMP1//K=6, DIMP1_P=K=2, DIMP0_B=DIMP0//L=5, DIMP0_P=L=2.It will split The dimp1_p in six dimension datas dimnp1, dimnp0, dimp1_b, dimp1_p, dimp0_b, dimp0_p obtained afterwards With dimp0_p directly as the dp1 and dp0 in six final dimensions；Dimnp1, dimnp0, dimp1_b, dimp0_b will be It is interleaved under the control of w_ctrl1, exports the relationship with input are as follows: dnp3=dimp1_b, dnp2=dimp0_b, dnp1= Dimnp1, dnp0=dimnp0.Data vector to be written is DNP3=DIMP1_B=6, DNP2=in the size of each dimension DIMP0_B=5, DNP1=DIMNP1=8, DNP0=DIMNP0=6, DP1=DIMP1_P=2, DP0=DIMP0_P=2.

When write-in data directory w_index=[0,2,0,0], preset data be written number W_M=4, data to be written to When measuring [6,7,8,9] W_DATA=, data parallel writing process is as follows:

W_base is calculated according to write-in data directory w_index, it may be assumed that

W_base=dp0_p+dp1*DP0+dnp0*DP0*DP1+dnp1*DP0*DP1*DNP0+dnp2 * DP0*DP1* DNP0*DNP1+dnp3*DP0*DP1*DNP0*DNP1*DNP1=2.

According to w_base and W_M, W_BE, W_BA and W_BI_R are calculated, it may be assumed that

[W_M { 1 }, (N-W_M) { 0 }]=[1,1,1,1] W_BE=；

W_BI=(w_base+ [0,1,2 ..., N-1]) %N=[2,3,0,1]；

W_BI_R=(N-W_BI [0]+[and 0,1,2 ..., N-1]) %N=[2,3,0,1]；

W_BA=(w_base+ [0,1,2 ..., N-1]) //N=[0,0,1,1]；

Sequence network is reset in W_BE, W_BA and W_DATA input write-in, under the control of W_BI_R, output are as follows:

W_BE_R=[1,1,1,1]；

W_BA_R=[1,1,0,0]；

W_DATA_R=[8,9,6,7].

Data vector to be written is written in parallel to according to W_BE_R, W_BA_R and W_BA_R.Due to W_BE_R vector Middle each element is 1, therefore all storing sub-units are enabled, and W_BA_R gives the address of each storing sub-units, W_ DATA_R provides each storing sub-units data to be stored.

A kind of data parallel read method is provided in another embodiment of the present invention, and Fig. 6 mentions for the embodiment of the present invention The data parallel read method overall flow schematic diagram of confession, this method comprises: S601, by the reading data of data vector to be read Index is transformed to one-dimensional reading address；Wherein, the data vector to be read is one in multidimensional data matrix to be read One or more dimensions vector reads first element to be read in all elements that data directory is data vector to be read and is continuing The index in multidimensional data matrix taken；

Wherein, data vector to be read is the data vector for needing to be read parallel.It is to be read for reading data directory Index of first element to be read in multidimensional data matrix to be read in data vector.Wherein element to be read is to need The element being read out.Reading data directory is converted under the control of control signal r_ctrl0 and r_ctrl1, is generated One one-dimensional reading address.One-dimensional reading address generating method is identical as one-dimensional writing address generation method.

S602 obtains the reading data of data vector to be read according to one-dimensional reading address and default reading data amount check Enabled vector, the second storage index vector and the second storage address vector；Wherein, each element in the enabled vector of data is read For indicating whether the element of corresponding position in data vector to be read reads；Second storage index vector be data to be read to The each element vector that the index of each storing sub-units is constituted in parallel storage in amount；Second storage address vector is to continue Take the vector that address of each element in each storing sub-units is constituted in data vector；

Wherein, the number for reading that data amount check is the element for needing to read in data vector to be read is preset.According to one-dimensional It reads address R_base and the default data amount check R_M that reads is calculated, the reading data for obtaining data vector to be read are enabled Vector, the second storage index vector and the second storage address vector.Wherein, read data enable vector, second storage index to Amount and the second storage address vector are respectively the vector that length is equal to N.Read data enable vector R_BE in each element be 0 or 1, for indicating whether the element of corresponding position in data vector R_DATA to be read reads, wherein 1 indicates to read, 0 is indicated not It reads.Second storage index vector R_BI is that each element stores son in parallel storage in data vector R_DATA to be read The vector that the index of unit is constituted.Second storage address vector R_BA is to read each member in data vector R_DATA to be read The vector that address of the element in each storing sub-units is constituted.

S603 carries out weight to enabled the second storage address of the vector sum vector of data is read according to the second storage index vector Sequence enables vector sum the second storage address vector according to the reading data after reordering and reads storing data from parallel storage Vector reorders to storing data vector according to the second storage index vector, obtains data vector to be read.

Specifically, data will be read and enable vector, the second storage address vector sum data vector input reading weight to be read Sorting network enables vector R_BE and the second storage address vector R_BA to data are read according to the second storage index vector R_BI It reorders, obtains the reading data after reordering and enable vector R_BE_R and the second storage address vector R_BA_R.Wherein R_BA_R is the address of each element in the pre-stored data vector to be read to be read in each storing sub-units, and R_BE_R is The enabled vector that whether each element reads in pre-stored data vector to be read in each storing sub-units.According to R_BE_R Storing data vector R_DATA_R is read from parallel storage with R_BA_R.The sequence of each element is according to storage in R_DATA_R What the sequence of subelement was arranged, but sequence identical as the value of element in data vector R_DATA to be read is different.According to Two storage index vector R_BI reorder to R_DATA_R, obtain R_DATA.

The present embodiment is by being transformed to one-dimensional reading address for the reading data directory of data vector to be read, according to one-dimensional Address and default reading data amount check are read, the write-in data for obtaining data vector to be read enable vector, the second storage index Vector sum the second storage address vector enables the first storage address of vector sum to write-in data according to the second storage index vector Vector reorders, and enables the second storage address of vector sum vector from parallel storage according to the reading data after reordering Storing data vector is read, is reordered according to the second storage index vector to storing data vector, obtains data to be read Vector improves the flexibility and write-in effect of reading data so that data be supported to be read parallel from one or more dimensions Rate.

On the basis of the above embodiments, step S601 is specifically included in the present embodiment: by the reading of data vector to be read The step of taking data directory to be transformed to one-dimensional reading address specifically includes: reordering, will reorder to data directory is read Reading data directory in the corresponding index value of each default dimension that is read parallel be split as multiple index values respectively；To tearing open Reading data directory after point reorders again, is calculated according to the reading data directory to reorder again, obtains one Dimension reads address.

For example, reorder data directory [dim0, dim1, dim2, dim3] is read under the control of r_ctrl0, Reading data directory [dimnp1, dimnp0, dimp1, dimp0] after being reordered.Wherein, dimp1 and dimp0 is parallel Two dimensions read, dimnp1 and dimnp0 are two dimensions for not needing to be read parallel, data vector to be read Size on dimnp1, dimnp0, dimp1 and dimp0 four dimensions is respectively DIMNP1, DIMNP0, DIMP1 and DIMP0.

Defining dimp1 and the dimp0 two parallel parallel numbers that read of the maximum for reading dimension is respectively K and L.K and L can be with Performance according to parallel storage is configured.Dimp1 is split as two dimensions, i.e. dim1_p and dim1_b according to K.Wherein, Dim1_p=dimp1%K, dim1_b=dimp1//K.Size of the data vector to be read on dimension dim1_p is DIM1_P =K, the size on dimension dim1_b are DIM1_B=DIMP1//K.Dimp0 is split as two dimensions, i.e. dim0_p according to L And dim0_b.Wherein, dim0_p=dimp0%L, dim0_b=dimp0//L.Data vector to be read is on dimension dim0_p Size be DIM0_P=L, size on dimension dim0_b is DIM0_B=DIMP1//L.At this point, data vector to be read 6 DOF is split as by the four-dimension, the reading data directory after fractionation be [dimnp1, dimnp0, dimp1_b, dimp1_p, dimp0_b,dimp0_p].Size of the data vector to be read in each dimension be respectively DIMNP1, DIMNP0, DIMP1_B, DIMP1_P, DIMP0_B and DIMP0_P.

It reorders to the reading data directory after fractionation, is carried out according to the reading data directory to reorder again again It calculates, obtains one-dimensional reading address.

For example, by after fractionation reading data directory [dimnp1, dimnp0, dimp1_b, dimp1_p, dimp0_b, Dimp0_p] it is reordered again under the control of r_ctrl1, the reading data directory that is reordered again [dnp3, dnp2,dnp1,dnp0,dp1,dp0].Wherein, dp1=dimp1_p, dp0=dimp0_p.Dnp3, dnp2, dnp1, dnp0 are then It is dimnp1, dimnp0, dimp1_b, what dimp0_b reordered again.Data vector to be read after reordering again The size in corresponding dimension be respectively DNP3, DNP2, DNP1, DNP0, DP1 and DP0, be by DIMNP1, DIMNP0, DIMP1_B, DIMP1_P, DIMP0_B and DIMP0_P progress are similarly reordered again and are obtained.According to weighing again The reading data directory of sequence is calculated, and the one-dimensional formula for reading address w_base is obtained are as follows:

W_base=dp0+dp1*DP0+dnp0*DP0*DP1+dnp1*DP0*DP1*DNP0+dn p2*DP0*DP1* DNP0*DNP1+dnp3*DP0*DP1*DNP0*DNP1*DNP1.The present embodiment is not limited to the dimension of data vector to be read, carries out The dimension and the dimension split into that data are split, dimension when being also not necessarily limited to be sorted again without reordering again.

On the basis of the above embodiments, it is obtained in the present embodiment according to one-dimensional reading address and default reading data amount check The step of taking the reading data of data vector to be read to enable vector, the second storage index vector and the second storage address vector tool Body includes:

It according to default reading data amount check, determines and reads the number that data enable the element that vector intermediate value is 1, according to continuing The length and the default difference read between data amount check for taking data vector determine that reading data enables the member that vector intermediate value is 0 The number of element；According to index of each element in data vector to be read in data vector to be read, one-dimensional reading address and to The length of data vector is read, the second storage index vector and the second storage address vector are obtained.

Specifically, according to the available mathematical notation for reading data and enabling vector R_BE of R_M: R_BE=[R_M { 1 }, (N-R_M){0}].Preceding R_M element is written in data vector R_DATA i.e. to be read, and the other elements in R_DATA are not read. The calculation formula of second storage index vector R_BI are as follows: R_BI=(r_base+ [0,1,2 ..., N-1]) %N.Second storage ground Location vector R_BA calculation formula are as follows: R_BA=(r_base+ [0,1,2 ..., N-1]) //N.

It is enabled to data are read according to the second storage index vector in the present embodiment on the basis of the various embodiments described above The step of vector sum the second storage address vector is reordered specifically includes: according to the second storage index vector, acquisition is respectively deposited The reading data directory vector that index of the corresponding element of index of storage subelement in data vector to be read is constituted；According to Data directory vector is read, is reordered to enabled the second storage address of the vector sum vector of data is read.

Specifically, reading data directory vector R_BI_R is vector corresponding with R_BI, indicates each in parallel storage The index vector that index of the element that a storing sub-units will be read in data vector R_DATA to be read is constituted, the vector It can be calculated by R_BI, formula are as follows: R_BI_R=(N-R_BI [0]+[0,1,2 ..., N-1]) %N.According to reading number Vector R_BE and the second storage address vector R_BA is enabled to write-in data according to index vector R_BI_R to reorder.It is reading Reset in sequence network, according to R_BI_R R_BE and R_BA is independently carried out it is identical reorder, read and reset sequence network Structure is identical with the structure of sorting network in write-in.Storing data vector is read from parallel storage according to R_BE_R and R_BA_R R_DATA_R.The sequence of each element is arranged according to the sequence of storing sub-units in R_DATA_R, with data to be read The value of element is identical in vector R_DATA but sequence is different.It is reordered according to R_BI to R_DATA_R, obtains R_DATA.

For example, by the data vector W_DATA=to be written being written in parallel storage cited in above-described embodiment [6,7,8,9] it is read out.I.e. [6,7,8,9] data vector R_DATA=to be read when, read data directory r_dex=[0, 2,0,0], preset and read data amount check R_M=4, data parallel reading process is as follows:

Index r_index calculates r_base according to read data, it may be assumed that

R_base=dp0_p+dp1*DP0+dnp0*DP0*DP1+dnp1*DP0*DP1*DNP0+dnp2 * DP0*DP1* DNP0*DNP1+dnp3*DP0*DP1*DNP0*DNP1*DNP1=2.

According to r_base and R_M, R_BE, R_BA and R_BI_R are calculated, it may be assumed that

[R_M { 1 }, (N-R_M) { 0 }]=[1,1,1,1] R_BE=；

R_BI=(r_base+ [0,1,2 ..., N-1]) %N=[2,3,0,1]；

R_BI_R=(N-R_BI [0]+[and 0,1,2 ..., N-1]) %N=[2,3,0,1]；

R_BA=(r_base+ [0,1,2 ..., N-1]) //N=[0,0,1,1]；

R_BE and r_BA input, which is read, resets sequence network, under the control of R_BI_R, output are as follows:

R_BE_R=[1,1,1,1]；

R_BA_R=[1,1,0,0]；

Storing data vector R_DATA_R is read from parallel storage according to R_BE_R and R_BA_R and W_BA_R.Due to Each element is 1 in R_BE_R vector, therefore all storing sub-units are enabled, and R_BA_R gives each storing sub-units Address, the data of reading are R_DATA_R=[8,9,6,7].R_DATA_R is identical as the value of element in R_DATA, but sequence is not Together, it reorders.It reorders under the control of R_BI to R_DATA_R, obtains R_DATA.

A kind of data parallel writing station is provided in another embodiment of the present invention, with reference to Fig. 7.The device is for real Existing above-mentioned each data parallel wiring method embodiment.Therefore, retouching in the data parallel wiring method in foregoing embodiments It states and defines, can be used for the understanding of each execution module in the embodiment of the present invention.

Data parallel writing station includes: the write-in data rope that the first conversion module 701 is used to be written into data vector Draw and is transformed to one-dimensional writing address；Wherein, the data vector to be written is one one in multidimensional data matrix to be written Dimension or multi-C vector, write-in data directory are first element to be written in all elements of data vector to be written to be written Multidimensional data matrix in index；First, which obtains module 702, is used for according to one-dimensional writing address and default write-in data amount check, The write-in data for obtaining data vector to be written enable vector, the first storage index vector and the first storage address vector；Wherein, Write-in data enable each element in vector for indicating whether the element of corresponding position in data vector to be written is written；The One storage index vector in data vector to be written in the corresponding parallel storage of each element each storing sub-units index institute The vector of composition；First storage address vector is the address in data vector to be written in the corresponding each storing sub-units of each element The vector constituted；It is stored in module 703 to be used for according to the first storage index vector, vector, the first storage is enabled to write-in data Address vector and data vector to be written reorder, and enable the storage of vector sum first ground according to the write-in data after reordering Data vector to be written after location vector will reorder is stored in parallel storage.

On the basis of the above embodiments, the first conversion module is specifically used in the present embodiment: to write-in data directory into Rearrangement sequence splits the corresponding index value of each default dimension being written in parallel in the write-in data directory to reorder respectively For multiple index values；It reorders again to the write-in data directory after fractionation, according to the write-in data rope to reorder again It introduces row to calculate, obtains one-dimensional writing address.

On the basis of the above embodiments, the first acquisition module is specifically used in the present embodiment: according to default write-in data Number determines that write-in data enable the number for the element that vector intermediate value is 1, is write according to the length of data vector to be written with default Enter the difference between data amount check, determines that write-in data enable the number for the element that vector intermediate value is 0；According to data to be written to The length of index of each element in data vector to be written, one-dimensional writing address and data vector to be written in amount obtains the One storage index vector and the first storage address vector.

On the basis of the various embodiments described above, writing module is specifically used in the present embodiment: according to first storage index to Amount obtains the write-in data rope that index of the corresponding element of index of each storing sub-units in data vector to be written is constituted The amount of guiding into；According to write-in data directory vector, to write-in data enable vector, the first storage address vector sum data to be written to Amount reorders.

Write-in data directory by being written into data vector is transformed to one-dimensional writing address, according to one-dimensional writing address With default write-in data amount check, the write-in data for obtaining data vector to be written enable vector, the first storage index vector and the It is to be written to enable vector, the first storage address vector sum to write-in data according to the first storage index vector for one storage address vector Enter data vector to reorder, enabling the first storage address of vector sum vector according to the write-in data after reordering will reorder Data vector to be written deposit parallel storage afterwards mentions so that data be supported to be written in parallel to from one or more dimensions The flexibility of high data write-ins and write efficiency.

A kind of data parallel reading device is provided in another embodiment of the present invention, with reference to Fig. 8.The device is for real Existing above-mentioned each data parallel read method embodiment.Therefore, retouching in the data parallel read method in foregoing embodiments It states and defines, can be used for the understanding of each execution module in the embodiment of the present invention.

Data parallel writing station includes: that the second conversion module 801 is used for the reading data rope of data vector to be read Draw and is transformed to one-dimensional reading address；Wherein, the data vector to be read is one one in multidimensional data matrix to be read Dimension or multi-C vector, reading data directory is first element to be read in all elements of data vector to be read to be read Multidimensional data matrix in index；Second, which obtains module 802, is used for according to one-dimensional reading address and default reading data amount check, The reading data for obtaining data vector to be read enable vector, the second storage index vector and the second storage address vector；Wherein, Each element that data enable in vector is read to be used to indicate whether the element of corresponding position in data vector to be read reads；The Two storage index vectors in data vector to be read in the corresponding parallel storage of each element each storing sub-units index institute The vector of composition；Second storage address vector is the address in data vector to be read in the corresponding each storing sub-units of each element The vector constituted；Read module 803 is used for according to the second storage index vector, enables the storage of vector sum second to data are read Address vector reorders, and enables vector sum the second storage address vector from parallel memorizing according to the reading data after reordering Device reads storing data vector, is reordered according to the second storage index vector to storing data vector, obtains access of continuing According to vector.

On the basis of the above embodiments, the second conversion module is specifically used in the present embodiment: to read data directory into Rearrangement sequence splits the corresponding index value of each default dimension read parallel in the reading data directory to reorder respectively For multiple index values；It reorders again to the reading data directory after fractionation, according to the reading data rope to reorder again It introduces row to calculate, obtains one-dimensional reading address.

On the basis of the above embodiments, the second acquisition module is specifically used in the present embodiment: according to default reading data Number determines and reads the number that data enable the element that vector intermediate value is 1, is read according to the length of data vector to be read with default The difference between data amount check is taken, determines and reads the number that data enable the element that vector intermediate value is 0；According to data to be read to Index of each element in data vector to be read, the one-dimensional length for reading address and data vector to be read in amount obtain the Two storage index vectors and the second storage address vector.

On the basis of the various embodiments described above, read module is specifically used in the present embodiment: according to second storage index to Amount obtains the reading data rope that index of the corresponding element of index of each storing sub-units in data vector to be read is constituted The amount of guiding into；Index vector according to read data reorders to enabled the second storage address of the vector sum vector of data is read.

A kind of data parallel read-write system is provided in another embodiment of the present invention, the data parallel read-write system packet Include parallel storage, any data in above-mentioned each data parallel writing station embodiment is written in parallel to device and above-mentioned each data The parallel reading device of any data in parallel reading device embodiment.

Finally, the present processes are only preferable embodiment, it is not intended to limit the scope of the present invention.It is all Within the spirit and principles in the present invention, any modification, equivalent replacement, improvement and so on should be included in protection of the invention Within the scope of.

Claims

1. a kind of data parallel wiring method characterized by comprising

The write-in data directory for being written into data vector is transformed to one-dimensional writing address；Wherein, the data vector to be written For an one or more dimensions vector in multidimensional data matrix to be written, said write data directory is the data to be written Index of first element to be written in the multidimensional data matrix to be written in all elements of vector；

According to the one-dimensional writing address and default write-in data amount check, the write-in data for obtaining the data vector to be written make It can vector, the first storage index vector and the first storage address vector；Wherein, said write data enable each member in vector Element is for indicating whether the element of corresponding position in the data vector to be written is written；The first storage index vector is institute State the vector that the index of each storing sub-units in the corresponding parallel storage of each element in data vector to be written is constituted；It is described First storage address vector is made of the address in the data vector to be written in the corresponding each storing sub-units of each element Vector；

According to the first storage index vector, vector, the first storage address vector sum institute are enabled to said write data It states data vector to be written to reorder, enables the first storage ground described in vector sum according to the said write data after reordering The data vector to be written after location vector will reorder is stored in the parallel storage.

2. the method according to claim 1, wherein the write-in data directory for being written into data vector is transformed to The step of one-dimensional writing address, specifically includes:

It reorders to said write data directory, it is each by being written in parallel in the said write data directory to reorder The default corresponding index value of dimension is split as multiple index values respectively；

It reorders again to the said write data directory after fractionation, according to the said write data directory to reorder again It is calculated, obtains the one-dimensional writing address.

3. the method according to claim 1, wherein according to the one-dimensional writing address and default write-in data Number, obtain the data vector to be written write-in data enable vector, the first storage index vector and the first storage address to The step of amount, specifically includes:

According to the default write-in data amount check, determine that said write data enable the number for the element that vector intermediate value is 1, according to Difference between the length of the data vector to be written and the default write-in data amount check, determines that said write data are enabled The number for the element that vector intermediate value is 0；

According to index, the one-dimensional write-in ground of each element in the data vector to be written in the data vector to be written The length of location and the data vector to be written obtains the first storage index vector and the first storage address vector.

4. method according to claim 1 to 3, which is characterized in that according to the first storage index vector, to institute State the step of write-in data enable vector, data vector to be written described in the first storage address vector sum is reordered tool Body includes:

According to the first storage index vector, the corresponding element of index of each storing sub-units is obtained described to be written The write-in data directory vector that index in data vector is constituted；

According to said write data directory vector, vector, the first storage address vector sum institute are enabled to said write data Data vector to be written is stated to reorder.

5. a kind of data parallel read method characterized by comprising

The reading data directory of data vector to be read is transformed to one-dimensional reading address；Wherein, the data vector to be read For an one or more dimensions vector in multidimensional data matrix to be read, the reading data directory is the data to be read Index of first element to be read in the multidimensional data matrix to be read in all elements of vector；

According to the one-dimensional reading address and default reading data amount check, the reading data for obtaining the data vector to be read make It can vector, the second storage index vector and the second storage address vector；Wherein, each member read in the enabled vector of data Element is for indicating whether the element of corresponding position in the data vector to be read reads；The second storage index vector is institute State each element vector that the index of each storing sub-units is constituted in parallel storage in data vector to be read；Described second Storage address vector by address of each element in each storing sub-units in the data vector to be read constitute to Amount；

According to the second storage index vector, the second storage address vector described in vector sum is enabled to the reading data and is carried out It reorders, enables the second storage address vector described in vector sum from the parallel memorizing according to the reading data after reordering Storing data vector is read in device, is reordered, is obtained to the storing data vector according to the second storage index vector Take the data vector to be read.

6. according to the method described in claim 5, it is characterized in that, the reading data directory of data vector to be read is transformed to The step of one-dimensional reading address, specifically includes:

It reorders to the reading data directory, it is each by being read parallel in the reading data directory to reorder The default corresponding index value of dimension is split as multiple index values respectively；

It reorders again to the reading data directory after fractionation, according to the reading data directory to reorder again It is calculated, obtains the one-dimensional reading address.

7. according to the method described in claim 5, it is characterized in that, according to the one-dimensional reading address and default reading data Number, obtain the data vector to be read reading data enable vector, the second storage index vector and the second storage address to The step of amount, specifically includes:

According to the default reading data amount check, determine that the reading data enable the number for the element that vector intermediate value is 1, according to The length of the data vector to be read and the default difference read between data amount check, determine that the reading data are enabled The number for the element that vector intermediate value is 0；

According to index, the one-dimensional reading ground of each element in the data vector to be read in the data vector to be read The length of location and the data vector to be read obtains the second storage index vector and the second storage address vector.

8. according to any method of claim 5-7, which is characterized in that according to the second storage index vector, to institute The step of the second storage address vector described in the enabled vector sum of reading data is reordered is stated to specifically include:

According to the second storage index vector, the corresponding element of index of each storing sub-units is obtained described to be read The reading data directory vector that index in data vector is constituted；

According to the reading data directory vector, the second storage address vector described in vector sum is enabled to the reading data and is carried out It reorders.

9. a kind of data parallel writing station characterized by comprising

First conversion module, the write-in data directory for being written into data vector are transformed to one-dimensional writing address；Wherein, institute Stating data vector to be written is an one or more dimensions vector in multidimensional data matrix to be written, said write data directory It is first element to be written in all elements of the data vector to be written in the multidimensional data matrix to be written Index；

First obtains module, for obtaining the number to be written according to the one-dimensional writing address and default write-in data amount check Vector, the first storage index vector and the first storage address vector are enabled according to the write-in data of vector；Wherein, said write data Each element in enabled vector is for indicating whether the element of corresponding position in the data vector to be written is written；Described One storage index vector is the rope of each storing sub-units in the corresponding parallel storage of each element in the data vector to be written Draw constituted vector；The first storage address vector is corresponding each storage of each element in the data vector to be written The vector that address in unit is constituted；

It is stored in module, for enabling vector, first storage to said write data according to the first storage index vector Address vector and the data vector to be written reorder, and enable vector sum institute according to the said write data after reordering The data vector to be written after stating the first storage address vector and reordering is stored in the parallel storage.

10. a kind of data parallel reading device characterized by comprising

Second conversion module, for the reading data directory of data vector to be read to be transformed to one-dimensional reading address；Wherein, institute Stating data vector to be read is an one or more dimensions vector in multidimensional data matrix to be read, the reading data directory It is first element to be read in all elements of the data vector to be read in the multidimensional data matrix to be read Index；

Second obtains module, for reading data amount check with default according to the one-dimensional reading address, access of continuing described in acquisition Vector, the second storage index vector and the second storage address vector are enabled according to the reading data of vector；Wherein, the reading data Each element in enabled vector is for indicating whether the element of corresponding position in the data vector to be read reads；Described Two storage index vectors are the rope of each storing sub-units in the corresponding parallel storage of each element in the data vector to be read Draw constituted vector；The second storage address vector is corresponding each storage of each element in the data vector to be read The vector that address in unit is constituted；

Read module, for being deposited to described in the enabled vector sum of the reading data second according to the second storage index vector Storage address vector reorders, and enables the second storage address vector described in vector sum according to the reading data after reordering From the parallel storage read storing data vector, according to it is described second storage index vector to the storing data vector into Rearrangement sequence obtains the data vector to be read.

11. a kind of data parallel read-write system, which is characterized in that including parallel storage, and as described in claim 9 and 10 Device.