WO2021120646A1

WO2021120646A1 - Data processing system

Info

Publication number: WO2021120646A1
Application number: PCT/CN2020/108985
Authority: WO
Inventors: 蒋文
Original assignee: 深圳云天励飞技术股份有限公司
Priority date: 2019-12-16
Filing date: 2020-08-13
Publication date: 2021-06-24
Also published as: CN111145075B; CN111145075A

Abstract

A data processing system. The data processing system comprises: a data transmission module, a control module, a calculation module, a storage control module, and a storage module. The data transmission module is configured to receive a first feature value and a second feature value, the control module is configured to control the calculation module to calculate data of the first feature value and the second feature value, the storage control module is configured to perform conjugate multiplication operation and perform calculation of correlation degree of the first feature value and the second feature value, and the storage module is configured to store a calculation result. The system completes the calculation of correlation degree by means of cooperation among a plurality of hardware modules, and the calculation efficiency is high.

Description

A data processing system

Technical field

The present invention relates to the technical field of image processing, in particular to a data processing system.

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on December 16, 2019, with an application number of 201911296752.6 and an invention title of "a data processing system", the entire content of which is incorporated into this application by reference.

Background technique

When performing target tracking, it is usually necessary to detect whether there is a target in the predicted position of the next frame of image through the correlation between the image of the current frame and the image of the next frame. The calculation of the correlation between the feature value of the current frame image and the feature value of the next frame image has become a key step. At present, the calculation of the correlation between the feature values of the two frames is usually implemented by software (that is, through the general-purpose processor ARM Or digital signal processor DSP to execute software program instructions), however, when pure software calculates the correlation between the feature values of two frames, the calculation efficiency is low.

Technical solutions

The embodiment of the present invention provides a data processing system, which improves the efficiency of calculating the correlation between the first feature value and the second feature value through cooperation between multiple hardware modules.

The embodiment of the present invention provides a data processing system, which includes a data transmission module, a control module, an arithmetic module, a storage control module, and a storage module;

The data transmission module is configured to receive a first characteristic value and a second characteristic value, the first characteristic value and the second characteristic value including D×W×H characteristic vectors, where D stands for dimension and W stands for width , H stands for height, D, W, and H are all positive integers;

The control module is configured to use the acquired W×H eigenvectors of the i-th dimension of the first eigenvalue as a first operation value, and control the operation module to perform each row of the first operation value The feature vector is subjected to fast Fourier transform to obtain W×H first results, and the arithmetic module is controlled to perform fast Fourier transform on each column of the first result to obtain H×W second results, where 1≤i≤D ；

The control module is further configured to use the acquired W×H eigenvectors of the i-th dimension of the second eigenvalue as a second operation value, and control the operation module to perform a calculation on each of the second operation value. Fast Fourier transform of a row of feature vectors to obtain W×H third results, and control the arithmetic module to perform fast Fourier transform on each column of the third result to obtain H×W fourth results;

The storage control module is configured to perform a conjugate multiplication operation on the second result and the fourth result to obtain W×H fifth results;

The control module is also configured to control the arithmetic module to perform inverse fast Fourier transform on each row feature vector of the fifth result to obtain W×H sixth results, and to control the arithmetic module to perform the inverse fast Fourier transform on the sixth result. Inverse fast Fourier transform for each column to obtain H×W seventh results;

The storage control module is further configured to accumulate the real parts of the same row and the same column in the seventh result of each of the D dimensions to obtain W×H eighth results, and use the eighth result as The degree of correlation between the first characteristic value and the second characteristic value;

The storage control module is further configured to control the storage module to store the first to seventh results and the correlation degree.

In a possible design, the arithmetic module includes M butterfly arithmetic units, and each butterfly arithmetic unit performs fast Fourier transform or inverse fast Fourier transform on the received data.

In a possible design, the control module includes an arithmetic controller, a register unit, and a data selection unit. The register unit is used to register the data to be calculated, and the arithmetic controller is used to register the data to be calculated according to W and M or H and The magnitude relationship of M controls the data selection unit to select corresponding data from the register unit, and output the selected data to the M butterfly operation units.

In a possible design, the arithmetic controller is further configured to control the data selection unit to select W data from the register unit if the W is less than 2M, and output the selected data to the M Butterfly operation unit;

The arithmetic controller is further configured to control the data selection unit to select 2M data from the register unit each time if the W is greater than or equal to 2M, and output the selected data to the M butterfly operations unit;

The operation controller is further configured to control the data selection unit to select H data from the register unit if the H is less than 2M, and output the selected data to the M butterfly operation units;

The operation controller is further configured to control the data selection unit to select 2M data from the register unit each time if the H is greater than or equal to 2M, and output the selected data to the M butterfly operations unit.

In a possible design, the data selection unit includes data selection arbitration logic and data selection logic;

The data selection arbitration logic determines the rotation factor according to the number of data selected from the register unit each time and the current stage where the butterfly operation unit performs fast Fourier transform or inverse fast Fourier transform;

The data selection logic determines the serial numbers of the two data input to each butterfly operation unit of the M butterfly operation units according to the rotation factor.

In a possible design, the register unit includes two register sets, each of the register sets includes X register sets, one register set includes P registers, and the P registers in one register set are used to store one row. Or a list of feature vectors, where both X and P are positive integers;

The arithmetic controller is configured to store X row or column feature vectors in one of the two register sets;

The arithmetic controller is also used to control the data selection unit to select corresponding data from the one set of registers, and output the selected data to the M butterfly arithmetic units, and at the same time, the arithmetic controller sends the The other register set in the two register sets stores feature vectors other than the X row or column feature vector.

In a possible design, the system further includes a register interface;

The register interface is used to obtain register configuration information, and transmit the register configuration information to the operation controller;

The arithmetic controller is used for storing feature vectors in the M register sets according to the register configuration information.

In a possible design, the storage module includes a first memory, a second memory, and a third memory;

The first memory is used to store the first result, the third result, and the sixth result;

The second memory is used to store the second result, the fourth result, the fifth result, and the seventh result;

The third memory is used to store the eighth result.

In a possible design, the system further includes a task controller, and the task controller is configured to send a task start signal to the calculation controller when the correlation calculation instruction is detected, and the task start signal is used for Instruct the arithmetic controller to perform fast Fourier transform or inverse fast Fourier transform on the input data.

In the embodiment of the present invention, the data processing system includes a data transmission module, a control module, a calculation module, a storage control module, and a storage module. These five modules are all hardware modules. The calculation of the correlation degree is realized by mutual cooperation, and the calculation process is performed in the unit of row or column. Therefore, compared with pure software calculation, the calculation efficiency of the data processing system is higher.

Description of the drawings

In order to illustrate the technical solutions in the embodiments of the present invention or the prior art, the following will briefly introduce the drawings that need to be used in the description of the embodiments or the prior art.

Fig. 1 is a functional block diagram of a data processing system provided by an embodiment of the present invention.

FIG. 2 is a schematic diagram of a first characteristic value and a second characteristic value provided by an embodiment of the present invention.

Fig. 3 is a functional block diagram of an arithmetic module provided by an embodiment of the present invention.

FIG. 4 is a schematic diagram of a butterfly operation unit provided by an embodiment of the present invention.

Fig. 5 is a functional block diagram of another data processing system provided by an embodiment of the present invention.

Fig. 6 is a schematic diagram of an FFT operation provided by an embodiment of the present invention.

FIG. 7 is a schematic diagram of a registration unit provided by an embodiment of the present invention.

FIG. 8 is a schematic diagram of each storage unit in the storage module provided by an embodiment of the present invention.

FIG. 9 is a schematic diagram of a pipeline processing provided by an embodiment of the present invention.

Embodiments of the present invention

Please refer to FIG. 1, which is a functional block diagram of a data processing system according to an embodiment of the present invention.

As shown in Figure 1, the data processing system may include a data transmission module, a control module, an arithmetic module, a storage control module, and a storage module. The data transmission module, the control module, the calculation module, the storage control module, and the storage module are all hardware modules.

Please refer to FIGS. 1 and 2 together. The data transmission module is configured to receive a first feature value F1 and a second feature value F2. The first feature value F1 and the second feature value F2 include D×W× H feature vectors, where D stands for dimension, W stands for width, H stands for height, and D, W, and H are all positive integers.

The control module is configured to use the acquired W×H eigenvectors of the i-th dimension of the first eigenvalue F1 as the first operation value, and control the operation module to calculate each of the first operation value. One row of feature vectors is subjected to fast Fourier transform to obtain W×H first results, and the arithmetic module is controlled to perform fast Fourier transform on each column of the first result to obtain H×W second results, where 1≤i≤ D.

Specifically, for the W×H feature vectors of each dimension of the first feature value F1, the control module first performs fast Fourier transform on each row of feature vectors to obtain W×H first results, and then Each column of the first result is subjected to fast Fourier transform to obtain H×W second results.

The control module is further configured to use the acquired W×H eigenvectors of the i-th dimension of the second eigenvalue F2 as a second operation value, and control the operation of the operation module on the second operation value Fast Fourier transform of each row feature vector to obtain W×H third results, and control the arithmetic module to perform fast Fourier transform on each column of the third result to obtain H×W fourth results;

Specifically, for the W×H eigenvectors of each dimension of the second eigenvalue F2, the control module first performs fast Fourier transform on each row of eigenvectors to obtain W×H third results, and then Each column of the first result is subjected to fast Fourier transform to obtain H×W fourth results.

The storage control module is configured to perform a conjugate multiplication operation on the second result and the fourth result to obtain W×H fifth results.

The control module is also configured to control the arithmetic module to perform inverse fast Fourier transform on each row feature vector of the fifth result to obtain W×H sixth results, and to control the arithmetic module to perform the inverse fast Fourier transform on the sixth result. Inverse fast Fourier transform for each column to obtain the seventh result of H×W.

The storage control module is further configured to accumulate the real parts of the same row and the same column in the seventh result of each of the D dimensions to obtain W×H eighth results, and use the eighth result as The degree of correlation between the first characteristic value and the second characteristic value.

Referring to FIG. 3, the arithmetic module may include M butterfly arithmetic units, and M is a positive integer. Each butterfly operation unit is used to perform fast Fourier transform or inverse fast Fourier transform on the received data.

Please refer to Fig. 4, each butterfly operation unit includes two input interfaces for receiving input data input 1 and input 2 respectively. Input 2 and the twiddle factor are multiplied by the multiplier to get the rotated input 2. Input 1 and the rotated input 2 are added by the adder to get the result 1. Optionally, if it is an IFFT operation, you also need to input 1 and The rotated input 2 is added by the adder and the result is divided by 2 to get the result 1. Input 1 and rotated input 2 are subtracted by the subtractor to get the result 2. Optionally, if it is an IFFT operation, you also need to divide the result of input 1 and the rotated input 2 by the subtractor by 2 to get the result 2.

M butterfly arithmetic units include 2M input interfaces, which can process the first-level 2M point FFT (Fast Fourier Transform, Fast Fourier Transform) or IFFT (Inverse Fast Fourier Transform, inverse fast Fourier transform) operation. For X-point FFT or IFFT operation, when X≤2M, the arithmetic module can process the X-point FFT or IFFT operation of the first stage in one clock cycle; when X>2M and Y times 2M, so The arithmetic module can process the X-point FFT or IFFT operation of the first stage in Y clock cycles. It should be noted that the 2M points mentioned above mean that the number of feature vectors that can be input by the arithmetic module is 2M.

Specifically, when M=8 and X=8, the arithmetic module can complete the 8-point FFT or IFFT operation of the first stage in one clock cycle. When M=8 and X=16, the arithmetic module can process the first-level 16-point FFT or IFFT operation in one clock cycle. When M=8 and X=32, the arithmetic module can complete the first-stage 32-point FFT or IFFT operation in two clock cycles. When M=8 and X=64, the arithmetic module can complete the first-level 32-point FFT or IFFT operation in four clock cycles.

Please refer to FIG. 5, which is a functional block diagram of another data processing system according to an embodiment of the present invention.

As shown in FIG. 5, the data processing system may include a data transmission module, a control module, an arithmetic module, a storage control module, and a storage module. The functions of the data transmission module, the control module, the operation module, the storage control module, and the storage module are the same as those of the data transmission module, control module, operation module, storage control module, and storage in the embodiment provided in FIG. 1 The functions of the modules are similar, so I won't repeat them here.

The data transmission module can be connected to an external memory to transmit the image feature value stored in the external memory to the control module, where the image feature value can include a first feature value F1 of the first image and a second feature value F2 of the second image . Each of the first feature value F1 and the second feature value F2 may include multiple feature vectors. As shown in FIG. 2, each feature value may include D×W×H feature vectors, where D represents dimension, W stands for width and H stands for height. The data transmission module transmits the first characteristic value and the second characteristic value from the external memory to the control module, which can facilitate subsequent calculations, thereby greatly improving calculation efficiency.

The control module may include an arithmetic controller, a register unit, and a data selection unit.

The register unit is used to register the data to be operated on, and the operation controller is used to control the data selection unit to select corresponding data from the register unit according to the magnitude relationship between W and M or H and M, and select The data of is output to the M butterfly operation units. For example, if W is less than 2M, control the data selection unit to select W data from the register unit, and output the selected data to the M butterfly operation units. For example, if W is 4, the arithmetic module integrates 4 butterfly arithmetic units, which can process 8-point FFT at the same time, and W is less than 8, then the processing can be completed in one time period, and the 4 eigenvectors are directly input to the arithmetic module for operation It can be understood that the feature vector of the input operation module only occupies two butterfly operation units.

For another example, if the W is greater than or equal to 2M, the data selection unit is controlled to select 2M data from the register unit each time, and the selected data is output to the M butterfly operation units. For example, if W is 16, the arithmetic module integrates 4 butterfly arithmetic units, which can process 8-point FFT at the same time, and W is greater than 2M, it is not possible to input all W data, only 2M feature vectors can be input to the arithmetic module at a time. Operation.

The data selection unit includes data selection arbitration logic and data selection logic. The data selection arbitration logic determines the rotation factor according to the number of data selected from the register unit each time and the current stage where the butterfly operation unit performs fast Fourier transform or inverse fast Fourier transform.

The data selection logic determines the serial numbers of the two data input to each butterfly operation unit of the M butterfly operation units according to the rotation factor, and inputs the corresponding butterfly operation unit according to the serial number.

As shown in Figure 6, W in the twiddle factor WkN represents the weight, and N represents the number of FFT or IFFT points (generally 4 points, 8 points, 16 points, etc.) and k represents the number of weights. It can be seen from the schematic diagram of the butterfly operation process that the values of N and k in the rotation factor WkN, the number of points involved in the FFT operation (the value of N), and the current fast Fourier transform or inverse fast Fourier transform of the butterfly operation unit The number of stages is related. For example, in the eight-point FFT operation, in the first stage operation, N=8, k=0, 1, 2 and 3, in the second stage operation, N=8, k=0 or 2, the first stage operation In the three-level operation, N=8, k=0.

Therefore, the data selection arbitration logic is based on the number of data selected from the register unit each time (that is, the number of points involved in the FFT operation), and the current level of the butterfly operation unit performing fast Fourier transform or inverse fast Fourier transform. Determine the rotation factor, that is, determine the size of N and k.

According to the rotation factor, the data selection logic can determine the serial number of the two data input to each butterfly operation unit of the M butterfly operation units, where the serial number can be obtained by sequentially numbering when the data is taken out from the register unit The serial number. Or, if it is the intermediate result of the FFT operation, the sequence number may be obtained by sequentially numbering the intermediate results obtained after the first-level operation. For example, if it is an 8-point FFT operation, the number is 0-7, and if it is a 16-point FFT operation, the number is 0-15. Determine the data input to the butterfly operation unit according to the serial number. For example, if an 8-point FFT operation is performed, and it is the first level operation, N=8, k=0, 1, 2, and 3, then input the data of the same butterfly operation unit The point spacing is 4 (where k has four values), that is, select sequence number 0 and sequence number 4, sequence number 1 and sequence number 5, sequence number 2 and sequence number 6, and sequence number 3 and sequence number 7. For another example, 8 point FFT operation, when performing the second level operation, N=8, k=0 or 2, then the distance between the data points input to the same butterfly operation unit is 2 (where k has two values) , That is, select sequence number 0 and sequence number 2, sequence number 1 and sequence number 3, sequence number 4 and sequence number 6, and sequence number 5 and sequence number 7. For another example, 8 point FFT operation, when performing the third level operation, N=8, k=0, then the distance between the data points input to the same butterfly operation unit is 1 (where k has a value), that is, select Serial number 0 and serial number 1, serial number 2 and serial number 3, serial number 4 and serial number 5, serial number 6 and serial number 7.

It can be understood that different points can correspond to different series, for example, if X points are included, the number of points divided into is log2X.

Optionally, the above-mentioned register unit may include a first register set and a second register set. The first register set and the second register set each include X register sets, and one register set includes P registers, and both X and P are positive. Integer, P registers in a register group are used to store one row or column feature vector, then one register set can store X row or column feature vector.

The arithmetic controller is used to control the first register set to store X row or column feature vectors;

The arithmetic controller is also used to control the data selection unit to select corresponding data from the first register set, and output the selected data to the M butterfly arithmetic units, while the arithmetic controller controls The second register set stores feature vectors other than the X row or column feature vectors. For example, if the first register set stores 1 to X row or column feature vectors, the second register set stores X+1 to 2X rows Or column feature vector.

In one embodiment, as shown in FIG. 7, the register unit may include two register sets, the first register set includes register sets 1-4, and the second register set includes register sets 5-8. A register group can be used to store a row of feature vectors, for example, register groups 1-4 are used to store the 1-4 rows of the feature value F1, and the register groups 5-8 are used to store the 5-8th rows of the feature value F1. In order to improve the efficiency of data processing, register banks 1-4 and register banks 5-8 can take ping-pong operation, that is, while processing the characteristic values of rows 1-4, the characteristic values of rows 5-8 can be stored in In the second register set, after processing the characteristic values in rows 1-4, the characteristic values in rows 5-8 have been stored in the second register set, and the characteristic values in the second register set can be directly processed The processing reduces the waiting time for storing the characteristic values of lines 5-8 into the second register set, thereby greatly improving the processing efficiency.

It should be noted that the register group 1-4 can also be used to store rows 9-12, 17-20, etc., and the register group 5-8 can also be used to store rows 13-16, 21-24, etc. , And so on.

It can be understood that the storage mode of the feature value F2 in the registration groups 1-4 and 5-8 is the same as that of the feature value F1, and will not be repeated here.

Optionally, the storage control module is configured to perform a conjugate multiplication operation on the second result and the fourth result obtained by the above calculation to obtain W×H fifth results.

Among them, the arithmetic module performs inverse fast Fourier transform method on each row feature vector of W×H fifth results, and performs inverse fast Fourier transform on each column of W×H sixth results to obtain H×W For the method of the seventh result, please refer to the descriptions of the directional fast Fourier transform process and the column directional fast Fourier transform process of the foregoing embodiment, which will not be repeated here.

The storage control module is further configured to accumulate the real parts of the same row and the same column in the seventh result of each of the D dimensions to obtain W×H eighth results, and use the eighth result as The correlation degree between the first eigenvalue and the second eigenvalue; the correlation degree can be used in a kernel correlation filtering algorithm (Kernel The calculation in the Correlation Filter (KCF) is to calculate the correlation between the current image frame and the previous image frame.

Optionally, every time the H×W seventh result of a dimension is calculated, that is, the H×W seventh result of the dimension is accumulated with the same row and the same column in the accumulated result that has been stored. When all the results are calculated The H×W seventh results of the dimension can obtain the W×H eighth results, and the eighth results are used as the correlation between the first eigenvalue and the second eigenvalue.

Optionally, the storage module includes a first memory, a second memory, and a third memory.

The first memory is used to store the first result, the third result, and the sixth result.

The second memory is used to store the second result, the fourth result, the fifth result, and the seventh result.

The third memory is used to store the eighth result.

Specifically, optionally, the first result obtained after the Fourier transform is performed on each row of the feature vector of the first feature value F1 is stored in the first memory, and further, the arithmetic controller reads the first result from the first memory, The arithmetic controller controls the arithmetic module to perform fast Fourier transform on each column of the first result to obtain a second result, and stores the second result in the second memory.

Since the first result stored in the first memory has been used for the fast Fourier transform in the column direction, in order to save memory, the third result obtained by performing the fast Fourier transform on each row of the second eigenvalue can be stored in The first storage, it should be noted that, can be overwriting storage, that is, the third result overwrites the first result, or it can be stored without overwriting.

Further, the arithmetic controller reads the third result from the first memory, and controls the arithmetic module to perform fast Fourier transform on each column of the third result to obtain the fourth result. The fourth result can be stored in the second memory, which cannot be Overwrite storage, because the second result stored in the second memory requires a conjugate multiplication operation.

Further read the second result and the fourth result from the second memory, and the storage control module controls the second result and the fourth result to perform conjugate multiplication to obtain the fifth result, and store the fifth result in the second memory.

The fifth result is further read from the second memory, and the inverse fast Fourier transform is performed on each row of the fifth result to obtain the sixth result. Store the sixth result in the first memory.

The sixth result is further read from the first memory, and the inverse fast Fourier transform is performed on each column of the sixth result to obtain the seventh result.

The seventh result is stored in the second memory, and the seventh result is further read from the second memory. The real parts of the same row and the same column of the seventh result of all dimensions are added to obtain the eighth result, which is the degree of correlation. The correlation degree is stored in the third memory. For example, two dimensions, two rows and two columns of eigenvalues, then the real parts of the two second inverse fast Fourier transform results in the first row and first column of the two dimensions are added together, and the first in the two dimensions The real parts of the two second inverse fast Fourier transform results in one row and the second column are added together, and so on, to get the value of the addition of the real parts of the two rows and two columns.

As shown in FIG. 8, in one embodiment, each memory (such as the first memory, the second memory, and the third memory) in the storage module may include multiple storage units (such as Bank0-Bank31). Each storage unit has an address number, and each storage unit is used to store one row or one column of data (Figure 7 shows an example of a row). The number of storage units included in each memory may be the same or different, and the address numbers of the storage units in each memory may be the same or different.

In one embodiment, each memory includes the same number of storage units and the address numbers of the storage units in each memory are also the same. For example, each memory includes 32 storage units, and the address numbers of the 32 storage units are addresses. 0-Address 31. A storage unit sequentially stores each result in a row. When fast Fourier transform is performed on each column of data, the results with the same address number in multiple storage units are taken out to form a column of data. For example, all the data at address 0 is taken to form the first column of data.

In one embodiment, the above calculation of the fast Fourier transform of each row of data, or the fast Fourier transform of each column of data, and the inverse fast Fourier transform of each row of data, or the fast Fourier transform of each column of data The inverse transformation needs to go through the following four processes:

Data loading: the arithmetic controller loads the first characteristic value F1 and the first characteristic value F2 transmitted by the data transmission module, or the intermediate result output by the arithmetic module, or the first result and the third result output by the storage module into the register unit;

Data selection: The data selection unit selects the data loaded into the register unit, and transmits the selected data to the arithmetic module;

Data calculation: The calculation module performs fast Fourier transform or inverse fast Fourier transform on the selected data, and transmits the calculation result to the storage module;

Data storage: The storage module stores the received calculation results (such as the first to seventh results).

It can be seen that the above four processing procedures need to be executed in sequence, that is, data loading is performed first, data selection is performed, data calculation is performed, and data storage is finally performed. That is, the above four processing procedures are processed in a pipeline.

Fig. 9 is a schematic diagram of a pipeline processing provided by an embodiment of the present invention. As shown in Fig. 9, a cycle includes four sub-periods, and each sub-period corresponds to one of the above-mentioned processing procedures. In the first sub-cycle, the arithmetic controller loads the corresponding data into register group 1. In the second sub-cycle, the data selection unit selects the data loaded into register group 1, and the arithmetic controller loads the corresponding data Register group 2; in the third sub-cycle, the arithmetic module performs FFT or IFFT operations on the data of the selected register group 1, while the data selection unit selects the data loaded in register group 2, and the arithmetic controller loads the corresponding data Into register group 3; in the fourth sub-cycle, the storage module stores the result of the operation on the data of register group 1 output by the operation module, while the operation module performs FFT or IFFT operation on the data of the selected register group 2, and the data selection unit Load the data of register group 3 for selection, and the arithmetic controller loads the corresponding data into register group 4. By analogy, each cycle completes the FFT or IFFT operation of one row or one column of data, and each cycle has the participation of 4 register groups, which can process 4 rows or 4 columns of data. The calculation of the fast Fourier transform results of the 4 rows of eigenvalues is completed until 4 time periods, and the fast Fourier transform results of the 4 rows of eigenvalues are obtained. It should be noted that in order to avoid the waiting time for reading the eigenvalues of rows 5-8 from the outside, you can read the data to the first while performing fast Fourier transform calculations on the eigenvalues of rows 1-4 The second storage group is stored in the set.

In one embodiment, the number of processing procedures required to calculate the FFT or IFFT operation of one row or one column of data is the same as the number of sub-periods included in one cycle and the number of register groups included in a register set. For example, when calculating the FFT or IFFT operation of one row or one column of data requires four processing procedures, one cycle includes four sub-periods, and one register set includes four register sets.

Please refer to FIG. 5 again, the data processing system may also include a register interface. The register interface is used to obtain register configuration information and transmit the register configuration information to the operation controller. The arithmetic controller is configured to store feature vectors in the first register set and the second register set according to the register configuration information.

For example, the register configuration information indicates that data is stored in the first register set first, and then when the data stored in the first register set is processed, the data is stored in the second register set.

Please refer to FIG. 5 again, the data processing system may further include a task controller, the task controller is configured to send a task start signal to the calculation controller when the correlation calculation instruction is detected, and the task start signal is used to indicate The arithmetic controller performs fast Fourier transform or inverse fast Fourier transform on input data.

A person of ordinary skill in the art can understand that all or part of the processes in the above-mentioned embodiment methods can be implemented by instructing relevant hardware through a computer program. The program can be stored in a computer readable storage medium, and the program can be stored in a computer readable storage medium. When executed, it includes the procedures of the above-mentioned method embodiments. Wherein, the storage medium can be a magnetic disk, an optical disk, a read-only storage memory (Read-Only Memory, ROM) or random storage memory (Random Access Memory, RAM) etc.

Claims

A data processing system, characterized in that the data processing system includes a data transmission module, a control module, an arithmetic module, a storage control module, and a storage module;

The data transmission module is configured to receive a first characteristic value and a second characteristic value, the first characteristic value and the second characteristic value including D×W×H characteristic vectors, where D stands for dimension and W stands for width , H stands for height, D, W, and H are all positive integers;

The control module is configured to use the acquired W×H eigenvectors of the i-th dimension of the first eigenvalue as a first operation value, and control the operation module to perform each row of the first operation value The feature vector is subjected to fast Fourier transform to obtain W×H first results, and the arithmetic module is controlled to perform fast Fourier transform on each column of the first result to obtain H×W second results, where 1≤i≤D ；

The control module is further configured to use the acquired W×H eigenvectors of the i-th dimension of the second eigenvalue as a second operation value, and control the operation module to perform a calculation on each of the second operation value. Fast Fourier transform of a row of feature vectors to obtain W×H third results, and control the arithmetic module to perform fast Fourier transform on each column of the third result to obtain H×W fourth results;

The storage control module is configured to perform a conjugate multiplication operation on the second result and the fourth result to obtain W×H fifth results;

The control module is also configured to control the arithmetic module to perform inverse fast Fourier transform on each row feature vector of the fifth result to obtain W×H sixth results, and to control the arithmetic module to perform the inverse fast Fourier transform on the sixth result. Inverse fast Fourier transform for each column to obtain H×W seventh results;

The storage control module is further configured to accumulate the real parts of the same row and the same column in the seventh result of each of the D dimensions to obtain W×H eighth results, and use the eighth result as The degree of correlation between the first characteristic value and the second characteristic value;

The storage control module is further configured to control the storage module to store the first to seventh results and the correlation degree.
The data processing system according to claim 1, wherein the arithmetic module includes M butterfly arithmetic units, M is a positive integer, and each butterfly arithmetic unit performs fast Fourier transform or fast Fourier transform on the received data. Inverse transformation.
The data processing system according to claim 2, wherein the control module includes an arithmetic controller, a registering unit, and a data selection unit, the registering unit is used to register data to be calculated, and the arithmetic controller is used for According to the magnitude relationship between W and M or H and M, the data selection unit is controlled to select corresponding data from the register unit, and the selected data is output to the M butterfly operation units.
The data processing system of claim 3, wherein the arithmetic controller is further configured to control the data selection unit to select W data from the register unit if the W is less than 2M, and select Output data of to the M butterfly operation units;

The arithmetic controller is further configured to control the data selection unit to select 2M data from the register unit each time if the W is greater than or equal to 2M, and output the selected data to the M butterfly operations unit;

The operation controller is further configured to control the data selection unit to select H data from the register unit if the H is less than 2M, and output the selected data to the M butterfly operation units;

The operation controller is further configured to control the data selection unit to select 2M data from the register unit each time if the H is greater than or equal to 2M, and output the selected data to the M butterfly operations unit.
5. The data processing system of claim 4, wherein the data selection unit comprises data selection arbitration logic and data selection logic;

The data selection arbitration logic determines the rotation factor according to the number of data selected from the register unit each time and the current stage where the butterfly operation unit performs fast Fourier transform or inverse fast Fourier transform;

The data selection logic determines the serial numbers of the two data input to each butterfly operation unit of the M butterfly operation units according to the rotation factor.
The data processing system according to claim 3, wherein the register unit includes a first register set and a second register set, and both the first register set and the second register set include X register sets, A register group includes P registers, and the P registers in a register group are used to store a row or a column of feature vectors, and the X and P are both positive integers;

The operation controller is used to store X row or column feature vectors in the first register set;

The arithmetic controller is also used to control the data selection unit to select corresponding data from the first register set, and output the selected data to the M butterfly arithmetic units, while the arithmetic controller sends The second register set stores feature vectors other than the X row or column feature vectors.
The data processing system according to claim 6, said system further comprising a register interface;

The register interface is used to obtain register configuration information, and transmit the register configuration information to the operation controller;

The arithmetic controller is configured to store feature vectors in the first register set and the second register set according to the register configuration information.
The data processing system according to claim 1, wherein the storage module includes a first storage, a second storage, and a third storage;

The first memory is used to store the first result, the third result, and the sixth result;

The second memory is used to store the second result, the fourth result, the fifth result, and the seventh result;

The third memory is used to store the eighth result.
The data processing system according to claim 3, the system further comprising a task controller, the task controller is configured to send a task start signal to the calculation controller when the correlation calculation instruction is detected, and the task start The signal is used to instruct the arithmetic controller to perform fast Fourier transform or inverse fast Fourier transform on the input data.