WO2018018412A1 - 混合基dft/idft并行读取及计算方法和装置 - Google Patents

混合基dft/idft并行读取及计算方法和装置 Download PDF

Info

Publication number
WO2018018412A1
WO2018018412A1 PCT/CN2016/091707 CN2016091707W WO2018018412A1 WO 2018018412 A1 WO2018018412 A1 WO 2018018412A1 CN 2016091707 W CN2016091707 W CN 2016091707W WO 2018018412 A1 WO2018018412 A1 WO 2018018412A1
Authority
WO
WIPO (PCT)
Prior art keywords
parallel
data
stages
read
product
Prior art date
Application number
PCT/CN2016/091707
Other languages
English (en)
French (fr)
Inventor
李桓
王晓琴
郭晨
Original Assignee
中国科学院自动化研究所
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 中国科学院自动化研究所 filed Critical 中国科学院自动化研究所
Priority to PCT/CN2016/091707 priority Critical patent/WO2018018412A1/zh
Priority to US16/068,914 priority patent/US10698973B2/en
Publication of WO2018018412A1 publication Critical patent/WO2018018412A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/14Fourier, Walsh or analogous domain transformations, e.g. Laplace, Hilbert, Karhunen-Loeve, transforms
    • G06F17/141Discrete Fourier transforms
    • G06F17/142Fast Fourier transforms, e.g. using a Cooley-Tukey type algorithm
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F7/00Methods or arrangements for processing data by operating upon the order or content of the data handled
    • G06F7/38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
    • G06F7/48Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
    • G06F7/4806Computations with complex numbers
    • G06F7/4812Complex multiplication

Definitions

  • Embodiments of the present invention relate to the field of mobile communications technologies, and in particular, to a DFT/IDFT data parallel reading method, a hybrid DFT/IDFT parallel computing method, a hybrid DFT/IDFT data parallel reading device, and a hybrid DFT/IDFT.
  • Parallel computing device but is not limited to this.
  • DFT Discrete Fourier Transform
  • the general DFT generally adopts a hybrid basis algorithm based on the Cooley-Tukey algorithm.
  • the FFT algorithm of the base class 2 is also modified based on this.
  • the basic idea is to convert the large-point DFT into multiple small-point DFTs, each of which is called a first-level operation, and each stage-level operation is performed in turn to complete the entire DFT process.
  • the number of small points is set to a prime number, that is, 3, 5... In the operation, the processes of base 3, base 5, ... are successively nested.
  • Each level of base N operation is performed several times, but the data is subject to change.
  • Equation (1) is a base 3 algorithm expression, in which For inputting a twiddle factor, related to k; To output the twiddle factor, it has nothing to do with k.
  • the general DFT process is not an integer multiple of 2, the integer processor data cannot be read or written once by the general processor, thereby reducing the degree of parallelism.
  • the general DFT process is to multiply and add the data and the input rotation factor first, and then multiply and add the output rotation factor, so that the correlation between the data is large.
  • the general DFT process alternately performs multiplication, addition, and mixing, again introducing computational correlation. This causes the operator wait period caused by data correlation to become longer, and the pipeline utilization rate to be lowered, thereby reducing the processing rate of the entire DFT operation.
  • a main object of embodiments of the present invention is to provide a hybrid DFT/IDFT data parallel reading method that at least partially solves the technical problem of how to improve computing efficiency.
  • a hybrid DFT/IDFT parallel computing method, a hybrid-based DFT/IDFT data parallel reading device, and a hybrid-based DFT/IDFT parallel computing device are also provided.
  • a hybrid base DFT/IDFT data parallel reading method can include:
  • the two-loop parameters corresponding to the two-cycle parameters are calculated according to the judgment result, and the data is read in parallel based on the calculated two-cycle parameters.
  • the two-loop parameter is configured according to the product of the number of points corresponding to the number of stages to be calculated and the number of points corresponding to the number of completed stages, which may specifically include:
  • the first recirculation step is N 1
  • the first re-cycle number is N 0
  • the second recirculation step is N 2 and the second recirculation is
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the number of completed stages
  • the N 2 is the product of the N 1 and the N 0 .
  • the calculating the two-loop parameter corresponding to the determination result according to the determination result, and reading the data in parallel based on the calculated two-loop parameter may specifically include:
  • the first recirculation step is M
  • the first re-cycle number is
  • the second recirculation step is N 2 and the second recirculation is
  • the M represents the maximum number of parallel read data supported by the processor
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the completed number of stages
  • N 2 is a product of the N 1 and the N 0 ;
  • the data is read in parallel according to the above two loop parameters, and the M data is read each time until all of the N 1 data are read.
  • the method may further include:
  • the first recirculation step is The first number of recirculations is N 0
  • the second recirculation step is The second number of repetitions is
  • the M represents the maximum number of parallel read data supported by the processor
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the completed number of stages
  • N 2 is the product of the N 1 and the N 0 .
  • the parallel computing method can include:
  • Step 1 reading the input rotation factor and the output rotation factor in parallel, and multiplying the corresponding items, and taking the product result together with the input rotation factor as the equivalent rotation factor;
  • Step 2 multiplying the equivalent rotation factor by input data, and buffering the product result
  • Step 3 In the second re-loop, when the multiplication operation in the step 2 is performed, the result of the step 2 buffer is read out, and a corresponding addition or subtraction operation is performed.
  • the multiplying the equivalent twiddle factor by the input data and buffering the product result may include:
  • the processor does not include the complex arithmetic unit, the result of multiplying the equivalent twiddle factor by the real part and the imaginary part of the input data is buffered.
  • step 3 may specifically include:
  • the processor is provided with a complex arithmetic unit, when the multiplication operation in the step 2 is performed, the result of the buffering in the step 2 is read out, and a corresponding addition operation is performed.
  • step 3 may further specifically include:
  • a product between the equivalent twiddle factor and a real part of the input data is subtracted from a product between the equivalent twiddle factor and an imaginary part of the input data.
  • a hybrid DFT/IDFT data parallel reading apparatus is also provided.
  • the parallel reading device can include:
  • a point calculation unit configured to configure a double loop parameter according to a product of a number of points corresponding to the number of stages to be operated and a number of points corresponding to the number of completed stages;
  • a group number determining unit configured to determine a size between a number of maximum parallel read data and a number of points corresponding to the completed number of stages
  • the reading unit is configured to calculate a two-loop parameter corresponding to the determination result obtained by the group number determining unit, and read data in parallel based on the calculated two-loop parameter.
  • the point calculation unit may specifically include:
  • a configuration module configured to configure, according to the product of the number of points corresponding to the number of stages to be operated and the number of points corresponding to the number of completed stages, the following two loop parameters: the first recirculation step is N 1 , and the first recirculation the number of N 0, a second step is recirculated N 2, the number of cycles of the second weight
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the number of completed stages
  • the N 2 is the product of the N 1 and the N 0 .
  • reading unit may specifically include:
  • the first calculation module is configured to calculate the following two-loop parameters without processing the read rotation factor if M is less than or equal to N 1 :
  • the first recirculation step is M, and the number of repetitions is
  • the second loop step is N 2 and the number of repetitions is
  • the M represents the maximum number of parallel read data supported by the processor
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the completed number of stages
  • N 2 is a product of the N 1 and the N 0 ;
  • the first reading module is configured to read the data in parallel according to the double loop parameter described above, and read the M data each time until all the N 1 data are read out.
  • reading unit may specifically include:
  • a second calculation module configured to calculate when M is greater than N 1 value
  • the first heavy cycle step is The number of first recirculation cycles is N 0 and the second recirculation cycle step is The number of second loops is
  • the M represents the maximum number of parallel read data supported by the processor
  • the N 0 represents the number of points corresponding to the number of stages to be operated
  • the N 1 represents the product of the number of points corresponding to the completed number of stages
  • N 2 is the product of the N 1 and the N 0 .
  • a base is also provided A hybrid DFT/IDFT parallel computing device for the parallel reading device described above.
  • the parallel computing device can include:
  • An equivalent rotation factor calculation unit configured to read the input rotation factor and the output rotation factor in parallel, and multiply the corresponding items, and use the input rotation factor as the equivalent rotation factor;
  • a buffer unit configured to multiply the equivalent twiddle factor obtained by the equivalent twiddle factor calculation unit by input data, and cache the product result;
  • a data processing unit configured to read out the buffered result in the buffer unit and perform a corresponding addition or subtraction operation when the buffer unit performs the multiplication operation in the second re-loop.
  • the equivalent rotation factor calculation unit may specifically include:
  • Parallel read-in module for reading the input twiddle factor and the output twiddle factor in parallel
  • a buffering module configured to multiply the input twiddle factor by the output twiddle factor corresponding item to obtain first and second sets of equivalent twiddle factors, and to convert the first and the second set of equivalent rotations
  • the factor is stored in the cache along with the input twiddle factor as a third set of equivalent twiddle factors.
  • the data processing unit may further include:
  • a complex operation unit is configured to read out the buffered result in the cache unit and perform a corresponding addition operation.
  • the embodiment of the present invention passes the number of points corresponding to the number of stages to be operated and the number of completed stages
  • the product of the corresponding number of points is used to configure the double loop parameter; then, the size between the maximum parallel read data and the number of points corresponding to the completed number of stages is determined; finally, based on the judgment result, the calculation result is calculated according to the judgment result.
  • the corresponding two-cycle parameters are read in parallel based on the calculated two-cycle parameters. Therefore, by calculating the number of points related information, the two loop parameters are configured.
  • the bit width of the processor is constant, the data is read according to the maximum parallelism according to the number of points and the number of operation stages, and the data is not correlated, and it is not necessary for the operation.
  • the data is rearranged specifically, and no horizontal operation is required for processing, which improves the processing parallelism and reduces the calculation cycle.
  • FIG. 1 is a schematic flow chart of a hybrid DFT and IDFT data parallel reading method according to an exemplary embodiment
  • FIG. 2 is a schematic flow chart of a hybrid DFT and IDFT parallel computing method according to another exemplary embodiment
  • FIG. 3 is a flow diagram showing parallel reading of an input twiddle factor and an output twiddle factor, and multiplying the corresponding terms, and combining the product result with the input twiddle factor as an equivalent twiddle factor, according to an exemplary embodiment
  • FIG. 4 is a schematic structural diagram of a hybrid DFT and IDFT data parallel reading apparatus according to an exemplary embodiment
  • FIG. 5 is a schematic structural diagram of a hybrid DFT and IDFT parallel computing device according to an exemplary embodiment.
  • the environment to which the embodiment of the present invention is applied is an LTE system in the field of mobile communications, where the uplink transmitting end transmission precoding module is a DFT process, and the corresponding receiving end is an IDFT (Discrete Fourier Transform) process.
  • the uplink transmitting end transmission precoding module is a DFT process
  • the corresponding receiving end is an IDFT (Discrete Fourier Transform) process.
  • the number of points N for performing DFT/IDFT satisfies the following relationship:
  • N 2 ⁇ ⁇ 3 ⁇ ⁇ 5 ⁇ , 12 ⁇ N ⁇ 1536, ⁇ ⁇ 2, ⁇ ⁇ 1, ⁇ ⁇ 0
  • the DFT of 2 ⁇ points can be completed by FFT, and the DFT process of the remaining base 3 and base 5 is completed by using the mixed base DFT.
  • the mixed base DFT needs to perform the ⁇ -subunit 3 operation and the ⁇ -subunit 5 operation, and is completed in the order of the pre-base 3 post-base 5.
  • FIG. 1 exemplarily shows a hybrid base DFT/IDFT data parallel reading method. As shown in FIG. 1, the method can include:
  • S100 Configure a double loop parameter according to a product of the number of points corresponding to the number of stages to be operated and the number of points corresponding to the number of completed stages.
  • S110 Determine a size between a maximum parallel read data and a product of the number of points corresponding to the completed number of stages.
  • S120 Calculate the two-loop parameter corresponding to the judgment result according to the judgment result, and read the data in parallel based on the calculated two-cycle parameter.
  • two-cycle parameters are configured by calculating the point-related information, and when the bit width of the processor is constant, the data is read with the maximum degree of parallelism, thereby improving the processing parallelism.
  • configuring the two-loop parameter according to the product of the number of points corresponding to the number of stages to be operated and the number of points corresponding to the number of completed stages may include: corresponding to the number of stages to be operated
  • the product of the number of points and the number of points corresponding to the number of completed stages is configured as follows: the first re-circulation step is N 1 , the first re-cycle number is N 0 , and the second re-circulation step is N 2 ,
  • the number of double cycles is Where N 0 represents the number of points corresponding to the number of stages to be operated, N 1 represents the product of the number of points corresponding to the number of completed stages, and N 2 is the product of N 1 and N 0 .
  • the two-loop parameter corresponding to the two-cycle parameter is calculated based on the result of the determination, and the data may be read in parallel according to the calculated two-loop parameter.
  • the first recirculation step is M
  • M represents the maximum number of parallel read data supported by the processor
  • N 0 represents the number of points corresponding to the number of stages to be operated
  • N 1 represents the product of the number of points corresponding to the number of completed stages
  • N 2 is N 1 and N 0
  • the data is read in parallel according to the above cycle parameters, and M data is read each time until all N 1 data are read.
  • the two-cycle parameter is configured by calculating the point-related information, and when the processor has a certain bit width, the data is read with maximum parallelism, and the data is irrelevant, and no horizontal operation is required, thereby improving processing. Degree of parallelism.
  • the two-loop parameter corresponding to the two-cycle parameter is calculated based on the result of the determination, and the data may be further read based on the calculated two-loop parameter.
  • the first heavy cycle step is The number of first recirculation cycles is N 0 and the second recirculation cycle step is The number of second loops is
  • M represents the maximum number of parallel read data supported by the processor
  • N 0 represents the number of points corresponding to the number of stages to be operated
  • N 1 represents the product of the number of points corresponding to the number of completed stages
  • N 2 is N 1 and N 0 The product of.
  • the two-cycle parameter is configured by calculating the point-related information, and when the processor has a certain bit width, the data is read with maximum parallelism, and the data is irrelevant, and no horizontal operation is required, thereby improving processing. Degree of parallelism.
  • the embodiments of the present invention can be based on any mixed base process. Since the mixed base theory may take an arbitrary number, it is impossible to exhaustively exemplify the present invention. Therefore, the present invention will be described in detail by taking the base 3 as an example in a preferred manner.
  • N 0 represents the number of points corresponding to the number of stages to be operated
  • N 1 represents the product of the number of points corresponding to the number of completed stages
  • M represents the maximum number of parallel read data supported by the processor (which can be taken as 16); DFT points (can be taken as 1200 points).
  • N 2 is the product of N 1 and N 0
  • the first recirculation step is N 1
  • the first re-cycle number is N 0
  • the second re-circulation step is N 2
  • the first recirculation step is 16
  • the first re-cycle number is 3
  • the second re-circulation step is 48
  • the second re-cycle number 25.
  • step S210 determining the magnitude relation between M and N 1. If M is less than or equal to N 1 , step S211 is performed; otherwise, step S212 is performed.
  • the first recirculation step is M, and the number of repetitions is
  • the second loop step is N 2 and the number of repetitions is
  • the degree of parallelism is 16, and the bandwidth utilization rate is 1.
  • the second recirculation parameter is constant.
  • the first re-circulation parameter and the second re-cycle parameter may be adjusted according to the bit width of the processor.
  • an embodiment of the present invention further provides a hybrid DFT/IDFT parallel computing method. As shown in FIG. 2, the method can be implemented by step S300 to step S320.
  • this step may include: step S301 and step S302.
  • S302 Multiply the input rotation factor and the output rotation factor corresponding item to obtain the first One and a second set of equivalent twiddle factors, and the first and second sets of equivalent twiddle factors are stored in the cache along with an input twiddle factor as a third set of equivalent twiddle factors.
  • the first and second sets of equivalent twiddle factors are stored in the cache along with an input twiddle factor as a third set of equivalent twiddle factors.
  • the buffer can be performed in such a manner that the input rotation factor and the output rotation factor are always stored as a factor of 1.
  • the input rotation factor needs to store (N 0 -1) ⁇ N 1 different data according to different data.
  • the output rotation factor has only (N 0 -1) ⁇ (N 0 -1) different data, and the corresponding multiplication result is (N 0 -1) ⁇ (N 0 -1) ⁇ N 1 different data.
  • step S302 the two sets of equivalent twirl factors obtained in step S302 are used in this step. with And the input twiddle factor Multiplied by the input data as three sets of equivalent twiddle factors.
  • B and C represent input data.
  • the step caches the result of the multiplication of the equivalent twiddle factor with the real and imaginary parts of the input data.
  • step S320 In the second re-circulation, when the multiplication operation in step S310 is performed, the result buffered in step S310 is read out, and a corresponding addition or subtraction operation is performed.
  • the adding operation may be Among them, A, B and C represent input data.
  • the input and output rotation factors are multiplied, and then the multiplication results are buffered in the calculation process, thereby completely separating the multiplication and addition operations, thereby reducing the correlation in the whole operation process and improving the pipeline utilization rate. This increases the speed of the operation.
  • this step includes a product between the equivalent twiddle factor and the real part of the input data and the product between the equivalent twiddle factor and the imaginary part of the input data. Subtraction operation.
  • the multiplication and subtraction operations are completely separated, thereby improving the utilization rate of each component pipeline, thereby improving the operation rate.
  • the multiplication operation of the equivalent rotation factor and the input data is performed first in each operation process, and then the product result is all stored in the slowdown. Save.
  • the product result data in the buffer is read out to perform addition and subtraction operations to avoid the operator aerial shot generated by the correlation between data.
  • the apparatus 40 may include a point calculation unit 42, a group number judgment unit 44, and a reading unit 46.
  • the point calculation unit 42 is configured to configure the double loop parameter according to the product of the number of points corresponding to the number of stages to be operated and the number of points corresponding to the number of completed stages.
  • the group number judging unit 44 is for judging the size between the number of maximum parallel read data and the number of points corresponding to the number of completed stages.
  • the reading unit 46 is configured to calculate a double cycle parameter corresponding thereto according to the determination result obtained by the group number judgment unit 44, and read data in parallel based on the calculated two-cycle parameter.
  • the embodiment of the hybrid DFT/IDFT data parallel reading device configures two loop parameters by calculating the point number related information.
  • the bit width of the processor is constant, the data is read according to the maximum parallelism according to the number of points and the operation level. And the data is not correlated, which improves the processing parallelism and reduces the computation cycle.
  • the above point calculation unit 42 may further include a configuration module.
  • the configuration module is configured to configure the following two loop parameters according to the product of the number of points corresponding to the number of stages to be operated and the number of points corresponding to the number of completed stages: the first recirculation step is N 1 , and the first re-cycle number is N 0
  • the second loop step is N 2 and the second loop is Where N 0 represents the number of points corresponding to the number of stages to be operated, N 1 represents the product of the number of points corresponding to the number of completed stages, and N 2 is the product of N 1 and N 0 .
  • the reading unit 46 may further include a first computing module and a first reading module.
  • the first calculation module is configured to calculate the following two-cycle parameters without processing the read rotation factor if M is less than or equal to N 1 :
  • the first recirculation step is M, and the number of repetitions is
  • the second loop step is N 2 and the number of repetitions is
  • M represents the maximum number of parallel read data supported by the processor
  • N 0 represents the number of points corresponding to the number of stages to be operated
  • N 1 represents the product of the number of points corresponding to the number of completed stages
  • N 2 is N 1 and N 0
  • the first reading module is configured to read data in parallel according to the above two loop parameters, and read M data each time until all N 1 data are read out.
  • the reading unit 46 may further include a second calculation module, a copy module, and a second read module.
  • the second calculation module is configured to calculate when M is greater than N 1 value.
  • Copy module for copying The rotation factor read by the share.
  • the second reading module is configured to read in parallel in N 2 steps according to the following two loop parameters Group data: the first heavy cycle step is The number of first recirculation cycles is N 0 and the second recirculation cycle step is The number of second loops is Where M represents the maximum number of parallel read data supported by the processor, N 0 represents the number of points corresponding to the number of stages to be operated, N 1 represents the product of the number of points corresponding to the number of completed stages, and N 2 is N 1 and N 0 The product of.
  • hybrid DFT/IDFT data parallel reading device provided by the above embodiment is only illustrated by the division of each functional module described above when performing data reading. In practical applications, the above may be used as needed.
  • the function assignment is done by different functional modules, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • embodiments of the present invention also provide a hybrid DFT/IDFT parallel computing device based on the above-described parallel reading device embodiment.
  • the parallel computing device can perform the parallel computing method embodiments described above.
  • the apparatus 50 can include an equivalent twiddle factor calculation unit 52, a cache unit 54, and a data processing unit 56.
  • the equivalent rotation factor calculation unit 52 is configured to read the input rotation factor and the output rotation factor in parallel, and multiply the corresponding items, and take the product result together with the input rotation factor as the equivalent rotation factor.
  • the buffer unit 54 is for multiplying the equivalent twiddle factor obtained by the equivalent twiddle factor calculation unit 52 with the input data, and buffering the product result.
  • the data processing unit 56 is configured to read out the result of the buffering in the buffer unit 54 and perform a corresponding addition or subtraction operation when the buffer unit 54 performs the multiplication operation in the second re-loop.
  • the embodiment of the hybrid DFT/IDFT parallel computing device preferentially processes the twiddle factor when performing the operation, and separates the multiplication operation from the addition and subtraction operation, thereby reducing the correlation between data, so that the overall operation empty beat is reduced, and the whole is improved.
  • Pipeline utilization which in turn can effectively increase the DFT and IDFT operation rates of the hybrid base.
  • the equivalent rotation factor calculation unit 52 may further include a parallel read-in module and a cache module. Where parallel read-in modules are used for parallel reads Enter the input twiddle factor and the output twiddle factor.
  • the cache module is configured to multiply the input rotation factor by the output rotation factor corresponding item to obtain the first and second sets of equivalent rotation factors, and combine the first and second sets of equivalent rotation factors as the third group of equivalent rotations
  • the input rotation factor of the factor is stored in the cache.
  • the data processing unit may further include a complex arithmetic unit.
  • the complex operation unit is configured to read out the buffered result in the cache unit and perform a corresponding addition operation.
  • the hybrid DFT/IDFT parallel computing device provided by the foregoing embodiment is only illustrated by the division of each functional module. In practical applications, the foregoing functions may be allocated according to requirements. Different functional modules are implemented, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • hybrid DFT/IDFT parallel computing device also includes some other well-known structures, such as a processor, a controller, a memory, etc., wherein the memory includes but not Restricted to random access memory, flash memory, read only memory, programmable read only memory, volatile memory, nonvolatile memory, serial memory, parallel memory or registers, etc., including but not limited to CPLD/FPGA, DSP, ARM Processors, MIPS processors, etc., in order to unnecessarily obscure the embodiments of the present disclosure, these well-known structures are not shown in Figures 4-5.
  • module may refer to a software object or routine that is executed on a computing system.
  • the different modules described herein can be implemented as an object or process executing on a computing system (eg, as a separate thread).
  • systems and methods described herein are preferably implemented in software, implementation in hardware or a combination of software and hardware is also possible and can be envisioned.
  • the various steps of the present invention can be implemented with a general-purpose computing device, for example, they can be centralized on a single computing device, such as a personal computer, a server computer, a handheld device or a portable device, a tablet device, or a multi-processor device. It may be distributed over a network of computing devices, which may perform the steps shown or described in a different order than the ones described herein, or separate them into individual integrated circuit modules, or multiple of them. Or the steps are made into a single integrated circuit module. Thus, the invention is not limited to any specific hardware or software or combination thereof.
  • the method provided by the present invention may be implemented using a programmable logic device, or may be implemented as a computer program software or program module (which includes performing a specific task or implementing a specific abstraction).
  • a data type of routine, program, object, component or data structure, etc. for example, an embodiment of the invention may be a computer program product that is executed to cause a computer to perform the method for the demonstration.
  • the computer program product comprises a computer readable storage medium having computer program logic or code portions for implementing the method.
  • the computer readable storage medium may be a built-in medium installed in a computer or a removable medium detachable from a computer main body (for example, a storage device using hot plug technology).
  • the built-in medium includes, but is not limited to, a rewritable non-volatile memory such as a RAM, a ROM, a flash memory, and a hard disk.
  • the removable medium includes, but is not limited to, optical storage media (eg, CD-ROM and DVD), magneto-optical storage media (eg, MO), magnetic storage media (eg, magnetic tape or mobile hard disk), with built-in weight
  • a medium for example, a memory card
  • a medium for example, a ROM box

Landscapes

  • Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Analysis (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Discrete Mathematics (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Complex Calculations (AREA)
  • Computing Systems (AREA)

Abstract

一种混合基DFT/IDFT数据并行读取方法、混合基DFT/IDFT并行计算方法、混合基DFT/IDFT数据并行读取装置和混合基DFT/IDFT并行计算装置。其中,该并行读取方法包括:首先,根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数(S100);然后,判断最大并行读取数据个数与已完成级数所对应点数的乘积之间的大小(S110);最后,基于判断结果,根据判断结果计算与之对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据(S120)。由此,提高了处理并行度,减少了数据间相关性,使得整体运算空拍减少,提高了流水线利用率,进而可有效提升混合基DFT/IDFT运算速率。

Description

混合基DFT/IDFT并行读取及计算方法和装置 技术领域
本发明实施例涉及移动通信技术领域,具体涉及一种混合基DFT/IDFT数据并行读取方法、混合基DFT/IDFT并行计算方法、混合基DFT/IDFT数据并行读取装置、混合基DFT/IDFT并行计算装置,但绝不限于此。
背景技术
在数字信号处理系统中,尤其是有限长序列,DFT(离散傅里叶变换)是一种极为重要的数学变换。其本质为有限长序列傅里叶变换的有限点离散采样。它使得数字信号处理可以在频域采用数字运算方法完成,增强了数字信号处理的灵活性,DFT在数字通信、图像处理、功率谱估计等领域有着广泛应用。其中,点数为2的幂次方的DFT运算可采用基2类FFT算法完成。对于其他点数情况,即不能采用FFT算法完成称为一般数DFT。
目前,一般数DFT一般采用以Cooley-Tukey算法为理论基础的混合基算法。基2类的FFT算法也以此为基础修改得到。其基本思想为:将大点数DFT转化为多次小点数DFT,其中每一次运算称为一级,依次执行每一级运算完成整个DFT过程。通常将小点数设置为质数,即3、5……在运算时依次按照基3、基5……的过程不断嵌套进行。每一级基N操作执行若干次,但所针对数据有所变化。
Figure PCTCN2016091707-appb-000001
Figure PCTCN2016091707-appb-000002
式(1)为基3算法表达式,其中,
Figure PCTCN2016091707-appb-000003
为输入旋转因子,与k有关;
Figure PCTCN2016091707-appb-000004
为输出旋转因子,与k无关。
由于一般数DFT过程非2的整数倍,因此一般处理器进行处理时无法将整数组数据一次读入或写出,从而降低了并行度。同时,一般DFT处理过程为先进行数据与输入旋转因子的乘、加运算,再进行与输出旋转因子的乘、加运算,使得数据间相关性较大。再者,一般DFT过程将乘、加混合交替执行,再次引入计算相关性。这导致由数据相关性引起的运算器等待周期变长,流水线利用率降低,从而降低整个DFT运算的处理速率。
有鉴于此,特提出本发明。
发明内容
本发明实施例的主要目的在于提供一种混合基DFT/IDFT数据并行读取方法,其至少部分地解决了如何提升运算效率的技术问题。此外,还提供一种混合基DFT/IDFT并行计算方法、混合基DFT/IDFT数据并行读取装置、混合基DFT/IDFT并行计算装置。
为了实现上述目的,根据本发明的一个方面,提供了以下技术方案:
一种混合基DFT/IDFT数据并行读取方法。所述方法可以包括:
根据待运算级数所对应的点数和已完成级数所对应点数的乘积, 来配置两重循环参数;
判断最大并行读取数据个数与所述已完成级数所对应点数的乘积之间的大小;
根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
进一步地,所述根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数,具体可以包括:
根据所述待运算级数所对应的点数和所述已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
Figure PCTCN2016091707-appb-000005
其中,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
进一步地,所述根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据,具体可以包括:
在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下两重循环参数:
所述第一重循环步长为M、所述第一重循环次数为
Figure PCTCN2016091707-appb-000006
所述第二重循环步长为N2、所述第二重循环次数为
Figure PCTCN2016091707-appb-000007
其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积;
根据上述两重循环参数并行读取所述数据,且每次读取所述M个 数据,直至将所述N1个数据全部读出。
进一步地,所述根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据,还具体可以包括:
在M大于N1的情况下,计算
Figure PCTCN2016091707-appb-000008
值;
复制
Figure PCTCN2016091707-appb-000009
份所读取的旋转因子;
根据以下两重循环参数以N2步长并行读取前
Figure PCTCN2016091707-appb-000010
组数据:所述第一重循环步长为
Figure PCTCN2016091707-appb-000011
所述第一重循环次数为N0、所述第二重循环步长为
Figure PCTCN2016091707-appb-000012
所述第二重循环次数为
Figure PCTCN2016091707-appb-000013
其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
为了实现上述目的,根据本发明的另一个方面,还提供了一种基于上述方法的混合基DFT/IDFT并行计算方法。所述并行计算方法可以包括:
步骤1:并行读取输入旋转因子与输出旋转因子,并将二者对应项进行相乘,将乘积结果连同所述输入旋转因子作为等效旋转因子;
步骤2:将所述等效旋转因子与输入数据相乘,并对乘积结果进行缓存;
步骤3:在第二重循环中,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行相应的加法或减法操作。
进一步地,所述将所述等效旋转因子与输入数据相乘,并对乘积结果进行缓存,具体可以包括:
在处理器未设有复数运算单元的情况下,将所述等效旋转因子与所述输入数据的实部、虚部交叉相乘的结果进行缓存。
进一步地,所述步骤3具体可以包括:
在处理器设有复数运算单元的情况下,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行相应的加法操作。
进一步地,所述步骤3还具体可以包括:
在处理器未设有复数运算单元的情况下,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行如下减法操作:
将所述等效旋转因子和所述输入数据的实部之间乘积与所述等效旋转因子和所述输入数据的虚部之间乘积相减。
为了实现上述目的,根据本发明的再一个方面,还提供了一种混合基DFT/IDFT数据并行读取装置。该并行读取装置可以包括:
点数计算单元,用于根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数;
组数判断单元,用于判断最大并行读取数据个数与所述已完成级数所对应点数的乘积之间的大小;
读取单元,用于根据所述组数判断单元得到的判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
进一步地,所述点数计算单元具体可以包括:
配置模块,用于根据所述待运算级数所对应的点数和所述已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为 N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
Figure PCTCN2016091707-appb-000014
其中,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
进一步地,所述读取单元具体可以包括:
第一计算模块,用于在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下两重循环参数:
第一重循环步长为M、重复次数为
Figure PCTCN2016091707-appb-000015
第二重循环步长为N2、重复次数为
Figure PCTCN2016091707-appb-000016
其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积;
第一读取模块,用于根据上述两重循环参数并行读取所述数据,且每次读取所述M个数据,直至将所述N1个数据全部读出。
进一步地,所述读取单元还具体可以包括:
第二计算模块,用于在M大于N1的情况下,计算
Figure PCTCN2016091707-appb-000017
值;
复制模块,用于复制所述
Figure PCTCN2016091707-appb-000018
份所读取的旋转因子;
第二读取模块,用于根据以下两重循环参数以N2步长并行读取前所述
Figure PCTCN2016091707-appb-000019
组数据:第一重循环步长为
Figure PCTCN2016091707-appb-000020
第一重循环次数为N0、第二重循环步长为
Figure PCTCN2016091707-appb-000021
第二重循环次数为
Figure PCTCN2016091707-appb-000022
其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
为了实现上述目的,根据本发明的又一个方面,还提供了一种基 于上述并行读取装置的混合基DFT/IDFT并行计算装置。该并行计算装置可以包括:
等效旋转因子计算单元,用于并行读取输入旋转因子与输出旋转因子,并将所述二者对应项进行相乘,将乘积结果连同所述输入旋转因子作为等效旋转因子;
缓存单元,用于将由所述等效旋转因子计算单元得到的所述等效旋转因子与输入数据相乘,并对乘积结果进行缓存;
数据处理单元,用于在第二重循环中,在所述缓存单元执行乘法运算时,将所述缓存单元中缓存的结果读出,并进行相应的加法或减法操作。
进一步地,所述等效旋转因子计算单元具体可以包括:
并行读入模块,用于并行读入所述输入旋转因子及所述输出旋转因子;
缓存模块,用于将所述输入旋转因子与所述输出旋转因子对应项进行相乘,得到第一和第二组等效旋转因子,并将所述第一和所述第二组等效旋转因子连同作为第三组等效旋转因子的所述输入旋转因子存入缓存。
进一步地,所述数据处理单元还可以包括:
复数运算单元,用于将所述缓存单元中缓存的结果读出,并进行相应的加法操作。
与现有技术相比,上述技术方案至少具有以下有益效果:
本发明实施例通过根据待运算级数所对应的点数和已完成级数 所对应点数的乘积,来配置两重循环参数;然后,判断最大并行读取数据个数与已完成级数所对应点数的乘积之间的大小;最后,基于判断结果,根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。由此,通过对点数相关信息进行计算,配置两重循环参数,当处理器的位宽一定时,根据点数与运算级数以最大并行度读取数据,并且数据间不相关,在运算时无需专门对数据进行重排操作,无需进行横向操作进行处理,提高了处理并行度,减少了运算周期。
当然,实施本发明的任一产品不一定需要同时实现以上所述的所有优点。
本发明的其它特征和优点将在随后的说明书中阐述,并且,部分地从说明书中变得显而易见,或者通过实施本发明而了解。本发明的目的和其它优点可通过在所写的说明书、权利要求书以及附图中所特别指出的方法来实现和获得。
附图说明
附图作为本发明的一部分,用来提供对本发明的进一步的理解,本发明的示意性实施例及其说明用于解释本发明,但不构成对本发明的不当限定。显然,下面描述中的附图仅仅是一些实施例,对于本领域普通技术人员来说,在不付出创造性劳动的前提下,还可以根据这些附图获得其他附图。在附图中:
图1为根据一示例性实施例示出的混合基DFT和IDFT数据并行读取方法的流程示意图;
图2为根据另一示例性实施例示出的混合基DFT和IDFT并行计算方法的流程示意图;
图3为根据一示例性实施例示出的并行读取输入旋转因子与输出旋转因子,并将二者对应项进行相乘,将乘积结果连同输入旋转因子作为等效旋转因子的流程示意图;
图4为根据一示例性实施例示出的混合基DFT和IDFT数据并行读取装置的结构示意图;
图5为根据一示例性实施例示出的混合基DFT和IDFT并行计算装置的结构示意图。
这些附图和文字描述并不旨在以任何方式限制本发明的保护范围,而是通过参考特定实施例为本领域技术人员说明本发明的概念。
具体实施方式
下面结合附图以及具体实施例对本发明实施例解决的技术问题、所采用的技术方案以及实现的技术效果进行清楚、完整的描述。显然,所描述的实施例仅仅是本申请的一部分实施例,并不是全部实施例。基于本申请中的实施例,本领域普通技术人员在不付出创造性劳动的前提下,所获的所有其它等同或明显变型的实施例均落在本发明的保护范围内。本发明实施例可以按照权利要求中限定和涵盖的多种不同方式来具体化。
需要说明的是,在下面的描述中,为了方便理解,给出了许多具体细节。但是很明显,本发明的实现可以没有这些具体细节。
还需要说明的是,在没有明确限定或不冲突的情况下,本发明中 的各个实施例及其中的技术特征可以相互组合而形成技术方案。
本发明实施例应用的环境为移动通信领域的LTE系统,其中,上行发送端传输预编码模块为DFT过程,对应的接收端为IDFT(离散傅里叶反变换)过程。
根据分配资源数不同,进行DFT/IDFT的点数N满足以下关系:
N=2α×3β×5γ,12≤N≤1536,α≥2,β≥1,γ≥0
具体实现时,2α点的DFT可采用FFT完成,剩余基3、基5的DFT过程则需使用混合基DFT完成。其中,混合基DFT需进行β次基3运算及γ次基5运算,且采用先基3后基5的顺序完成。
图1示例性地示出了一种混合基DFT/IDFT数据并行读取方法。如图1所示,该方法可以包括:
S100:根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数。
S110:判断最大并行读取数据个数与已完成级数所对应点数的乘积之间的大小。
S120:根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
本发明实施例通过对点数相关信息进行计算,配置两重循环参数,当处理器的位宽一定时以最大并行度读取数据,从而提高了处理并行度。
作为本实施例的一种可选的实现方式,根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数具体可以 包括:根据待运算级数所对应的点数和已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
Figure PCTCN2016091707-appb-000023
其中,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积。
作为本实施例的一种可选的实现方式,基于判断结果,根据该判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据具体可以包括:
在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下循环参数:
第一重循环步长为M、第一重循环次数为
Figure PCTCN2016091707-appb-000024
第二重循环步长为N2、第二重循环次数为
Figure PCTCN2016091707-appb-000025
其中,M表示处理器所支持的最大并行读取数据个数,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积;
根据上述循环参数并行读取数据,且每次读取M个数据,直至将N1个数据全部读出。
本发明实施例通过对点数相关信息进行计算,配置两重循环参数,当处理器得位宽一定时以最大并行度读取数据,并且数据间不相关,无需进行横向操作处理,从而提高了处理并行度。
作为本实施例的一种可选的实现方式,基于判断结果,根据该判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据具体还可以包括:
在M大于N1的情况下,计算
Figure PCTCN2016091707-appb-000026
值,将所读取的旋转因子复制
Figure PCTCN2016091707-appb-000027
份,并根据以下循环参数以N2步长并行读取前
Figure PCTCN2016091707-appb-000028
组数据:第一重循环步长为
Figure PCTCN2016091707-appb-000029
第一重循环次数为N0、第二重循环步长为
Figure PCTCN2016091707-appb-000030
第二重循环次数为
Figure PCTCN2016091707-appb-000031
其中,M表示处理器所支持的最大并行读取数据个数,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积。
本发明实施例通过对点数相关信息进行计算,配置两重循环参数,当处理器得位宽一定时以最大并行度读取数据,并且数据间不相关,无需进行横向操作处理,从而提高了处理并行度。
本发明实施例可以基于任意混合基过程,鉴于混合基理论可能取任意数,不可能穷尽举例,所以,下面通过优选的方式以基3为例来详细说明本发明。
假设:N0表示待运算级数所对应的点数;N1表示已完成级数所对应点数的乘积;M表示处理器所支持的最大并行读取数据个数(可以取为16);N表示DFT点数(可以取为1200点)。
S200:计算N0和N1,N0=3,N1=16,并根据N0和N1来确定循环参数,其中,循环参数包括第一重循环步长及循环次数、第二重循环步长及循环次数。
在本步骤中,N2为N1与N0的乘积,第一重循环步长为N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
Figure PCTCN2016091707-appb-000032
由此通过计算可以得出:N2=48,第一重循环步长为16,第一重循环次数为3;第二重循环步长为48,第二重循环次数为25。
S210:判断M与N1大小关系。如果M小于等于N1,则执行步骤S211;否则,执行步骤S212。
S211:不处理所读取的旋转因子,根据以下循环参数并行读取数据,且每次读取M个数据,直到将N1个数据全部读出:
第一重循环步长为M、重复次数为
Figure PCTCN2016091707-appb-000033
第二重循环步长为N2、重复次数为
Figure PCTCN2016091707-appb-000034
此时并行度为16,带宽利用率为1。在本步骤中,第二重循环参数是不变的。在实际应用中,第一重循环参数和第二重循环参数可以根据处理器的位宽进行调整。
S212:计算
Figure PCTCN2016091707-appb-000035
值,将所读取的旋转因子复制
Figure PCTCN2016091707-appb-000036
份,并根据以下循环参数以N2步长并行读取前
Figure PCTCN2016091707-appb-000037
组数据:第一重循环步长为
Figure PCTCN2016091707-appb-000038
第一重循环次数为N0、第二重循环步长为
Figure PCTCN2016091707-appb-000039
第二重循环次数为
Figure PCTCN2016091707-appb-000040
此时的并行度为
Figure PCTCN2016091707-appb-000041
基于上述实施例,本发明实施例还提出一种混合基DFT/IDFT并行计算方法。如图2所示,该方法可以通过步骤S300至步骤S320来实现。
S300:并行读取输入旋转因子与输出旋转因子,并将二者对应项进行相乘,将乘积结果连同输入旋转因子作为等效旋转因子。
具体地,如图3所示,本步骤可以包括:步骤S301和步骤S302。
S301:并行读入输入旋转因子及输出旋转因子。
S302:将输入旋转因子与输出旋转因子对应项进行相乘,得到第 一和第二组等效旋转因子,并将第一和第二组等效旋转因子连同作为第三组等效旋转因子的输入旋转因子存入缓存。
下面通过优选的方式以基3为例来详细说明得到等效旋转因子的过程。
S401:并行读入输入旋转因子
Figure PCTCN2016091707-appb-000042
及输出旋转因子
Figure PCTCN2016091707-appb-000043
Figure PCTCN2016091707-appb-000044
其中,W为旋转因子标记;k为进行基N操作的数据大小,取值为0,1,......N-1。
S402:将输入旋转因子与输出旋转因子对应项进行相乘,得到第一和第二组等效旋转因子:
Figure PCTCN2016091707-appb-000045
Figure PCTCN2016091707-appb-000046
并将第一和第二组等效旋转因子连同作为第三组等效旋转因子的输入旋转因子存入缓存。
其中,可以根据以下方式进行缓存:不单独存储输入旋转因子与输出旋转因子恒为1的因子。输入旋转因子根据数据不同需存储(N0-1)×N1个不同数据,输出旋转因子仅有(N0-1)×(N0-1)个不同数据,对应相乘结果为(N0-1)×(N0-1)×N1个不同数据。
S310:将等效旋转因子与输入数据相乘,并对乘积结果进行缓存。
具体地,以基3为例,本步骤将步骤S302得到的两组等效旋转因子
Figure PCTCN2016091707-appb-000047
Figure PCTCN2016091707-appb-000048
以及输入旋转因子
Figure PCTCN2016091707-appb-000049
作为三组等效旋转因子与输入数据相乘。
其中,乘法结果为
Figure PCTCN2016091707-appb-000050
Figure PCTCN2016091707-appb-000051
其中,B和C表示输入数据。
在一个可选的实施例中,若处理器无复数运算单元,则该步骤将等效旋转因子与输入数据的实部、虚部交叉相乘的结果进行缓存。
本步骤在进行计算时,由于每一组运算过程中的旋转因子使用缓存中的等效旋转因子,使得每一组的运算过程仅包含输入数据及旋转因子之间的乘、加运算,每一组运算过程之间无数据前后相关性,并且第二重循环中25次运算该过程仅需执行一次。
S320:在第二重循环中,执行步骤S310中乘法运算时,将步骤S310缓存的结果读出,并进行相应的加法或减法操作。
其中,作为优选实施例之一,以基3为例,在处理器设有复数运算单元的情况下,加法操作可以为
Figure PCTCN2016091707-appb-000052
Figure PCTCN2016091707-appb-000053
Figure PCTCN2016091707-appb-000054
其中,A、B和C表示输入数据。
本发明实施例将输入与输出旋转因子进行相乘,再通过将计算过程中乘法结果进行缓存,从而将乘、加操作完全分离,降低了整个运算过程中的相关性,提高了流水线利用率,进而提升了运算速率。
在一个可选的实施例中,若处理器无复数运算单元,则本步骤包含等效旋转因子和输入数据的实部之间的乘积与等效旋转因子和输入数据的虚部之间的乘积的减法操作。
本发明实施例通过将乘、减操作完全分离,从而提高了各部件流水线利用率,进而提升了运算速率。
综上所述,本发明实施例在进行计算时,每一组运算过程中先进行等效旋转因子与输入数据的乘法操作,然后将乘积结果全部存入缓 存。在下一组运算进行乘法操作时将缓存中的乘积结果数据读出进行加、减操作,以规避数据间相关性产生的运算器空拍。
上述实施例中虽然将各个步骤按照上述先后次序的方式进行了描述,但是本领域技术人员可以理解,为了实现本实施例的效果,不同的步骤之间不必按照这样的次序执行,其可以同时(并行)执行或以颠倒的次序执行,这些简单的变化都在本发明的保护范围之内。
基于与上述并行读取方法实施例相同的技术构思,本发明实施例还提供一种混合基DFT/IDFT数据并行读取装置。如图4所示,该装置40可以包括点数计算单元42、组数判断单元44和读取单元46。其中,点数计算单元42用于根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数。组数判断单元44用于判断最大并行读取数据个数与已完成级数所对应点数的乘积之间的大小。读取单元46用于根据组数判断单元44得到的判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
本混合基DFT/IDFT数据并行读取装置实施例通过对点数相关信息进行计算,配置两重循环参数,当处理器的位宽一定时,根据点数与运算级数以最大并行度读取数据,并且数据间不相关,提高了处理并行度,减少了运算周期。
在上述实施例的基础上,上述点数计算单元42可以进一步包括配置模块。该配置模块用于根据待运算级数所对应的点数和已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为N1, 第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
Figure PCTCN2016091707-appb-000055
其中,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积。
在图4所示实施例的基础上,读取单元46可以进一步包括第一计算模块和第一读取模块。其中,第一计算模块用于在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下两重循环参数:
第一重循环步长为M、重复次数为
Figure PCTCN2016091707-appb-000056
第二重循环步长为N2、重复次数为
Figure PCTCN2016091707-appb-000057
其中,M表示处理器所支持的最大并行读取数据个数,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积。第一读取模块用于根据上述两重循环参数并行读取数据,且每次读取M个数据,直至将N1个数据全部读出。
在图4所示实施例的基础上,读取单元46还可以进一步包括第二计算模块、复制模块和第二读取模块。其中,第二计算模块用于在M大于N1的情况下,计算
Figure PCTCN2016091707-appb-000058
值。复制模块用于复制
Figure PCTCN2016091707-appb-000059
份所读取的旋转因子。第二读取模块用于根据以下两重循环参数以N2步长并行读取前
Figure PCTCN2016091707-appb-000060
组数据:第一重循环步长为
Figure PCTCN2016091707-appb-000061
第一重循环次数为N0、第二重循环步长为
Figure PCTCN2016091707-appb-000062
第二重循环次数为
Figure PCTCN2016091707-appb-000063
其中,M表示处理器所支持的最大并行读取数据个数,N0表示待运算级数所对应的点数,N1表示已完成级数所对应点数的乘积,N2为N1与N0的乘积。
有关该并行读取装置实施例的说明可以参考与之相关的并行读 取方法实施例的说明,在此不再赘述。
需要说明的是,上述实施例提供的混合基DFT/IDFT数据并行读取装置在进行数据读取时,仅以上述各功能模块的划分进行举例说明,在实际应用中,可以根据需要而将上述功能分配由不同的功能模块来完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
此外,本发明实施例还提出一种基于上述并行读取装置实施例的混合基DFT/IDFT并行计算装置。该并行计算装置可以执行上述并行计算方法实施例。如图5所示,该装置50可以包括等效旋转因子计算单元52、缓存单元54和数据处理单元56。其中,等效旋转因子计算单元52用于并行读取输入旋转因子与输出旋转因子,并将二者对应项进行相乘,将乘积结果连同输入旋转因子作为等效旋转因子。缓存单元54用于将由等效旋转因子计算单元52得到的等效旋转因子与输入数据相乘,并对乘积结果进行缓存。数据处理单元56用于在第二重循环中,在缓存单元54执行乘法运算时,将缓存单元54中缓存的结果读出,并进行相应的加法或减法操作。
本混合基DFT/IDFT并行计算装置实施例在进行运算时优先对旋转因子进行处理,并将乘法运算与加减法运算分离,减少了数据间相关性,以使整体运算空拍减少,提高了流水线利用率,进而可有效提升混合基DFT和IDFT运算速率。
在上述实施例的基础上,上述等效旋转因子计算单元52可以进一步包括并行读入模块和缓存模块。其中,并行读入模块用于并行读 入输入旋转因子及输出旋转因子。缓存模块用于将输入旋转因子与输出旋转因子对应项进行相乘,得到第一和第二组等效旋转因子,并将第一和第二组等效旋转因子连同作为第三组等效旋转因子的输入旋转因子存入缓存。
在上述图5所示实施例的基础上,数据处理单元还可以包括复数运算单元。其中,复数运算单元用于将缓存单元中缓存的结果读出,并进行相应的加法操作。
有关该并行计算装置实施例的说明可以参考与之相关的并行计算方法实施例的有关说明,在此不再赘述。
需要说明的是,上述实施例提供的混合基DFT/IDFT并行计算装置在进行并行计算时,仅以上述各功能模块的划分进行举例说明,在实际应用中,可以根据需要而将上述功能分配由不同的功能模块来完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。
本领域技术人员可以理解,上述混合基DFT/IDFT数据并行读取装置、混合基DFT/IDFT并行计算装置还包括一些其他公知结构,例如处理器、控制器、存储器等,其中,存储器包括但不限于随机存储器、闪存、只读存储器、可编程只读存储器、易失性存储器、非易失性存储器、串行存储器、并行存储器或寄存器等,处理器包括但不限于CPLD/FPGA、DSP、ARM处理器、MIPS处理器等,为了不必要地模糊本公开的实施例,这些公知的结构未在图4-5中示出。
应该理解,图4-5中的各个模块的数量仅仅是示意性的。根据 实际需要,可以具有任意数量的各模块。
上述装置实施例可以用于执行上述相应的方法实施例,其技术原理、所解决的技术问题及产生的技术效果相似,所属技术领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的装置的具体工作过程及有关说明,可以参考前述方法实施例中的对应过程,在此不再赘述。
应指出的是,上面分别对本发明的装置实施例和方法实施例进行了描述,但是对一个实施例描述的细节也可应用于另一个实施例。对于本发明实施例中涉及的模块、步骤的名称,仅仅是为了区分各个模块或者步骤,不视为对本发明的不当限定。本领域技术人员应该理解:本发明实施例中的模块或者步骤还可以再分解或者组合。例如上述实施例的模块可以合并为一个模块,也可以进一步拆分成多个子模块。
以上对本发明实施例所提供的技术方案进行了详细的介绍。虽然本文应用了具体的个例对本发明的原理和实施方式进行了阐述,但是,上述实施例的说明仅适用于帮助理解本发明实施例的原理;同时,对于本领域技术人员来说,依据本发明实施例,在具体实施方式以及应用范围之内均会做出改变。
需要说明的是,本文中涉及到的流程图或框图不仅仅局限于本文所示的形式,其还可以进行其他划分和/或组合。
还需要说明的是:附图中的标记和文字只是为了更清楚地说明本发明,不视为对本发明保护范围的不当限定。
再需要说明的是,本发明的说明书和权利要求书及上述附图中的 术语“第一”、“第二”等是用于区别类似的对象,而不是用于描述或表示特定的顺序或先后次序。应该理解这样使用的数据在适当的情况下可以互换,以便这里描述的本发明的实施例能够以除了在这里图示或描述的那些以外的顺序实施。
术语“包括”、“包含”或者任何其它类似用语旨在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备/装置不仅包括那些要素,而且还包括没有明确列出的其它要素,或者还包括这些过程、方法、物品或者设备/装置所固有的要素。
如本文中所使用的,术语“模块”、“单元”可以指代在计算系统上执行的软件对象或例程。可以将本文中所描述的不同模块实现为在计算系统上执行的对象或过程(例如,作为独立的线程)。虽然优选地以软件来实现本文中所描述的系统和方法,但是以硬件或者软件和硬件的组合的实现也是可以的并且是可以被设想的。
本发明的各个步骤可以用通用的计算装置来实现,例如,它们可以集中在单个的计算装置上,例如:个人计算机、服务器计算机、手持设备或便携式设备、平板型设备或者多处理器装置,也可以分布在多个计算装置所组成的网络上,它们可以以不同于此处的顺序执行所示出或描述的步骤,或者将它们分别制作成各个集成电路模块,或者将它们中的多个模块或步骤制作成单个集成电路模块来实现。因此,本发明不限于任何特定的硬件和软件或者其结合。
本发明提供的方法可以使用可编程逻辑器件来实现,也可以实施为计算机程序软件或程序模块(其包括执行特定任务或实现特定抽象 数据类型的例程、程序、对象、组件或数据结构等等),例如根据本发明的实施例可以是一种计算机程序产品,运行该计算机程序产品使计算机执行用于所示范的方法。所述计算机程序产品包括计算机可读存储介质,该介质上包含计算机程序逻辑或代码部分,用于实现所述方法。所述计算机可读存储介质可以是被安装在计算机中的内置介质或者可以从计算机主体上拆卸下来的可移动介质(例如:采用热插拔技术的存储设备)。所述内置介质包括但不限于可重写的非易失性存储器,例如:RAM、ROM、快闪存储器和硬盘。所述可移动介质包括但不限于:光存储介质(例如:CD-ROM和DVD)、磁光存储介质(例如:MO)、磁存储介质(例如:磁带或移动硬盘)、具有内置的可重写非易失性存储器的媒体(例如:存储卡)和具有内置ROM的媒体(例如:ROM盒)。
本发明并不限于上述实施方式,在不背离本发明实质内容的情况下,本领域普通技术人员可以想到的任何变形、改进或替换均落入本发明的保护范围。

Claims (15)

  1. 一种混合基DFT/IDFT数据并行读取方法,其特征在于,所述方法至少包括:
    根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数;
    判断最大并行读取数据个数与所述已完成级数所对应点数的乘积之间的大小;
    根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
  2. 根据权利要求1所述的方法,其特征在于,所述根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数,具体包括:
    根据所述待运算级数所对应的点数和所述已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
    Figure PCTCN2016091707-appb-100001
    其中,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
  3. 根据权利要求1所述的方法,其特征在于,所述根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据,具体包括:
    在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下两重循环参数:
    所述第一重循环步长为M、所述第一重循环次数为
    Figure PCTCN2016091707-appb-100002
    所述第二重循环步长为N2、所述第二重循环次数为
    Figure PCTCN2016091707-appb-100003
    其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积;
    根据上述两重循环参数并行读取所述数据,且每次读取所述M个数据,直至将所述N1个数据全部读出。
  4. 根据权利要求1所述的方法,其特征在于,所述根据判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据,还具体包括:
    在M大于N1的情况下,计算
    Figure PCTCN2016091707-appb-100004
    值;
    复制
    Figure PCTCN2016091707-appb-100005
    份所读取的旋转因子;
    根据以下两重循环参数以N2步长并行读取前组数据:所述第一重循环步长为
    Figure PCTCN2016091707-appb-100007
    所述第一重循环次数为N0、所述第二重循环步长为
    Figure PCTCN2016091707-appb-100008
    所述第二重循环次数为
    Figure PCTCN2016091707-appb-100009
    其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
  5. 一种基于上述权利要求1-4中任一所述方法的混合基DFT/IDFT并行计算方法,其特征在于,所述并行计算方法至少包括:
    步骤1:并行读取输入旋转因子与输出旋转因子,并将二者对应项进行相乘,将乘积结果连同所述输入旋转因子作为等效旋转因子;
    步骤2:将所述等效旋转因子与输入数据相乘,并对乘积结果进 行缓存;
    步骤3:在第二重循环中,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行相应的加法或减法操作。
  6. 根据权利要求5所述的并行计算方法,其特征在于,所述将所述等效旋转因子与输入数据相乘,并对乘积结果进行缓存,具体包括:
    在处理器未设有复数运算单元的情况下,将所述等效旋转因子与所述输入数据的实部、虚部交叉相乘的结果进行缓存。
  7. 根据权利要求5所述的并行计算方法,其特征在于,所述步骤3具体包括:
    在处理器设有复数运算单元的情况下,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行相应的加法操作。
  8. 根据权利要求5所述的并行计算方法,其特征在于,所述步骤3还具体包括:
    在处理器未设有复数运算单元的情况下,执行所述步骤2中乘法运算时,将所述步骤2缓存的结果读出,并进行如下减法操作:
    将所述等效旋转因子和所述输入数据的实部之间乘积与所述等效旋转因子和所述输入数据的虚部之间乘积相减。
  9. 一种混合基DFT/IDFT数据并行读取装置,其特征在于,该并行读取装置至少包括:
    点数计算单元,用于根据待运算级数所对应的点数和已完成级数所对应点数的乘积,来配置两重循环参数;
    组数判断单元,用于判断最大并行读取数据个数与所述已完成级数所对应点数的乘积之间的大小;
    读取单元,用于根据所述组数判断单元得到的判断结果计算与之相对应的两重循环参数,并基于计算得到的两重循环参数并行读取数据。
  10. 根据权利要求9所述的并行读取装置,其特征在于,所述点数计算单元具体包括:
    配置模块,用于根据所述待运算级数所对应的点数和所述已完成级数所对应点数的乘积,配置如下两重循环参数:第一重循环步长为N1,第一重循环次数为N0,第二重循环步长为N2,第二重循环次数为
    Figure PCTCN2016091707-appb-100010
    其中,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
  11. 根据权利要求9所述的并行读取装置,其特征在于,所述读取单元具体包括:
    第一计算模块,用于在M小于等于N1的情况下,不处理所读取的旋转因子,计算以下两重循环参数:
    第一重循环步长为M、重复次数为
    Figure PCTCN2016091707-appb-100011
    第二重循环步长为N2、重复次数为
    Figure PCTCN2016091707-appb-100012
    其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积;
    第一读取模块,用于根据上述两重循环参数并行读取所述数据,且每次读取所述M个数据,直至将所述N1个数据全部读出。
  12. 根据权利要求9所述的并行读取装置,其特征在于,所述读取单元还具体包括:
    第二计算模块,用于在M大于N1的情况下,计算
    Figure PCTCN2016091707-appb-100013
    值;
    复制模块,用于复制所述
    Figure PCTCN2016091707-appb-100014
    份所读取的旋转因子;
    第二读取模块,用于根据以下两重循环参数以N2步长并行读取前所述
    Figure PCTCN2016091707-appb-100015
    组数据:第一重循环步长为
    Figure PCTCN2016091707-appb-100016
    第一重循环次数为N0、第二重循环步长为
    Figure PCTCN2016091707-appb-100017
    第二重循环次数为
    Figure PCTCN2016091707-appb-100018
    其中,所述M表示处理器所支持的最大并行读取数据个数,所述N0表示待运算级数所对应的点数,所述N1表示已完成级数所对应点数的乘积,所述N2为所述N1与所述N0的乘积。
  13. 一种基于上述权利要求9-12中任一所述并行读取装置的混合基DFT/IDFT并行计算装置,其特征在于,所述并行计算装置至少包括:
    等效旋转因子计算单元,用于并行读取输入旋转因子与输出旋转因子,并将所述二者对应项进行相乘,将乘积结果连同所述输入旋转因子作为等效旋转因子;
    缓存单元,用于将由所述等效旋转因子计算单元得到的所述等效旋转因子与输入数据相乘,并对乘积结果进行缓存;
    数据处理单元,用于在第二重循环中,在所述缓存单元执行乘法运算时,将所述缓存单元中缓存的结果读出,并进行相应的加法或减法操作。
  14. 根据权利要求13所述的并行计算装置,其特征在于,所述 等效旋转因子计算单元具体包括:
    并行读入模块,用于并行读入所述输入旋转因子及所述输出旋转因子;
    缓存模块,用于将所述输入旋转因子与所述输出旋转因子对应项进行相乘,得到第一和第二组等效旋转因子,并将所述第一和所述第二组等效旋转因子连同作为第三组等效旋转因子的所述输入旋转因子存入缓存。
  15. 根据权利要求13所述的并行计算装置,其特征在于,所述数据处理单元还包括:
    复数运算单元,用于将所述缓存单元中缓存的结果读出,并进行相应的加法操作。
PCT/CN2016/091707 2016-07-26 2016-07-26 混合基dft/idft并行读取及计算方法和装置 WO2018018412A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
PCT/CN2016/091707 WO2018018412A1 (zh) 2016-07-26 2016-07-26 混合基dft/idft并行读取及计算方法和装置
US16/068,914 US10698973B2 (en) 2016-07-26 2016-07-26 Method and apparatus for concurrent reading and calculation of mixed radix DFT/IDFT

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/091707 WO2018018412A1 (zh) 2016-07-26 2016-07-26 混合基dft/idft并行读取及计算方法和装置

Publications (1)

Publication Number Publication Date
WO2018018412A1 true WO2018018412A1 (zh) 2018-02-01

Family

ID=61015683

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2016/091707 WO2018018412A1 (zh) 2016-07-26 2016-07-26 混合基dft/idft并行读取及计算方法和装置

Country Status (2)

Country Link
US (1) US10698973B2 (zh)
WO (1) WO2018018412A1 (zh)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763337A (zh) * 2008-12-25 2010-06-30 上海明波通信技术有限公司 N点fft/ifft/ifft/ifft方法和装置
CN101840393A (zh) * 2010-02-26 2010-09-22 联芯科技有限公司 混合基离散傅立叶变换的处理方法及装置
CN102023963A (zh) * 2010-12-13 2011-04-20 西安电子科技大学 高速多模式时频域变换方法
CN103294642A (zh) * 2012-02-28 2013-09-11 中兴通讯股份有限公司 离散傅立叶变换/逆离散傅立叶变换处理方法及处理器
CN106201999A (zh) * 2016-07-26 2016-12-07 中国科学院自动化研究所 混合基dft/idft并行读取及计算方法和装置

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7836116B1 (en) * 2006-06-15 2010-11-16 Nvidia Corporation Fast fourier transforms and related transforms using cooperative thread arrays
US8386552B2 (en) * 2008-09-17 2013-02-26 Freescale Semiconductor, Inc. Fourier transform processing and twiddle factor generation
US8549059B2 (en) * 2009-01-08 2013-10-01 Texas Instruments Incorporated In-place fast fourier transform processor
US8990281B2 (en) * 2012-09-21 2015-03-24 International Business Machines Corporation Techniques for improving the efficiency of mixed radix fast fourier transform
US20150331634A1 (en) * 2013-01-09 2015-11-19 Sergei I. SALISHCHEV Continuous-flow conflict-free mixed-radix fast fourier transform in multi-bank memory

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101763337A (zh) * 2008-12-25 2010-06-30 上海明波通信技术有限公司 N点fft/ifft/ifft/ifft方法和装置
CN101840393A (zh) * 2010-02-26 2010-09-22 联芯科技有限公司 混合基离散傅立叶变换的处理方法及装置
CN102023963A (zh) * 2010-12-13 2011-04-20 西安电子科技大学 高速多模式时频域变换方法
CN103294642A (zh) * 2012-02-28 2013-09-11 中兴通讯股份有限公司 离散傅立叶变换/逆离散傅立叶变换处理方法及处理器
CN106201999A (zh) * 2016-07-26 2016-12-07 中国科学院自动化研究所 混合基dft/idft并行读取及计算方法和装置

Also Published As

Publication number Publication date
US20190042537A1 (en) 2019-02-07
US10698973B2 (en) 2020-06-30

Similar Documents

Publication Publication Date Title
JP6977239B2 (ja) 行列乗算器
CN107341547B (zh) 一种用于执行卷积神经网络训练的装置和方法
CN105022670B (zh) 一种云计算平台中的异构分布式任务处理系统及其处理方法
CN111580865B (zh) 一种向量运算装置及运算方法
US20180211357A1 (en) Pipelined approach to fused kernels for optimization of machine learning workloads on graphical processing units
US11842220B2 (en) Parallelization method and apparatus with processing of neural network model for manycore system
US11899741B2 (en) Memory device and method
WO2017185392A1 (zh) 一种用于执行向量四则运算的装置和方法
WO2018027706A1 (zh) Fft处理器及运算方法
Mei Evaluating the power of GPU acceleration for IDW interpolation algorithm
CN109558329A (zh) 一种程序检测方法、装置、设备及可读存储介质
Caprace et al. FLUPS: A fourier-based library of unbounded poisson solvers
Czarnul Investigation of parallel data processing using hybrid high performance CPU+ GPU systems and CUDA streams
US20210326683A1 (en) Hardware circuit for accelerating neural network computations
Caliari et al. An accurate and time-parallel rational exponential integrator for hyperbolic and oscillatory PDEs
CN115221102B (zh) 用于优化片上系统的卷积运算操作的方法和相关产品
WO2018018412A1 (zh) 混合基dft/idft并行读取及计算方法和装置
Meselhi et al. Parallel evolutionary algorithm for eeg optimization problems
CN115525244A (zh) 一种fft硬件加速器和数据处理方法
Amestoy et al. Modeling 1D distributed-memory dense kernels for an asynchronous multifrontal sparse solver
CN106201999B (zh) 混合基dft/idft并行读取及计算方法和装置
Suciu et al. Statistical testing of random number sequences using CUDA
Cai et al. Stationary Patterns of a Cross‐Diffusion Epidemic Model
Firmansyah et al. Performance evaluation of Stratix V DE5-Net FPGA board for high performance computing
Lee et al. Parallel srp-phat for GPUs

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 16909993

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 16909993

Country of ref document: EP

Kind code of ref document: A1