Digital signal processor with reconfigurable low power consumption data interleaving network
Technical field
The present invention relates to digital signal processor and microprocessor technology field, refer to a kind of digital signal processor with reconfigurable low power consumption data interleaving network especially.
Background technology
Along with the develop rapidly of computing machine and information subject, digital signal processing (DSP) technology is arisen at the historic moment and is developed rapidly.In digitized epoch nowadays, comprise all application products, particularly real-time of communication, consumer electronics field, portable electronic product, all need the digital signal processor of high-performance, low-power consumption to carry out various digital signal processing.
Traditional scalar digital signal processor can't be given full play to the advantage of a large amount of concurrencys that exist in the digital signal processing algorithm, develop rapidly along with integrated circuit technique, same chip internal can integrated increasing transistor, therefore allow us to use vector and multi-core technology fully to dig according to the concurrency in the digital signal processing, the application of Vector Processing technology in digital signal processor more and more widely.
In the processor microarchitecture design based on Vector Processing, in order to give full play to the advantage of vector operation unit, need logarithm reportedly to be input into effectively management of row, for the digital signal processing algorithm that exists mass data to interweave, traditional data transfer management technology has limitation, is mainly reflected in:
1), efficient is low.Traditional data transmits based on single address and transmits, and is not suitable for the discontinuous transmission of data, and for the inefficiency that data in the algorithm transmit with interleaving mode, degree of parallelism is low, in time supplies with operand can't for arithmetic element and reach back and deposit the result.
2), power consumption is big.The traditional approach data need to transmit between arithmetic element and storer frequently, consume lot of energy, thereby cause power consumption to rise, and are unfavorable for chip operation stability.A large amount of electric energy expenses has not only limited the application of product in portable built-in field, even also can cause a series of problems of degradation under maintenance cost increase, the system stability in extensive supercomputing platform.
Summary of the invention
(1) technical matters that will solve
In view of this, fundamental purpose of the present invention is to provide a kind of digital signal processor with reconfigurable low power consumption data interleaving network, to overcome conventional data transmission administrative skill limitation, improves the efficient that data transmit, reduce power consumption, satisfy the demand of different in width data interlacing.
(2) technical scheme
For achieving the above object, the invention provides a kind of digital signal processor with reconfigurable low power consumption data interleaving network, this digital signal processor comprises parallel vector operation unit, a N road 10, a N road parallel vector register file 20, a N road parallel vector storer 40 and a N road restructural parallel data interleaving network 30, wherein:
Parallel vector operation unit, N road 10 is used for the data of N road parallel vector register file 20 inputs are carried out calculation process, produces operation result, and this operation result is exported to N road parallel vector register file 20;
N road parallel vector register file 20 is used for depositing the employed operand of parallel vector operation unit, N road 10 arithmetic units and operation result temporarily;
N road parallel vector storer 40 is used for depositing the operation result of a large amount of input data and parallel vector operation unit, N road 10 arithmetic units;
N road restructural parallel data interleaving network 30 is used for connecting parallel vector operation unit, N road 10, N road parallel vector register file 20 and N road parallel vector storer 40, and management data transmission wherein.
In the such scheme, each parallel vector operation unit, road all comprises a plurality of arithmetic units in the parallel vector operation unit, described N road 10, this arithmetic unit is used for the operand of N road parallel vector register file 20 inputs is carried out calculation process, and preserves the operation result that produces.
In the such scheme, described N road restructural parallel data interleaving network 30 comprises interweave path 301 and the controller 302 that interweaves, wherein:
Interweave path 301 for the input data being interweaved and delivering to output;
The controller 302 that interweaves is used for transmitting control signal to the path 301 that interweaves, and controls selection and the data interlacing pattern of the path Data Source that interweaves.
In the such scheme, the described path 301 that interweaves selects 1 gate to form by N group N, and each N selects the input signal of 1 gate to comprise data-signal and control signal two parts.
In the such scheme, described N selects the data-signal of 1 gate to derive from the output of N road parallel vector register file 20, the operation result of parallel vector operation unit, N road 10 or the output of N road parallel vector storer 40, and the whereabouts of this data-signal is the input of N road parallel vector register file 20, the arithmetic operation number of parallel vector operation unit, N road 10 or the input of N road parallel vector storer 40.
In the such scheme, described N selects the control signal of 1 gate to derive from the described controller 302 that interweaves, with the gating of control input data to the output data.
In the such scheme, the selection of the path Data Source that interweaves that the described controller 302 that interweaves is controlled, by each bus application signal deciding, the parts that obtain bus application mandate will be as Data Source; The data interlacing pattern that the described controller 302 that interweaves is controlled, the high-order segment number in address that is sended over by parallel vector operation unit, N road 10, N road parallel vector register file 20 or N road parallel vector storer 40 determines that the low level of address is sent to N road parallel vector storer 40 as side-play amount.
In the such scheme, this digital signal processor also comprises a bypass selector switch 80, this bypass selector switch 80 is connected in parallel vector operation unit, described N road 10 by described N road restructural parallel data interleaving network 30, after the operation result of parallel vector operation unit, described N road 10 interweaves through described N road restructural parallel data interleaving network 30, directly be bypassed to the operand of parallel vector operation unit 10, described N road as next computing by this bypass selector switch 80, and without described N road parallel vector register file 20 and/or described N road parallel vector storer 40.
In the such scheme, this digital signal processor also comprises program storage 70, command decoder 60 and address generator 50, wherein:
Program storage 70 is used for depositing this digital signal processor and moves needed programmed instruction and output to command decoder 60;
Command decoder 60 is used for the programmed instruction that program storage 70 is sent here is deciphered the control signal that generation is controlled this digital signal processor, thereby parallel vector operation unit, N road 10, N road parallel vector register file 20, N road restructural parallel data interleaving network 30 and address generator 50 are controlled;
Address generator 50 is used for receiving the control signal that command decoder 60 is sent, generation is sent to N road parallel vector storer 40 to the N road parallel vector storer 40 needed reference address of visit and read-write control signal, generates control signal according to the address that produces simultaneously and is sent to N road restructural parallel data interleaving network 30.
(3) beneficial effect
From technique scheme as can be seen, the present invention has following beneficial effect:
1), this digital signal processor with reconfigurable low power consumption data interleaving network provided by the invention, be applicable to the data transfer management mode of parallel architecture, can overcome conventional data transmission administrative skill limitation, has efficient advantage of low power consumption, the interleaving network structure of this processor has the restructural characteristic simultaneously, can satisfy different in width data interlacing demand.
2), this digital signal processor with reconfigurable low power consumption data interleaving network provided by the invention, compared with prior art also have the high and fireballing advantage of data throughput.
Description of drawings
In order to further specify technical characterictic of the present invention, below in conjunction with accompanying drawing the present invention is described in detail, wherein:
Fig. 1 is the structural representation with digital signal processor of reconfigurable low power consumption data interleaving network provided by the invention.
Embodiment
For making the purpose, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.
See also Fig. 1, Fig. 1 is the structural representation with digital signal processor of reconfigurable low power consumption data interleaving network provided by the invention, and this digital signal processor comprises parallel vector operation unit, a N road 10, a N road parallel vector register file 20, a N road parallel vector storer 40 and a N road restructural parallel data interleaving network 30.
Wherein, parallel vector operation unit, N road 10 is used for the data of N road parallel vector register file 20 inputs are carried out calculation process, produces operation result, and this operation result is exported to N road parallel vector register file 20.N road parallel vector register file 20 is used for depositing the employed operand of parallel vector operation unit, N road 10 arithmetic units and operation result temporarily.N road parallel vector storer 40 is used for depositing the operation result of a large amount of input data and parallel vector operation unit, N road 10 arithmetic units.N road restructural parallel data interleaving network 30 is used for connecting parallel vector operation unit, N road 10, N road parallel vector register file 20 and N road parallel vector storer 40, and management data transmission wherein.
Each parallel vector operation unit, road all comprises a plurality of arithmetic units in the parallel vector operation unit, N road 10, and this arithmetic unit is used for the operand of N road parallel vector register file 20 inputs is carried out calculation process, and preserves the operation result that produces.
N road restructural parallel data interleaving network 30 comprises interweave path 301 and the controller 302 that interweaves, and the path 301 that wherein interweaves is for the input data being interweaved and delivering to output; The controller 302 that interweaves is used for transmitting control signal to the path 301 that interweaves, and controls selection and the data interlacing pattern of the path Data Source that interweaves.
The path 301 that interweaves selects 1 gate to form by N group N, and each N selects the input signal of 1 gate to comprise data-signal and control signal two parts.N selects the data-signal of 1 gate to derive from the output of N road parallel vector register file 20, the operation result of parallel vector operation unit, N road 10 or the output of N road parallel vector storer 40, and the whereabouts of this data-signal is the input of N road parallel vector register file 20, the arithmetic operation number of parallel vector operation unit, N road 10 or the input of N road parallel vector storer 40.N selects the control signal of 1 gate to derive from the described controller 302 that interweaves, with the gating of control input data to the output data.
The selection of the path Data Source that interweaves that the controller 302 that interweaves is controlled, by each bus application signal deciding, the parts that obtain bus application mandate will be as Data Source; The data interlacing pattern that the described controller 302 that interweaves is controlled, the high-order segment number in address that is sended over by parallel vector operation unit, N road 10, N road parallel vector register file 20 or N road parallel vector storer 40 determines that the low level of address is sent to N road parallel vector storer 40 as side-play amount.
Further, this digital signal processor also comprises a bypass selector switch 80, this bypass selector switch 80 is connected in parallel vector operation unit, described N road 10 by described N road restructural parallel data interleaving network 30, after the operation result of parallel vector operation unit, described N road 10 interweaves through described N road restructural parallel data interleaving network 30, directly be bypassed to the operand of parallel vector operation unit 10, described N road as next computing by this bypass selector switch 80, and without described N road parallel vector register file 20 and/or described N road parallel vector storer 40.
Further, this digital signal processor also comprises program storage 70, command decoder 60 and address generator 50, and wherein, program storage 70 is used for depositing this digital signal processor and moves needed programmed instruction and output to command decoder 60; Command decoder 60 is used for the programmed instruction that program storage 70 is sent here is deciphered the control signal that generation is controlled this digital signal processor, thereby parallel vector operation unit, N road 10, N road parallel vector register file 20, N road restructural parallel data interleaving network 30 and address generator 50 are controlled; Address generator 50 is used for receiving the control signal that command decoder 60 is sent, generation is sent to N road parallel vector storer 40 to the N road parallel vector storer 40 needed reference address of visit and read-write control signal, generates control signal according to the address that produces simultaneously and is sent to the controller 302 that interweaves in the N road restructural parallel data interleaving network 30.
Refer again to Fig. 1, the N group N that constitutes the path 301 that interweaves selects 1 gate, can realize the restructural function as requested.Data bit width can be 8,16,32 or 64, and corresponding different bit wide N can get different values, when total bus bandwidth is 512, allows 64 the tunnel 8, the 32 tunnel 16, the 16 tunnel 32 and 8 the tunnel 64, and corresponding N is respectively 64,32,16 and 8.When N selected different values, the function that N road N selects 1 gate to finish was also inequality, can finish 64 the tunnel 64 respectively and select 1 (8), 32 tunnel 32 to select 1 (16), 16 tunnel 16 to select 1 (32), 8 tunnel 8 to select 1 (64).These functions can by 64 8 64 select 1 to merge restructural by control signal and finish, have the advantage of high efficiency, low cost.Select for 64 64 1 gating network to realize by the barrel shape displacement mode, be divided into 64 buckets, each barrel is according to 1 group in 64 groups of 8 signals of the different choice input of gating control signal.
Simultaneously, in order to improve frequency, 64 barrel shapes displacements realize with pipeline mode, reach single-unit and clap crucial path short effect as far as possible by data path being divided into a plurality of beats, and crucial path weak point means that time-delay is little, thereby reaches the effect that frequency improves.
In order further to save power consumption, provide bypass in each grade flowing water in the data selective gate, when interweaving, front regular can finish all interweaving during task, follow-up pipelining-stage does not need to work on, the result that interweaves is sent by bypass, like this by stoping unnecessary functional part upset can effectively save power consumption.
Above-described specific embodiment; purpose of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the above only is specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.