Embodiment
For making the object of the invention, technical scheme and advantage clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, to further explain of the present invention.Though this paper can provide the demonstration of the parameter that comprises particular value, should be appreciated that parameter need not definitely to equal corresponding value, but can in acceptable error margin or design constraint, be similar to said value.
The applicant finds, can the algorithm of FIR filter in the formula 1 be expanded into following result:
Launch the result for FIR algorithm filter as implied above, filter factor uses has following rule:
(1) with regard to the row at each frame of broken lines place, each multiplier treats that with BS in a filter factor and the frame of broken lines is individual filtering data carries out computing, and the BS of this a process multiplication can be concurrent operation;
(2) result after parallel adds up again, and cumulative process is a right-to-left as implied above, adds up can draw common BS the result of Y (0)~Y (BS-1) after K time.Wherein, whenever draw BS result, use respectively filter factor H (K-1), H (K-2), H (K-3) ..., H (1), H (0).
Therefore, the filter factor buffer there are 2 requirements: 1, read at every turn and return a filter factor; 2, be separated by and read for K time, the value of returning is the same.According to above-mentioned result, the number of filter factor is K, and the number of operational data is BS.
Launch the result for FIR algorithm filter as implied above, treat that the filtering data use has following rule:
(1) for the adjacent frame of broken lines in the left and right sides, its use treats that filtering data only has a difference.
(2) in neighbouring BS filtered computing, there be BS to treat that filtering data does not re-use, with the individual filtering data of treating of stylish uses BS.For example, when calculating Y (BS)~Y (2BS-1), treat filtering data X (K+1)~X (BS-K) do not re-use, and treats that filtering data X (BS-K+1)~X (BS-1) continues use, has increased the use of treating filtering data X (BS)~X (2BS-1) simultaneously newly.
(3) treat filtering data X vector, the minus part of its index value replaces by zero.Be in the following formula X (1), X (2) ..., X (K+1) is zero.
According to above-mentioned rule, can design a kind of novel filtering data buffer of treating, this treats that the filtering data buffer satisfies following 4 requirements at least: 1, read buffer at every turn and can return BS and treat filtering data; 2, reading of adjacent twice buffer treated the filtering data the inside at the BS that returns, only has a value different; 3, K computing of every completion, promptly produced BS filtered after, this buffer has the renewal of an integral body, the BS that this renewal will be brought into use most treats that filtering data gives up, and newly loads BS and in memory, treats filtering data; 4, buffer provides corresponding zero padding zone, and effectively the zero number is K-1.
Above-mentioned filter factor with treat that the technical of filtering data rule, applicant propose a kind of FIR filter, to improve filter factor and the service efficiency of treating filtering data, reduce read-write number of times to memory, improve arithmetic speed.
In one exemplary embodiment of the present invention, a kind of FIR filter has been proposed.Fig. 2 is an embodiment of the invention FIR Filter Structures sketch map.As shown in Figure 2, present embodiment FIR filter comprises: filter factor buffer 20, treat filtering data buffer 30, multiplier 40, accumulator register 50, accumulator 60 and comparator 70.The peripheral components of this FIR filter also comprises memory 10.
Filter factor buffer 20, K filter factor is used to prestore; At n execution cycle, receive read enable signal after, 1 filter factor H (i) is provided.
Treat filtering data buffer 30, be used to the N that prestores
XThe individual filtering data of treating; At n execution cycle, receive read enable signal after, provide 1 to treat the filtering data group; This treats that the filtering data group comprises that BS is treated filtering data: X (nBS-i), X (nBS-i+1) ... X [(n+1) BS-i-1], when the X vector index of treating filtering data less than 0 the time, this filtering data replaces with 0.
Multiplier 40 comprises a parallel BS multiplication unit; Each multiplication unit all with filter factor buffer 20 with treat that filtering data buffer 30 is connected, 1 product calculation of treating filtering data that is used for realizing filter factor H (i) Yu treats the filtering data group obtains 1 multiplication result.
Accumulator 60, its control end is connected with comparator, comprises parallel BS the unit that adds up; Deposit unit is connected in each add up unit and corresponding multiplication unit and accumulator register, is used to the add up current result of product of said multiplication unit and the intermediate data of deposit unit.When receiving the output useful signal of comparator, this accumulator is exported BS accumulation result as n group filtered; Otherwise, export this BS accumulation result to deposit unit corresponding in the accumulator register respectively as intermediate data.
Accumulator register 50 comprises independently BS deposit unit, and the input of each deposit unit is connected with input with the output of the corresponding unit that adds up respectively with output, is used to deposit the intermediate data of accumulating operation.
Comparator 70; Be used for the computing sequence number and the relation of filter factor number K that comparison is come by multiplier 40 transmission, when computing sequence number<K-1, to the filter factor buffer with treat that the transmission of filtering data buffer reads enable signal; When computing sequence number=K-1, send the output useful signal to accumulator.In fact, this computing sequence number also can be passed to comparator by accumulator 60 or accumulator register 50, its principle similarly, equally within protection scope of the present invention.Wherein, the computing sequence number is in carrying out a certain group the result data computational process of BS, the product calculation of having accomplished or the number of times of accumulating operation, and this value is counted since 0, and maximum is K-1.Simultaneously, finish when the result data of this group BS calculates, in the computational process of the result data of next group BS, the computing sequence number is again since 0 counting.
Need to prove that n gets 0,1,2,3 successively ..., ceil (N
x/ BS)-1, ceil is the symbol to contiguous big integer value, ceil (N
x/ BS)-1 group filtered is as N
XThe individual filtered of treating filtering data.
In addition, treat in the present embodiment that the principle that filtering data and filter factor are chosen is, not homogeneous to read the filter factor that the enable signal correspondence provides different, and not corresponding the treating of enable signal of reading of homogeneous the filtering data group is different.Generally speaking, can select filter factor one by one and treat the filtering data group, as long as guarantee that K computing do not repeat each other.Certainly, launch the result for the algorithm like the FIR filter, two kinds of habitual computational methods are arranged, a kind of method is to calculate successively to a left side from right, and another kind of method is to calculate successively from left to right.Preferably, launch the result, calculate successively to a left side from right for the algorithm of FIR filter, promptly i get successively K-1, K-2 ..., 1,0 (will specify hereinafter).
Understand present embodiment in order to have known more, existing generation and output to its timing control signal describes.As shown in Figure 2; The output useful signal of comparator is used for indicating current BS the filtered of having accomplished on the one hand; The output that is current accumulator 60 is effective, is used for indicating the current shift signal effective (will specify hereinafter) of treating filtering data buffer 30 that outputs on the other hand.In addition, output to the enable signal of reading of filter factor buffer 20, be used for indicating current multiplier 40 need read filter factor buffer 20 by multiplier 40; Output to the enable signal of reading of treating filtering data buffer 30 by multiplier 40, be used for indicating current multiplier 40 to read and treat filtering data buffer 30.Certainly; This is read enable signal and also can is issued to the filter factor buffer and treated the filtering data buffer by comparator or accumulator; Be one of them of comparator, multiplier and accumulator; Also be used for when computing sequence number<K-1, module be provided and saidly treat that filtering data provides module to send and reads enable signal to said filter factor.
Can find out from present embodiment, be different from FIR filter of the prior art, the present invention adopts BS parallel multiplication unit; Can carry out BS multiplying simultaneously, simultaneously BS multiplication result carried out BS add operation with the relevant register intermediate value is parallel, thereby improved operation efficiency greatly; Practice thrift operation time; And all control signals all are digital signals, have avoided adopting the computing inaccuracies that analog component brought such as delay cell, have improved operational precision.
Adopt parallel multiplication unit and supporting adder and register in the present embodiment; Can carry out BS concurrent operation simultaneously, this concurrent operation is adopted the parallel of this form for producing a part of computing of BS filtered; Make the quantity of arithmetic unit be independent of the filter factor number; When filter factor increased, the parallel multiplication unit in the reuse plan and supporting adder and register had flexibility highly.
Present embodiment FIR filter adopts the filter factor buffer and treats the filtering data buffer.In fact, also can adopt and to realize treating that filtering data provides other memory devices of function, like register etc.In order to reduce reading as far as possible to memory; Improve arithmetic speed and efficient; For filter factor, the mode that the present invention adopts memory and buffer to mutually combine deposits in the buffer filter factor from memory in order; With the filter factor number K is the cycle, exports a filter factor at every turn.Concrete technical scheme is following:
In the present embodiment, filter factor buffer 20 is used for reading filter factor from memory 10, places in the buffer entity, and 1 filter factor H (i) is read in each filter computing in order.Fig. 3 is the sketch map of filter factor buffer in the embodiment of the invention FIR filter.As shown in Figure 3, this filter factor buffer 20 comprises that a buffer entity 201, one read logical block 202, an initialization logic unit 203 and a logic with shift unit 204.Wherein:
Buffer entity 201, buffer memory are used for the filter factor of computing, and its size is relevant with the size (BS) of arithmetic element, and buffer memory cell number is 2BS+1 here, and is addressed to 0~2BS from top to bottom.
Reading logical block 202 and be used for the read operation to the filter factor buffer, is the cycle with filter factor number (K), returns a filter factor in order.
Initialization logic unit 203 is used for the initialization to the buffer entity.Promptly buffer entity 201 is carried out assignment from memory access.
Can find out that from the structure of above-mentioned filter factor buffer 20 there are not inevitable relation in arithmetic element number BS of the present invention and filter factor number K.Under the certain situation of arithmetic element number BS; If the filter factor number K changes; Then need not change internal arithmetic unit number or size, as long as this filter factor number K in the memory space ranges of filter factor buffer, just can satisfy the needs of computing fully; Have high degree of flexibility, thereby make the present invention have good reconfigurability.Size with the filter factor buffer is 2BS+1, can adapt to the filtering operation in 2BS+1 the filter factor scope.For the ease of understanding, in the explanation hereinafter, it is example that the largest filter coefficient number that can support with the filter factor buffer is of a size of K '.
In the present embodiment; The filter factor buffer is that the requirement to filter factor designs in the filter implementation procedure in order to satisfy, and its function is for providing filter factor, and its way of realization maybe be various; Realize such as adopting 2BS+1 register; And do not need the added logic expense, but in each product calculation, which register multiplier need be controlled from is got required coefficient.
Further; Treat that in order to improve raising the efficiency of filtering data, the present invention also treat the parts that provide of filtering data and improve, in the present embodiment; Treat filtering data buffer 30; Be used for reading from memory 10 and treat the filtering data group, place and treat in the filtering data buffer entity, each filter computing is read and is treated filtering data group d
K, wherein, d
KBe a vector [d
K[0], d
K[1] ..., d
K[BS-1]], d in the FIR algorithm filter expansion
KVector index, all replaces with 0 like X [1] less than 0 part.This buffer size is 4BS, and wherein BS is the computing size of multiplier, the multiplying number that promptly can carry out simultaneously.
Fig. 4 is for treating the sketch map of filtering data buffer in the embodiment of the invention FIR filter.As shown in Figure 4, this is treated that the filtering data buffer comprises and treats filtering data buffer entity 310, reads logical block 320, upgrades logical block 330.Wherein:
Buffer entity 310 buffer memorys are used for the filtering data of treating of computing.Its size is relevant with the size (BS) of arithmetic element demand, and size is the 4BS size here, and is addressed to 0~4BS-1.
Read the filtering data of treating that logical block 330 is used to read buffer.Its input signal is for reading enable signal, computing sequence number and filter factor number, and the output signal comprises to buffer entity 10 and sends effective offset address, obtains the filtering data of treating of BS size, and this is treated filtering data output.
Wherein effectively offset address need be according to the computing sequence number, read enable signal and the filter factor number produces.Corresponding relation is: when reading enable signal when effective, and effective offset address=K '-K+ computing sequence number.Wherein computing sequence number be by arithmetic element (multiplication unit) according to the order of reading buffer memory, the number of reading of transmission, this computing sequence number increases by 1 since 0 at every turn.For present embodiment, the size of filter factor buffer entity is 2BS+1, this effective offset address=2BS+1-K+ computing sequence number.
That upgrades that logical block 320 is used to upgrade the buffer entity treats the filtering data content, and this structure comprises an initialization logic unit 321 and a logic with shift unit 322.Wherein:
The initialization logic unit is operated the early stage to buffer entity 10 before arithmetic unit work.Be used for the initialization to the buffer entity, wherein the address is that the partially-initialized of 0~2BS-1 is 0, and address 2BS~4BS-1 part is from the value initialization of memory the inside.
Logic with shift unit 322 is used for the memory block entity is treated the integral shift of filtering data, and promptly when the input signal shift signal is effective, start working in the logic with shift unit.
The largest filter coefficient number K that the one-dimensional filtering of filtering data buffer cooperation is supported is treated in supposition earlier '=2BS+1; Originally treat that then filtering data buffer physical size is at least the 4BS size; Suppose the arithmetic element size simultaneously, i.e. BS=4, actual filter factor number is K=6.
Can know that in conjunction with above-mentioned hypothesis filtering data buffer physical size>=4BS=16 is treated, K=6 in K '=9.Be that 4BS=16 is that example further specifies the present invention at first below to count the buffer physical size.
The incipient stage of using; Present embodiment need carry out initialization earlier, and this moment, 201 work of initialization logic unit were 0 with the partially-initialized of counting buffer address 0~7; With the partially-initialized of address 8~15 is specific value; This value obtains from memory, can get BS=4 from memory at every turn and treat filtering data, need get altogether 2 times.Situation after buffer addressing and the initialization is shown in Fig. 5 a.
After initialization finishes; Arithmetic unit promptly adopts the multiplier in shown in Figure 2 need read this buffer; Read logical block 330 and start working this moment; According to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the filtering data of treating of tram.
In the present embodiment, treat that filtering data need read buffer memory K=6 time because every generations BS=4 is individual, thus the computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing first group of BS=4 filtered data, effective offset address=3,4,5,6,7,8}.It is as shown in Figure 6 that this effective offset address produces figure.So the time preceding read for 6 times treat that the filtering data buffer returns treat filtering data be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
After the multiplier computing among Fig. 2 K=6 time, can draw preceding BS=4 end value, need upgrade the value of buffer entity 310 afterwards, i.e. value from the memory the inside.It is effective to export to the shift signal of treating filtering data buffer 30 in the FIR filter among Fig. 2 at this moment, and buffer has the displacement of an integral body, and the amplitude of displacement is BS=4.Displacement back buffer treats that filtering data is shown in Fig. 5 b.
Drawn BS filtering dateout this moment, and buffer carried out the corresponding renewal of treating filtering data, cooperates arithmetic unit to carry out the computing that a following BS result treats filtering data, during read buffer and the renewal caching mechanism is constant.
The foregoing description is that buffer size is the situation of 4BS, and in fact this buffer is not limited only to the 4BS size.Under the prerequisite of K '=2BS+1, be that the 5BS size is an example with buffer entity 10 below, introduce the present invention in detail.
When buffer entity 310 size is 5BS, treat equally filtering data in the distribution in the memory shown in Fig. 5 a, actual filter factor number (being less than or equal to 2BS) is K=6, then the buffer entity is addressed to 0~19.
Fig. 7 a after initialization finishes, treats filtering data in the buffer entity when treating that the filtering data buffer size is 5BS.Compare size this moment for the buffer of 4BS, initialization has finished the many buffer memorys in the back individual filtering data of treating of BS=4.These do not influence the logical block 330 of reading of the present invention, it equally according to the input the computing sequence number, read enable signal and filter factor number signal, produce effective offset address, thereby read the filtering data of treating of tram.
Same, first group of BS=4 filtered data of every generation, effectively offset address distinguish=3,4,5,6,7,8}.It is as shown in Figure 6 that this effective offset address produces figure.So the time preceding read for 6 times treat that the filtering data buffer returns treat filtering data be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
After the multiplier computing among Fig. 2 K=6 time, BS=4 end value before can drawing need be upgraded the value of treating filtering data buffer entity 310 afterwards, i.e. the value from the memory the inside.It is effective that export to the shift signal of treating filtering data buffer 30 among Fig. 2 this moment, and buffer has the displacement of an integral body, and the amplitude of displacement is BS=4.Displacement back buffer treats that filtering data is shown in 7b.
Drawn BS filtering dateout this moment, and buffer carried out the corresponding renewal of treating filtering data, cooperates arithmetic unit to carry out the computing of a following BS result data, during read buffer and the renewal caching mechanism is constant.
Can find out that from the above-mentioned structure of filtering data buffer 30 of treating it fully takes into account data recycling phenomenon in the algorithm, after memory load given number data; Carry out abundant computing, according to the algorithm requirement, all use this part treat that the computing of filtering data is all accomplished after; Just do the renewal of treating the filtering data buffer; This part treats that filtering data can not be loaded once more in the whole algorithm implementation procedure afterwards, has improved like this and has treated the filtering data utilance, has reduced the number of times of reference to storage; When having solved processor and having done the one-dimensional filtering algorithm to processor powerful confession count the demand of ability, and then reduced the power consumption of whole design.
In the present embodiment; Can simplify the function of treating the filtering data buffer; Transfer arithmetic element (multiplication unit or the unit that adds up) to and do treating thing that the filtering data buffer will be done; Such as the logic of reading that can the reduced data buffer, the effective address of this part produced to transfer to read logical block and do.But initialization logic and more new logic be absolutely necessary.In a word, because filter of the present invention is to the demand of data, so designed this buffer.Can regard filter as requirement profile to the requirement of data, and treat that originally the filtering data buffer is a kind of most possible realization.Other all or part of sequential logic of this standard of realization can be regarded the filtering data buffer of treating of the present invention as.
In addition; Present embodiment pending data buffer is to the application scenario of different filter factor numbers; Can greatly make things convenient for the programmer to carry out algorithm and realize, and need not consider to be directed against the zero padding operation that different coefficient length are carried out suitable number according to the zero padding of algorithm requirements intelligence.
Certainly; That also can adopt other treats the filtering data presentation mode; As will treat that the filtering data buffer is divided into BS sub-buffer, treat that with the corresponding BS of each filter factor in n the execution cycle filtering data all is stored to one of them of BS sub-buffer.In each multiplying, filtering data is to multiplier from sub-buffer, to read whole treating.Certainly, this internal memory of treating that the filtering data presentation mode expends is bigger, and reading efficiency is lower.
Below use a concrete scene the present invention to be elaborated as follows: in the present embodiment as example; The computing size BS=16 of multiplier, promptly each this multiplier can carry out 16 pairs of data and multiply each other, and is then corresponding; The size of filter factor buffer is 2*BS+1=33 and treats the filtering data width; The size of treating the filtering data buffer is 4*BS=64 and treats that filtering data width, multiplier have 16 parallel multiplication units that accumulator has 16 parallel unit that add up.The width of accumulator register is a BS=16 data width.Not limiting data type here, can be 64bit, 32bit, 16bit and 8bit data type.
In fact; In other algorithm of field of digital signals; Such as convolution and related operation, its operating characteristic is realized the same with the FIR filter, be data and multiplication; Adding up then draws a result, and filter provided by the invention and coefficient buffer thereof, data buffer can apply in this algorithm realization equally.
Suppose filter factor number K=18 of filter, suppose that it is distributing shown in 8a in memory.Before computing begins, need carry out filter factor buffer 20 and the initialization of treating filtering data buffer 30.After initialization finishes, filter factor buffer 20 stored 18 required filter factors of filtering, shown in Fig. 8 b.
After accomplishing the initialization of buffer; Multiplier begins the corresponding filtering data of treating is carried out computing; In order to make full use of the filtering data of treating in the filtering data buffer of treating; Multiplier whenever obtains once effectively result, carries out K=18 computing altogether, calls filter factor buffer 20 and treats filtering data buffer 30 once during each computing.
Multiplier 40 can return data to calling of filter factor buffer 20 at every turn.Suppose that to filter factor buffer 20 K=18 time called and to return k
17, k
16..., k
2, k
1, k
0, k wherein
0Width be the width of 1 data.k
17, k
16..., k
2, k
1, k
0H (K-1) in respectively corresponding the FIR algorithm filter expansion, H (K-2) ..., H (2), H (1), H (0), and K=18.
Treating calling of filtering data buffer 30 can return BS=16 and treat that filtering data, multiplier 40 treat K=18 time of filtering data buffer 30 and call and can return d at every turn
0, d
1, d
2, d
17, treat filtering data in their corresponding respectively FIR algorithm filter expansions in K=18 the square frame of the first half right-to-left.So d
0~d
17Width be the width that BS=16 treats filtering data, in describing in the back all with the form sign of one-dimension array, i.e. d
0[0]~d
0[15] expression d
0, d
17[0]~d
17[15] expression d
17The implementation structure of treating the filtering data buffer has guaranteed that the filtering data of treating that it provided is the filtering data of treating of filtering algorithm needs.
Particularly, suppose to treat filtering data in memory and the distribution of treating filtering data buffer 30 respectively shown in 8a and 8b.D then
0[0]~d
0[15]=0,0,0 ..., 0}, d
1[0]~d
1[15]=0,0,0 ..., 0} ..., d
17[0]~d
17[15]=1,2,3 ..., 16}.
Comparator 70 is used for writing down whether current K=18 the multiplying of having carried out.When operation times is lower than K time, current results is write accumulator register 50, wait for the output of multiplier 40 computings next time and pass through accumulator 60 additions; If current operation times equals K, then drawn BS=16 filtering output result this moment, in its write memory 10, promptly obtained BS=16 filtered.
Figure 10 a to Figure 10 c has described above-mentioned through K=18 computing, draws BS result's process, in fact also is the hardware realization description that K right-to-left to FIR filter deployment algorithm adds up.
Wherein Figure 10 a has described the process of the 1st multiply accumulating, and the value in the accumulator register 50 is 0 at this moment, after accumulation calculating, and filter factor k
17And d
0The end value of each element product put into accumulator register, promptly accomplished k
17d
0[0], k
17d
0[1], k
17d
0[2] ..., k
17d
0[15] computing.
Figure 10 b is the process of the 2nd multiply accumulating, and this moment, the result of multiplier 40 outputs was k
16And d
1The product of each element, accumulator obtains k as a result with this result and accumulation result last time (value in the current accumulator register 50) addition
17d
0[0]+k
16d
1[0], k
17d
0[1]+k
16d
1[1], k
17d
0[2]+k
16d
1[2] ..., k
17d
0[15]+k
16d
1And put into accumulator register 50 [15]; Residue k
15~k
1Similar with the calculating process of treating filtering data d2~d16.
Figure 10 c is the process of the 18th multiply accumulating, is still result and accumulator register 50 additions with multiplier 40 this moment.Through above-mentioned 18 computings, can draw BS=16 result, be equivalent to do adding up of following equality sequence:
k
17d
0[0]+k
16d
1[0]+…+k
2d
15[0]+k
1d
16[0]+k
0d
17[0]、
k
17d
0[1]+k
16d
1[1]+…+k
2d
15[1]+k
1d
16[1]+k
0d
17[1]、
k
17d
0[2]+k
16d
1[2]+…+k
2d
15[2]+k
1d
16[2]+k
0d
17[2]、
……、
k
17d
0[15]+k
16d
1[15]+…+k
2d
15[15]+k
1d
16[15]+k
0d
17[15],
Above-mentioned sequence promptly is the expression formula of filtered Y (0)~Y (15).
If input treats that filtering data does not all carry out filtering operation; Then need upgrade and treat filtering data buffer 30; Promptly finish after BS treats filtering data in every calculating; Send to and treat shift signal of filtering data buffer, treat the filtering data buffer in integral shift, from memory the inside value again.Repeat the collaborative calculating process of multiplier 40 and accumulator 60 afterwards, finish up to treating that filtering data all calculates.
What specify is, if treat filtering data number N
xCan not be divided exactly by BS, whole calculating process remains unchanged, but among BS the result that computing draws the last time, has only N
mIndividual effective filtered data, N
mBe N
xRemainder divided by the BS gained.When for the last time data being write back memory, only with this N
mIndividual effective result writes back.
In above-mentioned each embodiment; K '=2BS+1; Therefore the maximum number filter factor size that the filter that K ' is applied to for this filtering data generator can be supported, treats that the size that buffer is implemented in the filtering data generator is K '-1+2BS=4BS.Those skilled in the art should be appreciated that K ' is not only limited to 2BS+1, are that example describes with K '=BS+3 and K '=3BS+2 respectively below.As follows:
When K '=BS+3, embodiment is following: this moment, buffer sizes was at least 3BS+2.Still suppose simultaneously the arithmetic element size, i.e. BS=4, actual filter factor number is K=6, this moment K '=7, buffer size is at least 14.
The incipient stage of using; Present embodiment need carry out initialization earlier, and this moment, 201 work of initialization logic unit were 0 with the partially-initialized of counting buffer address 0~5; With the partially-initialized of address 6~13 is specific value; This value obtains from memory, can get BS=4 from memory at every turn and treat filtering data, need get altogether 2 times.Situation after buffer addressing and the initialization is shown in Figure 11 A.
After initialization finishes; Arithmetic unit promptly adopts the multiplier in shown in Figure 2 need read this buffer; Read logical block 330 and start working this moment; According to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the filtering data of treating of tram.
In the present embodiment, treat that filtering data need read buffer memory K=6 time because every generations BS=4 is individual, thus the computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing first group of BS=4 filtered data, effective offset address=1,2,3,4,5,6}.This effective offset address produces as shown in Figure 6.So the time preceding read for 6 times treat that the filtering data buffer returns treat filtering data be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
After the multiplier computing among Fig. 2 K=6 time, can draw preceding BS=4 end value, need upgrade the value of buffer entity 310 afterwards, i.e. value from the memory the inside.It is effective to export to the shift signal of treating filtering data buffer 30 in the FIR filter among Fig. 2 at this moment, and buffer has the displacement of an integral body, and the amplitude of displacement is BS=4.Displacement back buffer treats that filtering data is shown in Figure 11 B.
Drawn BS filtering dateout this moment, and buffer carried out the corresponding renewal of treating filtering data, cooperates arithmetic unit to carry out the computing that a following BS result treats filtering data, during read buffer and the renewal caching mechanism is constant.
When K '=3BS+2, embodiment is following: this moment, buffer sizes was at least 5BS+1.Still suppose simultaneously the arithmetic element size, i.e. BS=4, actual filter factor number is K=6, this moment K '=14, buffer size is at least 21.
The incipient stage of using; Present embodiment need carry out initialization earlier, and this moment, 201 work of initialization logic unit were 0 with the partially-initialized of counting buffer address 0~12; With the partially-initialized of address 13~20 is specific value; This value obtains from memory, can get BS=4 from memory at every turn and treat filtering data, need get altogether 2 times.Situation after buffer addressing and the initialization is shown in Figure 12 A.
After initialization finishes; Arithmetic unit promptly adopts the multiplier in shown in Figure 2 need read this buffer; Read logical block 330 and start working this moment; According to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the filtering data of treating of tram.
In the present embodiment, treat that filtering data need read buffer memory K=6 time because every generations BS=4 is individual, thus the computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing first group of BS=4 filtered data, effective offset address=8,9,10,11,12,13}.This effective offset address produces as shown in Figure 6.So the time preceding read for 6 times treat that the filtering data buffer returns treat filtering data be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
After the multiplier computing among Fig. 2 K=6 time, can draw preceding BS=4 end value, need upgrade the value of buffer entity 310 afterwards, i.e. value from the memory the inside.It is effective to export to the shift signal of treating filtering data buffer 30 in the FIR filter among Fig. 2 at this moment, and buffer has the displacement of an integral body, and the amplitude of displacement is BS=4.Displacement back buffer treats that filtering data is shown in Figure 12 B.
Drawn BS filtering dateout this moment, and buffer carried out the corresponding renewal of treating filtering data, cooperates arithmetic unit to carry out the computing that a following BS result treats filtering data, during read buffer and the renewal caching mechanism is constant.
What specify is, if treat filtering data number N
xCan not be divided exactly by BS, whole calculating process remains unchanged, but among BS the result that computing draws the last time, has only N
mIndividual effective filtered data, N
mBe N
xRemainder divided by the BS gained.When for the last time data being write back memory, only with this N
mIndividual effective result writes back.
The realization of above-mentioned filter can be applied in the Design of Filter of FPGA, also can be with this filter rows for being defined as a processor instruction, in the inner realization of ASIC.
Can know that from the foregoing description FIR filter of the present invention has following beneficial effect:
1) high efficiency of parallel computation, the present invention adopts BS parallel multiplication unit, can carry out BS multiplying simultaneously; Simultaneously BS multiplication result walked abreast with the relevant register intermediate value and carry out BS add operation; Thereby improved operation efficiency greatly, practice thrift operation time, and all control signals all have been digital signals; Avoid adopting the computing inaccuracies that analogue unit brought such as delay cell, improved operational precision;
2) reconfigurability; The present invention is based on the FIR filter of multiply accumulating device, in 2BS filter factor scope, all obtain filter effect efficiently; Even when the number of filter factor changes; As long as it within the scope of spatial cache, just need not be changed corresponding hardware again, thereby realize goodish reconstruct property;
3) treat the filtering data high usage, the present invention uses and treats filtering data buffer 30 and coefficient buffer 20 structures through in the whole algorithm level analysis; Made full use of and treated the filtering data principle of locality, do not repeated to load the phenomenon of treating filtering data, and every visit primary memory; Can draw BS result, reduce memory accesses, thereby reach the effect of " once reading; repeatedly calculate ", and then reduce the power consumption of whole design.
Above-described specific embodiment; The object of the invention, technical scheme and beneficial effect have been carried out further explain, and institute it should be understood that the above is merely specific embodiment of the present invention; Be not limited to the present invention; All within spirit of the present invention and principle, any modification of being made, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.