Embodiment
For making the object, technical solutions and advantages of the present invention clearer, below in conjunction with specific embodiment, and with reference to accompanying drawing, the present invention is described in more detail.Although the demonstration of the parameter that comprises particular value can be provided herein, should be appreciated that, parameter is without definitely equaling corresponding value, but can in acceptable error margin or design constraint, be similar to described value.
Applicant finds, the algorithm of FIR filter in formula 1 can be expanded into following result:
For FIR algorithm filter as implied above, launch result, filter factor is used has following rule:
(1) with regard to the row at each dotted line frame place, each multiplier carries out computing by BS data to be filtered in a filter factor and dotted line frame, and the BS of this process multiplication can be concurrent operation;
(2) result after parallel adds up again, and cumulative process is right-to-left as implied above, can draw common BS the result of Y (0)~Y (BS-1) after cumulative K time.Wherein, often draw BS result, use respectively filter factor H (K-1), H (K-2), H (K-3) ..., H (1), H (0).
Therefore, filter factor buffer is had to 2 requirements: 1, read at every turn and return to a filter factor; 2, be separated by and read for K time, the value of returning is the same.According to above-mentioned result, the number of filter factor is K, and the number of operational data is BS.
For FIR algorithm filter as implied above, launch result, data to be filtered are used has following rule:
(1), for the adjacent dotted line frame in left and right, it uses data to be filtered only to have a difference.
(2), in neighbouring BS filtering result computing, there are BS data to be filtered not re-use, with stylish use BS data to be filtered.For example, while calculating Y (BS)~Y (2BS-1), data X to be filtered (K+1)~X (BS-K) does not re-use, data X to be filtered (BS-K+1)~X (BS-1) continues to use, and has newly increased the use of data X to be filtered (BS)~X (2BS-1) simultaneously.
(3) data X vector to be filtered, the minus part of its index value, by zero replacement.Be X (1), X (2) in above formula ..., X (K+1) is zero.
According to above-mentioned rule, can design a kind of novel data buffer to be filtered, this data buffer to be filtered at least meets following 4 requirements: 1, read buffer at every turn and can return to BS data to be filtered; 2, reading of adjacent twice buffer, the BS returning a data to be filtered the inside, only has a value different; 3, often complete K computing, produced after BS filtering result, this buffer has an overall renewal, and this renewal, by the BS bringing into use most rejection of data to be filtered, newly loads BS the data to be filtered in memory; 4, buffer provides corresponding zero padding region, and effectively zero number is K-1.
Technical in above-mentioned filter factor and data rule to be filtered, applicant proposes a kind of FIR filter, to improve the service efficiency of filter factor and data to be filtered, reduces the read-write number of times to memory, improves arithmetic speed.
In one exemplary embodiment of the present invention, a kind of FIR filter has been proposed.Fig. 2 is the structural representation of embodiment of the present invention FIR filter.As shown in Figure 2, the present embodiment FIR filter comprises: filter factor buffer 20, data buffer to be filtered 30, multiplier 40, accumulator register 50, accumulator 60 and comparator 70.The peripheral components of this FIR filter also comprises memory 10.
Filter factor buffer 20, for K the filter factor that prestore; At n execution cycle, receive and read after enable signal, 1 filter factor H (i) is provided.
Data buffer 30 to be filtered, for the N that prestores
xindividual data to be filtered; At n execution cycle, receive and read after enable signal, 1 data group to be filtered is provided, this data group to be filtered comprises BS data to be filtered: X (nBS-i), X (nBS-i+1) ..., X[(n+1) and BS-i-1], when the X of data to be filtered vector index is less than 0, this filtering data is with 0 replacement.
Multiplier 40, comprises a parallel BS multiplication unit; Each multiplication unit is all connected with data buffer 30 to be filtered with filter factor buffer 20, for realizing the product calculation of 1 data to be filtered of filter factor H (i) and data group to be filtered, obtains 1 multiplication result.
Accumulator 60, its control end is connected with comparator, comprises parallel BS cumulative unit; Each cumulative unit and corresponding multiplication unit are connected with deposit unit in accumulator register, for the current result of product of cumulative described multiplication unit and the intermediate data of deposit unit.When receiving the output useful signal of comparator, this accumulator is organized the output of filtering result using BS accumulation result as n; Otherwise, using this BS accumulation result as intermediate data, export respectively deposit unit corresponding in accumulator register to.
Accumulator register 50, comprises independently BS deposit unit, and the input of each deposit unit is connected with input with the output of corresponding cumulative unit respectively with output, for depositing the intermediate data of accumulating operation.
Comparator 70, for relatively transmitting next computing sequence number and the relation of filter factor number K by multiplier 40, when computing sequence number < K-1, to filter factor buffer and data buffer to be filtered transmission, read enable signal, when computing sequence number=K-1, to accumulator, send output useful signal.In fact, this computing sequence number also can be passed to comparator by accumulator 60 or accumulator register 50, its principle similarly, equally within protection scope of the present invention.Wherein, computing sequence number is in the result data computational process of BS of carrying out a certain group, the product calculation having completed or the number of times of accumulating operation, and this value is since 0 counting, and maximum is K-1.Meanwhile, complete when the result data calculating of this group BS, in the computational process of the result data of next group BS, computing sequence number is again since 0 counting.
It should be noted that, n gets 0,1,2,3 successively ..., ceil (N
x/ BS)-1, ceil is the symbol to contiguous larger integer value, ceil (N
x/ BS)-1 group of filtering result be as N
xthe filtering result of individual data to be filtered.
In addition, the principle that in the present embodiment, data to be filtered and filter factor are chosen is, not homogeneous read the filter factor difference that enable signal correspondence provides, and not homogeneous read the data group difference to be filtered that enable signal is corresponding.In general, can select one by one filter factor and data group to be filtered, as long as guarantee that K computing do not repeat mutually.Certainly, for the algorithm as FIR filter, launch result, have two kinds of usual computational methods, a kind of method is to calculate successively from right-to-left, and another kind of method is to calculate successively from left to right.Preferably, for the algorithm of FIR filter, launch result, from right-to-left, calculate successively, i get successively K-1, K-2 ..., 1,0 (will describe in detail hereinafter).
In order can more to have known, understand the present embodiment, existing generation and output to its timing control signal describes.As shown in Figure 2, the output useful signal of comparator is used to refer to current BS the filtering result that completed on the one hand, the output that is current accumulator 60 is effective, is used to refer on the other hand current effectively (will describe in detail hereinafter) of shift signal that outputs to data buffer 30 to be filtered.In addition, by multiplier 40, outputed to the enable signal of reading of filter factor buffer 20, be used to refer to current multiplier 40 and need to read filter factor buffer 20; By multiplier 40, outputed to the enable signal of reading of data buffer 30 to be filtered, be used to refer to current multiplier 40 and need to read data buffer 30 to be filtered.Certainly, this reads enable signal also can be issued to filter factor buffer and data buffer to be filtered by comparator or accumulator, be one of them of comparator, multiplier and accumulator, also, for when the computing sequence number < K-1, to described filter factor, provide module and described data to be filtered to provide module to send and read enable signal.
From the present embodiment, can find out, be different from FIR filter of the prior art, the present invention adopts BS parallel multiplication unit, can carry out BS multiplying simultaneously, BS multiplication result be carried out to BS add operation with corresponding register intermediate value is parallel simultaneously, thereby greatly improved operation efficiency, save operation time, and all control signals are all digital signals, avoided the computing inaccuracy that adopts the analog components such as delay cell to bring, improved operational precision.
Adopt parallel multiplication unit and supporting adder and register in the present embodiment, can carry out BS concurrent operation simultaneously, this concurrent operation is a part of computing that produces BS filtering result, adopt the parallel of this form, make the quantity of arithmetic unit be independent of filter factor number, when filter factor increases, the parallel multiplication unit in reuse plan and supporting adder and register, have the flexibility of height.
The present embodiment FIR filter adopts filter factor buffer and data buffer to be filtered.In fact, also can adopt and can realize data to be filtered other memory devices of function are provided, as register etc.In order to reduce as far as possible reading memory, improve arithmetic speed and efficiency, for filter factor, the mode that the present invention adopts memory and buffer to mutually combine, filter factor is deposited in order in buffer from memory, take filter factor number K as the cycle, export a filter factor at every turn.Concrete technical scheme is as follows:
In the present embodiment, filter factor buffer 20, for reading filter factor from memory 10, is placed in buffer entity, and 1 filter factor H (i) is read in each filter computing in order.Fig. 3 is the schematic diagram of filter factor buffer in embodiment of the present invention FIR filter.As shown in Figure 3, this filter factor buffer 20 comprises that a buffer entity 201, one read logical block 202, initialization logic unit 203 and a logic with shift unit 204.Wherein:
Buffer entity 201, buffer memory is for the filter factor of computing, and its size is relevant with the size (BS) of arithmetic element, and buffer memory cell number is 2BS+1 herein, and is addressed to from top to bottom 0~2BS.
Read logical block 202 for the read operation to filter factor buffer, take filter factor number (K) as the cycle, return in order a filter factor.
Initialization logic unit 203 is for the initialization to buffer entity.From memory access, buffer entity 201 is carried out to assignment.
From the structure of above-mentioned filter factor buffer 20, can find out, there is not inevitable relation in arithmetic element number BS of the present invention and filter factor number K.In the situation that arithmetic element number BS is certain, if filter factor number K changes, do not need to change internal arithmetic unit number or size, as long as this filter factor number K is in the memory space ranges of filter factor buffer, just can meet the needs of computing completely, there is high degree of flexibility, thereby make the present invention there is good reconfigurability.Take the size of filter factor buffer as 2BS+1, can adapt to 2BS+1 the filtering operation within the scope of filter factor.For the ease of understanding, in explanation below, the largest filter coefficient number that can support take filter factor buffer is of a size of K ' as example.
In the present embodiment, filter factor buffer is to design for meeting the requirement to filter factor in filter implementation procedure, its function is for providing filter factor, its way of realization may be various, such as adopting 2BS+1 register, realize, and do not need added logic expense, but in each product calculation, which register multiplier need to be controlled from is got required coefficient.
Further, in order to improve raising the efficiency of data to be filtered, the parts that provide that the present invention also treats filtering data improve, in the present embodiment, data buffer 30 to be filtered, for reading data group to be filtered from memory 10, be placed in data buffer entity to be filtered, data group d to be filtered is read in each filter computing
k, wherein, d
kbe a vector [d
k[0], d
k[1] ..., d
k[BS-1]], d in FIR algorithm filter expansion
kthe part that vector index is less than 0, as X[-1], all with 0, replace.This buffer size is 4BS, the computing size that wherein BS is multiplier, the multiplying number that can simultaneously carry out.
Fig. 4 is the schematic diagram of data buffer to be filtered in embodiment of the present invention FIR filter.As shown in Figure 4, this data buffer to be filtered comprises data buffer entity 310 to be filtered, reads logical block 320, upgrades logical block 330.Wherein:
Buffer entity 310 buffer memorys are for the data to be filtered of computing.Its size is relevant with the size (BS) of arithmetic element demand, and size is 4BS size herein, and is addressed to 0~4BS-1.
Read logical block 330 for reading the data to be filtered of buffer.Its input signal is for reading enable signal, computing sequence number and filter factor number, and output signal comprises to buffer entity 10 sends effective offset address, obtains the data to be filtered of BS size, and by this data output to be filtered.
Wherein effectively offset address need to be according to computing sequence number, read enable signal and filter factor number produces.Corresponding relation is: when reading enable signal when effective, and effectively offset address=K '-K+ computing sequence number.Wherein computing sequence number be by arithmetic element (multiplication unit) according to the order of reading buffer memory, the number of reading of transmission, this computing sequence number, since 0, increases by 1 at every turn.For the present embodiment, the size of filter factor buffer entity is 2BS+1, this effective offset address=2BS+1-K+ computing sequence number.
Upgrade logical block 320 for upgrading the data content to be filtered of buffer entity, this structure comprises an initialization logic unit 321 and a logic with shift unit 322.Wherein:
Initialization logic unit is the operation in early stage to buffer entity 10 before arithmetic unit work.For the initialization to buffer entity, the partially-initialized that wherein address is 0~2BS-1 is 0, and address 2BS~4BS-1 part is from the value initialization of memory the inside.
Logic with shift unit 322 is for the integral shift to memory block entity data to be filtered, and, when input signal shift signal is effective, start working in logic with shift unit.
First suppose largest filter coefficient number K '=2BS+1 that one-dimensional filtering that data buffer to be filtered coordinates is supported, this data buffer physical size to be filtered is at least 4BS size, suppose arithmetic element size, i.e. simultaneously BS=4, actual filter factor number is K=6.
Known in conjunction with above-mentioned hypothesis, K '=9, treat data buffer physical size >=4BS=16 to be filtered, K=6.First take several buffer physical size, as 4BS=16 as example, the present invention will be further described below.
The incipient stage of using, the present embodiment need to first carry out initialization, now work in initialization logic unit 201, by the partially-initialized of several buffers address 0~7, be 0, by the partially-initialized of address 8~15, it is specific value, this value obtains from memory, can get BS=4 data to be filtered from memory at every turn, need to get altogether 2 times.Situation after buffer addressing and initialization as shown in Figure 5 a.
After initialization, the multiplier of arithmetic unit in adopting shown in Fig. 2 need to read this buffer, now reading logical block 330 starts working, according to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the data to be filtered of tram.
In the present embodiment, because every generation BS=4 data to be filtered need to read buffer memory K=6 time, therefore computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing during first group of BS=4 filtering result data, effective offset address=3,4,5,6,7,8}.This effective offset address produces figure as shown in Figure 6.So time front read for 6 times the data to be filtered that data buffer to be filtered returns be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
When after the multiplier computing in Fig. 2 K=6 time, can draw a front BS=4 end value, need afterwards to upgrade the value of buffer entity 310, from memory the inside value.The shift signal of now exporting to data buffer 30 to be filtered in Fig. 2 in FIR filter is effective, and buffer has an overall displacement, and the amplitude of displacement is BS=4.After displacement, buffer data to be filtered as shown in Figure 5 b.
Now drawn BS filtering output data, and buffer carries out the renewal of corresponding data to be filtered, coordinated arithmetic unit to carry out the computing of BS result data to be filtered below, read during this time buffer and renewal caching mechanism is constant.
Above-described embodiment is that buffer size is the situation of 4BS, and in fact this buffer is not limited only to 4BS size.Under the prerequisite of K '=2BS+1, take buffer entity 10 as 5BS sizes as example, introduce in detail the present invention below.
When buffer entity 310 sizes are 5BS, as shown in Figure 5 a, actual filter factor number (being less than or equal to 2BS) is K=6 in the distribution of same data to be filtered in memory, and buffer entity is addressed to 0~19.
When Fig. 7 a is the big or small 5BS of being of data buffer to be filtered, after initialization, the data to be filtered in buffer entity.Now compare size for the buffer of 4BS, after initialization many buffer memorys BS=4 data to be filtered.These do not affect the logical block 330 of reading of the present invention, it equally according to input computing sequence number, read enable signal and filter factor number signal, produce effective offset address, thereby read the data to be filtered of tram.
Same, first group of BS=4 filtering result data of every generation, effectively offset address distinguish=3,4,5,6,7,8}.This effective offset address produces figure as shown in Figure 6.So time front read for 6 times the data to be filtered that data buffer to be filtered returns be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
When after the multiplier computing in Fig. 2 K=6 time, can draw a front BS=4 end value, need afterwards to upgrade the value of data buffer entity 310 to be filtered, from memory the inside value.The shift signal of now exporting to data buffer 30 to be filtered in Fig. 2 is effective, and buffer has an overall displacement, and the amplitude of displacement is BS=4.After displacement, buffer data to be filtered are as shown in 7b.
Now drawn BS filtering output data, and buffer carries out the renewal of corresponding data to be filtered, coordinated arithmetic unit to carry out the computing of BS result data below, read during this time buffer and renewal caching mechanism is constant.
From the structure of above-mentioned data buffer 30 to be filtered, can find out, it fully takes into account Data duplication in algorithm and utilizes phenomenon, at memory, load after given number data, carry out abundant computing, according to algorithm requirement, after all computings that use this part data to be filtered all complete, just do the renewal of data buffer to be filtered, this part data to be filtered can again not be loaded in whole algorithm implementation procedure afterwards, improved like this data user rate to be filtered, reduced the number of times of reference to storage, solved when processor does one-dimensional filtering algorithm the powerful demand for number ability of processor, and then reduced the power consumption of whole design.
In the present embodiment, can simplify the function of data buffer to be filtered, the thing that data buffer to be filtered will be done is transferred arithmetic element (multiplication unit or cumulative unit) to and is done, such as the logic of reading that can reduced data buffer, the effective address of this part is produced to transfer to and read logical block and do.But initialization logic and more new logic are absolutely necessary.In a word, due to the demand of filter of the present invention to data, so designed this buffer.Can to the requirement of data, regard filter as requirement profile, and this data buffer to be filtered is a kind of most possible realization.Other realize all or part of sequential logic of this standard, can regard data buffer to be filtered of the present invention as.
In addition, the present embodiment pending data buffer is for the application scenario of different filter factor numbers, can, according to the zero padding of algorithm requirements intelligence, greatly facilitate programmer to carry out algorithm realization, and not need to consider to carry out for different coefficient length the zero padding operation of suitable number.
Certainly, also can adopt other data providing formula to be filtered, as data buffer to be filtered being divided into BS sub-buffer, BS corresponding each filter factor in n execution cycle data to be filtered are all stored to one of them of BS sub-buffer.In each multiplying, from sub-buffer, read whole data to be filtered to multiplier.Certainly, the internal memory that this data providing formula to be filtered expends is larger, and reading efficiency is lower.
By a concrete scene, be below that example is described below in detail the present invention: in the present embodiment, the computing size BS=16 of multiplier, each this multiplier can carry out 16 pairs of data and multiplies each other, corresponding, the size of filter factor buffer is 2*BS+1=33 data width to be filtered, the size of data buffer to be filtered is 4*BS=64 data width to be filtered, and multiplier has 16 parallel multiplication units, and accumulator has 16 parallel cumulative unit.The width of accumulator register is BS=16 data width.Not limiting data type herein, can be 64bit, 32bit, 16bit and 8bit data type.
In fact, in other algorithm of field of digital signals, such as convolution and related operation, its operating characteristic is realized with FIR filter, be data and multiplication, then add up and show that a result, filter provided by the invention and coefficient buffer thereof, data buffer can apply in this algorithm realization equally.
Suppose the filter factor number K=18 of filter, suppose that it distributes as shown in 8a in memory.Before computing starts, need to carry out the initialization of filter factor buffer 20 and data buffer to be filtered 30.After initialization, interior 18 the required filter factors of filtering of having stored of filter factor buffer 20, as shown in Figure 8 b.
Complete after the initialization of buffer, multiplier starts corresponding data to be filtered to carry out computing, in order to make full use of the data to be filtered in data buffer to be filtered, multiplier often obtains once effectively result, carry out altogether K=18 computing, during each computing, call filter factor buffer 20 and data buffer to be filtered 30 once.
Multiplier 40 can return to data to calling of filter factor buffer 20 at every turn.Suppose K=18 time of filter factor buffer 20 to call and can return to k
17, k
16..., k
2, k
1, k
0, wherein k
0width be the width of 1 data.K
17, k
16..., k
2, k
1, k
0the respectively H (K-1) in corresponding FIR algorithm filter expansion, H (K-2) ..., H (2), H (1), H (0), and K=18.
Treat calling of filtering data buffer 30 at every turn and can return to BS=16 data to be filtered, multiplier 40 is treated calling for K=18 time of filtering data buffer 30 can return to d
0, d
1, d
2, d
17, their are the data to be filtered in K=18 square frame of the first half right-to-left in corresponding FIR algorithm filter expansion respectively.So d
0~d
17width be the width of BS=16 data to be filtered, equal form sign with one-dimension array in describing in the back, i.e. d
0[0]~d
0[15] represent d
0, d
17[0]~d
17[15] represent d
17.The implementation structure of data buffer to be filtered has guaranteed that its data to be filtered that provide are the data to be filtered that filtering algorithm needs.
Particularly, suppose data to be filtered in the distribution of memory and data buffer to be filtered 30 respectively as shown in 8a and 8b.D
0[0]~d
0[15]=0,0,0 ..., 0}, d
1[0]~d
1[15]=0,0,0 ..., 0} ..., d
17[0]~d
17[15]=1,2,3 ..., 16}.
Comparator 70 is used for recording whether current K=18 the multiplying of having carried out.When operation times is during lower than K time, current results is write to accumulator register 50, wait for multiplier 40 computings outputs next time and being added by accumulator 60; If current operation times equals K, now drawn BS=16 filtering Output rusults, by its write memory 10, obtained BS=16 filtering result.
Figure 10 a to Figure 10 c has described above-mentioned through K=18 computing, draws the process of BS result, is in fact also a cumulative hardware of K right-to-left of FIR filter deployment algorithm is realized and being described.
Wherein Figure 10 a has described the process of the 1st multiply accumulating, and now the value in accumulator register 50 is 0, after accumulation calculating, and filter factor k
17and d
0the end value of each element product put into accumulator register, completed k
17d
0[0], k
17d
0[1], k
17d
0[2] ..., k
17d
0[15] computing.
Figure 10 b is the process of the 2nd multiply accumulating, and the result that now multiplier 40 is exported is k
16and d
1the product of each element, accumulator is added this result and accumulation result last time (value in current accumulator register 50), obtains result k
17d
0[0]+k
16d
1[0], k
17d
0[1]+k
16d
1[1], k
17d
0[2]+k
16d
1[2] ..., k
17d
0[15]+k
16d
1and put into accumulator register 50 [15]; Residue k
15~k
1similar with the calculating process of data d2~d16 to be filtered.
Figure 10 c is the process of the 18th multiply accumulating, is now still the result of multiplier 40 and accumulator register 50 are added.Through above-mentioned 18 computings, can draw BS=16 result, be equivalent to do the cumulative of following equation sequence:
k
17d
0[0]+k
16d
1[0]+…+k
2d
15[0]+k
1d
16[0]+k
0d
17[0]、
k
17d
0[1]+k
16d
1[1]+…+k
2d
15[1]+k
1d
16[1]+k
0d
17[1]、
k
17d
0[2]+k
16d
1[2]+…+k
2d
15[2]+k
1d
16[2]+k
0d
17[2]、
……、
k
17d
0[15]+k
16d
1[15]+…+k
2d
15[15]+k
1d
16[15]+k
0d
17[15],
Above-mentioned sequence is the expression formula of filtering result Y (0)~Y (15).
If input data to be filtered, all do not carry out filtering operation, need to upgrade data buffer 30 to be filtered, after the complete BS of every calculating data to be filtered, send to shift signal of data buffer to be filtered, data buffer to be filtered is in integral shift, from memory the inside value again.Repeat afterwards the collaborative calculating process of multiplier 40 and accumulator 60, until the whole calculating of data to be filtered is complete.
Special instruction, if data amount check N to be filtered
xcan not be divided exactly by BS, whole calculating process remains unchanged, but in BS the result that computing draws the last time, only has N
mindividual effective filtering result data, N
mfor N
xdivided by the remainder of BS gained.While for the last time data being write back to memory, only by this N
mindividual effective result writes back.
In above-mentioned each embodiment, K '=2BS+1, the maximum number filter factor size that K ' can support for the filter that this filtered data providing apparatus is applied to, therefore, the size that in data supplying device to be filtered, buffer is implemented is K '-1+2BS=4BS.Those skilled in the art should be appreciated that K ' is not only limited to 2BS+1, describe respectively below as an example of K '=BS+3 and K '=3BS+2 example.As follows:
When K '=BS+3, embodiment is as follows: now buffer sizes is at least 3BS+2.Still suppose arithmetic element size simultaneously, i.e. BS=4, actual filter factor number is K=6, now K '=7, buffer size is at least 14.
The incipient stage of using, the present embodiment need to first carry out initialization, now work in initialization logic unit 201, by the partially-initialized of several buffers address 0~5, be 0, by the partially-initialized of address 6~13, it is specific value, this value obtains from memory, can get BS=4 data to be filtered from memory at every turn, need to get altogether 2 times.Situation after buffer addressing and initialization as shown in Figure 11 A.
After initialization, the multiplier of arithmetic unit in adopting shown in Fig. 2 need to read this buffer, now reading logical block 330 starts working, according to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the data to be filtered of tram.
In the present embodiment, because every generation BS=4 data to be filtered need to read buffer memory K=6 time, therefore computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing during first group of BS=4 filtering result data, effective offset address=1,2,3,4,5,6}.This effective offset address produces as shown in Figure 6.So time front read for 6 times the data to be filtered that data buffer to be filtered returns be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
When after the multiplier computing in Fig. 2 K=6 time, can draw a front BS=4 end value, need afterwards to upgrade the value of buffer entity 310, from memory the inside value.The shift signal of now exporting to data buffer 30 to be filtered in Fig. 2 in FIR filter is effective, and buffer has an overall displacement, and the amplitude of displacement is BS=4.After displacement, buffer data to be filtered as shown in Figure 11 B.
Now drawn BS filtering output data, and buffer carries out the renewal of corresponding data to be filtered, coordinated arithmetic unit to carry out the computing of BS result data to be filtered below, read during this time buffer and renewal caching mechanism is constant.
When K '=3BS+2, embodiment is as follows: now buffer sizes is at least 5BS+1.Still suppose arithmetic element size simultaneously, i.e. BS=4, actual filter factor number is K=6, now K '=14, buffer size is at least 21.
The incipient stage of using, the present embodiment need to first carry out initialization, now work in initialization logic unit 201, by the partially-initialized of several buffers address 0~12, be 0, by the partially-initialized of address 13~20, it is specific value, this value obtains from memory, can get BS=4 data to be filtered from memory at every turn, need to get altogether 2 times.Situation after buffer addressing and initialization is as shown in Figure 12 A.
After initialization, the multiplier of arithmetic unit in adopting shown in Fig. 2 need to read this buffer, now reading logical block 330 starts working, according to the computing sequence number of input, read enable signal and maximum filter factor number and the actual filter factor number signal supported, produce effective offset address, thereby read the data to be filtered of tram.
In the present embodiment, because every generation BS=4 data to be filtered need to read buffer memory K=6 time, therefore computing sequence number be a string 0,1,2,3,4, the cyclic code of 5}.When producing during first group of BS=4 filtering result data, effective offset address=8,9,10,11,12,13}.This effective offset address produces as shown in Figure 6.So time front read for 6 times the data to be filtered that data buffer to be filtered returns be respectively 0,0,0,0}, 0,0,0,0}, 0,0,0,1}, 0,0,1,2}, 0,1,2,3}, 1,2,3,4}.
When after the multiplier computing in Fig. 2 K=6 time, can draw a front BS=4 end value, need afterwards to upgrade the value of buffer entity 310, from memory the inside value.The shift signal of now exporting to data buffer 30 to be filtered in Fig. 2 in FIR filter is effective, and buffer has an overall displacement, and the amplitude of displacement is BS=4.After displacement, buffer data to be filtered as shown in Figure 12 B.
Now drawn BS filtering output data, and buffer carries out the renewal of corresponding data to be filtered, coordinated arithmetic unit to carry out the computing of BS result data to be filtered below, read during this time buffer and renewal caching mechanism is constant.
Special instruction, if data amount check N to be filtered
xcan not be divided exactly by BS, whole calculating process remains unchanged, but in BS the result that computing draws the last time, only has N
mindividual effective filtering result data, N
mfor N
xdivided by the remainder of BS gained.While for the last time data being write back to memory, only by this N
mindividual effective result writes back.
The realization of above-mentioned filter can be applied in the design of filter of FPGA, also can be by this filter rows for being defined as a processor instruction, in the inner realization of ASIC.
From above-described embodiment, FIR filter of the present invention has following beneficial effect:
1) high efficiency of parallel computation, the present invention adopts BS parallel multiplication unit, can carry out BS multiplying simultaneously, BS multiplication result walked abreast and carry out BS add operation with corresponding register intermediate value simultaneously, thereby greatly improved operation efficiency, save operation time, and all control signals have been all digital signals, avoid the computing inaccuracy that adopts the analogue units such as delay cell to bring, improved operational precision;
2) reconfigurability, the present invention is based on the FIR filter of multiply accumulating device, within the scope of 2BS filter factor, all obtain efficient filter effect, even when the number of filter factor changes, as long as it,, within the scope of spatial cache, just need not again be changed corresponding hardware, thereby realize goodish reconstruct;
3) data high usage to be filtered, the present invention passes through in whole algorithm level analysis, use data buffer 30 to be filtered and coefficient buffer 20 structures, take full advantage of data locality principle to be filtered, do not repeat to load the phenomenon of data to be filtered, and every access primary memory, can draw BS result, reduce memory accesses, thereby reach the effect of " once reading; repeatedly calculate ", and then reduce the power consumption of whole design.
Above-described specific embodiment; object of the present invention, technical scheme and beneficial effect are further described; institute is understood that; the foregoing is only specific embodiments of the invention; be not limited to the present invention; within the spirit and principles in the present invention all, any modification of making, be equal to replacement, improvement etc., within all should being included in protection scope of the present invention.