CN104009733A

CN104009733A - Hardware Implementation Method of Sample Importance Resampling Particle Filter Based on FPGA

Info

Publication number: CN104009733A
Application number: CN201410211522.6A
Authority: CN
Inventors: 朱志宇; 吴将; 王彪; 李阳; 沈舒; 陈迅; 薛文涛; 黄巧亮; 戴晓强
Original assignee: Jiangsu University of Science and Technology
Current assignee: Changshu Intellectual Property Operation Center Co ltd
Priority date: 2014-05-19
Filing date: 2014-05-19
Publication date: 2014-08-27
Anticipated expiration: 2034-05-19
Also published as: CN104009733B

Abstract

The invention discloses a hardware implementation method of a sample importance resampling particle filter based on FPGA, the method is as follows: (1) The particle generation module is used to receive input vectors to generate particles and then output them to the particle update module, resampling module; (2) the particle update module is used to update the particles generated in step (1), that is, the weight calculation and weight normalization are output to the resampling module; (3) the resampling module is used to update the step (2) The updated particles or the particles generated in step (1) are fed back to the step (1) particle generation module after the resampling process and state update; (4) the output generation module is used to update the step (2) Particles or particles generated in step (1) are used for data generation and output.

Description

Hardware Implementation Method of Sample Importance Resampling Particle Filter Based on FPGA

技术领域technical field

本发明涉及一种基于FPGA实现的粒子滤波算法的硬件实现方法，采用数据流结构的模块级流水线设计方法，属于非线性系统滤波和电子技术领域。The invention relates to a hardware implementation method of a particle filter algorithm based on FPGA, a module-level assembly line design method using a data flow structure, and belongs to the field of nonlinear system filtering and electronic technology.

背景技术Background technique

粒子滤波是一种基于蒙特卡罗方法和递推贝叶斯估计的统计滤波方法，适用于任何能用状态空间模型以及传统的卡尔曼滤波表示的非高斯背景的非线性随机系统。Particle filter is a statistical filtering method based on Monte Carlo method and recursive Bayesian estimation, which is suitable for any nonlinear stochastic system with non-Gaussian background that can be represented by state space model and traditional Kalman filter.

但是粒子滤波存在粒子退化、粒子多样性丧失、粒子数与计算复杂度同比增长等问题。另一方面，粒子滤波算法比较复杂，运算量比较大，从而使得粒子滤波的实时性很差，阻碍了其实际应用。考虑到各粒子的独立性及其操作的并行性，硬件实现不失为提高粒子滤波实时性的有效途径之一。目前，绝大部分粒子滤波文献是关于其理论研究和算法仿真的，而关于其硬件实现的则很少。粒子滤波器在硬件系统上的应用还只是初级阶段，而粒子滤波器从理论、算法研究走向实际应用的过程中，硬件实现是一个关键环节。随着粒子滤波算法的深入研究和嵌入式微处理器技术的发展，使粒子滤波算法的硬件实现成为可能。However, particle filtering has problems such as particle degradation, loss of particle diversity, year-on-year increase in the number of particles and computational complexity. On the other hand, the particle filter algorithm is more complex and the amount of calculation is relatively large, which makes the real-time performance of the particle filter very poor and hinders its practical application. Considering the independence of each particle and the parallelism of its operation, hardware implementation is one of the effective ways to improve the real-time performance of particle filter. At present, the vast majority of particle filter literature is about its theoretical research and algorithm simulation, but very little about its hardware implementation. The application of particle filter on the hardware system is only in its infancy, and the hardware implementation is a key link in the process of particle filter from theory and algorithm research to practical application. With the in-depth study of particle filter algorithm and the development of embedded microprocessor technology, the hardware realization of particle filter algorithm becomes possible.

可重构计算最早由加利福尼亚大学洛杉矶分校的Estrin教授于1962年提出。可重构计算是指使用集成了可编程硬件的系统进行计算，并且可编程硬件的功能可由一系列定时变化的物理可控点来定义，其计算硬件结构可以改变(可重构)。上世纪70年代末，Suetlana P.等人提出了动态可重构系统的概念，研究在系统运行时对系统的局部进行重构，改变其配置，提高了系统性能和资源利用率。20世纪90年代末，随着FPGA技术的进一步成熟，FPGA成为可重构计算的主流硬件平台，很多算法(如卡尔曼滤波算法)都出现了基于FPGA的硬件计算方法。目前，FPGA器件已支持更加先进灵活的动态重构技术。Reconfigurable computing was first proposed by Professor Estrin of the University of California, Los Angeles in 1962. Reconfigurable computing refers to the use of systems integrated with programmable hardware for computing, and the functions of programmable hardware can be defined by a series of physically controllable points that change at regular intervals, and the computing hardware structure can be changed (reconfigurable). In the late 1970s, Suetlana P. et al. proposed the concept of a dynamic reconfigurable system, and studied the partial reconfiguration of the system while the system was running, changing its configuration, and improving system performance and resource utilization. In the late 1990s, with the further maturity of FPGA technology, FPGA became the mainstream hardware platform for reconfigurable computing, and many algorithms (such as Kalman filter algorithm) appeared FPGA-based hardware computing methods. Currently, FPGA devices already support more advanced and flexible dynamic reconfiguration techniques.

发明内容Contents of the invention

本发明为了提高硬件实现粒子滤波算法的计算效率和精度，提出一种基于FPGA的样本重要性重采样粒子滤波器的硬件实现方法，应用FPGA设计粒子滤波算法的各个模块，从而为工程应用中复杂粒子滤波算法的高效计算和硬件实现问题，提供一种新颖的解决思路。In order to improve the calculation efficiency and precision of the particle filter algorithm implemented by hardware, the present invention proposes a hardware implementation method of the sample importance resampling particle filter based on FPGA, and uses FPGA to design each module of the particle filter algorithm, thereby providing complex solutions for engineering applications. Efficient calculation and hardware implementation of particle filter algorithm provide a novel solution.

本发明基于FPGA的样本重要性重采样粒子滤波器(Samples ImportanceResampling Particle Filter—SIRF)的硬件实现方法，所述粒子滤波器包括粒子生成模块、粒子更新模块、重采样模块和输出生成模块，其中：The present invention is based on the FPGA-based sample importance resampling particle filter (Samples ImportanceResampling Particle Filter—SIRF) hardware implementation method, the particle filter includes a particle generation module, a particle update module, a resampling module and an output generation module, wherein:

(1)粒子生成模块用于接收输入向量生成粒子后分别输出至粒子更新模块、重采样模块；(1) The particle generation module is used to receive the input vector to generate particles and output them to the particle update module and the resampling module respectively;

(2)粒子更新模块用于对步骤(1)生成的粒子进行更新即权值计算和权值归一化后输出至重采样模块；(2) The particle update module is used to update the particles generated in step (1), that is, the weight calculation and weight normalization are output to the resampling module;

(3)重采样模块用于对步骤(2)所述更新后的粒子或者步骤(1)生成的粒子进行重采样过程和状态更新后反馈至步骤(1)粒子生成模块；(3) The resampling module is used to feed back to the particle generation module of step (1) after performing a resampling process and state update to the particles after the update described in step (2) or the particles generated in step (1);

(4)输出生成模块用于对步骤(2)所述更新后的粒子或者步骤(1)生成的粒子进行数据生成输出。(4) The output generating module is used to generate and output data on the updated particles described in step (2) or the particles generated in step (1).

步骤(1)所述粒子生成模块所有输入输出都是M维(M＝4)向量，且缓冲控制器的参数相同。All input and output of the particle generation module in step (1) are M-dimensional (M=4) vectors, and the parameters of the buffer controller are the same.

所述重采样模块所有的输入输出数据都是M维的。All input and output data of the resampling module are M-dimensional.

步骤(2)所述的粒子更新模块分成三个处理模块：PU1、PU2和PU3；The particle update module described in step (2) is divided into three processing modules: PU1, PU2 and PU3;

PU1处理模块接收来自粒子生成模块的输入，将输出M维临时数据t_PU1输送到PU2处理模块；The PU1 processing module receives the input from the particle generation module, and outputs the M-dimensional temporary data t _PU1 to the PU2 processing module;

PU2模块接收来自PU1处理模块的M维临时数据t_PU1和外部观测输入(z(n))进行权值计算形成输出流t_PU2，同时生成权值累加值sum；The PU2 module receives the M-dimensional temporary data t _PU1 from the PU1 processing module and the external observation input (z(n)) for weight calculation to form an output stream t _PU2 , and simultaneously generates the weight accumulation value sum;

PU3处理模块接收来自PU2处理模块的输出流t_PU2和权值累加值sum进行权值归一化，然后将标准化权重w存储在输出缓冲区，并输出至重采样模块以及粒子生成模块。The PU3 processing module receives the output stream t _PU2 and the weight accumulation value sum from the PU2 processing module for weight normalization, then stores the normalized weight w in the output buffer, and outputs it to the resampling module and the particle generation module.

步骤(4)所述的输出生成模块标准化输出为：The output generation module standardization output described in step (4) is:

${μ μ}_{x x} = = 11 / / sum sum {Σ Σ}_{m m = = 11}^{M m} x x ((m m)) {t t}_{PU PU 22} ((m m)) . .$

其中：u_x为输出生成模块的输出变量；sum为权值和；t_PU2为PU2模块的输出。Among them: u _x is the output variable of the output generation module; sum is the weight sum; t _PU2 is the output of the PU2 module.

本发明提出的硬件设计方法可以扩展到不同粒子滤波的动态重新配置。对于每个粒子滤波器，首先定义每个处理模块的操作，然后定义数据流结构，最后设计缓冲控制器和全局控制器。The hardware design method proposed by the invention can be extended to dynamic reconfiguration of different particle filters. For each particle filter, first define the operation of each processing module, then define the data flow structure, and finally design the buffer controller and global controller.

最重要的部分是数据中心，它负责处理模块之间大量的数据传输。整个滤波器使用模块级流水线设计，大大简化了设计流程。模块级流水线通过分布式控制器来实现同步执行，该控制器控制各个处理模块的数据生成和传输。The most important part is the data center, which is responsible for handling the large amount of data transfer between modules. The entire filter is designed using a block-level pipeline, which greatly simplifies the design process. Module-level pipelines achieve synchronous execution through distributed controllers that control the data generation and transmission of individual processing modules.

本发明设计的主要有益效果就是给出了应用模块级流水线设计思想设计粒子滤波器硬件的具体过程，同时采用循环融合的方式去除了粒子滤波算法中的权值归一化步骤去掉，从而实现了采样、权值计算和重采样过程的并行分布式结构。这种粒子滤波器的设计方法加快了滤波器的执行时间，同时降低了粒子滤波算法的算法复杂度。The main beneficial effect of the design of the present invention is to provide the specific process of designing particle filter hardware using the module-level pipeline design idea. At the same time, the weight normalization step in the particle filter algorithm is removed by adopting the loop fusion method, thereby realizing Parallel distributed structure of sampling, weight calculation and resampling process. This particle filter design method speeds up the execution time of the filter, and at the same time reduces the algorithmic complexity of the particle filter algorithm.

附图说明Description of drawings

图1SIRF的数据流图；Figure 1 The data flow diagram of SIRF;

图2SIRF算法各个模块之间的数据关系；Figure 2 The data relationship between the various modules of the SIRF algorithm;

图3SIRF在FPGA中实现的数据流图。Figure 3 The data flow diagram of SIRF implemented in FPGA.

具体实施方式Detailed ways

粒子滤波算法有如下两个独特的执行特性：(1)可以表示为数据流图，节点(或模块)可以并发执行。虽然每个模块的复杂性不同，但是数据流图都可以清楚地表示各模块之间的数据依赖关系；(2)数据流图中的每个模块处理每个周期的一组数据。The particle filter algorithm has the following two unique execution characteristics: (1) It can be expressed as a data flow graph, and nodes (or modules) can be executed concurrently. Although the complexity of each module is different, the data flow graph can clearly represent the data dependencies between modules; (2) Each module in the data flow graph processes a set of data in each cycle.

为了应用硬件实现粒子滤波，本发明采用模块级流水线设计方法，将粒子滤波分为粒子生成模块、粒子更新模块、重采样模块和输出生成模块，各个模块并行执行，能显著提高算法的运行效率。同时为了充分利用缓冲控制器，按如下三个要求设计处理模块：(1)在消除各处理模块之间控制信号的依赖关系(除了数据依赖)的基础上设计处理模块。如果任何两个处理模块之间的控制信号有依赖关系，则通过时间数据来设计这些依赖关系。如果控制信号间的依赖关系完全不可避免，则将控制信号作为数据并通过缓冲控制器实现控制。(2)确保数据的生成和使用速度一定的前提下选择处理模块的大小，同时还要确保生成和使用数据的数量一致。(3)只有一个全局时钟，其他处理模块的时钟信号都来自于全局时钟。In order to apply hardware to implement particle filtering, the present invention adopts a module-level pipeline design method, and divides particle filtering into a particle generation module, a particle update module, a resampling module, and an output generation module. Each module is executed in parallel, which can significantly improve the operating efficiency of the algorithm. At the same time, in order to make full use of the buffer controller, the processing modules are designed according to the following three requirements: (1) The processing modules are designed on the basis of eliminating the dependence of control signals (except for data dependencies) among the processing modules. If there are dependencies on the control signals between any two processing modules, these dependencies are engineered through the temporal data. If the dependency between the control signals is completely unavoidable, the control signal is used as data and the control is realized through the buffer controller. (2) Select the size of the processing module under the premise of ensuring that the data is generated and used at a certain speed, and at the same time ensure that the amount of generated and used data is consistent. (3) There is only one global clock, and the clock signals of other processing modules all come from the global clock.

本发明采用分布式控制器，使用模块级流水线设计方法设计样本重要性重采样粒子滤波器(SIRF)，以二维纯方位目标跟踪为处理对象，主要估计的未知状态是笛卡尔坐标系(X_n＝[x,V_x,y,V_y]^T)中跟踪对象的位置和速度，其中x,y是指目标的位置坐标，V_x,V_y分别是x，y方向上的速度分量。整个粒子滤波器由若干个处理模块组成，每个模块处理各种复杂的算术运算，同时每个处理模块具有用于控制其操作的局部控制器。分布式控制能够高效地处理各个数据模块之间的数据依赖关系。The present invention adopts a distributed controller, uses a module-level pipeline design method to design a sample importance resampling particle filter (SIRF), takes two-dimensional orientation-only target tracking as the processing object, and the unknown state mainly estimated is the Cartesian coordinate system (X _n = [x, V _x , y, V _y ] ^T ), where x and y refer to the position coordinates of the target, and V _x and V _y are the velocity components in the x and y directions respectively. The whole particle filter is composed of several processing modules, each processing various complex arithmetic operations, and each processing module has a local controller for controlling its operation. Distributed control can efficiently handle the data dependencies among various data modules.

整个粒子滤波器被分成了几个处理模块，每个处理模块有一个用于控制其操作的局部控制器。首先定义每个处理模块的操作，然后定义数据流结构。同时设计一个缓冲控制器和全局控制器，全局控制器控制各个处理模块的数据生成和传输，采用分布式控制高效地处理各个数据模块之间的数据依赖关系，模块级流水线通过分布式控制器来实现同步执行。整个滤波器使用模块级流水线设计，大大简化了设计流程。The whole particle filter is divided into several processing modules, each processing module has a local controller for controlling its operation. First define the operation of each processing module, and then define the data flow structure. At the same time, a buffer controller and a global controller are designed. The global controller controls the data generation and transmission of each processing module, and uses distributed control to efficiently process the data dependencies between each data module. The module-level pipeline is implemented through the distributed controller. Implement synchronous execution. The entire filter is designed using a block-level pipeline, which greatly simplifies the design process.

本发明设计了SIRF的各个处理模块，包括粒子生成模块(PG)、粒子更新模块(PU)、均值计算/生成输出(MC/OG)、重采样(RS)模块等，如图1所示为SIRF的模块数据流图。The present invention has designed each processing module of SIRF, comprises particle generation module (PG), particle update module (PU), average calculation/generate output (MC/OG), resampling (RS) module etc., as shown in Figure 1 SIRF's module data flow diagram.

(1)模块设计(1) Module design

粒子生成模块(PG)：该模块的主要功能就是生成粒子。在PG处理模块中，有四个连接输入向量的缓冲区和4个连接输出向量(x,V_x,y,V_y)的缓冲区。PG模块的输出用于重采样(RS)模块。此外，两个连接(x,y)的缓冲区用于粒子更新模块PU1。所有输入输出都是M维向量，且缓冲控制器的参数相同。粒子生成模块通过并行计算得到所有的输出。Particle Generation Module (PG): The main function of this module is to generate particles. In the PG processing module, there are four connected input vectors and 4 buffers connecting the output vectors (x,V _x ,y,V _y ). The output of the PG block is used for the resampling (RS) block. In addition, two buffers connected (x,y) are used for the particle update module PU1. All input and output are M-dimensional vectors, and the parameters of the buffer controller are the same. The particle generation module obtains all outputs through parallel computing.

粒子更新模块(PU)：粒子更新模块的主要功能是完成权值计算和权值归一化，该模块所完成的算术运算有乘法、分类、除法、反三角函数arctan()和指数函数exp()。采用坐标旋转数字计算方法(CORDIC)展开用作artan()和exp()的算子。根据运算单元的维数，将粒子更新运算在功能上分成三个处理模块：PU1、PU2和PU3。Particle update module (PU): The main function of the particle update module is to complete weight calculation and weight normalization. The arithmetic operations completed by this module include multiplication, classification, division, inverse trigonometric function arctan() and exponential function exp( ). The operators used for artan() and exp() are expanded using the Coordinate Rotation Digital Computation Method (CORDIC). According to the dimension of the operation unit, the particle update operation is functionally divided into three processing modules: PU1, PU2 and PU3.

PU1处理模块有2个接收来自PG处理模块的(x,y)的输入缓冲区和1个将输出t_PU1输送到PU2处理模块的缓冲区。PU1处理模块完成artan(y/x)的计算并生成M维临时数据t_PU1。对于artan()运算，为了区别开(-x,y)和(x,-y)，用一个常值π/2和一个多路复用器来调整角度。由于PG和PU1处理模块之间没有数据依赖关系，一旦PU1模块的输入缓冲区得到数据，PU1处理模块就直接计算其输出。The PU1 processing module has 2 input buffers that receive (x,y) from the PG processing module and 1 buffer that delivers the output t _PU1 to the PU2 processing module. The PU1 processing module completes the calculation of artan(y/x) and generates M-dimensional temporary data t _PU1 . For the artan() operation, in order to distinguish between (-x,y) and (x,-y), a constant value π/2 and a multiplexer are used to adjust the angle. Since there is no data dependency between the PG and PU1 processing modules, once the input buffer of the PU1 module gets data, the PU1 processing module directly calculates its output.

PU2模块有两个输入缓冲区，分别来自PU1处理模块的(t_PU1)和外部观测输入(z(n))。在第n步迭代期间，z(n)的值不变。PU2有两个输出缓冲区。分别是输出给PU3处理模块的(t_PU2)和(sum)。PU2模块的两个输出缓冲区将(t_PU2)和(sum)输送给PU3模块，PU2模块的功能主要是负责权值计算，但是该模块不对权值进行归一化，而是将它们定义为输出流(t_PU2)。同时，在权重计算的最后将这些权重累加生成sum，sum作为PU3处理模块的输入进行权值归一化。PU3处理模块有两个来自PU2处理模块的输入缓冲区(t_PU2,sum)，PU3标准化每一个没有标准化的t_PU2和sum，然后将标准化权重w存储在输出缓冲区，并用于RS处理模块以及生成粒子的PG处理模块。The PU2 module has two input buffers, one from the PU1 processing module (t _PU1 ) and the external observation input (z(n)). During the nth iteration, the value of z(n) does not change. PU2 has two output buffers. are (t _PU2 ) and (sum) output to the PU3 processing module, respectively. The two output buffers of the PU2 module deliver (t _PU2 ) and (sum) to the PU3 module. The function of the PU2 module is mainly responsible for weight calculation, but this module does not normalize the weights, but defines them as output stream (t _PU2 ). At the same time, at the end of the weight calculation, these weights are accumulated to generate a sum, and the sum is used as the input of the PU3 processing module for weight normalization. The PU3 processing module has two input buffers (t _PU2 , sum) from the PU2 processing module, PU3 normalizes each unnormalized t _PU2 and sum, and then stores the normalized weight w in the output buffer and uses it for the RS processing module and A PG processing module that generates particles.

重采样模块(RS)：RS模块主要执行重采样过程和状态更新计算，因此不需要单独设计一个状态更新模块。RS模块有5个输入缓冲区，其中连接(x,V_x,y,V_y)的4个缓冲区来自PG处理模块，还有一个缓冲区来自PU3处理模块中归一化权重(w)的缓冲区。重采样输出存储在4个输出缓冲区中。RS模块所有的输入输出数据都是M维的。Resampling module (RS): The RS module mainly performs the resampling process and state update calculation, so there is no need to design a separate state update module. The RS module has 5 input buffers, of which 4 buffers connected to (x,V _x ,y,V _y ) come from the PG processing module, and one buffer comes from the normalized weight (w) in the PU3 processing module buffer. resampled output Stored in 4 output buffers. All input and output data of the RS module are M-dimensional.

RS处理模块复制权重较大粒子并消除权重较小的粒子。通过读取每个权重并根据权重复制粒子实现以上操作。因为所有的权重都是标准化的，且权重的总和等于1。因此，重采样之后有相同数量的粒子。整个重采样过程需要至少M个时钟周期，这是因为在最差的情况下，所有的权重都可能是零，从而有效粒子可能在M个周期后产生。RS处理模块的输出一定要在M个周期之后才可用。因此，PG和OG处理模块在读取有效数据之前必须等待M个周期。The RS processing module replicates particles with larger weights and eliminates particles with lower weights. This is done by reading each weight and duplicating the particle based on the weight. Because all the weights are normalized and the sum of the weights is equal to 1. Therefore, there are the same number of particles after resampling. The whole resampling process needs at least M clock cycles, because in the worst case, all weights may be zero, so valid particles may be generated after M cycles. The output of the RS processing module must be available after M cycles. Therefore, the PG and OG processing modules must wait for M cycles before reading valid data.

输出生成模块(MC/OG)：为了共享模块缓冲区和互连，利用PG处理模块生成的数据以及PU2处理模块计算的权重和sum来设计该模块。然后执行这个模块且通过sum的值来标准化输出为：Output Generation Module (MC/OG): To share module buffers and interconnections, this module is designed using the data generated by the PG processing module and the weights and sums calculated by the PU2 processing module. Then execute this module and normalize the output by the value of sum as:

${μ μ}_{x x} = = 11 / / sum sum {Σ Σ}_{m m = = 11}^{M m} x x ((m m)) {t t}_{PU PU 22} ((m m)) - - - - - - ((11))$

SIRF算法各个模块之间的数据关系如图2所示，该图显示了处理模块和缓冲区之间的数据连接关系。重采样和状态更新被组合到同一个模块而且输出(估计)计算步骤直接从采样步骤得到数据。The data relationship between the various modules of the SIRF algorithm is shown in Figure 2, which shows the data connection relationship between the processing module and the buffer. Resampling and state updating are combined into the same module and the output (estimation) calculation step gets data directly from the sampling step.

(2)控制器设计(2) Controller design

滤波器应用缓冲控制器实现整体操作，决定控制器结构和整体实现的参数如下：L_maxi、L_i、nr_i、nw_i、M_i、C_i、P_i、F_i和D_i。其中L_maxi是指处理模块之间的逻辑延迟；实际的L_i的范围为0＜L_i＜L_maxi；nr_i是写入缓冲区和读取缓冲区之间的偏移量；nw_i是读取前一时刻缓冲控制器和写入当前缓冲控制器之间的偏移量；C_i、P_i和F_i分别是指处理模块i的数据使用率、数据生成速度和处理速度；Di是指处理模块i生成数据的延迟系数。参数M_i是数据生成模块的数据流维度，参数(M_i,nr_i,nw_i)由描述函数得到，而参数(L_i,C_i,P_i,F_i,D_i)由处理模块的实现程序得到。The buffer controller is used to realize the overall operation of the filter, and the parameters that determine the structure of the controller and the overall realization are as follows: L _maxi , L _i , _nri , _nwi , Mi _, C _i , P _i , F _i and D _i . Where L _maxi refers to the logical delay between processing modules; the actual range of L _i is 0<L _i <L _maxi ; nr _i is the offset between the write buffer and the read buffer; nw _i is The offset between reading the buffer controller at the previous moment and writing the current buffer controller; C _i , P _i and F _i refer to the data usage rate, data generation speed and processing speed of processing module i respectively; Di is Refers to the delay coefficient of data generated by processing module i. The parameter M _i is the data flow dimension of the data generation module, the parameters (M _i ,nri _, _nwi ) are obtained by the description function, and the parameters (L _i ,C _i ,P _i ,F _i ,D _i ) are obtained by the processing module Realize the program to get.

(3)FPGA实现(3) FPGA implementation

应用FPGA实现的SIRF的数据流图如图3所示，图中给出了处理模块和缓冲区之间的连接关系。表1列出了每个处理模块的主要参数，处理模块的实际速度范围为206MHz～351MHz之间，由于受到CORDIC方法的速度限制，同时为了简化控制器设计，选取206MHz为全局时钟。延迟值可以由内部数据流得到，该表还给出了FPGA实现时各模块所占用的FPGA资源。The data flow diagram of SIRF implemented by FPGA is shown in Figure 3, and the connection relationship between the processing module and the buffer is shown in the figure. Table 1 lists the main parameters of each processing module. The actual speed range of the processing module is between 206MHz and 351MHz. Due to the speed limit of the CORDIC method and to simplify the controller design, 206MHz is selected as the global clock. The delay value can be obtained from the internal data flow, and the table also shows the FPGA resources occupied by each module when the FPGA is implemented.

表1处理模块信息表Table 1 processing module information table

表2给出了各个模块之间的数据依赖关系，该表中除了E3、E4、E6、E7和E8，其他连接的参数都是默认的。对于连接E3和E4，由于利用t_PU2最后的数据同时生成sum，则分别有nw₃＝M+1,nw₄＝2，通过时序图可以看出，sum是由PU2模块在M个周期后利用t_PU2的第M个数据生成。对于E6有nr₆＝M+60，其中nr₁+L_PU1+nr₂+nw₂+L_PU2+nr₄+nw₄+L_PU3+nr₅＝1+23+1+1+20+2+1+10+1＝60。为了等待RS模块生成第一个数据，对于E7和E8分别有nw₇＝M,nw₈＝M。由于不存在速率失配，所以D的值全是1，只有当E5传输1个数据时，其它链接才传输M个数据(即数据矢量)。同步使用的缓冲区的数量约为5M，其中M为滤波器所用的粒子数。为了给读写逻辑分配不同的地址，每个缓冲区存储一个数据需要多个内存单元。Table 2 shows the data dependencies among the various modules. Except for E3, E4, E6, E7 and E8 in this table, other connection parameters are default. For connection E3 and E4, since the last data of t _PU2 is used to generate sum at the same time, then there are nw ₃ =M+1, nw ₄ =2 respectively. It can be seen from the timing diagram that sum is used by the PU2 module after M cycles The Mth data generation of t _PU2 . For E6 there is nr ₆ =M+60, where nr ₁ +L _PU1 +nr ₂ +nw ₂ +L _PU2 +nr ₄ +nw ₄ +L _PU3 +nr ₅ =1+23+1+1+20+2+ 1+10+1=60. In order to wait for the RS module to generate the first data, nw ₇ =M, nw ₈ =M for E7 and E8 respectively. Since there is no rate mismatch, the values of D are all 1, and only when E5 transmits 1 data, other links transmit M data (that is, data vectors). The number of buffers used simultaneously is about 5M, where M is the number of particles used by the filter. In order to assign different addresses to the read and write logic, storing one data per buffer requires multiple memory units.

表2SIRF的链接信息表(EIT)Table 2 Link Information Table (EIT) of SIRF

由表1和表2推导出SIRF所有缓冲控制器的参数，如表3所示。该表给出了每个缓冲控制器的开始时间、写开始时间和读开始时间。The parameters of all buffer controllers of SIRF are deduced from Table 1 and Table 2, as shown in Table 3. This table gives the start time, write start time and read start time for each buffer controller.

注意数据流结构的几个关键同步点：(1)E1和E6的缓冲控制器的开始时间和写开始时间相同；(2)由于RS模块处理两个数据，所以E5和E6的缓冲控制器读开始时间相同；(3)E7和E8的缓冲控制器同时使用；(4)E3和E4的缓冲控制器启动时间相同。Pay attention to several key synchronization points of the data flow structure: (1) The start time of the buffer controllers of E1 and E6 is the same as the start time of writing; (2) Since the RS module processes two data, the buffer controllers of E5 and E6 read The start time is the same; (3) The buffer controllers of E7 and E8 are used at the same time; (4) The buffer controllers of E3 and E4 are started at the same time.

表3SIRF的缓冲控制器参数Table 3 Buffer controller parameters of SIRF

Claims

1. a Hardware Implementation for the sample importance Resampling Particle Filter based on FPGA, is characterized in that described particle filter comprises particle generation module, particle update module, resampling module and output generation module, wherein:

(1) particle generation module exports respectively particle update module, resampling module to after being used for receiving input vector generation particle;

(2) it is to export resampling module to after weights calculating and weights normalization that the particle that particle update module is used for that step (1) is generated upgrades;

(3) resampling module feeds back to step (1) particle generation module for the particle of the particle after the described renewal of step (2) or step (1) generation being carried out after resampling process and state upgrade;

(4) output generation module generates output for the particle of the particle after the described renewal of step (2) or step (1) generation is carried out to data.

2. the Hardware Implementation of the sample importance Resampling Particle Filter based on FPGA according to claim 1, it is characterized in that all input and output of the described particle generation module of step (1) are all M (M=4) dimensional vectors, and the parameter of buffer control unit is identical.

3. the Hardware Implementation of the sample importance Resampling Particle Filter based on FPGA according to claim 1, is characterized in that all inputoutput datas of described resampling module are all M (M=4) dimensions.

4. the Hardware Implementation of the sample importance Resampling Particle Filter based on FPGA according to claim 1, is characterized in that the described particle update module of step (2) is divided into three processing module: PU1, PU2 and PU3;

PU1 processing module receives the input from particle generation module, will export M dimension ephemeral data t _pU1be transported to PU2 processing module;

PU2 module receives the M dimension ephemeral data t from PU1 processing module _pU1carry out weights with external observation input (z (n)) and calculate formation output stream t _pU2, generate weights accumulated value sum simultaneously;

PU3 processing module receives the output stream t from PU2 processing module _pU2sum carries out weights normalization with weights accumulated value, then standardized weight w is stored in to output buffer, and exports resampling module and particle generation module to.

5. the Hardware Implementation of the sample importance Resampling Particle Filter based on FPGA according to claim 1, is characterized in that the described output generation module normalization output of step (4) is:

μ_{x} = 1 / sum Σ_{m = 1}^{M} x (m) t_{PU 2} (m)

Wherein: u _xoutput variable for output generation module; Sum be weights and; t _pU2output for PU2 module.