CN105045763A - FPGA (Field Programmable Gata Array) and multi-core DSP (Digital Signal Processor) based PD (Pulse Doppler) radar signal processing system and parallel realization method therefor - Google Patents

FPGA (Field Programmable Gata Array) and multi-core DSP (Digital Signal Processor) based PD (Pulse Doppler) radar signal processing system and parallel realization method therefor Download PDF

Info

Publication number
CN105045763A
CN105045763A CN201510411844.XA CN201510411844A CN105045763A CN 105045763 A CN105045763 A CN 105045763A CN 201510411844 A CN201510411844 A CN 201510411844A CN 105045763 A CN105045763 A CN 105045763A
Authority
CN
China
Prior art keywords
data
chip
core
dsp
fpga
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN201510411844.XA
Other languages
Chinese (zh)
Other versions
CN105045763B (en
Inventor
王俊
吕栋
张玉玺
杨彬
尹晗
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Leishi Technology Co ltd
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CN201510411844.XA priority Critical patent/CN105045763B/en
Publication of CN105045763A publication Critical patent/CN105045763A/en
Application granted granted Critical
Publication of CN105045763B publication Critical patent/CN105045763B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Abstract

An FPGA (Field Programmable Gata Array) and multi-core DSP (Digital Signal Processor) based PD (Pulse Doppler) radar signal processing system comprises an FPGA core chip, a peripheral minimum system circuit of the FPGA core chip, a DSP chip, a peripheral minimum system circuit of the DSP chip, a gigabit network interface chip, a power supply chip and a level switching chip, wherein the FPGA core chip is used for receiving radar direct wave and echo signals acquired by a data collection chip; the signals are subjected to down-conversion processing and stored in a memory; after being received, data is transmitted to a DDR3 (Double Data Rate 3) memory in the DSP chip through an SRIO (Serial Rapid Input Output) interface between the FPGA core chip and the DSP chip; the DSP chip is used for pulse compression, phase-coherent accumulation and constant false alarm rate detection; information of a target point is obtained; and finally the target information is uploaded to an upper computer through a network port. A parallel realization method for the system comprises six steps. According to the system and the parallel realization method, a hardware circuit is simple; an FPGA and multi-core DSP architecture is adopted for processing; and the parallel processing performance of the system is made full use of.

Description

A kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof
Technical field
The present invention is a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, and it is the hardware platform based on FPGA+ multi-core DSP, realizes the Radar Signal Processing of multi-core DSP, belongs to digital processing field.
Background technology
Radar Doppler refers to the radar utilizing the information of Doppler effect to target to extract and process.If radar emission is pulse modulated radiofrequency signal, be namely referred to as pulse Doppler radar, be called for short PD radar.In order to obtain large Timed automata signal, improve the resolution of radar speed and distance, radar launches linear FM signal usually, based on the feature of chirped PD radar in conjunction with pulse Doppler and pulse compression.Additionally use correlative accumulation when signal transacting and improve the signal to noise ratio (S/N ratio) detected, the method that the method for carrying out FFT to the data of same range gate realizes Doppler filter group filtering is commonly used during Project Realization, output signal detects in (CFAR) system to constant false alarm rate after asking mould, whether exceedes thresholding come whether there is target in judging distance door according to detecting unit.Radar by improving the signal to noise ratio (S/N ratio) of target, signal to noise ratio detects target.
Radar signal is being carried out in the process of check processing, mainly be divided into the modules such as pulse compression, correlative accumulation, constant false alarm rate (CFAR) detection, although have remarkable help to raising target echo detection, but also add the operand of process simultaneously, as calculated the FFT etc. of large several point, the real-time calculation requirement of processor is promoted greatly.In addition, the proposition of new Radar Technology and application make the function of radar from strength to strength, but propose requirements at the higher level to radar signal processor simultaneously.
Along with the fast development of semiconductor technology and memory technology, very high speed integrated circuit (VHSIC) and VLSI (very large scale integrated circuit) (VLSI) technology obtain and increase substantially, TI company is proposed multi-core DSP chip, propose novel processor architecture, operational performance significantly promotes, and this makes to realize various algorithm fast becomes possibility.
For the demand that processor performance above-mentioned promotes, the present inventor devises a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP, this system adopts FPGA+ multi-core DSP framework, peripheral except FPGA, outside minimum system circuit needed for DSP work, also have two network interface chips, Radar Signal Processing is programming realization in FPGA and multi-core DSP, can meet the real-time demand of complex radar signal process.
Summary of the invention
1, object: the object of the present invention is to provide a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, its object is to realize PD Radar Signal Processing System by hardware language, C language programming and multi-core DSP program design.
2, technical scheme: object of the present invention is achieved through the following technical solutions.
(1) a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP of the present invention, it comprises fpga core chip and peripheral minimum system circuit, dsp chip and peripheral minimum system circuit, gigabit networking interface chip, power supply chip and level transferring chip.Its system architecture as shown in Figure 1, position annexation between them and signal trend are: the straight ripple of radar that fpga core chip reception data acquisition chip collects and echoed signal, exist after carrying out down-converted in internal memory, transferred data to the DDR3 in dsp chip by the SRIO interface between fpga core chip and dsp chip after receiving the data of a frame, then dsp chip carries out pulse compression, correlative accumulation and constant false alarm rate (CFAR) detection, obtain the information of impact point, finally by network interface, target information is uploaded to host computer.
This fpga core chip selection XC6VSX315T, belong to the Virtex-6 series of Xilinx company, adopt the 40nm manufacturing process of third generation XilinxASMBL framework, there is efficient pair register 6 and input LUT (look-up table) logic, there is abundant IO resource, a large amount of on-chip memory resources, supports DDR3.Lower power consumption 50% compared with last generation product, cost reduces by 20%.In addition, this chip has powerful signal handling capacity and has the ability connected in series based on low-power consumption GTX6.5Gbps transceiver, ensures the high speed serial transmission between fpga core chip and dsp chip.Fpga core chip is sampled after the data that obtain receiving data acquisition chip, exists in internal memory, be transferred in dsp chip after obtaining the data of a frame by SRIO after Digital Down Convert by data.
This fpga core chip periphery minimum system circuit, comprises clock source and program loads FLASH, and they are responsible for auxiliary fpga core chip and complete processing capacity.Clock source provides clock signal for fpga core chip; Because the power-off of fpga core chip Program is eliminated automatically, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in fpga core chip automatically to make it normally work at every turn.Clock source provides the system clock of fpga core chip operation, and the frequency required for crystal oscillator produces directly sends fpga core chip to.
The TMS320C6678 multinuclear process chip that this dsp chip adopts TI to release.This chip adopts a kind of Harvard's bus structure of improvement: the program bus of a set of 256, two cover 64 bit data bus and a set of 32 DMA private buss.Processing unit adopts high-performance, advanced very long instruction word structure, and per clock period can the instruction of executed in parallel 8 32bit.Adopt 8 arithmetic speeds to build up to the DSP kernel of 1.25GHz, achieve 320GMAC and 160GFLOP fixed point and floating-point performance on a single chip.Monokaryon is except can be configured to L1P and L1D of the 32KB of CACHE, also comprise 512KB can be configured to RAM or CACHE LL2SRAM, also has the multinuclear shared drive of 4MB in addition, can use as the L2SRAM shared or shared L3SRAM, built-in DDR3 controller, addressable 33bit address and 8GB storage space.TMS320C6678 chip provides abundant peripheral interface, and wherein according to mission requirements, the interfaces such as serial RapidIO, PCIE, Hyperlink, DDR3 are mainly used in signal transacting combination.The frame radar pulse string signal utilizing SRIO to receive fpga core chip to be in the present invention transferred to, carry out design and the distribution of multinuclear task, the multi-core parallel concurrent implementation procedure that layout pulse compression, correlative accumulation and CFAR detect, finally obtain the information of impact point, upload to host computer by network interface, realize the performance boost of computation process.
This dsp chip minimum system peripheral circuit, comprises clock source, and program loads FLASH, and outside DDR3 storer, they are responsible for auxiliary DSP chip and complete processing capacity.Owing to being automatically eliminated after the power-off of dsp chip program, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in dsp chip automatically to make it normally work at every turn.Because dsp chip needs temporary and process mass data, so must in its outside extension storage space.The data such as the buffering result of raw data and intermediate treatment are stored wherein by the plug-in four DDR3 storeies of dsp chip.Clock source provides the system clock of dsp chip work, and the frequency required for crystal oscillator produces directly sends dsp chip to.
This gigabit networking interface chip selects the 88E1111 ethernet physical layer chip of Marvell company, under the control of the EMAC module of dsp chip, transmits original information data with host computer with gigabit Ethernet network form.
This power supply chip provides the voltage needed for whole system work.Extraneous to the isolation voltage of system input+5V, by power supply chip ,+5V voltage is changed into+3.3V, + 2.5V, + 1.8V, + 1.5V, + 1.2V, + 1.0V, + 0.75V, CMGT_AVTT, CMGT_AVCC, be supplied to fpga core chip (+3.3V respectively, + 2.5V, + 1.8V, + 1.0V), program loads FLASH (+3.3V, + 1.8V), dsp chip (+3.3V, + 1.8V, + 1.0V), DDR3 module (+1.5V, + 0.75V), gigabit networking interface chip (+3.3V, + 1.2V), clock providing source (+3.3V), wherein CMGT_AVTT and CMGT_AVCC is respectively fpga core chip high speed interface and provides+1.2V and+1.0V voltage.
This level transferring chip be adopt be TI company release SN74ALVC164245 chip.This chip supports+2.5V to+3.3V ,+3.3V to the level conversion of+5V.
(2) the present invention is a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, its building process is summarized as follows: in fpga core chip, receive the intermediate frequency data that data acquisition obtains, obtain baseband signal data after carrying out Digital Down Convert, and data are sent into ram in slice and carry out buffer memory; After obtaining a frame pulse string data, carried out the transmission of data by the High Speed Serial SRIO of FPGA and DSP; Dsp chip obtains the baseband signal after a frame down coversion, leaves in DDR3, and design multi-core parallel concurrent realizes the pulse compression algorithm process of frame data; By the deposit data after pulse compression in the buffer memory of DDR3, then design multinuclear and realize parallel correlative accumulation algorithm process and CFAR check processing, obtain the information of impact point; Finally by network interface, impact point information is sent to host computer.
In sum, a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP of the present invention and Parallel Implementation method thereof, the method concrete steps are as follows:
Step one: in fpga core chip, Digital Down Convert is carried out to intermediate-freuqncy signal
This step is completed by Digital Down Converter Module in fpga core chip, and Digital Down Converter Module extracts logic by data acquisition, mould two, time delay correcting filter, dual port RAM module form.Digital Down Converter Module adopts multiphase filter structure, extracts through two times of odd evens, and if sampling data down-conversion is obtained base band complex data after correcting by time delay.Data acquisition module using the data that obtain after data acquisition chip is sampled as single ended input.Mould two extracts logic and input data pick-up is become I, Q two paths of data, at the rising edge of each clock by zone bit negate, data is got negative when zone bit is 1.Time delay corrects filtering and is realized by 12 rank FIR filter, and coefficient is generated by Matlab; Get high 16 of I, Q two-way after filtering and be spliced into 32 base band datas.
Step 2: data cached in fpga core chip, and configure SRIO preparation transmission data
Adopt the SRIO of x4 to interconnect between fpga core chip and dsp chip, single channel rate is 5Gbps, and consider that 8b/10b encodes, effective bandwidth is up to 2Gbps.Its structural drawing as shown in Figure 3, present invention utilizes the serial RapidIOIP core that Xilinx provides, and devises local side and far-end.Comprise local data process, remote data process and IP kernel.Local data process be responsible for sending local data request packet and receive that far-end sends to local data respond packet.Remote data process is responsible for receiving the packet from far-end.The major function of IP kernel is packing and unpacks, initialization and protocol realization.
When this locality sends data to far-end, data write is sent buffer memory, and give transmit control device enabling signal after writing.Local data end for process according to the SRIO header packet information set, comprise Packet type, bag size, bag number, send address, the other side ID etc. control produce request module from transmission buffer memory produce bag.These bags are transferred to far-end through the process of IP kernel.When far-end obtains packet when sending respond packet to this locality, the serial bit stream received is solved SRIO and wraps and pass to local data response processing module by IP kernel.Data write in bag is received buffer memory by remote data processing controls remote data request processing module, and transmits completion signal after writing to the module needing data, needs the module of data can from reception buffer memory sense data.
Step 3: configure SRIO register and receive data and leave in DDR3 in dsp chip
As shown in Figure 4, in SRIO module, local device is dsp chip to the SRIO module map of dsp chip end, and far-end device refers to fpga core chip.SRIO module in dsp chip is primarily of loading/unloading module and Physical layer composition.Loading/unloading module sends VBUSM request to DDR3 storer, accepts VBUSM response under the control of CPU/EDMA.In loading/unloading module, MMR command register controls send buffer memory and receive buffer memory, and is connected with the FIFO of Physical layer.
In dsp chip, usually call the configuration that CSL (on sheet Support Library) function realizes SRIO, comprise enable, initialization, open and set up functions such as communicating.The realization of SRIO can be divided into 4 steps: address maps; Configuration ID, SRIO port, interrupt vector; Configuration register, comprises the configuration of transmission mode and speed; Etc. to be linked.Upon connecting, dsp chip can receive and send SRIO bag.Need between dsp chip and fpga core chip to know that the object ID of the other side and start address correctly could transmit data.Select DirectIO mode when data are transmitted, only need the address mapping relation of TX and RX both sides just can realize transmission.
Step 4: realize multinuclear process pulse compression algorithm in dsp chip
This step completes in dsp chip, needs to design a set of multinuclear tasks in parallel implementation algorithm to carry out the data processing of pulse compression algorithm.The flow process of Radar Signal Processing as shown in Figure 5, pulse compression calculates in units of pulse, and correlative accumulation and CFAR detection carry out calculating according to the distance section position unit of train of impulses, only have after train of impulses all completes pulse compression, just can carry out correlative accumulation and CFAR detection, so generally flow process is divided into two subtasks, the pulse compression once completing all pulses calculates, and is once that correlative accumulation and CFAR detect.Due to when pulse compression calculates between pulse data correlation little, the data correlation between correlative accumulation and CFAR detection computations is little, thus multinuclear realize adopt master slave mode realize, a core is responsible for scheduling and the distribution of task, and its cokernel carries out parallel work-flow by task.
Need the configuration considering accumulator system when multinuclear designs, need in signal transacting to carry out data access operation with storer, access performance directly affects the efficiency of algorithm.Memory access performance is relevant with the position that code, data store, also relevant to access mode.By configure each memory size, cache arrange size and data access mode, the transfer rate under different situations can be obtained.The access that analysis can obtain external memory storage adopts EDMA mode usually, and internal storage is directly accessed or IDMA mode by kernel; The kernel access performance of internal memory is good, has wider data bus and instruction bus, leaves in LL2 by the data of key and variable; L1D and L1P is set to CACHE and can improves instruction and data buffer memory between kernel and storer.Raw data is placed in DDR3 by the accumulator system of the present invention's design, the LL2 of each core is configured to SRAM, deposit the data in pulse compression, correlative accumulation and CFAR testing process, L1D and L1P is configured to the access efficiency that CACHE is conducive to improving kernel, and the reservoir designs structural drawing of system as shown in Figure 6.
In Radar Signal Processing flow process, data are sent to DDR3 by SRIO interface by fpga core chip, then train of impulses are divided into 8 parts, and each core processes wherein 1 part of pulse data respectively.Pulse compression adopts frequency domain mode to realize in engineering, and after namely the frequency spectrum of input signal and the conjugate complex of local reference signal frequency spectrum are taken advantage of, another mistake welfare leaf transformation obtains result.Key step comprises input, FFT, takes advantage of again, IFFT, output.Pulse data is transferred to the LL2 of each core from DDR3 by EDMA module by importation, the buffer area of configuration data process in the LL2 of then each core, carry out data processing, by EDMA module, data are passed to the buffer zone of DDR3 when finally exporting, need the way of output configuring EDMA module when input and output.Twiddle factor required for FFT and IFFT calculates completes when initialization and calculates and exist in the LL2 of each core, takes advantage of the frequency spectrum conjugation of required reference signal also to exist when initialization in the LL2 of each core again, directly calls during calculating.
Relatively independent between kernel in multiple nucleus system, the scheduling that needing intercoms mutually finishes the work.Kernel needs the order of distributing corresponding subtask and execution.Pulse compression algorithm is realized owing to adopting master slave mode, each core completes respective data processing, do not need internuclear data sharing and transmission, so adopt the communication mode of internuclear interrupt mode, configuration of IP CGRx will produce the interruption of core x, SRCS0 ~ 27 arrange interrupt source mark, adopt SRCSx to represent the interrupt identification of core x here.Corresponding mark is removed in SRCC0 ~ 27 of configuration of IP CARx.
Need to consider the structure of Parallel Implementation in layout flow process, but during design, the bus of DDR3 becomes bottleneck, DDR3 data bus is the highest can only configure 64, and only has a set of bus, and each core needs president when accessing, can not return data in time.So stagger when core and Nuclear Data input during design, after previous core completes input, utilize the input of internuclear next core of down trigger; Adopt the input of last round of output and next round to be combined into the input of a module whole as next round during output, DDR3 bus occupation problem can be avoided like this; The burst process of next round is carried out again until complete the burst process task that each core distributes after core 7 completes whole data processing.The multinuclear realization flow figure of pulse compression as shown in Figure 7.
Step 5: realize multinuclear process correlative accumulation and CFAR detection algorithm in dsp chip
After process of pulse-compression completes, deposit data is in the DDR3 buffer zone of dsp chip, and the output wherein after each pulse compression stores by the mode of row in DDR3, just can carry out data processing in the mode of row like this when correlative accumulation and CFAR detect.The ranks transposition of a 2-D data is completed by the output of pulse compression.
Correlative accumulation and CFAR detect the mode adopting frequency domain to realize, and each core completes asks mould and CFAR to detect apart from the data input of cutting into slices, correlative accumulation, plural number.
The distance section of train of impulses, as data, adopts the mode of EDMA to be transferred to LL2 from DDR3.Correlative accumulation adopts FFT to realize in engineering, and its twiddle factor also completes when initialization and calculates and leave in LL2, directly completes in LL2 when FFT calculates.The complex points data obtained after correlative accumulation need to carry out asking mould to calculate amplitude.It is consuming time and need to consider data from overflow that conventional quadratic sum opens radical sign, adopts the mode of linear-apporximation to realize plural number and ask mould
|X|≈g(I,Q)=amax{|I|,|Q|}+bmin{|I|,|Q|}
Wherein a, b are weighting coefficient, and the value of coefficient is relevant with the requirement of relative error.The method that the ripples such as employing are approximate.Choose suitable a, b parameter, make error below 0.8%, its formula is as follows:
|X|≈max{TL+1/8TS,27/32TL+9/16TS}
CFAR detects the mode adopting conventional CA-CFAR rate (CA-CFAR) to detect and realizes; usually adopt the mode of drawing window detection to realize when real data process; when calculating once sliding window and detecting, test cell both sides are provided with protected location and are used for preventing target from crossing over multiple unit and cause interference.For judging whether there is target in one of them unit, need to be averaging left and right reference unit to be multiplied by threshold factor again and to obtain detection threshold, then with draw whether there is target after test cell.When Project Realization, usual CFAR detects needs to consume a large amount of cycle, and TMS320C6678 support software flowing water technology realizes parallel operation, can be optimized the process of CFAR detection by assembly instruction layout.Specific implementation process is as follows: first complete writing of C language code, then linear assembler code is rewritten into, according to the periodicity of code determination iteration, draw correlogram afterwards, namely the functional unit used by every bar instruction is determined, have 8 functional units in TMS320C6678 monokaryon, bus supports parallel 8 instructions of one-period.Last according to correlogram, determine the register file of every bar instruction, then carry out instruction layout, need the cyclic pac king in consideration streamline, kernel circulates, it is emptying to circulate, also need the delay, life cycle and so on of considering instruction, finally obtain the instruction layout walked abreast.
The problem considering that DDR3 bus takies is needed when multinuclear realizes correlative accumulation and CFAR detects, so also need staggered, each core completes correlative accumulation and the CFAR detection of partial distance section separately afterwards, obtain an impact point information, the heart computing carried out after finally the impact point information of 8 cores being asked value obtains the information result of target.The multinuclear realization flow figure that correlative accumulation and CFAR detect as depicted in figure 8.
Step 6: send target information to host computer by network interface
Obtain impact point information after being detected by correlative accumulation and CFAR, information is transferred to host computer by the network interface of 6678.Network interface card controller (NETCP) is had, mainly for the treatment of Ethernet data bag in 6678.This controller is by the PKTDMA controller transmitted for control data bag DMA; For the bag accelerator (PA) of identification of data packets and classification; For safety accelerating machine (SA) and the gigabit Ethernet conversion subsystem four part composition of Data Packet Encryption and deciphering.The quick exchange of common complete packet between dsp chip and external unit.The structural drawing of network interface card controller as shown in Figure 9.
When externally device sends packet to dsp chip, data enter network interface card controller through DMA, in SA, complete ciphering process, enter PA by data stream bus switches afterwards, in PA, add mac frame head, IP frame head and UDP/TCP frame head according to the descriptor preset.Enter gigabit Ethernet conversion subsystem by data stream bus switches afterwards, carry out identifying packing and being sent to external unit from predetermined port to data in this module.
(3) advantage and effect: the present invention is a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, and its advantage is: hardware circuit is simple, small volume; Process structure adopts FPGA+ multi-core DSP framework, in fpga core chip, realize Digital Down Convert, carries out pulse compression, correlative accumulation and CFAR and detects, given full play to the parallel processing performance of system in dsp chip.
Accompanying drawing explanation
Fig. 1 system architecture diagram.
Fig. 2 FPGA realizes Digital Down Convert schematic diagram.
The SRIO function structure chart of Fig. 3 FPGA.
The SRIO function structure chart of Fig. 4 DSP.
Fig. 5 Radar Signal Processing process flow diagram.
Fig. 6 multi-core DSP accumulator system design drawing.
Fig. 7 multi-core DSP realizes pulse compression process flow diagram.
Fig. 8 multi-core DSP realizes correlative accumulation and CFAR overhaul flow chart.
Fig. 9 network interface card controller architecture figure.
Figure 10 multi-core DSP realizes Radar Signal Processing process flow diagram.
In figure, symbol description is as follows:
Fig. 1: SRIO full name is serialrapidio, high-speed serial bus; PCIe full name is peripheralcomponentinterfaceexpress, high-speed bus and interface; EMIF full name is externalmemoryinterface, external memory interface; DDR3 full name is doubledatadate3sdram, octuple speed synchronous DRAM.
Fig. 2: AD_data 14 bit data representing input; I, Q two paths of data that AD_data_I and AD_data_Q obtains after representing extraction; AD_data_BB gets 16 after representing the filtering of I, Q two paths of data and is spliced into 32 bit data.
Fig. 4: FIFO full name is firstinfirstout, First Input First Output; VBUSM refers to the main equipment on VBUS; LSU is the register title in DSP; MMR full name is memorymappedregister, is the one of DSP storer.
Fig. 5: LFM refers to linear FM signal; FFT refers to Fast Fourier Transform (FFT); IFFT refers to inverse fast Fourier transform.
Fig. 6: CORE0, CORE7 represents core 0, core 7; L1D/32KBCACHE, L1P/32KBCACHE represent the level one data and program storage that are configured to 32KB buffer memory; LL2/512KBSRAM represents the local second-level storage being configured to 512KB internal memory; CACHE represents and can be configured to buffer memory; EDMA represents the direct memory access of enhancement mode.
Fig. 7, Fig. 8: IPCGR0 ~ IPCGR7 represents the internuclear interruption producing core 0 ~ 7.
Fig. 9: the PKTDMA DMA referring to packet transmits; SA refers to safety accelerating machine; PA refers to bag accelerator; GbESwitchSubSystem refers to gigabit Ethernet conversion subsystem; SGMII refers to Serial Gigabit Media Independent Interface.
Figure 10: 0x90000000 etc. refers to the logical address in DSP; IPC0 ~ 7 refer to the interruption of core 0 ~ 7; CFAR refers to constant false alarm rate and detects.
Embodiment
Below according to summary of the invention, in conjunction with Figure of description, to a kind of PD Radar Signal Processing System based on FPGA+DSP of the present invention and multinuclear implementation method thereof, be specifically described:
The present invention is by hardware program language, C language programming and the multi-core DSP programming realization PD Radar Signal Processing hardware system based on FPGA+DSP.
(1) the invention provides a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP, comprise fpga core chip and peripheral minimum system circuit, dsp chip and peripheral minimum system circuit, gigabit networking interface chip, power supply chip and level transferring chip.Its system architecture as shown in Figure 1, position annexation between them and signal trend are: the straight ripple of radar that FPGA reception data acquisition chip collects and echoed signal, exist after carrying out down-converted in internal memory, transferred data to the DDR3 in dsp chip by the SRIO interface between fpga core chip and dsp chip after receiving the data of a frame, then dsp chip carries out pulse compression, correlative accumulation and constant false alarm rate (CFAR) detection, obtain the information of impact point, finally by network interface, target information is uploaded to host computer.
This fpga core chip selection XC6VSX315T, belong to the Virtex-6 series of Xilinx company, adopt the 40nm manufacturing process of third generation XilinxASMBL framework, there is efficient pair register 6 and input LUT (look-up table) logic, there is abundant IO resource, a large amount of on-chip memory resources, supports DDR3.Lower power consumption 50% compared with last generation product, cost reduces by 20%.In addition, this chip has powerful signal handling capacity and has the ability connected in series based on low-power consumption GTX6.5Gbps transceiver, ensures the high speed serial transmission between FPGA and DSP.Fpga core chip is sampled after the data that obtain receiving data acquisition chip, exists in internal memory, be transferred in dsp chip after obtaining the data of a frame by SRIO after Digital Down Convert by data.Fig. 2 is that FPGA realizes Digital Down Convert schematic diagram.
This fpga chip minimum system peripheral circuit, comprises clock source and program loads FLASH, and they are responsible for auxiliary fpga core chip and complete processing capacity.Clock source provides clock signal for FPGA; Because the power-off of FPGA Program is eliminated automatically, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in fpga core chip automatically to make it normally work at every turn.Clock source provides the system clock of fpga core chip operation, and the frequency required for crystal oscillator produces directly sends fpga core chip to.
The TMS320C6678 multinuclear process chip that this dsp chip adopts TI to release.This chip adopts a kind of Harvard's bus structure of improvement: the program bus of a set of 256, two cover 64 bit data bus and a set of 32 DMA private buss.Processing unit adopts high-performance, advanced very long instruction word structure, and per clock period can the instruction of executed in parallel 8 32bit.Adopt 8 arithmetic speeds to build up to the DSP kernel of 1.25GHz, achieve 320GMAC and 160GFLOP fixed point and floating-point performance on a single chip.Monokaryon is except can be configured to L1P and L1D of the 32KB of CACHE, also comprise 512KB can be configured to RAM or CACHE LL2SRAM, also has the multinuclear shared drive of 4MB in addition, can use as the L2SRAM shared or shared L3SRAM, built-in DDR3 controller, addressable 33bit address and 8GB storage space.TMS320C6678 chip provides abundant peripheral interface, and wherein according to mission requirements, the interfaces such as serial RapidIO, PCIE, Hyperlink, DDR3 are mainly used in signal transacting combination.The frame radar pulse string signal utilizing SRIO to receive FPGA to be in the present invention transferred to, carry out design and the distribution of multinuclear task, the multi-core parallel concurrent implementation procedure that layout pulse compression, correlative accumulation and CFAR detect, finally obtain the information of impact point, upload to host computer by network interface, realize the performance boost of computation process.
This dsp chip minimum system peripheral circuit, comprises clock source, and program loads FLASH, and outside DDR3 storer, they are responsible for auxiliary DSP chip and complete processing capacity.Owing to being automatically eliminated after the power-off of dsp chip program, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in dsp chip automatically to make it normally work at every turn.Because dsp chip needs temporary and process mass data, so must in its outside extension storage space.The data such as the buffering result of raw data and intermediate treatment are stored wherein by the plug-in four DDR3 storeies of dsp chip.Clock source provides the system clock of DSP acp chip work, and the frequency required for crystal oscillator produces directly sends dsp chip to.
This gigabit networking interface chip selects the 88E1111 ethernet physical layer chip of Marvell company, under the control of the EMAC module of dsp chip, transmits original information data with host computer with gigabit Ethernet network form.
This power supply chip provides the voltage needed for whole system work.Extraneous to the isolation voltage of system input+5V, by power supply chip ,+5V voltage is changed into+3.3V, + 2.5V, + 1.8V, + 1.5V, + 1.2V, + 1.0V, + 0.75V, CMGT_AVTT, CMGT_AVCC, be supplied to fpga core chip (+3.3V respectively, + 2.5V, + 1.8V, + 1.0V), program loads FLASH (+3.3V, + 1.8V), DSP acp chip (+3.3V, + 1.8V, + 1.0V), DDR3 module (+1.5V, + 0.75V), gigabit networking interface chip (+3.3V, + 1.2V), clock providing source (+3.3V), wherein CMGT_AVTT and CMGT_AVCC is respectively FPGA high-speed interface and provides+1.2V and+1.0V voltage.
This level transferring chip be adopt be TI company release SN74ALVC164245 chip.This chip supports+2.5V to+3.3V ,+3.3V to the level conversion of+5V.
(2) the present invention is a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, its building process is summarized as follows: in fpga core chip, receive the intermediate frequency data that data acquisition obtains, obtain baseband signal data after carrying out Digital Down Convert, and data are sent into ram in slice and carry out buffer memory; After obtaining a frame pulse string data, carried out the transmission of data by the High Speed Serial SRIO of fpga core chip and dsp chip; Dsp chip obtains the baseband signal after a frame down coversion, leaves in DDR3, and design multi-core parallel concurrent realizes the pulse compression algorithm process of frame data; By the deposit data after pulse compression in the buffer memory of DDR3, then design multinuclear and realize parallel correlative accumulation algorithm process and CFAR check processing, obtain the information of impact point; Finally by network interface, impact point information is sent to host computer.
In sum, a kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP of the present invention and Parallel Implementation method thereof, the method concrete steps are as follows:
Step one: in fpga core chip, Digital Down Convert is carried out to intermediate-freuqncy signal
This step is completed by Digital Down Converter Module in fpga core chip, and Digital Down Converter Module extracts logic by data acquisition, mould two, time delay correcting filter, dual port RAM module form.Digital Down Converter Module adopts multiphase filter structure, extracts through two times of odd evens, and if sampling data down-conversion is obtained base band complex data after correcting by time delay.Data acquisition module using the data that obtain after data acquisition chip is sampled as single ended input.Mould two extracts logic and input data pick-up is become I, Q two paths of data, at the rising edge of each clock by zone bit negate, data is got negative when zone bit is 1.Time delay corrects filtering and is realized by 12 rank FIR filter, and coefficient is generated by Matlab; Get high 16 of I, Q two-way after filtering and be spliced into 32 base band datas.
Step 2: data cached in fpga core chip, and configure SRIO preparation transmission data
Adopt the SRIO of x4 to interconnect between fpga core chip and dsp chip, single channel rate is 5Gbps, and consider that 8b/10b encodes, effective bandwidth is up to 2Gbps.Its structural drawing as shown in Figure 3, present invention utilizes the serial RapidIOIP core that Xilinx provides, and devises local side and far-end.Comprise local data process, remote data process and IP kernel.Local data process be responsible for sending local data request packet and receive that far-end sends to local data respond packet.Remote data process is responsible for receiving the packet from far-end.The major function of IP kernel is packing and unpacks, initialization and protocol realization.
When this locality sends data to far-end, data write is sent buffer memory, and give transmit control device enabling signal after writing.Local data end for process according to the SRIO header packet information set, comprise Packet type, bag size, bag number, send address, the other side ID etc. control produce request module from transmission buffer memory produce bag.These bags are transferred to far-end through the process of IP kernel.When far-end obtains packet when sending respond packet to this locality, the serial bit stream received is solved SRIO and wraps and pass to local data response processing module by IP kernel.Data write in bag is received buffer memory by remote data processing controls remote data request processing module, and transmits completion signal after writing to the module needing data, needs the module of data can from reception buffer memory sense data.
Step 3: configure SRIO register and receive data and leave in DDR3 in dsp chip
As shown in Figure 4, in SRIO module, local device is dsp chip to the SRIO module map of dsp chip end, and far-end device refers to fpga core chip.SRIO module in dsp chip is primarily of loading/unloading module and Physical layer composition.Loading/unloading module sends VBUSM request to DDR3 storer, accepts VBUSM response under the control of CPU/EDMA.In loading/unloading module, MMR command register controls send buffer memory and receive buffer memory, and is connected with the FIFO of Physical layer.
In dsp chip, usually call the configuration that CSL (on sheet Support Library) function realizes SRIO, comprise enable, initialization, open and set up functions such as communicating.The realization of SRIO can be divided into 4 steps: address maps; Configuration ID, SRIO port, interrupt vector; Configuration register, comprises the configuration of transmission mode and speed; Etc. to be linked.Upon connecting, dsp chip can receive and send SRIO bag.Need between dsp chip and fpga core chip to know that the object ID of the other side and start address correctly could transmit data.Select DirectIO mode when data are transmitted, only need the address mapping relation of TX and RX both sides just can realize transmission.
Step 4: realize multinuclear process pulse compression algorithm in dsp chip
This step completes in dsp chip, needs to design a set of multinuclear tasks in parallel implementation algorithm to carry out the data processing of pulse compression algorithm.The flow process of Radar Signal Processing as shown in Figure 5, pulse compression calculates in units of pulse, and correlative accumulation and CFAR detection carry out calculating according to the distance section position unit of train of impulses, only have after train of impulses all completes pulse compression, just can carry out correlative accumulation and CFAR detection, so generally flow process is divided into two subtasks, the pulse compression once completing all pulses calculates, and is once that correlative accumulation and CFAR detect.Due to when pulse compression calculates between pulse data correlation little, the data correlation between correlative accumulation and CFAR detection computations is little, thus multinuclear realize adopt master slave mode realize, a core is responsible for scheduling and the distribution of task, and its cokernel carries out parallel work-flow by task.
Need the configuration considering accumulator system when multinuclear designs, need in signal transacting to carry out data access operation with storer, access performance directly affects the efficiency of algorithm.Memory access performance is relevant with the position that code, data store, also relevant to access mode.By configure each memory size, cache arrange size and data access mode, the transfer rate under different situations can be obtained.The access that analysis can obtain external memory storage adopts EDMA mode usually, and internal storage is directly accessed or IDMA mode by kernel; The kernel access performance of internal memory is good, has wider data bus and instruction bus, leaves in LL2 by the data of key and variable; L1D and L1P is set to CACHE and can improves instruction and data buffer memory between kernel and storer.Raw data is placed in DDR3 by the accumulator system of the present invention's design, the LL2 of each core is configured to SRAM, deposit the data in pulse compression, correlative accumulation and CFAR testing process, L1D and L1P is configured to the access efficiency that CACHE is conducive to improving kernel, and the reservoir designs structural drawing of system as shown in Figure 6.
In Radar Signal Processing flow process, data are sent to DDR3 by SRIO interface by fpga core chip, then train of impulses are divided into 8 parts, and each core processes wherein 1 part of pulse data respectively.Pulse compression adopts frequency domain mode to realize in engineering, and after namely the frequency spectrum of input signal and the conjugate complex of local reference signal frequency spectrum are taken advantage of, another mistake welfare leaf transformation obtains result.Key step comprises input, FFT, takes advantage of again, IFFT, output.Pulse data is transferred to the LL2 of each core from DDR3 by EDMA module by importation, the buffer area of configuration data process in the LL2 of then each core, carry out data processing, by EDMA module, data are passed to the buffer zone of DDR3 when finally exporting, need the way of output configuring EDMA module when input and output.Twiddle factor required for FFT and IFFT calculates completes when initialization and calculates and exist in the LL2 of each core, takes advantage of the frequency spectrum conjugation of required reference signal also to exist when initialization in the LL2 of each core again, directly calls during calculating.
Relatively independent between kernel in multiple nucleus system, the scheduling that needing intercoms mutually finishes the work.Kernel needs the order of distributing corresponding subtask and execution.Pulse compression algorithm is realized owing to adopting master slave mode, each core completes respective data processing, do not need internuclear data sharing and transmission, so adopt the communication mode of internuclear interrupt mode, configuration of IP CGRx will produce the interruption of core x, SRCS0 ~ 27 arrange interrupt source mark, adopt SRCSx to represent the interrupt identification of core x here.Corresponding mark is removed in SRCC0 ~ 27 of configuration of IP CARx.
Need to consider the structure of Parallel Implementation in layout flow process, but during design, the bus of DDR3 becomes bottleneck, DDR3 data bus is the highest can only configure 64, and only has a set of bus, and each core needs president when accessing, can not return data in time.So stagger when core and Nuclear Data input during design, after previous core completes input, utilize the input of internuclear next core of down trigger; Adopt the input of last round of output and next round to be combined into the input of a module whole as next round during output, DDR3 bus occupation problem can be avoided like this; The burst process of next round is carried out again until complete the burst process task that each core distributes after core 7 completes whole data processing.The multinuclear realization flow figure of pulse compression as shown in Figure 7.
Step 5: realize multinuclear process correlative accumulation and CFAR detection algorithm in dsp chip
After process of pulse-compression completes, deposit data is in the DDR3 buffer zone of dsp chip, and the output wherein after each pulse compression stores by the mode of row in DDR3, just can carry out data processing in the mode of row like this when correlative accumulation and CFAR detect.The ranks transposition of a 2-D data is completed by the output of pulse compression.
Correlative accumulation and CFAR detect the mode adopting frequency domain to realize, and each core completes asks mould and CFAR to detect apart from the data input of cutting into slices, correlative accumulation, plural number.
The distance section of train of impulses, as data, adopts the mode of EDMA to be transferred to LL2 from DDR3.Correlative accumulation adopts FFT to realize in engineering, and its twiddle factor also completes when initialization and calculates and leave in LL2, directly completes in LL2 when FFT calculates.The complex points data obtained after correlative accumulation need to carry out asking mould to calculate amplitude.It is consuming time and need to consider data from overflow that conventional quadratic sum opens radical sign, adopts the mode of linear-apporximation to realize plural number and ask mould
|X|≈g(I,Q)=amax{|I|,|Q|}+bmin{|I|,|Q|}
Wherein a, b are weighting coefficient, and the value of coefficient is relevant with the requirement of relative error.The method that the ripples such as employing are approximate.Choose suitable a, b parameter, make error below 0.8%, its formula is as follows:
|X|≈max{TL+1/8TS,27/32TL+9/16TS}
CFAR detects the mode adopting conventional CA-CFAR rate (CA-CFAR) to detect and realizes; usually adopt the mode of drawing window detection to realize when real data process; when calculating once sliding window and detecting, test cell both sides are provided with protected location and are used for preventing target from crossing over multiple unit and cause interference.For judging whether there is target in one of them unit, need to be averaging left and right reference unit to be multiplied by threshold factor again and to obtain detection threshold, then with draw whether there is target after test cell.When Project Realization, usual CFAR detects needs to consume a large amount of cycle, and TMS320C6678 support software flowing water technology realizes parallel operation, can be optimized the process of CFAR detection by assembly instruction layout.Specific implementation process is as follows: first complete writing of C language code, then linear assembler code is rewritten into, according to the periodicity of code determination iteration, draw correlogram afterwards, namely the functional unit used by every bar instruction is determined, have 8 functional units in TMS320C6678 monokaryon, bus supports parallel 8 instructions of one-period.Last according to correlogram, determine the register file of every bar instruction, then carry out instruction layout, need the cyclic pac king in consideration streamline, kernel circulates, it is emptying to circulate, also need the delay, life cycle and so on of considering instruction, finally obtain the instruction layout walked abreast.
The problem considering that DDR3 bus takies is needed when multinuclear realizes correlative accumulation and CFAR detects, so also need staggered, each core completes correlative accumulation and the CFAR detection of partial distance section separately afterwards, obtain an impact point information, the heart computing carried out after finally the impact point information of 8 cores being asked value obtains the information result of target.The multinuclear realization flow figure that correlative accumulation and CFAR detect as shown in Figure 8.
Step 6: send target information to host computer by network interface
Obtain impact point information after being detected by correlative accumulation and CFAR, information is transferred to host computer by the network interface of 6678.Network interface card controller (NETCP) is had, mainly for the treatment of Ethernet data bag in 6678.This controller is by the PKTDMA controller transmitted for control data bag DMA; For the bag accelerator (PA) of identification of data packets and classification; For safety accelerating machine (SA) and the gigabit Ethernet conversion subsystem four part composition of Data Packet Encryption and deciphering.The quick exchange of common complete packet between dsp chip and external unit.The structural drawing of network interface card controller as shown in Figure 9.
When externally device sends packet to dsp chip, data enter network interface card controller through DMA, in SA, complete ciphering process, enter PA by data stream bus switches afterwards, in PA, add mac frame head, IP frame head and UDP/TCP frame head according to the descriptor preset.Enter gigabit Ethernet conversion subsystem by data stream bus switches afterwards, carry out identifying packing and being sent to external unit from predetermined port to data in this module.
The process flow diagram that in last dsp chip, multinuclear realizes pulse compression, correlative accumulation and CFAR detect as shown in Figure 10.
Based on the PD Radar Signal Processing System of FPGA+DSP and the main devices of multinuclear implementation method hardware circuit thereof be:
The selection of fpga core chip:
Select the Virtex-6XC6VSX315T of Xilinx company
Virtex-6 series is the fpga core chip of new generation that Xilinx company releases, this Series FPGA acp chip has carried out most suitable Combinatorial Optimization, comprise dirigibility, hard kernel IP, transceiver function and developing instrument support, total solution can be provided for the application of communication, network and digital processing field.
Virtex-6XC6VSX315T is a member of Virtex-6 family.There is following principal feature:
1) 49200 slice;
2) 12 MMCM (Mixed-ModeClockManagers) modules;
3)25344KbitsRAM;
4) 720 general purpose I/O pins;
5) 24 GTX modules;
6) 2 PCIe interface modules.
In addition, Xilinx company additionally provides powerful development platform (ISE), and developer completes whole design by this platform.
Program loads the selection of FLASH chip:
Select the XCF16P of Xilinx company.
XCF16P capacity is 16Mbit, and its memory capacity can support that the fpga core chip of multiple Xilinx company carries out power-up routine loading.
The selection of dsp chip:
Select the TMS320C6678 of TI company
TMS320C6678, it adopts a kind of Harvard's bus structure of improvement: the program bus of a set of 256, two cover 32 bit data bus and a set of 32 DMA private buss, and its principal feature is as follows:
1) processing unit adopts high-performance, advanced very long instruction word structure, and per clock period can the instruction of executed in parallel 8 32bit;
2) TMS320C6678 adopts 8 arithmetic speeds to build up to the DSP kernel of 1.25GHz, and individual devices incorporates 320GMAC and 160GFLOP fixed point and floating-point performance.
3) TMS320C6678 incorporates jumbo on-chip memory, each core is except 32KBL1P and L1D that can be configured to CACHE, also comprise the L2 that 512KB can be configured to RAM or CACHE, also have the multinuclear shared drive of 4MB in addition, shared L2SRAM or shared L3SRAM can be used as and use.
4) TMS320C6678 chip provides abundant peripheral interface, and native system mainly uses the interfaces such as SRIO, DDR3.These interfaces mainly use in computing dsp chip.Wherein SRIO is used for the data communication of dsp chip and fpga core chip, and DDR3 is used for DSP exterior storage.
In addition, the dsp chip Integrated Development Environment (CCS5) that Texas Instruments provides, developer completes all designs and debugging by this Integrated Development Environment.
The selection of power supply chip:
Described power supply chip is LTM4616 and LTM4627 of LinearTechnology company.
The key property of LTM4616 is as follows:
1) input voltage range 2.7V to 5.5V;
2) out-put supply scope 0.6V to 5V;
3) overcurrent and overheating protection;
4) output voltage overvoltage protection;
5) (15mm × 15mm × 2.82mm) LGA encapsulates and (15mmx15mmx3.42mm) BGA package.
The key property of LTM4627 is as follows:
1) input voltage range 4.5V to 20V;
2) output voltage range 0.6V to 5V;
3) for realizing the difference far-end sampling amplifier of accurate voltage stabilizing;
4) output voltage overvoltage protection;
5) (15mm × 15mm × 4.32mm) LGA encapsulates and 15mmx15mmx4.92mmBGA encapsulation.
Gigabit networking interface chip:
Gigabit networking interface chip selects the 88E1111 chip of Marvell company.This chip complete support IEEE802.3 protocol family, built-in 1.25G serial deserializer, meets the application of gigabit optical transport, uses Standard Digital CMOS manufacture, possess self-adaptation, super low-power consumption pattern.It supports Gigabit Media gateway interface (GMII), the GMII (RGMII) simplified, Serial Gigabit Media gateway interface (SGMII).
Level transferring chip:
What level transferring chip adopted is the SN74ALVC164245 chip that TI company releases.This chip belongs to the Widebus series of TI, supports the level conversion between+2.5V and+3.3V ,+3.3V and+5V, for the asynchronous communication between data bus.
System realizes result
Application VHDL Hardware description language fixed point C language of making peace is programmed, and the module write is downloaded in XilinxVirtex-6XC6VSX315T and TMS320C6678.In experimentation, emulation generation one frame pulse string data is input in FPGA module and inputs as data, is observed by ChipScopePro (logic analyser that XilinxISE software carries), PC.
The resource taken in fpga core chip is as follows:
Table 1FPGA acp chip system resource service condition
In multi-core DSP chip, realize 200 pulses pulse compression algorithm and monokaryon realize making comparisons, and the consumption clock period recorded is as follows:
Table 2 pulse compression monokaryon and multinuclear realize consuming the clock period
In multi-core DSP chip, realize the correlative accumulation of 200 pulses and CFAR detection algorithm and monokaryon realize making comparisons, the consumption clock period recorded is as follows:
Table 3 correlative accumulation and CFAR detect monokaryon and multinuclear realizes consuming the clock period
A kind of PD Radar Signal Processing System based on FPGA+ multi-core DSP and Parallel Implementation method thereof, PD Radar Signal Detection is achieved by VHDL language, fixed point C language and assembly routine, and in the test of reality, demonstrate the parallel optimization of multinuclear, demonstrate the feasibility of multinuclear optimal design, and have following characteristics:
hardware circuit is simple, small volume, provides exploration and foundation for system in future is integrated.
process structure adopts FPGA+ multi-core DSP framework, in fpga core chip, realize Digital Down Convert, carries out pulse compression, correlative accumulation and CFAR and detects, given full play to the parallel processing performance of system in dsp chip.
achieved the multinuclear optimization of Radar Signal Detection flow process by test, improve performance.
Visible, the PD Radar Signal Processing System based on FPGA+ multi-core DSP has very high using value, is improved in practical application real-time, has good application prospect.

Claims (2)

1. based on a PD Radar Signal Processing System for FPGA+ multi-core DSP, it is characterized in that: it comprises fpga core chip and peripheral minimum system circuit, dsp chip and peripheral minimum system circuit, gigabit networking interface chip, power supply chip and level transferring chip; The straight ripple of radar that fpga core chip reception data acquisition chip collects and echoed signal, exist after carrying out down-converted in internal memory, transferred data to the DDR3 in dsp chip by the SRIO interface between fpga core chip and dsp chip after receiving the data of a frame, then dsp chip carries out pulse compression, correlative accumulation and constant false alarm rate detection, obtain the information of impact point, finally by network interface, target information is uploaded to host computer;
This fpga core chip is XC6VSX315T, has efficient pair register 6 and inputs LUT logic, abundant IO resource and a large amount of on-chip memory resources, supports DDR3; In addition, this chip has powerful signal handling capacity and has the ability connected in series based on low-power consumption GTX6.5Gbps transceiver, ensures the high speed serial transmission between fpga core chip and dsp chip; Fpga core chip is sampled after the data that obtain receiving data acquisition chip, exists in internal memory, be transferred in dsp chip after obtaining the data of a frame by SRIO after Digital Down Convert by data;
This fpga core chip periphery minimum system circuit, comprises clock source and program loads FLASH, and they are responsible for auxiliary fpga core chip and complete processing capacity; Clock source provides clock signal for fpga core chip; Because the power-off of fpga core chip Program is eliminated automatically, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in fpga core chip automatically to make it normally work at every turn; Clock source provides the system clock of fpga core chip operation, and the frequency required for crystal oscillator produces directly sends fpga core chip to;
This dsp chip is TMS320C6678 multinuclear process chip, and this chip adopts a kind of Harvard's bus structure of improvement: the program bus of a set of 256, two cover 64 bit data bus and a set of 32 DMA private buss; Processing unit adopts high-performance, advanced very long instruction word structure, the instruction of per clock period energy executed in parallel 8 32bit; Adopt 8 arithmetic speeds to build up to the DSP kernel of 1.25GHz, achieve 320GMAC and 160GFLOP fixed point and floating-point performance on a single chip; Monokaryon is except being configured to L1P and L1D of the 32KB of CACHE, also comprise 512KB be configured to RAM or CACHE LL2SRAM, also has the multinuclear shared drive of 4MB in addition, can use as the L2SRAM shared or shared L3SRAM, built-in DDR3 controller, energy addressing 33bit address and 8GB storage space; TMS320C6678 chip provides abundant peripheral interface, and wherein according to mission requirements, serial RapidIO, PCIE, Hyperlink, DDR3 interface is mainly used in signal transacting combination; The frame radar pulse string signal receiving fpga core chip utilizing SRIO and be transferred to, carry out design and the distribution of multinuclear task, the multi-core parallel concurrent implementation procedure that layout pulse compression, correlative accumulation and CFAR detect, finally obtain the information of impact point, upload to host computer by network interface, realize the performance boost of computation process;
The peripheral minimum system circuit of this dsp chip, comprise clock source, program loads FLASH, outside DDR3 storer, and they are responsible for auxiliary DSP chip and complete processing capacity; Owing to being automatically eliminated after the power-off of dsp chip program, load in FLASH so program code must be cured to a program, after powering on, the program in FLASH is loaded in dsp chip automatically to make it normally work at every turn; Because dsp chip needs temporary and process mass data, so must in its outside extension storage space, the plug-in four DDR3 storeies of dsp chip, store the buffering result data of raw data and intermediate treatment wherein; Clock source provides the system clock of dsp chip work, and the frequency required for crystal oscillator produces directly sends dsp chip to;
This gigabit networking interface chip is 88E1111 ethernet physical layer chip, under the control of the EMAC module of dsp chip, transmits original information data with host computer with gigabit Ethernet network form;
This power supply chip provides the voltage needed for whole system work, extraneous to the isolation voltage of system input+5V, by power supply chip ,+5V voltage is changed into+3.3V, + 2.5V, + 1.8V, + 1.5V, + 1.2V, + 1.0V, + 0.75V, CMGT_AVTT, CMGT_AVCC, be supplied to fpga core chip+3.3V respectively, + 2.5V, + 1.8V, + 1.0V, program loads FLASH+3.3V, + 1.8V, dsp chip+3.3V, + 1.8V, + 1.0V, DDR3 module+1.5V, + 0.75V, gigabit networking interface chip+3.3V, + 1.2V, clock providing source+3.3V, wherein CMGT_AVTT and CMGT_AVCC is respectively fpga core chip high speed interface and provides+1.2V and+1.0V voltage,
This level transferring chip is SN74ALVC164245 chip, and this chip supports+2.5V to+3.3V ,+3.3V to the level conversion of+5V.
2., based on a Parallel Implementation method for the PD Radar Signal Processing System of FPGA+ multi-core DSP, the method concrete steps are as follows:
Step one: in fpga core chip, Digital Down Convert is carried out to intermediate-freuqncy signal
This step is completed by Digital Down Converter Module in fpga core chip, Digital Down Converter Module extracts logic by data acquisition, mould two, time delay correcting filter, dual port RAM module form, Digital Down Converter Module adopts multiphase filter structure, extract through two times of odd evens, if sampling data down-conversion is obtained base band complex data after correcting by time delay; Data acquisition module using the data that obtain after data acquisition chip is sampled as single ended input; Mould two extracts logic and input data pick-up is become I, Q two paths of data, at the rising edge of each clock by zone bit negate, data is got negative when zone bit is 1; Time delay corrects filtering and is realized by 12 rank FIR filter, and coefficient is generated by Matlab; Get high 16 of I, Q two-way after filtering and be spliced into 32 base band datas;
Step 2: data cached in fpga core chip, and configure SRIO preparation transmission data
Adopt the SRIO of x4 to interconnect between fpga core chip and dsp chip, single channel rate is 5Gbps, and consider that 8b/10b encodes, effective bandwidth is up to 2Gbps; The serial RapidIOIP core utilizing Xilinx to provide, and devise local side and far-end; Comprise local data process, remote data process and IP kernel; Local data process be responsible for sending local data request packet and receive that far-end sends to local data respond packet; Remote data process is responsible for receiving the packet from far-end; The major function of IP kernel is packing and unpacks, initialization and protocol realization;
When this locality sends data to far-end, data write is sent buffer memory, and give transmit control device enabling signal after writing; Local data end for process, according to the SRIO header packet information set, comprises Packet type, bag size, bag number, sends address, the other side ID and control to produce request module produce bag from transmission buffer memory; These bags are transferred to far-end through process of IP kernel, when far-end obtains packet and when sending respond packet to this locality, the serial bit stream received is solved SRIO and wraps and pass to local data response processing module by IP kernel; Data write in bag is received buffer memory by remote data processing controls remote data request processing module, and transmits completion signal after writing to the module needing data, needs the module of data from reception buffer memory sense data;
Step 3: configure SRIO register and receive data and leave in DDR3 in dsp chip
In the SRIO module of dsp chip end, local device is dsp chip, and far-end device refers to fpga core chip; SRIO module in dsp chip is primarily of loading/unloading module and Physical layer composition, and loading/unloading module sends VBUSM request to DDR3 storer, accepts VBUSM response under the control of CPU/EDMA; In loading/unloading module, MMR command register controls send buffer memory and receive buffer memory, and is connected with the FIFO of Physical layer;
In dsp chip, usually call Support Library function on CSL and sheet and realize the configuration of SRIO, comprise enable, initialization, open and set up communication functions; The realization of RIO is divided into 4 steps: address maps; Configuration ID, SRIO port, interrupt vector; Configuration register, comprises the configuration of transmission mode and speed; Etc. to be linked; Upon connecting, dsp chip can receive and send SRIO bag, needs to know that the object ID of the other side and start address correctly could transmit data between dsp chip and fpga core chip; Select DirectIO mode when data are transmitted, only need the address mapping relation of TX and RX both sides just can realize transmission;
Step 4: realize multinuclear process pulse compression algorithm in dsp chip
This step completes in dsp chip, needs to design a set of multinuclear tasks in parallel implementation algorithm to carry out the data processing of pulse compression algorithm; Pulse compression calculates in units of pulse, and correlative accumulation and CFAR detection carry out calculating according to the distance section position unit of train of impulses, only have after train of impulses all completes pulse compression, just can carry out correlative accumulation and CFAR detection, so generally flow process is divided into two subtasks, the pulse compression once completing all pulses calculates, and is once that correlative accumulation and CFAR detect; Due to when pulse compression calculates between pulse data correlation little, the data correlation between correlative accumulation and CFAR detection computations is little, thus multinuclear realize adopt master slave mode realize, a core is responsible for scheduling and the distribution of task, and its cokernel carries out parallel work-flow by task;
Need the configuration considering accumulator system when multinuclear designs, need in signal transacting to carry out data access operation with storer, access performance directly affects the efficiency of algorithm; Memory access performance is relevant with the position that code, data store, also relevant to access mode; By configure each memory size, cache arrange size and data access mode, obtain the transfer rate under different situations; Analyze the access of external memory storage adopts EDMA mode usually, internal storage is directly accessed or IDMA mode by kernel; The kernel access performance of internal memory is good, has wider data bus and instruction bus, leaves in LL2 by the data of key and variable; L1D and L1P is set to the instruction and data buffer memory between CACHE raising kernel and storer; Raw data is placed in DDR3 by the accumulator system of design, the LL2 of each core is configured to SRAM, deposit the data in pulse compression, correlative accumulation and CFAR testing process, L1D and L1P is configured to the access efficiency that CACHE is conducive to improving kernel, in Radar Signal Processing flow process, data are sent to DDR3 by SRIO interface by fpga core chip, then train of impulses are divided into 8 parts, and each core processes wherein 1 part of pulse data respectively; Pulse compression adopts frequency domain mode to realize in engineering, and after namely the frequency spectrum of input signal and the conjugate complex of local reference signal frequency spectrum are taken advantage of, another mistake welfare leaf transformation obtains result; Key step comprises input, FFT, takes advantage of again, IFFT, output; Pulse data is transferred to the LL2 of each core from DDR3 by EDMA module by importation, the buffer area of configuration data process in the LL2 of then each core, carry out data processing, by EDMA module, data are passed to the buffer zone of DDR3 when finally exporting, need the way of output configuring EDMA module when input and output; Twiddle factor required for FFT and IFFT calculates completes when initialization and calculates and exist in the LL2 of each core, takes advantage of the frequency spectrum conjugation of required reference signal also to exist when initialization in the LL2 of each core again, directly calls during calculating;
Relatively independent between kernel in multiple nucleus system, the scheduling that needing intercoms mutually finishes the work; Kernel needs the order of distributing corresponding subtask and execution; Pulse compression algorithm is realized owing to adopting master slave mode, each core completes respective data processing, do not need internuclear data sharing and transmission, so adopt the communication mode of internuclear interrupt mode, configuration of IP CGRx will produce the interruption of core x, SRCS0 ~ 27 arrange interrupt source mark, adopt SRCSx to represent the interrupt identification of core x here, and corresponding mark is removed in SRCC0 ~ 27 of configuration of IP CARx;
Need to consider the structure of Parallel Implementation in layout flow process, but during design, the bus of DDR3 becomes bottleneck, DDR3 data bus is the highest can only configure 64, and only has a set of bus, and each core needs president when accessing, can not return data in time; So stagger when core and Nuclear Data input during design, after previous core completes input, utilize the input of internuclear next core of down trigger; Adopt the input of last round of output and next round to be combined into the input of a module whole as next round during output, avoid DDR3 bus occupation problem like this; The burst process of next round is carried out again until complete the burst process task that each core distributes after core 7 completes whole data processing;
Step 5: realize multinuclear process correlative accumulation and CFAR detection algorithm in dsp chip
After process of pulse-compression completes, deposit data is in the DDR3 buffer zone of dsp chip, output wherein after each pulse compression stores by the mode of row in DDR3, just carry out data processing in the mode of row when correlative accumulation and CFAR detect like this, completed the ranks transposition of a 2-D data by the output of pulse compression; Correlative accumulation and CFAR detect the mode adopting frequency domain to realize, and each core completes asks mould and CFAR to detect apart from the data input of cutting into slices, correlative accumulation, plural number;
The distance section of train of impulses is as data, the mode of EDMA is adopted to be transferred to LL2 from DDR3, correlative accumulation adopts FFT to realize in engineering, and its twiddle factor also completes when initialization and calculates and leave in LL2, directly completes in LL2 when FFT calculates; The complex points data that obtain after correlative accumulation need to carry out asking mould to calculate amplitude, and it is consuming time and need to consider data from overflow that conventional quadratic sum opens radical sign, adopts the mode of linear-apporximation to realize plural number and ask mould;
|X|≈g(I,Q)=amax{|I|,|Q|}+bmin{|I|,|Q|}
Wherein a, b are weighting coefficient, and the value of coefficient is relevant with the requirement of relative error; The method that the ripples such as employing are approximate, choose suitable a, b parameter, make error below 0.8%, its formula is as follows:
|X|≈max{TL+1/8TS,27/32TL+9/16TS}
CFAR detects the mode adopting conventional CA-CFAR rate to detect and realizes, usually adopt the mode of drawing window detection to realize when real data process, when calculating once sliding window and detecting, test cell both sides are provided with protected location and are used for preventing target from crossing over multiple unit and cause interference, for judging whether there is target in one of them unit, need to be averaging left and right reference unit to be multiplied by threshold factor again and to obtain detection threshold, then with draw whether there is target after test cell, when Project Realization, usual CFAR detects needs to consume a large amount of cycle, and TMS320C6678 support software flowing water technology realizes parallel operation, is optimized the process of CFAR detection by assembly instruction layout, specific implementation process is as follows: first complete writing of C language code, then linear assembler code is rewritten into, according to the periodicity of code determination iteration, draw correlogram afterwards, namely the functional unit used by every bar instruction is determined, 8 functional units are had in TMS320C6678 monokaryon, bus supports parallel 8 instructions of one-period, last according to correlogram, determine the register file of every bar instruction, then instruction layout is carried out, need to consider the cyclic pac king in streamline, kernel circulates, circulate emptying, also need the delay considering instruction, life cycle, finally obtain the instruction layout walked abreast,
The problem considering that DDR3 bus takies is needed when multinuclear realizes correlative accumulation and CFAR detects, so also need staggered, each core completes correlative accumulation and the CFAR detection of partial distance section separately afterwards, obtain an impact point information, the heart computing carried out after finally the impact point information of 8 cores being asked value obtains the information result of target;
Step 6: send target information to host computer by network interface
Obtain impact point information after being detected by correlative accumulation and CFAR, information is transferred to host computer by the network interface of 6678, has network interface card controller NETCP in 6678, for the treatment of Ethernet data bag; This controller is made up of the PKTDMA controller transmitted for control data bag DMA, the bag accelerator PA for identification of data packets and classification, the safety accelerating machine SA for Data Packet Encryption and deciphering and gigabit Ethernet conversion subsystem four part; The quick exchange of common complete packet between dsp chip and external unit;
When externally device sends packet to dsp chip, data enter network interface card controller through DMA, ciphering process is completed in SA, PA is entered afterwards by data stream bus switches, mac frame head, IP frame head and UDP/TCP frame head is added according to the descriptor preset in PA, enter gigabit Ethernet conversion subsystem by data stream bus switches afterwards, carry out identifying packing and being sent to external unit from predetermined port to data in this module.
CN201510411844.XA 2015-07-14 2015-07-14 A kind of PD Radar Signal Processing Systems and its Parallel Implementation method based on FPGA+ multi-core DSPs Active CN105045763B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510411844.XA CN105045763B (en) 2015-07-14 2015-07-14 A kind of PD Radar Signal Processing Systems and its Parallel Implementation method based on FPGA+ multi-core DSPs

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510411844.XA CN105045763B (en) 2015-07-14 2015-07-14 A kind of PD Radar Signal Processing Systems and its Parallel Implementation method based on FPGA+ multi-core DSPs

Publications (2)

Publication Number Publication Date
CN105045763A true CN105045763A (en) 2015-11-11
CN105045763B CN105045763B (en) 2018-07-13

Family

ID=54452321

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510411844.XA Active CN105045763B (en) 2015-07-14 2015-07-14 A kind of PD Radar Signal Processing Systems and its Parallel Implementation method based on FPGA+ multi-core DSPs

Country Status (1)

Country Link
CN (1) CN105045763B (en)

Cited By (59)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279127A (en) * 2015-11-25 2016-01-27 哈尔滨工业大学 FPGA program downloading system based on PCI or PCIe bus, and method
CN105516624A (en) * 2015-12-10 2016-04-20 合肥师范学院 Multi-core digital signal processor (DSP) based multi-channel image acquisition processing system
CN105548970A (en) * 2015-12-11 2016-05-04 无锡市雷华科技有限公司 Flying bird detection radar processor
CN106093895A (en) * 2016-06-03 2016-11-09 山东省科学院自动化研究所 A kind of method of estimation of pulse Doppler radar amplitude jitter
CN106484658A (en) * 2016-09-26 2017-03-08 西安电子科技大学 The device and method of 65536 points of pulse compressions is realized based on FPGA
CN106647496A (en) * 2016-12-23 2017-05-10 潘敏 Novel power supply control system
CN106803816A (en) * 2017-03-27 2017-06-06 南京大学 A kind of configurable self-adapting load balance system and method
CN107154948A (en) * 2017-07-11 2017-09-12 北京航天发射技术研究所 A kind of multi-protocol data exchange method applied to car launcher information control system
CN107612862A (en) * 2017-09-11 2018-01-19 中国电子科技集团公司第四十研究所 A kind of LTE A Pro OFDM modulating devices and method
CN107943744A (en) * 2017-10-25 2018-04-20 西南电子技术研究所(中国电子科技集团公司第十研究所) Synthesization communication system polycaryon processor reconfigurable architecture
CN108055244A (en) * 2017-11-27 2018-05-18 珠海市鸿瑞信息技术股份有限公司 A kind of dual processor system network security partition method based on SRIO interfacings
CN108089184A (en) * 2017-12-08 2018-05-29 中国船舶重工集团公司第七二四研究所 A kind of TWS radar targets spatial position grouping parallel tracking processing method
CN108153561A (en) * 2017-12-18 2018-06-12 北京遥测技术研究所 The Ethernet loading method and signal processing system of a kind of DSP and FPGA
CN108169746A (en) * 2017-12-21 2018-06-15 南京理工大学 Chirp Semi-active RADAR guidance header signal processing method
CN108462620A (en) * 2018-02-11 2018-08-28 北京控制工程研究所 A kind of Gb SpaceWire bus systems
CN108710596A (en) * 2018-05-10 2018-10-26 中国人民解放军空军工程大学 It is a kind of to assist the desktop of processing card is super to calculate hardware platform based on DSP and FPGA more
CN108717187A (en) * 2018-05-23 2018-10-30 桂林电子科技大学 Based on multinuclear digital signal processor through-wall radar motion target tracking imaging method
CN108733617A (en) * 2018-09-20 2018-11-02 电信科学技术第五研究所有限公司 64 parallel-by-bits of Fibre channel scramble the FPGA implementation method of descrambler
CN108763011A (en) * 2018-03-27 2018-11-06 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) SoC chip check figure detection method, device, system and storage medium
CN108828556A (en) * 2018-07-12 2018-11-16 北京大汉正源科技有限公司 laser radar control system
CN109001688A (en) * 2018-05-28 2018-12-14 中国电子科技集团公司第二十九研究所 A kind of intermediate data storage method and device based on radar signal parallel processing
CN109239688A (en) * 2018-07-31 2019-01-18 电子科技大学 A kind of efficient Doppler filter group realized based on FPGA
CN109478144A (en) * 2017-07-05 2019-03-15 上海寒武纪信息科技有限公司 A kind of data processing equipment and method
CN109557510A (en) * 2018-11-30 2019-04-02 安徽四创电子股份有限公司 A kind of LFMCW Radar Signals processor
CN109581170A (en) * 2018-12-06 2019-04-05 贵州电网有限责任公司 A kind of high frequency response bandwidth Lightning Over-voltage on-line monitoring system
CN109782890A (en) * 2018-12-11 2019-05-21 广东高云半导体科技股份有限公司 A kind of electronic equipment and its low-power consumption FPGA device
CN109856599A (en) * 2018-12-24 2019-06-07 南京理工大学 A kind of array radar signal processing system and method based on DSP and server
CN110275141A (en) * 2019-06-26 2019-09-24 西安电子科技大学 Radar signal processing circuit, encapsulation and implementation method based on sip technique
CN110309086A (en) * 2019-05-17 2019-10-08 全球能源互联网研究院有限公司 A kind of multichannel low speed mouth and single channel high speed port data interactive method
CN110441739A (en) * 2019-07-02 2019-11-12 中国航空工业集团公司雷华电子技术研究所 A method of improving radar SRIO transmission reliability
CN110489356A (en) * 2019-08-06 2019-11-22 上海商汤智能科技有限公司 Information processing method, device, electronic equipment and storage medium
CN110648273A (en) * 2019-09-27 2020-01-03 中国科学院长春光学精密机械与物理研究所 Real-time image processing apparatus
CN110825687A (en) * 2019-10-25 2020-02-21 天津津航技术物理研究所 Dual-mode tracking method based on DSP (digital Signal processor) multi-core architecture
CN111130586A (en) * 2019-12-27 2020-05-08 中科院计算技术研究所南京移动通信与计算创新研究院 Frequency conversion method and device
CN111190853A (en) * 2019-12-26 2020-05-22 南京理工大学 High-speed communication system between pieces based on EMIF and SRIO interface
CN111208504A (en) * 2020-02-28 2020-05-29 成都汇蓉国科微系统技术有限公司 PD radar waveform configuration method and device based on DSP
CN111431596A (en) * 2020-03-24 2020-07-17 中星联华科技(北京)有限公司 Signal speed-up method and circuit
CN111475362A (en) * 2020-04-20 2020-07-31 西安太乙电子有限公司 Multi-core isomorphic DSP (digital Signal processor) testing system and method
CN111626011A (en) * 2020-04-20 2020-09-04 芯创智(北京)微电子有限公司 FPGA comprehensive rapid iteration method and system based on configurable endpoint restart
CN111666106A (en) * 2019-03-07 2020-09-15 慧与发展有限责任合伙企业 Data offload acceleration from multiple remote chips
CN111858384A (en) * 2020-08-04 2020-10-30 上海无线电设备研究所 Efficient test method for constant false alarm detection software unit
CN111983615A (en) * 2020-07-13 2020-11-24 惠州市德赛西威智能交通技术研究院有限公司 Distributed radar signal processing system and device
CN112068102A (en) * 2020-09-10 2020-12-11 成都汇蓉国科微系统技术有限公司 Radar signal processing computing power balance design and device
CN112199320A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-channel reconfigurable signal processing device
CN112214325A (en) * 2020-10-20 2021-01-12 杭州电子科技大学 FPGA task dynamic arrangement method, device, chip and storage medium
CN112347721A (en) * 2020-10-29 2021-02-09 北京长焜科技有限公司 System for realizing data processing acceleration based on FPGA and acceleration method thereof
CN112504428A (en) * 2020-10-19 2021-03-16 威海北洋光电信息技术股份公司 Low-power-consumption embedded high-speed distributed optical fiber vibration sensing system and application thereof
WO2021057302A1 (en) * 2019-09-29 2021-04-01 中兴通讯股份有限公司 Digital pre-distortion implementation method and system, readable storage medium, and dpd apparatus
CN112601234A (en) * 2020-11-20 2021-04-02 中电科仪器仪表(安徽)有限公司 Multi-core DSP-based multi-channel 5G signal demodulation device
CN114116554A (en) * 2021-11-05 2022-03-01 中国航空工业集团公司雷华电子技术研究所 Radar data forwarding architecture and method based on FPGA
CN114296650A (en) * 2021-12-27 2022-04-08 中国航天科工集团八五一一研究所 Intermediate-frequency large-capacity acquired data file processing method based on FPGA
CN114549275A (en) * 2022-02-10 2022-05-27 中国科学院上海技术物理研究所 Real-time digital image processing method based on double-quad-core DSP signal processing board
CN114928575A (en) * 2022-06-02 2022-08-19 江苏新质信息科技有限公司 Multi-algorithm core data packet order-preserving method and device based on FPGA
CN115017081A (en) * 2022-06-30 2022-09-06 重庆秦嵩科技有限公司 Multi-path SRIO interface clock resource sharing system based on domestic FPGA
CN115856824A (en) * 2023-01-10 2023-03-28 电子科技大学 Multiphase parallel distance dimension CFAR implementation method with low resource consumption
CN116106851A (en) * 2023-04-04 2023-05-12 中国科学院空天信息创新研究院 Method and device for compressing raw data of synthetic aperture radar
CN116366084A (en) * 2022-09-07 2023-06-30 无锡国芯微电子系统有限公司 Multi-channel PDW high-density real-time processing method based on FPGA
CN116796674A (en) * 2023-08-24 2023-09-22 上海合见工业软件集团有限公司 Heterogeneous hardware simulation method and system
CN116974235A (en) * 2023-09-22 2023-10-31 天津工业大学 Multichannel data acquisition card control system and method based on ZYNQ

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201465562U (en) * 2009-06-29 2010-05-12 北京理工大学 Two-channel digital radiofrequency storage board
CN202563080U (en) * 2012-05-07 2012-11-28 中国人民解放军海军702厂 High-speed data acquisition device based on FPGA and DSP
CN103605309A (en) * 2013-11-25 2014-02-26 北京航空航天大学 Four-channel high-capacity waveform storage system and construction method thereof
CN203480022U (en) * 2013-05-16 2014-03-12 中国电子科技集团公司第二十七研究所 Super-high speed general radar signal processing board
KR101385439B1 (en) * 2013-04-03 2014-04-15 주식회사 이노와이어리스 Method for transferring between fpga and dsp connected with srio interface
CN103793355A (en) * 2014-01-08 2014-05-14 西安电子科技大学 General signal processing board card based on multi-core DSP (digital signal processor)
CN103885919A (en) * 2014-03-20 2014-06-25 北京航空航天大学 Multi-DSP and multi-FPGA parallel processing system and implement method

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN201465562U (en) * 2009-06-29 2010-05-12 北京理工大学 Two-channel digital radiofrequency storage board
CN202563080U (en) * 2012-05-07 2012-11-28 中国人民解放军海军702厂 High-speed data acquisition device based on FPGA and DSP
KR101385439B1 (en) * 2013-04-03 2014-04-15 주식회사 이노와이어리스 Method for transferring between fpga and dsp connected with srio interface
CN203480022U (en) * 2013-05-16 2014-03-12 中国电子科技集团公司第二十七研究所 Super-high speed general radar signal processing board
CN103605309A (en) * 2013-11-25 2014-02-26 北京航空航天大学 Four-channel high-capacity waveform storage system and construction method thereof
CN103793355A (en) * 2014-01-08 2014-05-14 西安电子科技大学 General signal processing board card based on multi-core DSP (digital signal processor)
CN103885919A (en) * 2014-03-20 2014-06-25 北京航空航天大学 Multi-DSP and multi-FPGA parallel processing system and implement method

Cited By (83)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105279127B (en) * 2015-11-25 2017-12-08 哈尔滨工业大学 A kind of FPGA program downloading systems and method based on PCI or PCIe buses
CN105279127A (en) * 2015-11-25 2016-01-27 哈尔滨工业大学 FPGA program downloading system based on PCI or PCIe bus, and method
CN105516624A (en) * 2015-12-10 2016-04-20 合肥师范学院 Multi-core digital signal processor (DSP) based multi-channel image acquisition processing system
CN105548970A (en) * 2015-12-11 2016-05-04 无锡市雷华科技有限公司 Flying bird detection radar processor
CN106093895A (en) * 2016-06-03 2016-11-09 山东省科学院自动化研究所 A kind of method of estimation of pulse Doppler radar amplitude jitter
CN106484658A (en) * 2016-09-26 2017-03-08 西安电子科技大学 The device and method of 65536 points of pulse compressions is realized based on FPGA
CN106484658B (en) * 2016-09-26 2019-01-11 西安电子科技大学 The device and method of 65536 pulses compression is realized based on FPGA
CN106647496A (en) * 2016-12-23 2017-05-10 潘敏 Novel power supply control system
CN106803816A (en) * 2017-03-27 2017-06-06 南京大学 A kind of configurable self-adapting load balance system and method
CN106803816B (en) * 2017-03-27 2020-04-07 南京大学 Configurable self-adaptive load balancing system and method
CN109478144B (en) * 2017-07-05 2021-12-14 上海寒武纪信息科技有限公司 Data processing device and method
CN109478144A (en) * 2017-07-05 2019-03-15 上海寒武纪信息科技有限公司 A kind of data processing equipment and method
CN107154948A (en) * 2017-07-11 2017-09-12 北京航天发射技术研究所 A kind of multi-protocol data exchange method applied to car launcher information control system
CN107612862A (en) * 2017-09-11 2018-01-19 中国电子科技集团公司第四十研究所 A kind of LTE A Pro OFDM modulating devices and method
CN107612862B (en) * 2017-09-11 2020-06-09 中国电子科技集团公司第四十一研究所 OFDM modulation device and method for LTE-A Pro
CN107943744A (en) * 2017-10-25 2018-04-20 西南电子技术研究所(中国电子科技集团公司第十研究所) Synthesization communication system polycaryon processor reconfigurable architecture
CN108055244A (en) * 2017-11-27 2018-05-18 珠海市鸿瑞信息技术股份有限公司 A kind of dual processor system network security partition method based on SRIO interfacings
CN108055244B (en) * 2017-11-27 2020-09-08 珠海市鸿瑞信息技术股份有限公司 SRIO interface technology-based network security isolation method for dual-processing system
CN108089184A (en) * 2017-12-08 2018-05-29 中国船舶重工集团公司第七二四研究所 A kind of TWS radar targets spatial position grouping parallel tracking processing method
CN108153561A (en) * 2017-12-18 2018-06-12 北京遥测技术研究所 The Ethernet loading method and signal processing system of a kind of DSP and FPGA
CN108153561B (en) * 2017-12-18 2021-12-07 北京遥测技术研究所 Ethernet loading method and signal processing system for DSP and FPGA
CN108169746A (en) * 2017-12-21 2018-06-15 南京理工大学 Chirp Semi-active RADAR guidance header signal processing method
CN108169746B (en) * 2017-12-21 2021-09-21 南京理工大学 Linear frequency modulation pulse semi-active radar seeker signal processing method
CN108462620A (en) * 2018-02-11 2018-08-28 北京控制工程研究所 A kind of Gb SpaceWire bus systems
CN108462620B (en) * 2018-02-11 2020-10-20 北京控制工程研究所 Gilbert-level SpaceWire bus system
CN108763011A (en) * 2018-03-27 2018-11-06 中国电子产品可靠性与环境试验研究所((工业和信息化部电子第五研究所)(中国赛宝实验室)) SoC chip check figure detection method, device, system and storage medium
CN108710596A (en) * 2018-05-10 2018-10-26 中国人民解放军空军工程大学 It is a kind of to assist the desktop of processing card is super to calculate hardware platform based on DSP and FPGA more
CN108717187A (en) * 2018-05-23 2018-10-30 桂林电子科技大学 Based on multinuclear digital signal processor through-wall radar motion target tracking imaging method
CN109001688A (en) * 2018-05-28 2018-12-14 中国电子科技集团公司第二十九研究所 A kind of intermediate data storage method and device based on radar signal parallel processing
CN109001688B (en) * 2018-05-28 2022-08-02 中国电子科技集团公司第二十九研究所 Intermediate data storage method and device based on radar signal parallel processing
CN108828556A (en) * 2018-07-12 2018-11-16 北京大汉正源科技有限公司 laser radar control system
CN109239688A (en) * 2018-07-31 2019-01-18 电子科技大学 A kind of efficient Doppler filter group realized based on FPGA
CN108733617A (en) * 2018-09-20 2018-11-02 电信科学技术第五研究所有限公司 64 parallel-by-bits of Fibre channel scramble the FPGA implementation method of descrambler
CN109557510A (en) * 2018-11-30 2019-04-02 安徽四创电子股份有限公司 A kind of LFMCW Radar Signals processor
CN109581170A (en) * 2018-12-06 2019-04-05 贵州电网有限责任公司 A kind of high frequency response bandwidth Lightning Over-voltage on-line monitoring system
CN109782890A (en) * 2018-12-11 2019-05-21 广东高云半导体科技股份有限公司 A kind of electronic equipment and its low-power consumption FPGA device
CN109856599A (en) * 2018-12-24 2019-06-07 南京理工大学 A kind of array radar signal processing system and method based on DSP and server
CN111666106A (en) * 2019-03-07 2020-09-15 慧与发展有限责任合伙企业 Data offload acceleration from multiple remote chips
CN110309086A (en) * 2019-05-17 2019-10-08 全球能源互联网研究院有限公司 A kind of multichannel low speed mouth and single channel high speed port data interactive method
CN110275141A (en) * 2019-06-26 2019-09-24 西安电子科技大学 Radar signal processing circuit, encapsulation and implementation method based on sip technique
CN110441739A (en) * 2019-07-02 2019-11-12 中国航空工业集团公司雷华电子技术研究所 A method of improving radar SRIO transmission reliability
CN110489356A (en) * 2019-08-06 2019-11-22 上海商汤智能科技有限公司 Information processing method, device, electronic equipment and storage medium
CN110648273A (en) * 2019-09-27 2020-01-03 中国科学院长春光学精密机械与物理研究所 Real-time image processing apparatus
WO2021057302A1 (en) * 2019-09-29 2021-04-01 中兴通讯股份有限公司 Digital pre-distortion implementation method and system, readable storage medium, and dpd apparatus
CN110825687B (en) * 2019-10-25 2024-01-05 天津津航技术物理研究所 Dual-mode tracking method based on DSP multi-core architecture
CN110825687A (en) * 2019-10-25 2020-02-21 天津津航技术物理研究所 Dual-mode tracking method based on DSP (digital Signal processor) multi-core architecture
CN111190853A (en) * 2019-12-26 2020-05-22 南京理工大学 High-speed communication system between pieces based on EMIF and SRIO interface
CN111130586A (en) * 2019-12-27 2020-05-08 中科院计算技术研究所南京移动通信与计算创新研究院 Frequency conversion method and device
CN111208504A (en) * 2020-02-28 2020-05-29 成都汇蓉国科微系统技术有限公司 PD radar waveform configuration method and device based on DSP
CN111431596A (en) * 2020-03-24 2020-07-17 中星联华科技(北京)有限公司 Signal speed-up method and circuit
CN111431596B (en) * 2020-03-24 2021-04-02 中星联华科技(北京)有限公司 Signal speed-up method and circuit
CN111475362A (en) * 2020-04-20 2020-07-31 西安太乙电子有限公司 Multi-core isomorphic DSP (digital Signal processor) testing system and method
CN111626011B (en) * 2020-04-20 2023-07-07 芯创智(上海)微电子有限公司 FPGA comprehensive rapid iteration method and system based on configurable breakpoint restart
CN111475362B (en) * 2020-04-20 2024-03-01 西安太乙电子有限公司 Multi-core isomorphic DSP processor test system and method
CN111626011A (en) * 2020-04-20 2020-09-04 芯创智(北京)微电子有限公司 FPGA comprehensive rapid iteration method and system based on configurable endpoint restart
CN111983615A (en) * 2020-07-13 2020-11-24 惠州市德赛西威智能交通技术研究院有限公司 Distributed radar signal processing system and device
CN111858384A (en) * 2020-08-04 2020-10-30 上海无线电设备研究所 Efficient test method for constant false alarm detection software unit
CN111858384B (en) * 2020-08-04 2024-01-02 上海无线电设备研究所 Efficient test method for constant false alarm detection software unit
CN112068102A (en) * 2020-09-10 2020-12-11 成都汇蓉国科微系统技术有限公司 Radar signal processing computing power balance design and device
CN112068102B (en) * 2020-09-10 2023-08-25 成都汇蓉国科微系统技术有限公司 Radar signal processing calculation power balance design and device
CN112199320A (en) * 2020-09-28 2021-01-08 西南电子技术研究所(中国电子科技集团公司第十研究所) Multi-channel reconfigurable signal processing device
CN112504428A (en) * 2020-10-19 2021-03-16 威海北洋光电信息技术股份公司 Low-power-consumption embedded high-speed distributed optical fiber vibration sensing system and application thereof
CN112214325A (en) * 2020-10-20 2021-01-12 杭州电子科技大学 FPGA task dynamic arrangement method, device, chip and storage medium
CN112347721A (en) * 2020-10-29 2021-02-09 北京长焜科技有限公司 System for realizing data processing acceleration based on FPGA and acceleration method thereof
CN112347721B (en) * 2020-10-29 2023-05-26 北京长焜科技有限公司 System for realizing data processing acceleration based on FPGA and acceleration method thereof
CN112601234A (en) * 2020-11-20 2021-04-02 中电科仪器仪表(安徽)有限公司 Multi-core DSP-based multi-channel 5G signal demodulation device
CN114116554A (en) * 2021-11-05 2022-03-01 中国航空工业集团公司雷华电子技术研究所 Radar data forwarding architecture and method based on FPGA
CN114296650B (en) * 2021-12-27 2023-09-26 中国航天科工集团八五一一研究所 Intermediate frequency high-capacity acquisition data file processing method based on FPGA
CN114296650A (en) * 2021-12-27 2022-04-08 中国航天科工集团八五一一研究所 Intermediate-frequency large-capacity acquired data file processing method based on FPGA
CN114549275B (en) * 2022-02-10 2024-03-26 中国科学院上海技术物理研究所 Real-time digital image processing method based on double-quad-core DSP signal processing board
CN114549275A (en) * 2022-02-10 2022-05-27 中国科学院上海技术物理研究所 Real-time digital image processing method based on double-quad-core DSP signal processing board
CN114928575A (en) * 2022-06-02 2022-08-19 江苏新质信息科技有限公司 Multi-algorithm core data packet order-preserving method and device based on FPGA
CN114928575B (en) * 2022-06-02 2023-08-11 江苏新质信息科技有限公司 Multi-algorithm core data packet order preservation method and device based on FPGA
CN115017081A (en) * 2022-06-30 2022-09-06 重庆秦嵩科技有限公司 Multi-path SRIO interface clock resource sharing system based on domestic FPGA
CN115017081B (en) * 2022-06-30 2023-06-23 重庆秦嵩科技有限公司 Multipath SRIO interface clock resource sharing system based on domestic FPGA
CN116366084A (en) * 2022-09-07 2023-06-30 无锡国芯微电子系统有限公司 Multi-channel PDW high-density real-time processing method based on FPGA
CN116366084B (en) * 2022-09-07 2023-11-10 无锡国芯微电子系统有限公司 Multi-channel PDW high-density real-time processing method based on FPGA
CN115856824B (en) * 2023-01-10 2023-05-09 电子科技大学 Multi-phase parallel distance dimension CFAR realization method with low resource consumption
CN115856824A (en) * 2023-01-10 2023-03-28 电子科技大学 Multiphase parallel distance dimension CFAR implementation method with low resource consumption
CN116106851A (en) * 2023-04-04 2023-05-12 中国科学院空天信息创新研究院 Method and device for compressing raw data of synthetic aperture radar
CN116796674B (en) * 2023-08-24 2023-11-24 上海合见工业软件集团有限公司 Heterogeneous hardware simulation method and system
CN116796674A (en) * 2023-08-24 2023-09-22 上海合见工业软件集团有限公司 Heterogeneous hardware simulation method and system
CN116974235A (en) * 2023-09-22 2023-10-31 天津工业大学 Multichannel data acquisition card control system and method based on ZYNQ

Also Published As

Publication number Publication date
CN105045763B (en) 2018-07-13

Similar Documents

Publication Publication Date Title
CN105045763A (en) FPGA (Field Programmable Gata Array) and multi-core DSP (Digital Signal Processor) based PD (Pulse Doppler) radar signal processing system and parallel realization method therefor
CN102929158B (en) Multi-core multi-model parallel distributed type real-time simulation system
CN108051786A (en) A kind of broadband target simulator verification platform and verification method
CN109388595A (en) High-bandwidth memory systems and logic dice
CN109946666A (en) MMW RADAR SIGNAL USING processing system based on MPSoC
CN105468568B (en) Efficient coarseness restructurable computing system
CN109828941A (en) AXI2WB bus bridge implementation method, device, equipment and storage medium
CN207817702U (en) Data processing system for improving data processing speed
CN111736115B (en) MIMO millimeter wave radar high-speed transmission method based on improved SGDMA + PCIE
CN103310850A (en) Built-in self-test structure and method for on-chip network resource node storage device
CN106209693A (en) High Speed Data Collection Method based on network-on-chip
CN109521400A (en) Radar Signal Processing platform based on FPGA, DSP and ARM
CN104850516B (en) A kind of DDR Frequency Conversion Designs method and apparatus
CN104545902A (en) Four stage flow line digital signal processor and wireless on-chip system chip with same
CN104714907B (en) A kind of pci bus is converted to ISA and APB bus design methods
CN103092119B (en) A kind of bus state supervision method based on FPGA
CN109446740A (en) A kind of system on chip framework performance emulation platform
CN109856599A (en) A kind of array radar signal processing system and method based on DSP and server
Huang et al. FPGA-based IoT sensor HUB
CN105718349B (en) Across die interface monitoring or global observing prioritisation of messages
CN103166863A (en) Lumped type 8 X 8 low-latency and high-bandwidth crosspoint cache queued on-chip router
CN103455714B (en) Time consumption calculating method of FPGA (Field Programmable Gate Array)-based DPR SoC self-reconfiguration system and application thereof
CN105893036A (en) Compatible accelerator extension method for embedded system
CN201909847U (en) Double-channel digital signal acquisition device on basis of VXI (VME <Virtual Machine Enviroment> bus Extension for Instrumentation) interface
CN103336752A (en) Microcontroller real-time data transmission device and method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20220927

Address after: Building 1, No. 768, Jianghong Road, Changhe Street, Binjiang District, Hangzhou City, Zhejiang Province, 310051

Patentee after: Hangzhou Leishi Technology Co.,Ltd.

Address before: 100191 No. 37, Haidian District, Beijing, Xueyuan Road

Patentee before: BEIHANG University