In the DSP+FPGA framework, improve the system and method for signal real-time mode recognizing processing speed
Technical field
The present invention relates to utilize the external memory interface (EMIF) of DSP to finish the signal transmission between DSP and the FPGA and replace DSP and finish a solution that pattern classification improves the data processing speed of whole pattern recognition system with FPGA.Belong to electronic information field.
Background technology
Increasingly extensive along with the embedded technology application, and the inevitable Intelligent Developing Trends of embedded technology, also increasing for the demand of embedded mode identification technology.Occasion for needs carry out pattern-recognition often can collect a large amount of information, needs at short notice these information to be concluded refinement, thereby obtains accurate, the terse description of target.The bottleneck of embedded pattern-recognition just is to be difficult to guarantee the speed of signal mode identification processing, the have relatively high expectations application scenario of (such as video signal, network data flow etc.) of, result real-time big for amount of input information, traditional technological means often can't satisfy the requirement on the processing speed.
In present embedded system, based on the Architecture characteristic of DSP+FPGA be exactly FPGA with its flexibly programmability be responsible for external interface and sequential control part, main signal Processing computing is then finished by DSP, has made full use of the arithmetic capability of DSP like this.In this framework, though considered the advantage of FPGA on clock signal produces, but the advantage of FPGA in parallel computation do not obtain utilizing, and simple sequential control often can only be used in the FPGA resource seldom, a large amount of resources is by idle, this programme is considered this point exactly, in FPGA, put into the neural network classifier that is particularly suitable for parallel computation, FPGA can replace DSP to finish classification link in its pattern-recognition like this, improved the concurrency of whole signal processing flow, thereby improved the processing speed of system, the application scenario that the real-time mode recognizing of satisfying the demand requires.
DSP in the scheme uses the TMS320C6000 series of TI company, and this serial EMIF interface is supported the seamless link of various external devices, comprises SRAM, SDRAM, ROM, FIFO and the outside device or the like of sharing.External memory space is divided into four independently storage spaces (CE space), selects CE line and corresponding CE spatial control register controlled by 4 outer plate.
Summary of the invention
The objective of the invention is to deficiency, a kind of system and method that improves signal real-time mode recognizing processing speed in the DSP+FPGA framework is provided, can make the conversion speed of this framework improve 30~50% at the prior art existence.
This method is utilized the computation capability of FPGA, the neural network classifier that originally need handle in DSP is put among the FPGA handles, and has shared the burden of DSP.Use the EDMA mode to communicate on the EMIF bus between DSP and FPGA, communication process does not take the CPU time sheet, and DSP goes up the use multithreading, and DSP is idle when guaranteeing to classify processing in FPGA, accomplishes the parallel processing of DSP and FPGA.
For achieving the above object, design of the present invention is:
Embedded Real-Time pattern recognition system based on the DSP+FPGA framework, with DSP is main process chip, FPGA is association's process chip, storer is furnished with SDRAM and FLASH, SDRAM is as primary memory, DSP is provided the internal memory support in when work, and FLASH is as supplementary storage, utilizes its characteristics of obliterated data not of cutting off the power supply to preserve the weights data of the neural network in startup vectoring information, routine data and the FPGA of DSP.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.System also is furnished with peripheral modules such as signals collecting, control automatically, output demonstration and man-machine interaction except above core, but irrelevant with core content of the present invention, so introduce no longer in detail.
In 4 external memory spaces (CE space) of DSP, CE0 is configured to synchronous space, distributes to main storage device SDRAM, and CE1 and CE2 are configured to asynchronous space, distribute to the internal RAM of FLASH and FPGA respectively.Address wire, data line and the control line of the EMIF interface of DSP also need to be connected on the pin of FPGA except the respective pins that will be connected to SDRAM and FLASH.
For signal to be identified, its entire process flow process is as follows:
1.DSP obtain signal to be identified by the signals collecting thread.
2.DSP carry out pre-service for the signal that collects, obtain several interested targets in the signal, these targets are exactly the main body that need carry out pattern-recognition.
3.DSP carry out feature extraction for each interested target, the feature packing with extracting sends to FPGA by the EMIF bus in enhancement mode direct memory visit (EDMA) mode and carries out discriminator.
4.FPGA utilize ram in slice to receive the feature bag that DSP transmits, with being input in the neural network classifier module in the FPGA of feature correspondence, through the classification results after this module again anti-pass pull over and temporarily store in the ram in slice, when DSP needs, this result is sent to DSP by the EMIF bus.
5.DSP visit the address of RAM in the FPGA by the mode of inquiry, the classification results number of also not brought back by DSP in the FPGA has been write down in this address, if should count greater than 0, then DSP reads a classification results from FPGA, and this result is carried out corresponding subsequent processing and output control.
According to the foregoing invention design, the present invention adopts following technical proposals:
A kind of system that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed.It is characterized in that system architecture is to be built into the signal real-time mode recognizing core with DSP, FPGA, SDRAM and 4 chips of FLASH, wherein DSP is as main process chip, FPGA is as association's process chip, SDRAM is as primary memory, internal memory support when DSP work is provided, FLASH is as supplementary storage.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.
The EMIF interface of above-mentioned DSP has a plurality of CE space, be CE0~CE3, connect the main storage device SDRAM of DSP with one of them, another CE space connects auxiliary storage device FLASH, and the 3rd CE space then connects an External memory equipment of being simulated by described FPGA ram in slice; Data line, address wire and read-write control line in the EMIF interface of described DSP all also needs all to be connected on the pin of described FPGA together with the route selection of corresponding C E spatial pieces except described SDRAM of being connected to of routine and described FLASH again.
A kind of method that improves signal real-time mode recognizing processing speed in the DSP+FPGA framework adopts the above-mentioned system that improves signal real-time mode recognizing processing speed in the DSP+FPGA framework to carry out signal Processing, it is characterized in that whole signal processing flow is:
1. signals collecting is finished by DSP;
2. Signal Pretreatment and feature extraction are finished by DSP;
3. neural network classification is finished by FPGA;
4. the treatment classification result is finished by DSP.
For cooperating above-mentioned flow process, use multithreading to realize 4 threads among the DSP, be respectively main thread, signals collecting, signal Processing and result treatment thread:
Main thread is the higher management of other 3 threads, and its flow process is:
1. finish the DSP initialization;
2. start other 3 threads;
3. enter waiting status.
The signals collecting thread is finished the collection of input signal, and its flow process is:
1. initialization collecting device;
2. open the collection port;
3. waiting signal input if having, then enters step 4; Otherwise continue to wait for;
4. the signal that collects is put into a formation on main storage device SDRAM---input signal formation, got back to step 3 then.
The signal Processing thread is finished the pre-service and the feature extraction of signal, and its flow process is:
1. judge whether the input signal formation is empty,, then continue to judge if empty; Otherwise enter step 2;
2. from the input signal formation, read one group of input signal;
3. input signal is carried out pre-service;
4. the interesting target in the detection input signal, these targets are exactly the main body that need carry out pattern-recognition;
5. judge the also quantity of untreated interesting target, if quantity greater than 0, then enters step 6; Otherwise get back to step 1;
6. to a untreated interesting target, it is carried out feature extraction;
7. with the characteristic generating feature bag that extracts in the step 6;
8. trigger the enhancement mode direct memory visit (EDMA) between DSP and the FPGA, the feature bag is passed to FPGA by the EMIF bus, get back to step 5 then.
The result treatment thread is finished the processing of classification results, and its flow process is
1. read the value of a ram register on the FPGA, this register is writing down also not processed classification results number;
2. whether the value of reading in the determining step 1 is greater than 0, if then enter step 3, otherwise get back to step 1;
3. change the value of the ram register on the FPGA in the step 1, make it reduce 1;
4. from FPGA, read the result of a neural network classification;
5. classification results is carried out respective handling;
6. carry out man-machine interaction and Decision Control, get back to step 1 then.
In addition, in whole pattern-recognition flow process, FPGA is the work that DSP has shared neural network classifier, and its workflow is:
1. during system start-up, described FPGA reads in the weights data of neural network by the EMIF bus from FLASH, finish this work by the weights initialization module in the FPGA;
2. when described DSP triggers EDMA transmission feature bag data to FPGA, FPGA receives these data by RAM and RAM control module, wherein the RAM module receives the data on the EMIF data line, the RAM control module receives the signal on EMIF address wire and the control line, and the write address that produces RAM uses for the RAM module;
3. after receiving the feature bag data that DSP sends, neural network classifier module in the FPGA is read these feature bag data from the RAM module, carry out behind the neural network classification result being sent back in the RAM module, in this process, the neural network classifier module need be used the weights in the weights initialization module, and simultaneously the RAM control module is responsible for the read/write address coordinating and control the read-write state of RAM and RAM is provided;
4. when DSP need read classification results among the FPGA, FPGA sends these data by RAM and RAM control module, wherein the RAM module sends to data on the EMIF data line, and the RAM control module receives the signal on EMIF address wire and the control line, and use for the RAM module address of reading that produces RAM.
The present invention compares with existing correlation technique, has following advantage:
1. existing DSP+FPGA scheme only limits to aspects such as input and output control, sequential control or signal switching for the utilization of FPGA, do not make full use of the advantage of FPGA parallel computation, the present invention has well improved this problem, realize being more suitable for being put among the FPGA, make FPGA become truly association's process chip in the neural network classifier of parallel computation.
2.DSP on utilize multithreading, can well cooperate with FPGA, carry out the branch time-like at FPGA, DSP need not wait for its result, but can do other thing, because FPGA is the work that DSP has shared sorter, makes the signal Processing cycle of DSP shorten, the parallel signal of DSP and FPGA is handled the speed of pattern-recognition is promoted greatly like this.
Description of drawings
Fig. 1 system forms structural representation.
Fig. 2 signal processing flow synoptic diagram.
Fig. 3 DSP main thread flow process figure.
Fig. 4 DSP signals collecting thread process flow diagram.
Fig. 5 DSP signal Processing thread process flow diagram.
Fig. 6 DSP result treatment thread process flow diagram.
Fig. 7 FPGA internal module structural drawing.
Embodiment
Below in conjunction with accompanying drawing the specific embodiment of the present invention is described in detail.
Referring to Fig. 1, this system that improves signal real-time mode recognizing processing speed in the DSP+FPGA framework is built into the signal real-time mode recognizing core with DSP, FPGA, SDRAM and 4 hardware chips of FLASH, wherein DSP is as main process chip, FPGA is as association's process chip, SDRAM is as primary memory, internal memory support when DSP work is provided, FLASH is as supplementary storage.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.
System at Fig. 1 forms in the structural representation, selecting dsp chip is the TMS320DM642 of TI, and fpga chip is the EP2C20 of Altera, and SDRAM is with 4 MT48LC16M16A2FG, amount to capacity 128MB and 64 position datawires, FLASH selects AM29LV033C-WD (4MB).
The CE0 of DSP links the CS pin of 4 SDRAM, and CE1 links the CS pin of FLASH, and CE2 inserts FPGA.In addition, the data line of EMIF, address wire and control line link to each other with data line, address wire, the control line of SDRAM and FLASH respectively, and they also insert FPGA together simultaneously.
Figure 2 shows that whole signal processing flow, the feature bag that DSP obtains carrying out pattern classification by signals collecting, pre-service and feature extraction, DSP sends to FPGA by the EDMA mode from the EMIF bus with this feature bag.FPGA temporarily stores the result after characteristic is carried out neural network classification.DSP reads classification results in FPGA when the needs classification results carries out subsequent treatment, export after finishing corresponding subsequent treatment.In the described flow process, signals collecting is finished by the signals collecting thread of DSP, and pre-service and feature extraction are finished by the signal Processing thread of DSP, and neural network classification is finished by the FPGA internal module, and the treatment classification result is finished by the result treatment thread of DSP.
Fig. 3 is the main thread flow process figure of DSP, and main thread starts other thread after finishing the DSP initialization, and self enters waiting status.
Fig. 4 is the signals collecting thread process flow diagram of DSP, at the initialization collecting device with after opening the collection port, thread begins the waiting signal input, in case new signal input is arranged, then the data content with this signal joins in the input signal formation, this formation is one section global memory on the SDRAM, and we realize that with program it has the data structure of first-in first-out characteristic.
Fig. 5 is the signal Processing thread process flow diagram of DSP, if the input signal formation is not empty, then from wherein reading an input signal, this signal is carried out pre-service, detect interesting target wherein,, then each interesting target is carried out feature extraction if having some interesting targets in the signal, after the feature packing of extracting, send to FPGA by the EDMA mode.
Fig. 6 is the result treatment thread process flow diagram of DSP, thread judges by the value that reads a ram register on the FPGA whether FPGA has also not processed classification results, if having, then reads a classification results from FPGA, it is carried out result treatment, export at last and Decision Control.
Fig. 7 is a FPGA internal module structural drawing, relevant with the present invention among the FPGA have RAM module, RAM control module, weights initialization module and a neural network classifier module, the solid line line of each intermodule is a data bus connection among the figure, and the dotted line line is the control line, and the connected mode of intermodule is specially:
1.RAM module externally is the FPGA pin, 64 data lines that connect the EMIF interface of DSP, the read-write control line of RAM and address wire connect the RAM control module, the data line of RAM is except connecing the FPGA pin, be also connected to the neural network classifier module, the input data of neural network classifier module and the output data that receives the neural network classifier module are provided.
2.RAM control module externally is the FPGA pin, the address wire, read-write control line and the CE space select lines (CE2) that connect the EMIF interface of DSP, the RAM control module is the status signal of Connection Neural Network classifier modules and the read-write RAM signal of neural network classifier module also, by receiving DSP and neural network classifier module read-write state and demand to RAM, the RAM control module produces read-write control signal and the address signal of RAM, produces the control signal of neural network classifier module simultaneously.
3. the neural network classifier module connects the data line of RAM module, its state and the read-write demand of RAM is connected into the RAM control module by internal signal wire, and simultaneously, the neural network classifier module receives the control signal of RAM control module to it.In the neural network classification process, neural network classifier module value initialization as a matter of expediency module is read in the required weights of computing.
4. the weights initialization module externally is the FPGA pin, FLASH associated data line, address wire and control line in the EMIF interface of connection DSP.In system start-up, the weights initialization module is read the weights of neural network from FLASH.When the neural network classifier module was carried out computing, the weights initialization module provided weights.It should be noted that, some is identical to the external pin of FPGA of the external pin of the FPGA of weights initialization module and RAM module and RAM control module physically, but this can't cause conflict, because the external pin of weights initialization module only works in system start-up, can not use afterwards, the external pin of FPGA with RAM module and RAM control module does not conflict in time.
This improves signal real-time mode recognizing processing speed in the DSP+FPGA framework method adopts said system to carry out signal Processing, it is characterized in that whole signal processing flow is:
1. signals collecting is finished by DSP;
2. Signal Pretreatment and feature extraction are finished by DSP;
3. neural network classification is finished by FPGA;
4. the treatment classification result is finished by DSP.
For cooperating above-mentioned flow process, use multithreading to realize 4 threads among the DSP, be respectively main thread, signals collecting, signal Processing and result treatment thread:
Main thread is the higher management of other 3 threads, and its flow process is:
1. finish the DSP initialization;
2. start other 3 threads;
3. enter waiting status.
The signals collecting thread is finished the collection of input signal, and its flow process is:
1. initialization collecting device;
2. open the collection port;
3. waiting signal input if having, then enters step 4; Otherwise continue to wait for;
4. the signal that collects is put into a formation on main storage device SDRAM---input signal formation, got back to step 3 then.
The signal Processing thread is finished the pre-service and the feature extraction of signal, and its flow process is:
1. judge whether the input signal formation is empty,, then continue to judge if empty; Otherwise enter step 2;
2. from the input signal formation, read one group of input signal;
3. input signal is carried out pre-service;
4. the interesting target in the detection input signal, these targets are exactly the main body that need carry out pattern-recognition;
5. judge the also quantity of untreated interesting target, if quantity greater than 0, then enters step 6; Otherwise get back to step 1;
6. to a untreated interesting target, it is carried out feature extraction;
7. with the characteristic generating feature bag that extracts in the step 6;
8. trigger the enhancement mode direct memory visit (EDMA) between DSP and the FPGA, the feature bag is passed to FPGA by the EMIF bus, get back to step 5 then.
The result treatment thread is finished the processing of classification results, and its flow process is
1. read the value of a ram register on the FPGA, this register is writing down also not processed classification results number;
2. whether the value of reading in the determining step 1 is greater than 0, if then enter step 3, otherwise get back to step 1;
3. change the value of the ram register on the FPGA in the step 1, make it reduce 1;
4. from FPGA, read the result of a neural network classification;
5. classification results is carried out respective handling;
6. carry out man-machine interaction and Decision Control, get back to step 1 then.
In addition, in whole pattern-recognition flow process, FPGA is the work that DSP has shared neural network classifier, and its workflow is:
1. during system start-up, described FPGA reads in the weights data of neural network by the EMIF bus from FLASH, finish this work by the weights initialization module in the FPGA;
2. when described DSP triggers EDMA transmission feature bag data to FPGA, FPGA receives these data by RAM and RAM control module, wherein the RAM module receives the data on the EMIF data line, the RAM control module receives the signal on EMIF address wire and the control line, and the write address that produces RAM uses for the RAM module;
3. after receiving the feature bag data that DSP sends, neural network classifier module in the FPGA is read these feature bag data from the RAM module, carry out behind the neural network classification result being sent back in the RAM module, in this process, the neural network classifier module need be used the weights in the weights initialization module, and simultaneously the RAM control module is responsible for the read/write address coordinating and control the read-write state of RAM and RAM is provided;
4. when DSP need read classification results among the FPGA, FPGA sends these data by RAM and RAM control module, wherein the RAM module sends to data on the EMIF data line, and the RAM control module receives the signal on EMIF address wire and the control line, and use for the RAM module address of reading that produces RAM.
Certainly; the above only is a kind of preferred implementation of the present invention; should be understood that; for those skilled in the art; under the prerequisite that does not break away from the principle of the invention; can also make some improvements and modifications, these improvements and modifications also should be considered as protection scope of the present invention.