In the DSP+FPGA framework, improve the system and method for signal real-time mode recognizing processing speed
Technical field
The present invention relates to utilize external memory interface (EMIF) completion DSP and the signal transmission between the FPGA of DSP and replace DSP and accomplish a solution that pattern classification improves the data processing speed of whole PRS with FPGA.Belong to electronic information field.
Background technology
Increasingly extensive along with the embedded technology application, and the inevitable Intelligent Developing Trends of embedded technology, also increasing for the demand of embedded mode identification technology.Occasion for needs carry out pattern-recognition often can collect great deal of information, needs at short notice these information to be concluded refinement, thereby obtains accurate, the terse description of target.The bottleneck of embedded pattern-recognition just is to be difficult to guarantee the speed of signal mode identification processing; The have relatively high expectations application scenario of (such as video signal, network data flow etc.) of, result real-time big for amount of input information, traditional technological means often can't satisfy the requirement on the processing speed.
In present embedded system, based on the Architecture characteristic of DSP+FPGA be exactly FPGA with its flexibly programmability be responsible for external interface and sequential control part, main signal Processing computing is then accomplished by DSP, has made full use of the arithmetic capability of DSP like this.In this framework, though considered the advantage of FPGA on clock signal produces, the advantage of FPGA in parallel computation do not obtain utilizing; And simple sequential control often can only be used in the FPGA resource seldom; A large amount of resources is by idle, and this programme is considered this point exactly, in FPGA, puts into the neural network classifier that is particularly suitable for parallel computation; FPGA can replace DSP to accomplish the classification link in its pattern-recognition like this; Improved the concurrency of whole signal processing flow, thereby improved the processing speed of system, the application scenario that the real-time mode recognizing of satisfying the demand requires.
DSP in the scheme uses the TMS320C6000 series of TI company, and this serial EMIF interface is supported the seamless link of various external devices, comprises SRAM, SDRAM, ROM, FIFO and the outside device or the like of sharing.External memory space is divided into four independently storage spaces (CE space), selects CE line and corresponding CE spatial control register controlled by 4 outer plate.
Summary of the invention
The objective of the invention is to deficiency, a kind of system and method that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed is provided, can make the conversion speed of this framework improve 30~50% to the prior art existence.
This method is utilized the computation capability of FPGA, and the neural network classifier that script need be handled in DSP is put among the FPGA to be handled, and has shared the burden of DSP.Use the EDMA mode on the EMIF bus, to communicate between DSP and FPGA, communication process does not take the CPU time sheet, and DSP goes up the use multithreading, and DSP is idle when guaranteeing in FPGA, to carry out classification processing, accomplishes the parallel processing of DSP and FPGA.
For realizing above-mentioned purpose, design of the present invention is:
Embedded Real-Time PRS based on the DSP+FPGA framework; With DSP is main process chip, and FPGA is association's process chip, and storer is furnished with SDRAM and FLASH; SDRAM is as primary memory; DSP is provided the internal memory support in when work, and FLASH is as supplementary storage, utilizes its characteristics of obliterated data not of cutting off the power supply to preserve the weights data of the neural network in startup vectoring information, routine data and the FPGA of DSP.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.System also is furnished with peripheral modules such as signals collecting, control automatically, output demonstration and man-machine interaction except above core, but irrelevant with core content of the present invention, so introduce no longer in detail.
In 4 external memory spaces (CE space) of DSP, CE0 is configured to synchronous space, distributes to main storage device SDRAM, and CE1 and CE2 are configured to asynchronous space, distribute to the internal RAM of FLASH and FPGA respectively.Address wire, data line and the control line of the EMIF interface of DSP also need be connected on the pin of FPGA except the respective pins that will be connected to SDRAM and FLASH.
For signal to be identified, its entire process flow process is following:
1.DSP obtain signal to be identified through the signals collecting thread.
2.DSP carry out pre-service for the signal that collects, obtain several interested targets in the signal, these targets are exactly the main body that need carry out pattern-recognition.
3.DSP carry out feature extraction for each interested target, the characteristic packing with extracting sends to FPGA through the EMIF bus with enhancement mode direct memory visit (EDMA) mode and carries out discriminator.
4.FPGA utilize ram in slice to receive the characteristic bag that DSP transmits; With corresponding being input in the neural network classifier module in the FPGA of characteristic; Through the classification results after this module again anti-pass pull over and temporarily store in the ram in slice, when DSP needs, this result is sent to DSP through the EMIF bus.
5.DSP visit the address of RAM in the FPGA through the mode of inquiry; The classification results number of also not brought back by DSP in the FPGA has been write down in this address; If should count greater than 0, then DSP reads a classification results from FPGA, and this result is carried out corresponding subsequent processing and output control.
According to the foregoing invention design, the present invention adopts following technical proposals:
A kind of system that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed.It is characterized in that system architecture is to be built into the signal real-time mode recognizing core with DSP, FPGA, SDRAM and 4 chips of FLASH; Wherein DSP is as main process chip; FPGA is as association's process chip; SDRAM is as primary memory, the internal memory support when DSP work is provided, and FLASH is as supplementary storage.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.
The EMIF interface of above-mentioned DSP has a plurality of CE space; Be CE0~CE3; Connect the main storage device SDRAM of DSP with one of them, another CE space connects auxiliary storage device FLASH, and the 3rd CE space then connects an External memory equipment by said FPGA ram in slice simulation; Data line, address wire and read-write control line in the EMIF interface of said DSP all also needs all to be connected on the pin of said FPGA together with the route selection of corresponding C E spatial pieces except said SDRAM of being connected to of routine and said FLASH again.
A kind of method that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed adopts the above-mentioned system that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed to carry out signal Processing, it is characterized in that whole signal processing flow is:
1. signals collecting is accomplished by DSP;
2. Signal Pretreatment and feature extraction accomplished by DSP;
3. neural network classification is accomplished by FPGA;
4. the treatment classification result is accomplished by DSP.
For cooperating above-mentioned flow process, use multithreading to realize 4 threads among the DSP, be respectively main thread, signals collecting, signal Processing and result treatment thread:
Main thread is the higher management of other 3 threads, and its flow process is:
1. accomplish the DSP initialization;
2. start other 3 threads;
3. entering waiting status.
The signals collecting thread is accomplished the collection of input signal, and its flow process is:
1. initialization collecting device;
2. open the collection port;
3. waiting signal input if having, then gets into step 4; Otherwise continue to wait for;
4. the signal that collects is put into a formation on main storage device SDRAM---input signal formation, get back to step 3 then.
The signal Processing thread is accomplished the pre-service and the feature extraction of signal, and its flow process is:
1. judge whether the input signal formation is empty,, then continue to judge if empty; Otherwise get into step 2;
2. from the input signal formation, read one group of input signal;
3. input signal is carried out pre-service;
4. the interesting target in the detection input signal, these targets are exactly the main body that need carry out pattern-recognition;
5. judge the also quantity of untreated interesting target, if quantity greater than 0, then gets into step 6; Otherwise get back to step 1;
6. to a untreated interesting target, it is carried out feature extraction;
7. with the characteristic generating feature bag that extracts in the step 6;
8. trigger the enhancement mode direct memory visit (EDMA) between DSP and the FPGA, the characteristic bag is passed to FPGA through the EMIF bus, get back to step 5 then.
The result treatment thread is accomplished the processing of classification results, and its flow process does
1. read the value of a ram register on the FPGA, this register is writing down the classification results number that also is not processed;
2. whether the value of reading in the determining step 1 is greater than 0, if then get into step 3, otherwise get back to step 1;
3. change the value of the ram register on the FPGA in the step 1, make it reduce 1;
4. from FPGA, read the result of a neural network classification;
5. classification results is carried out handled;
6. carry out man-machine interaction and Decision Control, get back to step 1 then.
In addition, in whole pattern-recognition flow process, FPGA is the work that DSP has shared neural network classifier, and its workflow is:
1. during system start-up, said FPGA reads in the weights data of neural network through the EMIF bus from FLASH, accomplish this work by the weights initialization module in the FPGA;
2. when said DSP triggers EDMA transmission characteristic bag data to FPGA; FPGA receives these data by RAM and RAM control module; Wherein the RAM module receives the data on the EMIF data line; The RAM control module receives the signal on EMIF address wire and the control line, and the write address that produces RAM supplies the RAM module to use;
3. after receiving the characteristic bag data that DSP sends; Neural network classifier module in the FPGA is read these characteristic bag data from the RAM module; Carry out behind the neural network classification result being sent back in the RAM module; In this process, the neural network classifier module need be used the weights in the weights initialization module, and simultaneously the RAM control module is responsible for the read/write address coordinating and control the read-write state of RAM and RAM is provided;
4. when DSP need read the classification results among the FPGA; FPGA sends these data by RAM and RAM control module; Wherein the RAM module sends to data on the EMIF data line, and the RAM control module receives the signal on EMIF address wire and the control line, and the address of reading that produces RAM supplies the RAM module to use.
The present invention compares with existing correlation technique, has following advantage:
1. existing DSP+FPGA scheme only limits to aspects such as input and output control, sequential control or signal switching for the utilization of FPGA; Do not make full use of the advantage of FPGA parallel computation; The present invention has well improved this problem; Realize being more suitable for being put among the FPGA, make FPGA become association's process chip truly in the neural network classifier of parallel computation.
2.DSP on utilize multithreading; Can well cooperate with FPGA, carry out the branch time-like at FPGA, DSP need not wait for its result; But can do other thing; Because FPGA is the work that DSP has shared sorter, makes the signal Processing cycle of DSP shorten, the parallel signal of DSP and FPGA is handled the speed of pattern-recognition is promoted greatly like this.
Description of drawings
Fig. 1 system forms structural representation.
Fig. 2 signal processing flow synoptic diagram.
Fig. 3 DSP main thread flow process figure.
Fig. 4 DSP signals collecting thread process flow diagram.
Fig. 5 DSP signal Processing thread process flow diagram.
Fig. 6 DSP result treatment thread process flow diagram.
Fig. 7 FPGA internal module structural drawing.
Embodiment
Describe in detail below in conjunction with the accompanying drawing specific embodiments of the invention.
Referring to Fig. 1; This system that in the DSP+FPGA framework, improves signal real-time mode recognizing processing speed is built into the signal real-time mode recognizing core with DSP, FPGA, SDRAM and 4 hardware chips of FLASH; Wherein DSP is as main process chip, and FPGA is as association's process chip, and SDRAM is as primary memory; Internal memory support when DSP work is provided, FLASH is as supplementary storage.DSP, FPGA, SDRAM and FLASH all are connected on the EMIF bus of DSP, make things convenient for them to carry out data interaction mutually.
System at Fig. 1 forms in the structural representation; Selecting dsp chip is the TMS320DM642 of TI, and fpga chip is the EP2C20 of Altera, and SDRAM is with 4 MT48LC16M16A2FG; Amount to capacity 128MB and 64 position datawires, FLASH selects AM29LV033C-WD (4MB).
The CE0 of DSP links the CS pin of 4 SDRAM, and CE1 links the CS pin of FLASH, and CE2 inserts FPGA.In addition, the data line of EMIF, address wire and control line link to each other with data line, address wire, the control line of SDRAM and FLASH respectively, and they also insert FPGA together simultaneously.
Shown in Figure 2 is whole signal processing flow, the characteristic bag that DSP obtains carrying out pattern classification through signals collecting, pre-service and feature extraction, and DSP sends to FPGA through the EDMA mode from the EMIF bus with this characteristic bag.FPGA temporarily stores the result after characteristic is carried out neural network classification.DSP reads classification results in FPGA when the needs classification results carries out subsequent treatment, export after accomplishing corresponding subsequent treatment.In the described flow process, signals collecting is accomplished by the signals collecting thread of DSP, and pre-service and feature extraction are accomplished by the signal Processing thread of DSP, and neural network classification is accomplished by the FPGA internal module, and the treatment classification result is accomplished by the result treatment thread of DSP.
Fig. 3 is the main thread flow process figure of DSP, and main thread starts other thread after accomplishing the DSP initialization, and self gets into waiting status.
Fig. 4 is the signals collecting thread process flow diagram of DSP; At the initialization collecting device with after opening the collection port; Thread begins the waiting signal input, in case new signal input is arranged, then the data content with this signal joins in the input signal formation; This formation is one section global memory on the SDRAM, and we realize that with program it has the data structure of FIFO characteristic.
Fig. 5 is the signal Processing thread process flow diagram of DSP, if the input signal formation is not empty, then from wherein reading an input signal; This signal is carried out pre-service; Detect interesting target wherein,, then each interesting target is carried out feature extraction if having some interesting targets in the signal; After the characteristic packing of extracting, send to FPGA through the EDMA mode.
Fig. 6 is the result treatment thread process flow diagram of DSP; Thread judges through the value that reads a ram register on the FPGA whether FPGA has the classification results that also is not processed, if having, then from FPGA, reads a classification results; It is carried out result treatment, export at last and Decision Control.
Fig. 7 is a FPGA internal module structural drawing; Relevant with the present invention among the FPGA have RAM module, RAM control module, weights initialization module and a neural network classifier module; The solid line line of each intermodule is a data bus connection among the figure, and the dotted line line is the control line, and the connected mode of intermodule is specially:
1.RAM module externally is the FPGA pin; 64 data lines that connect the EMIF interface of DSP; The read-write control line of RAM and address wire connect the RAM control module; The data line of RAM is also connected to the neural network classifier module except connecing the FPGA pin, the input data of neural network classifier module and the output data that receives the neural network classifier module are provided.
2.RAM control module externally is the FPGA pin; The address wire, read-write control line and the CE space select lines (CE2) that connect the EMIF interface of DSP; The RAM control module is the status signal of Connection Neural Network classifier modules and the read-write RAM signal of neural network classifier module also; Through receiving DSP and neural network classifier module read-write state and the demand to RAM, the RAM control module produces read-write control signal and the address signal of RAM, produces the control signal of neural network classifier module simultaneously.
3. the neural network classifier module connects the data line of RAM module, its state and the read-write demand of RAM is connected into the RAM control module through internal signal wire, and simultaneously, the neural network classifier module receives the control signal of RAM control module to it.In the neural network classification process, neural network classifier module value initialization as a matter of expediency module is read in the required weights of computing.
4. the weights initialization module externally is the FPGA pin, FLASH associated data line, address wire and control line in the EMIF interface of connection DSP.In system start-up, the weights initialization module is read the weights of neural network from FLASH.When the neural network classifier module was carried out computing, the weights initialization module provided weights.It should be noted that; The external pin of FPGA of the external pin of the FPGA of weights initialization module and RAM module and RAM control module is in that physically some is identical; But this can't cause conflict; Because the external pin of weights initialization module only works in system start-up, can not use afterwards, the external pin of FPGA with RAM module and RAM control module does not conflict in time.
This improves signal real-time mode recognizing processing speed in the DSP+FPGA framework method adopts said system to carry out signal Processing, it is characterized in that whole signal processing flow is:
1. signals collecting is accomplished by DSP;
2. Signal Pretreatment and feature extraction accomplished by DSP;
3. neural network classification is accomplished by FPGA;
4. the treatment classification result is accomplished by DSP.
For cooperating above-mentioned flow process, use multithreading to realize 4 threads among the DSP, be respectively main thread, signals collecting, signal Processing and result treatment thread:
Main thread is the higher management of other 3 threads, and its flow process is:
1. accomplish the DSP initialization;
2. start other 3 threads;
3. entering waiting status.
The signals collecting thread is accomplished the collection of input signal, and its flow process is:
1. initialization collecting device;
2. open the collection port;
3. waiting signal input if having, then gets into step 4; Otherwise continue to wait for;
4. the signal that collects is put into a formation on main storage device SDRAM---input signal formation, get back to step 3 then.
The signal Processing thread is accomplished the pre-service and the feature extraction of signal, and its flow process is:
1. judge whether the input signal formation is empty,, then continue to judge if empty; Otherwise get into step 2;
2. from the input signal formation, read one group of input signal;
3. input signal is carried out pre-service;
4. the interesting target in the detection input signal, these targets are exactly the main body that need carry out pattern-recognition;
5. judge the also quantity of untreated interesting target, if quantity greater than 0, then gets into step 6; Otherwise get back to step 1;
6. to a untreated interesting target, it is carried out feature extraction;
7. with the characteristic generating feature bag that extracts in the step 6;
8. trigger the enhancement mode direct memory visit (EDMA) between DSP and the FPGA, the characteristic bag is passed to FPGA through the EMIF bus, get back to step 5 then.
The result treatment thread is accomplished the processing of classification results, and its flow process does
1. read the value of a ram register on the FPGA, this register is writing down the classification results number that also is not processed;
2. whether the value of reading in the determining step 1 is greater than 0, if then get into step 3, otherwise get back to step 1;
3. change the value of the ram register on the FPGA in the step 1, make it reduce 1;
4. from FPGA, read the result of a neural network classification;
5. classification results is carried out handled;
6. carry out man-machine interaction and Decision Control, get back to step 1 then.
In addition, in whole pattern-recognition flow process, FPGA is the work that DSP has shared neural network classifier, and its workflow is:
1. during system start-up, said FPGA reads in the weights data of neural network through the EMIF bus from FLASH, accomplish this work by the weights initialization module in the FPGA;
2. when said DSP triggers EDMA transmission characteristic bag data to FPGA; FPGA receives these data by RAM and RAM control module; Wherein the RAM module receives the data on the EMIF data line; The RAM control module receives the signal on EMIF address wire and the control line, and the write address that produces RAM supplies the RAM module to use;
3. after receiving the characteristic bag data that DSP sends; Neural network classifier module in the FPGA is read these characteristic bag data from the RAM module; Carry out behind the neural network classification result being sent back in the RAM module; In this process, the neural network classifier module need be used the weights in the weights initialization module, and simultaneously the RAM control module is responsible for the read/write address coordinating and control the read-write state of RAM and RAM is provided;
4. when DSP need read the classification results among the FPGA; FPGA sends these data by RAM and RAM control module; Wherein the RAM module sends to data on the EMIF data line, and the RAM control module receives the signal on EMIF address wire and the control line, and the address of reading that produces RAM supplies the RAM module to use.
Certainly; The above only is a kind of preferred implementation of the present invention; Should be pointed out that for those skilled in the art, under the prerequisite that does not break away from the principle of the invention; Can also make some improvement and retouching, these improvement and retouching also should be regarded as protection scope of the present invention.