CN101163129A - Method of reducing message transmission overhead of parallel multi-digital signal processor - Google Patents

Method of reducing message transmission overhead of parallel multi-digital signal processor Download PDF

Info

Publication number
CN101163129A
CN101163129A CNA200610113617XA CN200610113617A CN101163129A CN 101163129 A CN101163129 A CN 101163129A CN A200610113617X A CNA200610113617X A CN A200610113617XA CN 200610113617 A CN200610113617 A CN 200610113617A CN 101163129 A CN101163129 A CN 101163129A
Authority
CN
China
Prior art keywords
data
frame
digital signal
signal processor
message
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CNA200610113617XA
Other languages
Chinese (zh)
Other versions
CN100490435C (en
Inventor
李波
葛宝珊
姜宏旭
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beihang University
Original Assignee
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beihang University filed Critical Beihang University
Priority to CNB200610113617XA priority Critical patent/CN100490435C/en
Publication of CN101163129A publication Critical patent/CN101163129A/en
Application granted granted Critical
Publication of CN100490435C publication Critical patent/CN100490435C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Abstract

The invention discloses a method for decreasing the spending of information transfer between paralleled multi-digital signal processors; wherein, the data receiving and transmitting between each digital signal processor are realized by LINK port; before transmitting data, the digital signal processor in the transmitter transmits a control frame to make clear the length and the type and so on for the transmitted data firstly; the digital signal processor in the receiver sets the relative receiving address and the length of the receiving data according the type and the length of the received frame; and then the digital signal processor in the transmitter transmits the actual data frame, in this way, the data can be transmitted to the destination directly without needing double buffer. So the invention not only can save the storage room, but also can save the time for copying data. The invention can realize information zero-copy communication in deed by DMA and the relative break treatment program, which can decrease the burden for DSP core in the most degree, shorten the communication spending greatly and obtain a much better practice effect.

Description

A kind of method that reduces message transmission expense between the parallel multi-digital signal processor
Technical field
The present invention relates in a kind of loose coupling parallel processing system (PPS) that can reduce based on message passing mechanism, message is transmitted the method for expense between a plurality of digital signal processors (abbreviating DSP as), belongs to signal processing and communication technical field.
Background technology
At present, DSP especially is being widely used aspect the picture signal processing in the signal processing field.Yet, along with the disposal ability for signal-processing board requires more and more higher, for example, real-time processing for high frame number, large-sized image sequence, often require the operational capability of per second tens times, the operational capability of monolithic DSP can not satisfy its requirement at present, and this just needs signal-processing board to adopt many DSP parallel processing structure when design, to satisfy the requirement of system to operand.
In the loose coupling Parallel DSP system based on message passing mechanism, the efficient communication technology is the focus that people study always.But existing communication protocol mechanism does not have combined with hardware and application characteristic, and communication overhead is bigger.The communication overhead here mainly comprises: network interface; Communication delay, it is meant from first bit of message sends the last received time of bit to the Web; Send out and receive expense, it is meant transmission, receives the processor time of a message cost, is mainly caused by communication protocol, application interface, operating system.The effective way that reduces communication overhead is to improve network interface, reduces communication delay and reduces a receipts expense.
Fig. 1 has shown that the transfer of data that exists among two DSP flows to schematic diagram.Relevant experiment shows: 1. by average required kernel clock periodicity of 32 words of memory copy: between the on-chip memory be 2.19, in the sheet and be 15.00 between the chip external memory; 2. by the defeated average required kernel clock periodicity of word of LINK oral instructions: being 4.02 when receiving-transmitting sides all is on-chip memory, is 15.01 when having a side to be chip external memory.As seen, the copy time of message data is compared with the transmission of messages time and seemed very long between the different DSP, can not ignore, therefore eliminates unnecessary messages data copy, and is very big for reducing the communication overhead effect.
At present, along with the development of express network, network transfer speeds is more and more faster, and very near the speed of memory copying, it is also obvious day by day that at this moment Ang Gui memory copying is operated the influence that brings to high speed communication for the network bandwidth.In order to reduce memory copying operation adverse effect, many people study zero duplication technology, and for example the someone has proposed EMP (Ethernet Message Passing) thought, and the workaround system has realized that the Network Transmission of zero-copy is handled.The somebody remaps the realization zero-copy by the physical memory page of kernel mode and the memory in user buffering district.
In embedded system, because embedded system is resource-constrained, uses zero-copy mechanism and can reduce the memory use amount that is used for transfer of data and the processing time of CPU, reduce power consumption, its meaning is more obvious.Someone designs on the LyraNET research platform and has realized the embedded ICP/IP protocol of zero-copy.This abrogation of agreement data copies expense, when data will send to network, user data was directly write NIC (Network Interface Controller).Directly the data that receive are copied in the mainframe memory earlier during reception, after protocol processes finishes, be transferred to the user buffering district with the memory mode of remapping again.In addition, the ZBUF socket that VxWorks provides uses zero-copy buffer, can avoid time consuming data copy.The same with the user of the BSD socket that uses standard, use the user of ZBUF socket can use two types socket: Stream Socket (employing Transmission Control Protocol) and Datagram Socket (employing udp protocol).
But, existing zero duplication technology utilizes DMA and memory to remap etc. often and realizes the zero-copy of packet between user buffering district and kernel mode buffering area, the other parts of communication protocol do not have essence to change, and owing to do not combine closely with particular hardware and application, can't be at last truly zero-copy, still have the potential to be tapped.In addition, all data are not directly taken out from the place memory in the prior art and send by communication interface, the data that receive are not directly put into destination memory yet, still need once copy therebetween, if can remove this copy procedure then can further reduce communication overhead.
Summary of the invention
The purpose of this invention is to provide a kind of method that reduces message transmission expense between the parallel multi-digital signal processor.The ardware feature of the abundant digging utilization DSP of this method is removed all redundancy message copies, thereby realizes communication more efficiently.
For achieving the above object, the present invention adopts following technical proposals:
A kind of method that reduces message transmission expense between the parallel multi-digital signal processor by LINK mouth transceive data, is characterized in that between each digital signal processor:
The digital signal processor of transmit leg is before sending data, at first send a control frame and indicate data type that the back is sent out and length etc., recipient's digital signal processor is provided with corresponding receiver address according to the type of receiving frame, and the digital signal processor of transmit leg sends actual Frame more then.
Wherein more preferably, Frame separates with control frame, and control frame has 4 byte control fields, 4 byte length fields, 4 byte destination address field (DAF)s, 4 block of bytes fields and 1 byte checksum field; Frame only comprises pure data and checksum field.
Transmit leg is before sending data, at first send out data and send claim frame with the form of control frame, inform data type and length thereof that the other side will send, after the recipient receives described data sending request frame, send an acknowledgement frame, and the corresponding receiver address that receives in the DMA parameter is set and receives length; After transmit leg treats that the recipient confirms, send Frame, the recipient sends an acknowledgement frame after harvesting Frame; If acknowledgement frame is an acknowledgement frame, then data send successfully, otherwise will retransmit.
Transmit leg sends the association request frame of one 4 byte when connecting, connect after receiving affirmative acknowledgement; Transmit leg sends the dismounting association request frame of one 4 byte when remove connecting, receive to remove after the affirmative acknowledgement to connect.
Between the different digital signal processor, adopt dma mode to carry out transfer of data.
When adopting dma mode transmission data, the TCB register of corresponding transmission DMA at first correctly is set, comprise address, length and control information; Corresponding LINK mouth transmit control register is set then, comprises speed, bit wide and enable transmission information; It is busy condition that corresponding passage is set at last.
Message is transmitted the method for expense by DMA and interrupt handling routine thereof between the reduction parallel multi-digital signal processor provided by the present invention, realized message zero-copy communication truly, farthest reduced the burden of DSP nuclear, shorten communication overhead greatly, obtained good practice effect.
Description of drawings
The present invention is further illustrated below in conjunction with the drawings and specific embodiments.
Fig. 1 is that the transfer of data that exists among two DSP flows to schematic diagram.
Fig. 2 is the frame structure schematic diagram of unified frame.
Fig. 3 is the control frame of employing and the frame structure schematic diagram of Frame.
The schematic flow sheet of Fig. 4 for connecting between the different DSP.
Fig. 5 is the schematic flow sheet that sends data between the different DSP.
Fig. 6 removes the schematic flow sheet that connects between the different DSP.
Fig. 7 is the state machine diagram of data receiver.
Fig. 8 is data receiver's a state machine diagram.
Embodiment
The present invention is in order to solve in the multiple DSP system, pass through high speed LINK port communications between the DSP, communication delay is low as far as possible between the requirement DSP, and the source end of communication and destination both can be that the requirement of the chip external memory that quick on-chip memory also can be relatively slow proposes.In order to satisfy this specification requirement, method of the present invention is based on zero duplication technology.
Zero duplication technology can reduce the communication overhead of legacy operating system, reduces between protocol hierarchy and corresponding protocol to copy number of times, improves the performance of communication effectively.It mainly comprises following two aspects:
(1) creates effective user-level communication interface, eliminate unnecessary copy procedure in the system kernel.
(2) the immediate data transmission of going into to hold out end of router, promptly the message of Jie Shouing only cushions through primary memory, and after the information processing of the process of the message in buffer queue necessary control, directly is sent to output port to send, and the realization message is transmitted fast.
The present invention be directed to embedded system design, additionally do not adopt special communication component such as network interface card to handle communications transaction.Like this, all compressions handle and communication task all will be finished by DSP, and DSP will be based on image Compression.Therefore, the processing of communication protocol can only be finished by interrupt handling routine, and transfer of data adopts dma mode, to alleviate the burden of DSP nuclear.
In communication process, the data class difference (as view data and packed data) that flows between the DSP, they all have fixing separately receiver address (have plenty of on-chip memory fast, have plenty of chip external memory relatively at a slow speed), and supply is used with program.If adopt the double buffering structure, can remove the expense of such data copy to certain class data (as view data) with the mode of transmitting pointer, but because other categorical data (as packed data) need be placed in the chip external memory, and in advance and which class data what do not know to transmit be, can't avoid various data copy expenses fully, and waste memory space.Therefore among the present invention, do not use the double buffering structure.
Processing method of the present invention is such:
Transmit leg is before sending data, send a control frame earlier and indicate data type that the back is sent out and length etc., the recipient is provided with corresponding receiver address according to the type of receiving frame, transmit leg sends actual data more then, so just can directly be transferred to the destination, need not double buffering, both save memory space, save the data copy time again, reduced the system communication expense.
In order further to reduce the expense that message transmits between many DSP, the present invention takes following measure on the basis of existing communication protocol frame format commonly used when the design frame format:
(1) unmark field
Process by LINK mouth transceive data between the DSP can be designed to: the recipient always start to receive DMA earlier, receive frame data after, finish to start again in the interrupt handling routine receiving DMA at its DMA at once; When needing, transmit leg just sends a frame continuously by sending DMA.When transmit leg did not send data, clock line did not enable, and the recipient also can not receive any data.The DMA transmission mechanism that utilizes DSP to provide can guarantee the integrality of data transmit-receive fully, need not to indicate the leading and suffix as frame.
(2) remove address field
Because passage is special-purpose, there is no need to use address field.
(3) data segment, length increases
(4) in conjunction with data transmission mechanism characteristics that DSP provided
In order to reduce the burden of DSP nuclear, to go to calculate so that vacate more time, transfer of data will be carried out on the backstage by DMA, and the maximum length of each DMA transmission data is 256KB.In order once to transmit the data of 256KB, in a DMA transmission, just can not increase out of Memory, this form to frame has proposed challenge and restriction.
Comprehensive above consider that the unified frame format that satisfies above-mentioned requirements as shown in Figure 2.Wherein, control field is represented the type of frame, comprising: connection request, and connection request confirms that connection request is denied; View data, view data send confirms that view data sends and denies; Compressed bit stream data, compressed bit stream data send according to affirmation, and the compressed bit stream data send and deny; File name data, file name data send confirms that file name data sends and denies; Close connection request, close connection request and confirm, close connection request and deny etc.
But, adopt unified frame format transmission data to have all drawbacks, as: cause the reception data address discontinuous, waste memory space on the valuable sheet, redundant data copy causes communication overhead increase etc.Therefore, the present invention defines two class frame formats in addition, and a class is a control frame, and another kind of is Frame, sees Fig. 3.The mode that adopts control frame to separate with Frame and combine both can have been saved memory space on the valuable sheet, realized the message zero-copy, reduced communication overhead, and it is discontinuous to avoid receiving data address, has brought great flexibility to application again.
When actual transmissions message data, need make Frame be independent of control frame, be that the front is sent out type (view data, packed data or the file name data that a control frame is used to indicate next frame earlier, their receiver address difference) and length etc., the back is with the Frame or the Frame chain of one " purely ".The recipient can select appropriate receiver address is set according to the control frame of receiving, makes valid data " through destination ", reduces unnecessary data copy expense, realizes the zero-copy of message.
The once complete communication process that following mask body introduction utilizes method provided by the present invention to realize.This communication process comprises and connecting, and sends data and dismounting is connected.Because passage is special-purpose full-duplex channel among the present invention, only need to set up once to connect, just become permanent connection, can be used to transmit data always, need not the dismounting that repeats to connect and set up process again, end up to whole application, save unnecessary process, saved call duration time.
(1) flow process that connects
Each LINK mouth should connect earlier before communicating, affirmation the other side's existence and ready, and then send real data, otherwise return timeout error message.The process of connecting is the association request frame that transmit leg is sent out one 4 word, starts timer then and waits for the response message that receives the other side.Will send out sure replying, successful connection if the other side is ready; Otherwise send out negative replying or not replying.If when timer expiry, the other side is no response still, then restart timer and wait for, overtimely still there is not an answer up to 3 times, think that then the other side is not ready to, connection failure; If the other side is negative response, send out connection request after then transmit leg waits for a period of time again, repeat said process.Its flow chart as shown in Figure 4.
(2) send data flow
Before sending data, at first transmit leg is sent out data and is sent claim frame, inform data type (can be filename, raw image data or packed data) and length thereof that the other side will send, so that the recipient correctly is provided with the corresponding receiver address that receives in the DMA parameter and receives length, after thereby the transfer of data that guarantees next specified quantity was finished, generation DMA transmission was finished interruption and is handled accordingly.The recipient should send an acknowledgement frame after receiving data sending request.Acknowledgement frame can be an acknowledgement frame, also can be to deny frame.After the other side's affirmation, send and send real Frame just now, the recipient sends an acknowledgement frame after harvesting Frame.If acknowledgement frame is an acknowledgement frame, then data send successfully, otherwise will retransmit.Its flow chart as shown in Figure 5.
(3) remove the connection flow process
Removing the process that connects is the dismounting association request frame that transmit leg is sent out one 4 word, starts timer then and waits for the response message that receives the other side.Will send out sure replying if the other side agrees to remove, remove successfully; Otherwise send out negative replying or not replying.If when timer expiry, the other side is no response still, then restart timer and wait for, overtimely still there is not an answer up to 3 times, think that then the other side does not reply, the dismounting connection failure; If the other side is negative response, send out the dismounting connection request after then transmit leg waits for a period of time again, repeat said process.Its flow chart as shown in Figure 6.
Introduce transmitting-receiving interrupt handling routine and the agreement reiving/transmitting state machine of realizing above-mentioned communication process key below
Because DSP is to handle image compression, and do not adopt special communication component to handle communications transaction, but finish by DSP, carry out image Compression in order to allow DSP nuclear vacate more time, data communication and agreement thereof are mainly finished in interrupt handling routine, and adopt dma mode transmission data, make transmission data and compression handle parallel carrying out, have only when transfer of data is finished, just to produce interruption, in interrupt handling routine, handle by protocol rule.Therefore, interrupt handling routine is the key that realizes agreement, it can be further divided into, and interrupt handling routine is finished in transmission DMA transmission and interrupt handling routine is finished in reception DMA transmission, realize by their mainly and safeguard transmit leg and recipient's communication protocol state machine that main program or other interrupt handling routine can trigger corresponding event.For ease of understanding main thought, Fig. 7 and Fig. 8 have provided part communication protocol state machines under the normal condition of transmit leg and recipient respectively, only drawing among the figure connects, receives and dispatches view data and the situation of removing when being connected, for other situation with make a mistake or overtime situation is not given and being drawn.
More than message between the reduction parallel multi-digital signal processor provided by the present invention is transmitted expense method have been described in detail.For one of ordinary skill in the art, any conspicuous change of under the prerequisite that does not deviate from connotation of the present invention it being done all will constitute to infringement of patent right of the present invention, with corresponding legal responsibilities.

Claims (6)

1. a method that reduces message transmission expense between the parallel multi-digital signal processor is passed through LINK mouth transceive data between each digital signal processor, it is characterized in that:
The digital signal processor of transmit leg is before sending data, at first send a control frame and indicate data type that the back is sent out and length etc., recipient's digital signal processor is provided with corresponding receiver address according to the type of receiving frame, and the digital signal processor of transmit leg sends actual Frame more then.
2. message is transmitted the method for expense between the reduction parallel multi-digital signal processor as claimed in claim 1, it is characterized in that:
Described Frame separates with control frame;
Described control frame has 4 byte control fields, 4 byte length fields, 4 byte destination address field (DAF)s, 4 block of bytes fields and 1 byte checksum field;
Described Frame only comprises pure data and checksum field.
3. message is transmitted the method for expense between the reduction parallel multi-digital signal processor as claimed in claim 1, it is characterized in that:
Transmit leg was at first sent out data with the form of control frame and is sent claim frame before sending data, informed data type and length thereof that the other side will send,
After the recipient receives described data sending request frame, send an acknowledgement frame, and the corresponding receiver address that receives in the DMA parameter is set and receives length;
After transmit leg treats that the recipient confirms, send Frame, the recipient sends an acknowledgement frame after harvesting Frame;
If acknowledgement frame is an acknowledgement frame, then data send successfully, otherwise will retransmit.
4. message is transmitted the method for expense between the reduction parallel multi-digital signal processor as claimed in claim 1, it is characterized in that:
Transmit leg sends the association request frame of one 4 byte when connecting, connect after receiving affirmative acknowledgement; Transmit leg sends the dismounting association request frame of one 4 byte when remove connecting, receive to remove after the affirmative acknowledgement to connect.
5. message is transmitted the method for expense between the reduction parallel multi-digital signal processor as claimed in claim 1, it is characterized in that:
Between the different digital signal processor, adopt dma mode to carry out transfer of data.
6. message is transmitted the method for expense between the reduction parallel multi-digital signal processor as claimed in claim 5, it is characterized in that:
When adopting dma mode transmission data, the TCB register of corresponding transmission DMA at first correctly is set, comprise address, length and control information; Corresponding LINK mouth transmit control register is set then, comprises speed, bit wide and enable transmission information; It is busy condition that corresponding passage is set at last.
CNB200610113617XA 2006-10-09 2006-10-09 Method of reducing message transmission overhead of parallel multi-digital signal processor Expired - Fee Related CN100490435C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB200610113617XA CN100490435C (en) 2006-10-09 2006-10-09 Method of reducing message transmission overhead of parallel multi-digital signal processor

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB200610113617XA CN100490435C (en) 2006-10-09 2006-10-09 Method of reducing message transmission overhead of parallel multi-digital signal processor

Publications (2)

Publication Number Publication Date
CN101163129A true CN101163129A (en) 2008-04-16
CN100490435C CN100490435C (en) 2009-05-20

Family

ID=39297957

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB200610113617XA Expired - Fee Related CN100490435C (en) 2006-10-09 2006-10-09 Method of reducing message transmission overhead of parallel multi-digital signal processor

Country Status (1)

Country Link
CN (1) CN100490435C (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102368711A (en) * 2011-10-25 2012-03-07 曙光信息产业(北京)有限公司 Communication system facing parallel file system
CN101707565B (en) * 2009-12-04 2012-04-25 曙光信息产业(北京)有限公司 Method and device for transmitting and receiving zero-copy network message
CN104486249A (en) * 2014-12-22 2015-04-01 浪潮集团有限公司 Method for improving network message transmission efficiency of RAPIDIO
CN109688606A (en) * 2018-12-29 2019-04-26 京信通信系统(中国)有限公司 Data processing method, device, computer equipment and storage medium
CN117094876A (en) * 2023-07-12 2023-11-21 荣耀终端有限公司 Data processing method, electronic device and readable storage medium

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101853238A (en) * 2010-06-01 2010-10-06 华为技术有限公司 Message communication method and system between communication processors

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101707565B (en) * 2009-12-04 2012-04-25 曙光信息产业(北京)有限公司 Method and device for transmitting and receiving zero-copy network message
CN102368711A (en) * 2011-10-25 2012-03-07 曙光信息产业(北京)有限公司 Communication system facing parallel file system
CN102368711B (en) * 2011-10-25 2014-05-21 曙光信息产业(北京)有限公司 Communication system facing parallel file system
CN104486249A (en) * 2014-12-22 2015-04-01 浪潮集团有限公司 Method for improving network message transmission efficiency of RAPIDIO
CN109688606A (en) * 2018-12-29 2019-04-26 京信通信系统(中国)有限公司 Data processing method, device, computer equipment and storage medium
CN109688606B (en) * 2018-12-29 2022-03-25 京信网络系统股份有限公司 Data processing method and device, computer equipment and storage medium
CN117094876A (en) * 2023-07-12 2023-11-21 荣耀终端有限公司 Data processing method, electronic device and readable storage medium

Also Published As

Publication number Publication date
CN100490435C (en) 2009-05-20

Similar Documents

Publication Publication Date Title
CN100490435C (en) Method of reducing message transmission overhead of parallel multi-digital signal processor
US7664026B2 (en) Methods and systems for reliable data transmission using selective retransmission
US10430374B2 (en) Selective acknowledgement of RDMA packets
CN101304373B (en) Method and system for implementing high-efficiency transmission chunk data in LAN
Velten et al. Reliable data protocol
EP0525985B1 (en) High speed duplex data link interface
CN103905300B (en) A kind of data message sending method, equipment and system
Haas A communication architecture for high-speed networking
JP2013511884A (en) Dynamically connected transport service
KR101283482B1 (en) Apparatus for processing pci express protocol
CN114520711B (en) Selective retransmission of data packets
CN103957169A (en) Reliable UDP achievement method based on reserve request
CN103338184A (en) Data transmitting method and apparatus, data receiving apparatus and data transmission system
CN1276635C (en) Priority enhanced information transfer device and its method
WO2011057525A1 (en) Http server based on packet processing and data processing method thereof
CN101449254B (en) Flow control for universal serial bus (USB)
TW200415474A (en) Method and apparatus for intermediate buffer segmentation and reassembly
CN104536934A (en) Serial port communication method and system
CN100486248C (en) Zero-copy communication method under real-time environment
US20070147381A1 (en) Method and device for transfer of data over a data connection from a sender to a receiver by means of packets
EP1225741B1 (en) High speed interconnection for embedded systems within a computer network
US7213074B2 (en) Method using receive and transmit protocol aware logic modules for confirming checksum values stored in network packet
EP1282264A3 (en) Method and device for multicasting
US20120072520A1 (en) System and Method for Establishing Reliable Communication in a Connection-Less Environment
JP4264924B2 (en) Data transfer method

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20090520

Termination date: 20201009