CN102567256A - Processor system, as well as multi-channel memory copying DMA accelerator and method thereof - Google Patents

Processor system, as well as multi-channel memory copying DMA accelerator and method thereof Download PDF

Info

Publication number
CN102567256A
CN102567256A CN2011104255307A CN201110425530A CN102567256A CN 102567256 A CN102567256 A CN 102567256A CN 2011104255307 A CN2011104255307 A CN 2011104255307A CN 201110425530 A CN201110425530 A CN 201110425530A CN 102567256 A CN102567256 A CN 102567256A
Authority
CN
China
Prior art keywords
read
write
data
channel
reading
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011104255307A
Other languages
Chinese (zh)
Other versions
CN102567256B (en
Inventor
苏文
苏孟豪
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Loongson Technology Corp Ltd
Original Assignee
Loongson Technology Corp Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Loongson Technology Corp Ltd filed Critical Loongson Technology Corp Ltd
Priority to CN201110425530.7A priority Critical patent/CN102567256B/en
Publication of CN102567256A publication Critical patent/CN102567256A/en
Application granted granted Critical
Publication of CN102567256B publication Critical patent/CN102567256B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention provides a processor system, as well as a multi-channel memory copying DMA (direct memory access) accelerator and a method thereof. The processor system comprises a multi-channel direct memory access (DMA) accelerator connected between a processor core and a memory through a data bus, and the multi-channel DMA accelerator is used for judging and decomposing task information of a data reading and writing request according to the task information of the data reading and writing request when the processor core emits the data reading and writing request of a memory copying command, controlling a plurality of reading and writing channels to emit the multiple reading and writing requests to the memory in parallel according to the task information of the data reading and writing request after decomposition, the reading and writing frequencies and the priorities of the plurality of the reading and writing channels in the task information and values of marker bits of the reading and writing channels, and further completing data reading and writing. The processor system has the advantages of high bandwidth, low latency, high degree of parallelism, reconfigurability and platform independence.

Description

Processor system and multichannel memory copy DMA accelerator and method
Technical field
The present invention relates to computer hardware architectures and processor design field; Be particularly related to a kind of support asynchronous memory access, memory read-write parallel based on the direct internal storage access of the embedded hyperchannel of processor (Direct Memory Access, processor system DMA) and multichannel memory copy DMA accelerator and method.
Background technology
In existing computer system, memory copying (Memory Copy) is a kind of important operation that between the internal memory diverse location, transmits data.It extensively is present in the middle of operating system and the types of applications program, and correlative study finds that memory copying operates in the 20%-40% that can account for the T.T. expense in the ICP/IP protocol processing.In operating system, the modular system function m emcpy that the memory copying operation defines through system kernel, bcopy etc. realize its function.To different Computer Architectures, operating system is also different to the concrete realization of this group function.In user program, C language standard storehouse (ANSIC) also copy function provides the function realization to internal memory.
As shown in Figure 1; Be existing memory copying processor system structure; Comprise processor core (CPU) 1, L2 cache (Cache) module 2, internal memory 3; Wherein, processor core comprises control module 11, arithmetic unit 12, register file 13, decoding and memory access unit 14, level cache (Cache) module 15 etc.Typical memory copying operation can be decomposed into a series of read-write operations that replace to internal memory with it on microcosmic.Processor core earlier sends a read operation to the A address, after it is accomplished, sends the value V (A) that a write operation will read back and writes address B; Send read request to A+1 afterwards, read back results V (A+1) is write B+1; Carrying out this process repeatedly accomplishes up to whole memory copying operation.
Existing a kind of memory copying accelerated method is the memory copying method of optimizing of resetting of instructing; This method is according to the streamline characteristics of particular architecture processor; Internal memory copy function programmed instruction is arranged again, postponed to obtain the stream of memory access continuously and to improve memory access efficient and reduce.Its embedded corresponding assembly instruction code in operating system nucleus memory copying function is replaced original general C code and is improved executing efficiency.And reset the compilation access instruction according to the characteristics of particular architecture, the pipeline stall when reducing the processor execution command instructs four or four arranged in groups like memory copying function under the MIPS architecture with load and store.
Existing a kind of memory copying accelerated method is the memory copying method that memory copying and access synchronized are optimized; This method is come the address of operation of record analysis memory copying and internal storage access operation through increasing the additional hardware module, does not improve instruction execution efficient thereby do not block processor.And the copy function primitive of optimization is provided in operating system, to realize the synchronous of copy procedure and other internal storage access processes.
Existing memory copying accelerated method has following shortcoming:
(1) system effectiveness is low.Still need processor to carry out relevant memory access and steering order when prior art is carried out memory copying, cause that processor can't carry out other operations in the whole copy procedure, it belongs to serial isochronous memory copy of processor control in essence.
(2) copying speed is slow.The inner general integrated 1-2 memory access parts of the processor of prior art; Could carry out the access instruction of back after having only current access instruction to accomplish; Therefore existing memory copying method is the serial access to internal memory on microcosmic; Can't carry out incoherent memory read-write operation simultaneously, cause copying speed slow.
(3) do not have universal compatibility.This method and processor structure and program instruction set are closely related, and the memory copying program after optimizing under the different architecture can not be compatible.
Summary of the invention
The object of the present invention is to provide a kind of processor system and multichannel memory thereof copy DMA accelerator and method, it has high bandwidth, low delay, high degree of parallelism, reconfigurableization, the advantage of platform-neutral.
A kind of processor system for realizing that the object of the invention provides comprises processor core, and internal memory, also comprises through data bus being connected the multi-channel DMA accelerator between processor core and the internal memory;
Said multi-channel DMA accelerator; Be used for sending memory copying order when producing the reading and writing data request, judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information at processor core, and according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel, control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, accomplish reading and writing data.
More excellent ground, described processor system also comprises the cache module that is connected between internal memory and the multi-channel DMA accelerator, is used to be buffered in the data of transmitting between internal memory and the multi-channel DMA accelerator.
More excellent ground, said multi-channel DMA accelerator comprises at least one DMA engine modules and two interfaces;
Said DMA engine modules; Be used for judging and decomposing said reading and writing data tasks requested information according to said reading and writing data tasks requested information; And according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels, and the value of the marker bit of said read-write channel controls that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory;
Said at least two interfaces are at least one data-interface and one control and communication interface;
Said data-interface is used for, and transmits the data of the required read-write of reading and writing data request of said memory copying order;
Said control and communication interface are used for communicating with said processor core and internal memory, and data are put in the receiving processor caryogamy, and dispose and be stored to the configuration register of said read-write channel according to said configuration data.
More excellent ground, said DMA engine modules comprise a plurality of read-write channels and corresponding marker bit thereof, connect two flow control unit and the data buffer of said processor core and said internal memory;
Said a plurality of read-write channel comprises a read channel and a write access at least;
Said read channel is used under the control of said flow control unit, and reading of data is to said processor core from internal memory;
Said write access is used under the control of said flow control unit, and the data that processor core is sent are written to internal memory;
Said flow control unit; Be used for value according to the marker bit of the configuration data of configuration register and read-write channel; And each read-write channel of the primary system meter that is provided with during initialization uses the situation of data bus, data bus distributed to the highest read-write channel of priority use, and the frequency and the priority of every read-write channel are controlled; Control the different different data transfer tasks of channel start, and with the duty of the mutual read-write channel of said control module;
Said data buffer is used for the data of cache read write access;
Each said read-write channel comprises a configuration register, be used to receive and storage of processor authorize send here, supply the configuration data of read-write channel read-write;
The data volume that the each read-write requests of the said read-write channel of each marker bit mark is read and write.
More excellent ground, the control module of said processor core comprises initialization subelement and configuration subelement, wherein:
Said initialization subelement is used for when processor core carries out initialization, said flow control unit being carried out initialization, and its initialization state is set, and starts the data transfer task of passage;
Said configuration subelement is used for through said control and communication interface, sends the configuration register of configuration data to said read-write channel to every read-write channel, and the value of the corresponding configuration register of every read-write channel is set.
For realizing that the object of the invention also provides a kind of multichannel memory copy DMA accelerator; Be used for receiving processor core when internal memory sends the copies data read-write requests; Judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information; And according to the mission bit stream of decomposed data read-write requests, and the wherein read-write frequency and the priority of a plurality of read-write channels, and the value of the marker bit of said read-write channel; Control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, accomplish reading and writing data.
For realizing that the object of the invention more provides a kind of memory copying accelerated method, comprise the steps:
Step S101, the control module of processor core sends the memory copying order to the multi-channel DMA accelerator;
Step S102; Multichannel memory copy DMA accelerator is when receiving processor core to memory copying reading and writing data request that internal memory sends; Judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information, and according to the mission bit stream of decomposed data read-write requests, and the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel; Control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, the parallel data read-write is until accomplishing all read-write operations.
More excellent ground, among the said step S102, said judgement is also decomposed said reading and writing data tasks requested information, comprises the steps:
Multichannel memory copy DMA accelerator is judged according to said mission bit stream; When copies data total length during greater than the bus bit wide of the single passage of said multi-channel DMA accelerator; Then the mission bit stream to said memory copying order decomposes; Mission bit stream according to the memory copying order after decomposing sends repeatedly read-write requests through a plurality of passages to internal memory by the multi-channel DMA accelerator;
Otherwise multichannel memory copy DMA accelerator selects a passage to send the reading and writing data request to internal memory at random.
More excellent ground, said parallel data read-write comprises the steps:
Step S1021, said cache module are connected between internal memory and the multi-channel DMA accelerator, are buffered in the data of transmitting between internal memory and the multi-channel DMA accelerator;
After said cache module receives the reading and writing data request of multi-channel DMA accelerator, judge whether in cache module, whether had respective backup by visit data in the said reading and writing data request;
Step S1022 is if said reading and writing data request is had respective backup, execution in step S1023 by visit data in cache module; Otherwise execution in step S1024;
Step S1023 reads and writes in cache module accordingly by visit data according to said reading and writing data request, i.e. internal storage access cache hit, return read operation required by visit data, or upgrade the corresponding cache blocks of write operation by visit data;
Step S1024; Said reading and writing data request can't be read by visit data from the respective backup of cache module; Be the internal storage access cache miss, then cause the buffer memory replacement operation, will treat by visit data by changing in the external memory in the cache module; And return read operation required by visit data, or upgrade corresponding for write operation by the backup of visit data in buffer memory.
More excellent ground, said step S102 also comprises the steps:
Step S201, processor core are provided with the configuration register of every passage of multi-channel DMA accelerator, and the value of initialization s-tag;
Step S202, each bar passage of said multi-channel DMA module starts corresponding data transfer task according to the value of configuration register among the step S201;
Step S203, in each clock period, each passage is provided with the value of p-tag according to self working state, detects tag0~tag7 of s-tag simultaneously;
If the marker bit data represented amount of s-tag then temporarily quits work, and sends interrupt request to processor core, then execution in step S204 during passage completing steps S201 initialization; Continue execution in step S202 otherwise return;
Step S204, whether processor core detects the value of s-tag and the value of inquiring about p-tag, accomplish according to the value judgment data transmission of s-tag and p-tag; If then finish whole transformation task; Otherwise, begin next marker bit data represented transmission.
More excellent ground, said cache module is the L2 cache module.
Processor system of the present invention and multichannel memory copy accelerator and method have following beneficial effect:
(1) high bandwidth, the low delay: the present invention handles through the streamlined of many passages through in processor, adding the direct internal storage access of hyperchannel (DMA) accelerator, can obtain very high bandwidth and very low delay;
(2) high degree of parallelism: the present invention is provided with the duty of a group echo position mark channel through every passage at the multi-channel DMA accelerator; Make that (Direct Memory Access has realized a kind of more fine-grained concurrent working mechanism between data transmission DMA) with direct memory access in processor core calculating;
(3) reconfigurableization: the present invention realizes the data transmission procedure of CPU to real-time reconfigurableization of direct memory access (DMA) module through control and communication interface;
(4) platform-neutral: the present invention can effectively avoid the dependence of software platform, has good portability.
Figure of description
Fig. 1 is the existing processor system structural representation that carries out memory copying;
Fig. 2 is the processor system structural representation that carries out memory copying of the embodiment of the invention;
Fig. 3 is a multi-channel DMA accelerator structure synoptic diagram among embodiment of the invention Fig. 2;
Fig. 4 is a control module structural representation among embodiment of the invention Fig. 2;
The mutual synoptic diagram of Fig. 5 processor core (CPU) that is the embodiment of the invention in a memory copying operation and multi-channel DMA accelerator and cache module;
The course of work synoptic diagram of Fig. 6 parallel memory copying function m emcpy () of processor multichannel memory copy method of the present invention for the present invention uses.
Embodiment
In order to make the object of the invention, technical scheme and advantage clearer,, processor system of the present invention and multichannel memory thereof copy DMA accelerator and method are further elaborated below in conjunction with accompanying drawing and embodiment.Should be appreciated that specific embodiment described herein only in order to explanation the present invention, and be not used in qualification the present invention.
Embodiment one
As shown in Figure 2, be embodiment of the invention processor system, comprise processor core 1, and internal memory 3, and be connected multi-channel DMA (Direct Memory Access, the direct memory access) accelerator 4 between processor core 1 and the internal memory 3 through data bus;
Said multi-channel DMA accelerator 4; Be used for when processor core 1 sends the reading and writing data request of memory copying order, judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information, and according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel, control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory 3, accomplish reading and writing data.
Preferably, but as a kind of embodiment, the processor system of the embodiment of the invention also comprises the cache module 2 that is connected between internal memory 3 and the multi-channel DMA accelerator 4, is used to be buffered in the data of transmission between internal memory 3 and the multi-channel DMA accelerator 4.
But as a kind of embodiment; As shown in Figure 2; Said processor core 1 comprises control module 11, arithmetic unit 12, register file 13, decoding memory access unit 14 and level cache (Cache) module 15 etc.; Is processor core 1 with said control module 11, arithmetic unit 12, register file 13, decoding memory access unit 14 and level cache (Cache) module 15 as a hardware module encapsulation integral body, and said multi-channel DMA accelerator 4 is connected between the control module 11 and internal memory 3 of processor core through data bus.
Embodiment of the invention processor system; Through integrated multi-channel DMA accelerator 4; Eliminate the intervention of processor core in the memory copying process; The multi-channel DMA accelerator 4 inner integrated through processor system has the time delay that buffer memory (Cache) module 2 at utmost reduces the DMA read/write memory now with utilizing, and when keeping structure independence, reduces the influence to memory subsystem, and can realize the parallel work-flow of arithmetic unit computation process and the operation of asynchronous DMA engine memory copying.
But as a kind of embodiment; The multi-channel DMA accelerator 4 of the said embodiment of the invention is arranged at processor system inside; Be connected between the control module 11 and internal memory 3 of processor core 1, between processor core 1 and internal memory 3, opened up a direct data path.
But as a kind of embodiment, as shown in Figure 3, the multi-channel DMA accelerator 4 of the embodiment of the invention comprises at least one DMA engine modules 41 and two interfaces 42,43;
Said DMA engine modules 41; Be used for judging and decomposing said reading and writing data tasks requested information according to said reading and writing data tasks requested information; And, send repeatedly read-write requests to internal memory 3 through a plurality of passages according to the mission bit stream of decomposed data read-write requests.
Preferably, as shown in Figure 3, the DMA engine modules 41 of the embodiment of the invention comprises a plurality of read-write channels and corresponding 412, two flow control unit 411 of marker bit and data buffer 413.
Said a plurality of read-write channel 412 comprises a read channel and a write access at least.
Said read channel is used under the control of said flow control unit, and reading of data is to said processor core from internal memory 3;
Said write access is used under the control of said flow control unit, and the data that processor core is sent are written to internal memory 3.
Said flow control unit 411; The value of the configuration data that is used for sending and the marker bit of read-write channel according to processor core; And each read-write channel of the primary system meter that is provided with during initialization uses the situation of data bus, data bus distributed to the highest read-write channel of priority use, and the frequency and the priority of every read-write channel are controlled; Control the different different data transfer tasks of channel start, and with the duty of said control module 11 mutual read-write channels.
But as a kind of embodiment; (certain bar passage sends one-time request to utilize one group of register to come to preserve respectively the memory access request number that current each bar DMA passage sent in the flow control unit 411; The corresponding registers value adds 1); When many passages sent the memory access request simultaneously, the minimum passage of corresponding register value obtained limit priority according to the equity dispatching principle.
Said data buffer 413 is used for the data of cache read write access.
Each said read-write channel 412 comprises a configuration register 4121, be used to receive and storage of processor authorize send here, supply the configuration data of read-write channel read-write.
The data volume that the each read-write requests of the said read-write channel of each marker bit mark is read and write.
As shown in Figure 3, said at least two interfaces of the multi-channel DMA accelerator 4 in the embodiment of the invention are at least one data-interface 43 and control and communication interface 42.
Said data-interface 43 is used to transmit the data of the required read-write of reading and writing data request of said memory copying order;
Said control and communication interface 42 are used for communicating with internal memory 3 with said processor core 1, and data are put in the receiving processor caryogamy, and dispose and be stored to the configuration register 4121 of said read-write channel according to said configuration data.
Preferably, but as a kind of embodiment, said interface is the interface of AXI (the Advanced eXtensible Interface) bus based on 128, accomplishes the read-write operation of internal storage location through this data bus; Said cache module 2 is the L2 cache module.
Correspondingly, but as a kind of embodiment, as shown in Figure 4, the control module 11 of said processor core comprises initialization subelement 111 and configuration subelement 112, wherein:
Said initialization subelement 111 is used for when processor core carries out initialization, and said flow control unit 411 is carried out initialization, and its initialization state is set, and starts the data transfer task of passage.
Said configuration subelement 112 is used for through said control and communication interface 42, sends the configuration register 4121 of configuration data to said read-write channel to every read-write channel, and the value of the corresponding configuration register of every read-write channel is set.
Preferably, said configuration data comprises the information of source address, destination address, data segment, length etc.
Said processor core 1 can be provided with the value of the corresponding configuration register 4121 of every read-write channel through said control and communication interface 42, thereby makes flow control unit to start different data transfer tasks according to putting data.
Core component as the embodiment of the invention; DMA engine modules 41 can reduce the instruction strip number that processor core is carried out; Can support the locality of memory copying operation, buffer memory (Cache) consistance and the program of continuous step-length; Degree of parallelism in the raising memory copying process between calculating and the data transmission, thus very high program run efficient obtained, and very low power and area overhead.
But as a kind of embodiment, the DMA engine modules 41 of the embodiment of the invention, as shown in Figure 3, comprise three read channels and a write access; Every passage all has oneself independently configuration register and marker bit (tag).Wherein a read-write channel other read-write channel when starting can carry out its data transmission; Promptly four read-write channels can the parallel processing data; Its effectively reduced passage startup, conversion, suspend and restart expense; In addition, because many passages can concurrent working, this has just greatly increased the bandwidth of data transmission.
The configuration subelement 112 of the control module 11 of processor core 1 is provided with the value of every channel arrangement register 4121 through said control and communication interface 42; Preferably, the value of said channel arrangement register 4121 comprises the value of source address, destination address, data segment, length etc.; The initialization subelement of control module is accomplished the initial work of flow control unit simultaneously, and starts the data transfer task of passage.
In data transmission procedure; Flow control unit 411 in the read-write channel will marker bit (tag) separately feeds back to the control module 11 of processor core, the duty separately of the said passage of control module 11 signs of processor core through said control and communication interface 4121.
But as a kind of embodiment, in the embodiment of the invention, the mode of operation of first in first out (FIFO) is adopted in said data buffer, holds the data of 2K byte, and the data that read channel reads back are temporarily stored in the data buffer earlier and write back internal memory by write access again.
Each passage of the primary system meter that flow control unit 411 is provided with during according to initialization uses the situation of bus, uses to the highest passage of priority bus assignment, realizes the frequency of every passage is controlled with priority.
In order to reduce control and mutual time delay.In embodiments of the present invention, described multi-channel DMA accelerator 4 all is provided with a group echo position (tag) in every read-write channel, and controls said marker bit by flow control unit and use.
Specifically, each group echo position (tag) all comprises the s-tag (tag0-tag7) of one 8 bit and the p-tag (tag0-tag7) of one 8 bit.
Wherein, s-tag: expression control mark, p-tag: expression status indication.
Wherein s-tag is for the condition of work through the pre-set DMA passage of CPU, thus the control after being implemented in DMA and starting working.But as a kind of embodiment, in the embodiment of the invention, s-tag has 8 0-7 positions, and whole DMA transmission course is divided into 8 sections (1/8 increases progressively).For instance: as in advance the position of s-tag 1,5 being made as 1; Then DMA has accomplished 2/8 of setting data total length in transmission respectively; With suspended (stall) at 6/8 o'clock if. be exactly to plan to transmit the data of 80 bytes in advance specifically through DMA; Then through after s-tag1,5 are set; DMA can get into halted state automatically when 20 bytes and 60 bytes (this statistics and be that counter register is realized through the transmission statistic function in the flow control unit relatively) are accomplished in transmission, suspends the back and says the word the artificial transmission that recovers through CPU.
Wherein, p-tag is the situation that the current DMA transmission of reflection is accomplished.Take example, establish DMA and accomplish 80 bytes of continual transmission, then p-tag accomplish greater than 10 at DMA successively, during 20...80 byte, its 0-7 position 1.The transmission working condition that the main CPU for ease of this mark inquires about current DMA.
S-tag and p-tag have no mutual in work.For instance, can s-tag not carried out any setting (being the free of discontinuities transmission), p-tag still can change according to DMA transmission situation.The two s-tag is important, is to realize that DMA controls the setting of (do not have and intervene) in advance, and p-tag is convenient mutual the using of CPU.
8 s-tag, p-tag are divided into 8 sections with passage; But transformation task is divided into 8 sections; After the once basic transmission requests of every completion, will check (through the inquiry counter register in the flow control unit relatively) current transmitted data whether more than or equal to the requirement of s-tag (1/8,2/8...8/8).The also same s-tag of transmission data phase of P-tag representative, every corresponding respectively (1/8,2/8...8/8).
The variation of s-tag, p-tag is to be provided with and the variation of DMA volume of transmitted data according to said CPU, and itself does not control flow control unit it.Just the read-write to marker bit is to send the back by CPU to accept and return corresponding state by flow control unit.
Each data represented amount of marker bit is that the total amount of data according to task decides, and promptly different task maybe be different, but all 1/8,2/8...
80 byte tasks for example, each of tag is represented 10 bytes
800 byte tasks, each of tag is represented 100 bytes
But, when the initialization of initialization subelement, each bit of s-tag and p-tag is labeled as 0 as a kind of embodiment; From LSB (Least Significant Bit; Least significant bit (LSB)) beginning, flow control unit are to each bit position 1 of s-tag, when accomplishing the data transfer task of marker bit representative; Promptly after each data transfer request is accomplished corresponding data volume from said read-write channel read-write; Corresponding read-write channel is with break-off, and then, read-write channel then can be with the corresponding bits position 1 among the p-tag.
In a concrete by way of example, each marker bit of the s-tag of read channel 1 and p-tag is all represented 1/8 data transfer task.
After flow control unit receives the reading and writing data request in the memory copying order; Judge that according to the mission bit stream in the reading and writing data request decomposition method according to setting in advance decomposes, and is decomposed into a plurality of read-write requests with the decomposed data read-write requests, then according to the configuration data in the configuration register; And the read-write frequency of each read-write channel and priority; And the value of marker bit, a plurality of read-write channels of parallel starting, reading and writing data.
Read channel 1 then temporarily quits work, and sends interrupt request to processor core (CPU) after running through 1/8 data task;
After the control module of processor core (CPU) is received interrupt request, detect the value of s-tag and p-tag, thereby know that read channel 1 writes the data buffer with 1/8 data segment by flow control unit.
Provide the example of one 1280 (160 * 8) byte of memory copy (being copied to address B) below by address A, as shown in Figure 5, further specify the multi-channel DMA accelerator 4 of the processor system of the embodiment of the invention.
At first carry out the setting of memory copying method;
(11) be provided with and utilize a read channel r and write access w to work simultaneously to accomplish (simple scenario can think between hyperchannel of the same type that passage has limit priority in the work during arbitration, between the read-write be walk abreast do not need arbitration).
(12) marker bit strategy (only using s-tag here) is set.After 160 bytes of the every completion of read channel, suspend; CPU starts 160 bytes that write access will read back at this moment and writes destination address, parallel simultaneously recovery read channel work just now, and same write access is accomplished 160 bytes transmission back time-out with read channel.Circulate 8 times to 1280 byte datas completion memory copying.
Then, configuration DMA passage and marker bit;
(21) source address (A), destination address (B) and the length (1280) of configuration read channel and write access
(22) only dispose the s-tag of read channel r earlier this moment,, start the read channel r work of DMA its 0-7 position 1.
Carry out first time read channel r time-out thereafter;
(31) after read channel r accomplishes 160 bytes and reads (1/8), the Rule of judgment of s-tag position 0 is triggered (position 0 corresponding 1/8), and read channel r suspends and also sends look-at-me notice CPU.
(32) have no progeny during CPU receives, the marker bit 0-7 of configurable write passage w puts 1, and starts write access w work (beginning 120 bytes that run through are before write destination address); Recover read channel r work simultaneously, with the position 0 of read channel s-tag (if decline 0 can send out repeatedly interruptions-to greater than 1/8 while less than 2/8 situation).
At last, carry out the residue process;
(41) after read channel r accomplishes 320 bytes (2/8) transmission, send interruption once more this moment, this moment is through judging the p-tag of write access w, write (1/8) whether inspection write access w has accomplished 160 bytes; If accomplished then recover read channel r and write access w, remove corresponding s-tag position (reason is with 3.2) simultaneously, otherwise continue the p-tag of poll write access w
(42) carry out 4.1 repeatedly up to the completion of 1280 bytes copy
From the whole process crucial effects of can having found out s-tag, CPU according to the embodiment of the invention after accomplishing with the transmission of 1/8 (160 byte) data volume in the Interrupt Process with s-tag, p-tag carries out alternately.Mutual method will be looked concrete setting and decided.
Whole process is except last of first 1/8 process of read channel r and write access 1/8, and other processes all are the read-write channel concurrent workings.Suppose that serial copies 16 unit interval of 1280 byte process needs (read 160, write 160, read 160... again and write last 160 bytes), based on the realization of the embodiment of the invention only need 10 unit interval (read 160, write 160+ read next 160... write last 160).Therebetween only need in the have no progeny value of (totally 8 times) change s-tag and inquiry p-tag, expense is very little.
Embodiment two
Correspondingly, the embodiment of the invention provides a kind of memory copying accelerated method, comprises the steps:
Step S101, the control module of processor core sends the memory copying order to the multi-channel DMA accelerator;
Step S102; The multi-channel DMA accelerator is when receiving the reading and writing data request of the memory copying order that processor core sends to internal memory; Judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information, and according to the mission bit stream of decomposed data read-write requests, and the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel; Control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, the parallel data read-write is until accomplishing all read-write operations;
Preferably, among the said step S102, said judgement is also decomposed said reading and writing data tasks requested information, comprises the steps:
The multi-channel DMA accelerator is judged according to said mission bit stream; When copies data total length during greater than the bus bit wide of the single passage of said multi-channel DMA accelerator; Then the mission bit stream to said memory copying order decomposes; Mission bit stream according to the memory copying order after decomposing sends repeatedly read-write requests through a plurality of passages to internal memory by the multi-channel DMA accelerator;
Otherwise the multi-channel DMA accelerator selects a passage to send the reading and writing data request to internal memory at random;
Preferably, among the said step S102, said parallel data read-write comprises the steps:
Step S1021, said cache module are connected between internal memory and the multi-channel DMA accelerator, are buffered in the data of transmitting between internal memory and the multi-channel DMA accelerator;
After said cache module receives the reading and writing data request of multi-channel DMA accelerator, judge whether in cache module, had respective backup by visit data in the said reading and writing data request;
Step S1022 is if said reading and writing data request is had respective backup, execution in step S1023 by visit data in buffer memory (Cache) module; Otherwise execution in step S1024;
Step S1023 reads and writes in cache module accordingly by visit data according to said reading and writing data request, and promptly internal storage access buffer memory (Cache) hits, return read operation required by visit data, or upgrade corresponding buffer memory (Cache) piece of write operation by visit data;
Step S1024; Said reading and writing data request can't be read by visit data from the respective backup of cache module; Be internal storage access buffer memory (Cache) disappearance, then cause buffer memory (Cache) replacement operation (Cache Evict), will treat by visit data by changing in the external memory in the cache module; And return read operation required by visit data, or upgrade corresponding for write operation by the backup of visit data in buffer memory (Cache).
Said buffer memory (Cache) replacement operation (Cache Evict) is a kind of prior art, therefore, in embodiments of the present invention, describes in detail no longer one by one.
But as a kind of embodiment, preferably, the memory copying accelerated method of the embodiment of the invention, said step 102 also comprises the steps:
Step S201, processor core (CPU) is provided with the configuration register of every passage of multi-channel DMA accelerator, and the value of initialization s-tag;
Step S202, each bar passage of said multi-channel DMA module starts corresponding data transfer task according to the value of configuration register among the step S201;
Step S203, in each clock period, each passage is provided with the value of p-tag according to self working state, detects tag0~tag7 of s-tag simultaneously;
If the marker bit data represented amount of s-tag then temporarily quits work during passage completing steps S201 initialization, and sends interrupt request to processor core (CPU), then execution in step S204; Continue execution in step S202 otherwise return.
Step S204, whether processor core (CPU) detects the value of s-tag and the value of inquiring about p-tag, accomplish according to the value judgment data transmission of s-tag and p-tag; If then finish whole transformation task; Otherwise, begin next marker bit data represented transmission.
Illustrate processor system and the multichannel memory copy accelerator and the method for the embodiment of the invention below.
In a practical implementation example, a kind of course of work of using the parallel memory copying function m emcpy () of processor system of the present invention and multichannel memory copy accelerator and method, as shown in Figure 6.
Function m emcpy (src, dst, len) in, src representes the source address of data segment to be copied, dst representes the destination address that copies, len representes the length of data segment.
The read channel of the said multi-channel DMA accelerator of load (src) expression is from the source address reading of data;
The write access of the said multi-channel DMA accelerator of store (dst) expression writes destination address with data.
As shown in Figure 6, based on the marker bit (tag) of read-write channel, the read-write operation of memory copying process can be realized a kind of processing of streamlined among this embodiment: read channel reads tag0 data represented amount from source address earlier, sends interrupt request to CPU then.If CPU judges this part data and has been ready to then begins the data transfer task of write access, the data that read write back the destination address of internal memory.At this moment, read channel continues to read tag1 data represented amount, like this write access just can with the read channel concurrent working.Through the processing of this streamlined, the startup of write access and the conversion between the read-write channel, suspend and expense such as restart and all stashed, reduced the expense of processor, also obtained very high bandwidth availability ratio simultaneously.
Should be noted that at last that obviously those skilled in the art can carry out various changes and modification to the present invention and not break away from the spirit and scope of the present invention.Like this, if of the present invention these revise and modification belongs within the scope of claim of the present invention and equivalent technologies thereof, then the present invention also is intended to comprise these changes and modification.

Claims (16)

1. a processor system comprises processor core, and internal memory, it is characterized in that, also comprises through data bus being connected the multi-channel DMA accelerator between processor core and the internal memory;
Said multi-channel DMA accelerator; Be used for sending memory copying order when producing the reading and writing data request, judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information at processor core, and according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel, control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, accomplish reading and writing data.
2. processor system according to claim 1 is characterized in that, also comprises the cache module that is connected between internal memory and the multi-channel DMA accelerator, is used to be buffered in the data of transmitting between internal memory and the multi-channel DMA accelerator.
3. processor system according to claim 1 is characterized in that, said multi-channel DMA accelerator comprises at least one DMA engine modules and two interfaces;
Said DMA engine modules; Be used for judging and decomposing said reading and writing data tasks requested information according to said reading and writing data tasks requested information; And according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels, and the value of the marker bit of said read-write channel controls that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory;
Said at least two interfaces are at least one data-interface and one control and communication interface;
Said data-interface is used to transmit the data of the required read-write of reading and writing data request of said memory copying order;
Said control and communication interface are used for communicating with said processor core and internal memory, and data are put in the receiving processor caryogamy, and dispose and be stored to the configuration register of said read-write channel according to said configuration data.
4. processor system according to claim 3 is characterized in that, said interface is the interface based on 128 bus, accomplishes the read-write operation of internal storage location through this data bus; Said cache module is the L2 cache module.
5. processor system according to claim 3 is characterized in that, said DMA engine modules comprises a plurality of read-write channels and corresponding marker bit thereof, connects two flow control unit and the data buffer of said processor core and said internal memory;
Said a plurality of read-write channel comprises a read channel and a write access at least;
Said read channel is used under the control of said flow control unit, and reading of data is to said processor core from said internal memory;
Said write access is used under the control of said flow control unit, and the data that processor core is sent are written to internal memory;
Said flow control unit; Be used for value according to the marker bit of the configuration data of configuration register and read-write channel; And each read-write channel of the primary system meter that is provided with during initialization uses the situation of data bus, data bus distributed to the highest read-write channel of priority use, and the frequency and the priority of every read-write channel are controlled; Control the different different data transfer tasks of channel start, and with the duty of the mutual read-write channel of control module of processor core;
Said data buffer is used for the data of cache read write access;
Each said read-write channel comprises a configuration register, be used to receive and storage of processor authorize send here, supply the configuration data of read-write channel read-write;
The data volume that the each read-write requests of the said read-write channel of each marker bit mark is read and write.
6. according to each described processor system of claim 1 to 5, it is characterized in that the control module of said processor core comprises initialization subelement and configuration subelement, wherein:
Said initialization subelement is used for when processor core carries out initialization, said flow control unit being carried out initialization, and its initialization state is set, and starts the data transfer task of passage;
Said configuration subelement is used for through said control and communication interface, sends the configuration register of configuration data to said read-write channel to every read-write channel, and the value of the corresponding configuration register of every read-write channel is set.
7. processor system according to claim 6 is characterized in that said configuration data comprises source address, destination address, data segment, length.
8. processor system according to claim 5; It is characterized in that; Said many read-write channels comprise three read channels and a write access; Every passage all has oneself independently configuration register and marker bit, and wherein a read-write channel other read-write channel when starting can carry out its data transmission, and promptly four read-write channels can the parallel processing data.
9. a multichannel memory copies the DMA accelerator; It is characterized in that; Be used for receiving processor core when internal memory sends the copies data read-write requests, judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information, and according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel, control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, accomplish reading and writing data.
10. multichannel memory copy DMA accelerator according to claim 9 is characterized in that said multi-channel DMA accelerator comprises at least one DMA engine modules and two interfaces;
Said DMA engine modules; Be used for judging and decomposing said reading and writing data tasks requested information according to said reading and writing data tasks requested information; And according to the mission bit stream of decomposed data read-write requests; And the wherein read-write frequency and the priority of a plurality of read-write channels, and the value of the marker bit of said read-write channel controls that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory;
Said at least two interfaces are at least one data-interface and one control and communication interface;
Said data-interface is used to transmit the data of the required read-write of reading and writing data request of said memory copying order;
Said control and communication interface are used for communicating with said processor core and internal memory, and data are put in the receiving processor caryogamy, and dispose and be stored to the configuration register of said read-write channel according to said configuration data.
11. according to claim 9 or 10 described multichannel memory copy DMA accelerators; It is characterized in that; Said DMA engine modules comprises a plurality of read-write channels and corresponding marker bit thereof, connects two flow control unit and the data buffer of said processor core and said internal memory;
Said a plurality of read-write channel comprises a read channel and a write access at least;
Said read channel is used under the control of said flow control unit, and reading of data is to said processor core from internal memory;
Said write access is used under the control of said flow control unit, and the data that processor core is sent are written to internal memory;
Said flow control unit; Be used for value according to the marker bit of the configuration data of configuration register and read-write channel; And each read-write channel of the primary system meter that is provided with during initialization uses the situation of data bus, data bus distributed to the highest read-write channel of priority use, and the frequency and the priority of every read-write channel are controlled; Control the different different data transfer tasks of channel start, and with the duty of the mutual read-write channel of control module of said processor;
Said data buffer is used for the data of cache read write access;
Each said read-write channel comprises a configuration register, be used to receive and storage of processor authorize send here, supply the configuration data of read-write channel read-write;
The data volume that the each read-write requests of the said read-write channel of each marker bit mark is read and write.
12. a memory copying accelerated method is characterized in that, comprises the steps:
Step S101, the control module of processor core sends the memory copying order to the multi-channel DMA accelerator;
Step S102; Multichannel memory copy DMA accelerator is when receiving processor core to memory copying reading and writing data request that internal memory sends; Judge and decompose said reading and writing data tasks requested information according to said reading and writing data tasks requested information, and according to the mission bit stream of decomposed data read-write requests, and the wherein read-write frequency and the priority of a plurality of read-write channels; And the value of the marker bit of said read-write channel; Control that a plurality of read-write channels are parallel to send repeatedly read-write requests to internal memory, the parallel data read-write is until accomplishing all read-write operations.
13. memory copying accelerated method according to claim 12 is characterized in that, among the said step S102, said judgement is also decomposed said reading and writing data tasks requested information, comprises the steps:
Multichannel memory copy DMA accelerator is judged according to said mission bit stream; When copies data total length during greater than the bus bit wide of the single passage of said multi-channel DMA accelerator; Then the mission bit stream to said memory copying order decomposes; Mission bit stream according to the memory copying order after decomposing sends repeatedly read-write requests through a plurality of passages to internal memory by the multi-channel DMA accelerator;
Otherwise multichannel memory copy DMA accelerator selects a passage to send the reading and writing data request to internal memory at random.
14. memory copying accelerated method according to claim 12 is characterized in that, said parallel data read-write comprises the steps:
Step S1021, said cache module are connected between internal memory and the multi-channel DMA accelerator, are buffered in the data of transmitting between internal memory and the multi-channel DMA accelerator;
After said cache module receives the reading and writing data request of multi-channel DMA accelerator, judge whether in cache module, whether had respective backup by visit data in the said reading and writing data request;
Step S1022 is if said reading and writing data request is had respective backup, execution in step S1023 by visit data in cache module; Otherwise execution in step S1024;
Step S1023 reads and writes in cache module accordingly by visit data according to said reading and writing data request, i.e. internal storage access cache hit, return read operation required by visit data, or upgrade the corresponding cache blocks of write operation by visit data;
Step S1024; Said reading and writing data request can't be read by visit data from the respective backup of cache module; Be the internal storage access cache miss, then cause the buffer memory replacement operation, will treat by visit data by changing in the external memory in the cache module; And return read operation required by visit data, or upgrade corresponding for write operation by the backup of visit data in buffer memory.
15. memory copying accelerated method according to claim 14 is characterized in that said step S102 also comprises the steps:
Step S201, processor core are provided with the configuration register of every passage of multi-channel DMA accelerator, and the value of initialization s-tag;
Step S202, each bar passage of said multi-channel DMA module starts corresponding data transfer task according to the value of configuration register among the step S201;
Step S203, in each clock period, each passage is provided with the value of p-tag according to self working state, detects tag0~tag7 of s-tag simultaneously;
If the marker bit data represented amount of s-tag then temporarily quits work, and sends interrupt request to processor core, then execution in step S204 during passage completing steps S201 initialization; Continue execution in step S202 otherwise return;
Step S204, whether processor core detects the value of s-tag and the value of inquiring about p-tag, accomplish according to the value judgment data transmission of s-tag and p-tag; If then finish whole transformation task; Otherwise, begin next marker bit data represented transmission.
16. memory copying accelerated method according to claim 14 is characterized in that, said cache module is the L2 cache module.
CN201110425530.7A 2011-12-16 2011-12-16 Processor system, as well as multi-channel memory copying DMA accelerator and method thereof Active CN102567256B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201110425530.7A CN102567256B (en) 2011-12-16 2011-12-16 Processor system, as well as multi-channel memory copying DMA accelerator and method thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201110425530.7A CN102567256B (en) 2011-12-16 2011-12-16 Processor system, as well as multi-channel memory copying DMA accelerator and method thereof

Publications (2)

Publication Number Publication Date
CN102567256A true CN102567256A (en) 2012-07-11
CN102567256B CN102567256B (en) 2015-01-07

Family

ID=46412706

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201110425530.7A Active CN102567256B (en) 2011-12-16 2011-12-16 Processor system, as well as multi-channel memory copying DMA accelerator and method thereof

Country Status (1)

Country Link
CN (1) CN102567256B (en)

Cited By (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831091A (en) * 2012-07-31 2012-12-19 宁波成电泰克电子信息技术发展有限公司 Serial port-based ship radar echo data collecting method
CN104281537A (en) * 2013-07-05 2015-01-14 阿里巴巴集团控股有限公司 Memory copying method and device
CN104461674A (en) * 2013-09-18 2015-03-25 联想(北京)有限公司 Data processing method and electronic device
CN104699641A (en) * 2015-03-20 2015-06-10 浪潮集团有限公司 EDMA (enhanced direct memory access) controller concurrent control method in multinuclear DSP (digital signal processor) system
CN106851706A (en) * 2017-02-23 2017-06-13 武汉米风通信技术有限公司 Register configuration method based on multichannel communication reception system
CN107070593A (en) * 2017-02-09 2017-08-18 武汉米风通信技术有限公司 Relay based on multichannel communication reception system starts method
CN107291629A (en) * 2016-04-12 2017-10-24 华为技术有限公司 A kind of method and apparatus for accessing internal memory
CN107430628A (en) * 2015-04-03 2017-12-01 华为技术有限公司 Acceleration framework with immediate data transmission mechanism
CN107436855A (en) * 2016-05-25 2017-12-05 三星电子株式会社 QOS cognition IO management for the PCIE storage systems with reconfigurable multiport
CN107463829A (en) * 2017-09-27 2017-12-12 山东渔翁信息技术股份有限公司 The processing method of DMA request, system and relevant apparatus in a kind of cipher card
CN104461674B (en) * 2013-09-18 2018-08-31 联想(北京)有限公司 A kind of data processing method and electronic equipment
CN109144906A (en) * 2017-06-15 2019-01-04 北京忆芯科技有限公司 Electronic equipment and its command dma processing method
CN109408428A (en) * 2018-10-29 2019-03-01 京信通信系统(中国)有限公司 Control method, device and the physical layer accelerator card of direct memory access
CN111026687A (en) * 2019-10-30 2020-04-17 深圳震有科技股份有限公司 Method, system and computer equipment for matching data transmission read-write rate
CN111338998A (en) * 2020-02-20 2020-06-26 深圳震有科技股份有限公司 FLASH access processing method and device based on AMP system
CN111460461A (en) * 2020-04-03 2020-07-28 全球能源互联网研究院有限公司 Trusted CPU system, read-write request and trusted checking method of DMA data
CN111949600A (en) * 2020-09-25 2020-11-17 苏州浪潮智能科技有限公司 Method and device for applying thousand-gear market quotation based on programmable device
CN112486410A (en) * 2020-11-23 2021-03-12 华南师范大学 Method, system, device and storage medium for reading and writing persistent memory file
CN112749112A (en) * 2020-12-31 2021-05-04 无锡众星微系统技术有限公司 Hardware flow structure
CN112783117A (en) * 2020-12-29 2021-05-11 浙江中控技术股份有限公司 Method and device for data isolation between security and conventional control applications
CN112835827A (en) * 2019-11-25 2021-05-25 美光科技公司 Quality of service level for direct memory access engines in a memory subsystem
WO2021129304A1 (en) * 2019-12-23 2021-07-01 华为技术有限公司 Memory manager, processor memory subsystem, processor and electronic device
CN113076189A (en) * 2020-04-17 2021-07-06 北京忆芯科技有限公司 Data processing system with multiple data paths and virtual electronic device constructed using multiple data paths
CN113190475A (en) * 2021-05-08 2021-07-30 中国电子科技集团公司第五十八研究所 Secondary cache controller structure
CN113254321A (en) * 2021-06-07 2021-08-13 恒为科技(上海)股份有限公司 Method and system for evaluating memory access performance of processor
WO2021179218A1 (en) * 2020-03-11 2021-09-16 深圳市大疆创新科技有限公司 Direct memory access unit, processor, device, processing method, and storage medium
CN114691564A (en) * 2020-12-29 2022-07-01 新唐科技股份有限公司 Direct memory access device, data transmission method and electronic device
CN116860335A (en) * 2023-09-01 2023-10-10 北京大禹智芯科技有限公司 Method for realizing pipelining operation of direct memory access driving system
CN117094876A (en) * 2023-07-12 2023-11-21 荣耀终端有限公司 Data processing method, electronic device and readable storage medium

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101034384A (en) * 2007-04-26 2007-09-12 北京中星微电子有限公司 DMA controller and transmit method capable of simultaneously carrying out read-write operation
CN101807165A (en) * 2009-01-22 2010-08-18 台湾积体电路制造股份有限公司 System and method for fast cache-hit detection
CN102231142A (en) * 2011-07-21 2011-11-02 浙江大学 Multi-channel direct memory access (DMA) controller with arbitrator

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101034384A (en) * 2007-04-26 2007-09-12 北京中星微电子有限公司 DMA controller and transmit method capable of simultaneously carrying out read-write operation
CN101807165A (en) * 2009-01-22 2010-08-18 台湾积体电路制造股份有限公司 System and method for fast cache-hit detection
CN102231142A (en) * 2011-07-21 2011-11-02 浙江大学 Multi-channel direct memory access (DMA) controller with arbitrator

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
刘煜峰,高雪莲: "一种多通道DMA控制器的IP核设计", 《2007年研究综述与技术论坛专刊》 *
曹宗凯等: "DMA在内存间数据拷贝中的应用及其性能分析的研究", 《电子器件》 *

Cited By (45)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102831091B (en) * 2012-07-31 2015-01-21 宁波成电泰克电子信息技术发展有限公司 Serial port-based ship radar echo data collecting method
CN102831091A (en) * 2012-07-31 2012-12-19 宁波成电泰克电子信息技术发展有限公司 Serial port-based ship radar echo data collecting method
CN104281537B (en) * 2013-07-05 2017-09-08 阿里巴巴集团控股有限公司 A kind of internal memory clone method and device
CN104281537A (en) * 2013-07-05 2015-01-14 阿里巴巴集团控股有限公司 Memory copying method and device
CN104461674A (en) * 2013-09-18 2015-03-25 联想(北京)有限公司 Data processing method and electronic device
CN104461674B (en) * 2013-09-18 2018-08-31 联想(北京)有限公司 A kind of data processing method and electronic equipment
CN104699641A (en) * 2015-03-20 2015-06-10 浪潮集团有限公司 EDMA (enhanced direct memory access) controller concurrent control method in multinuclear DSP (digital signal processor) system
CN107430628A (en) * 2015-04-03 2017-12-01 华为技术有限公司 Acceleration framework with immediate data transmission mechanism
CN107291629A (en) * 2016-04-12 2017-10-24 华为技术有限公司 A kind of method and apparatus for accessing internal memory
CN107291629B (en) * 2016-04-12 2020-12-25 华为技术有限公司 Method and device for accessing memory
CN107436855A (en) * 2016-05-25 2017-12-05 三星电子株式会社 QOS cognition IO management for the PCIE storage systems with reconfigurable multiport
CN107070593A (en) * 2017-02-09 2017-08-18 武汉米风通信技术有限公司 Relay based on multichannel communication reception system starts method
CN107070593B (en) * 2017-02-09 2020-08-28 武汉米风通信技术有限公司 Interrupt starting method based on multi-channel communication receiving system
CN106851706A (en) * 2017-02-23 2017-06-13 武汉米风通信技术有限公司 Register configuration method based on multichannel communication reception system
CN106851706B (en) * 2017-02-23 2020-05-15 成都米风感知科技有限公司 Register configuration method based on multichannel communication receiving system
CN109144906A (en) * 2017-06-15 2019-01-04 北京忆芯科技有限公司 Electronic equipment and its command dma processing method
CN109144906B (en) * 2017-06-15 2019-11-26 北京忆芯科技有限公司 Electronic equipment and its command dma processing method
CN107463829A (en) * 2017-09-27 2017-12-12 山东渔翁信息技术股份有限公司 The processing method of DMA request, system and relevant apparatus in a kind of cipher card
CN109408428A (en) * 2018-10-29 2019-03-01 京信通信系统(中国)有限公司 Control method, device and the physical layer accelerator card of direct memory access
CN109408428B (en) * 2018-10-29 2021-05-28 京信通信系统(中国)有限公司 Control method and device for direct memory access and physical layer accelerator card
CN111026687A (en) * 2019-10-30 2020-04-17 深圳震有科技股份有限公司 Method, system and computer equipment for matching data transmission read-write rate
CN111026687B (en) * 2019-10-30 2023-08-01 深圳震有科技股份有限公司 Method, system and computer equipment for data transmission read-write rate matching
CN112835827A (en) * 2019-11-25 2021-05-25 美光科技公司 Quality of service level for direct memory access engines in a memory subsystem
WO2021129304A1 (en) * 2019-12-23 2021-07-01 华为技术有限公司 Memory manager, processor memory subsystem, processor and electronic device
CN111338998B (en) * 2020-02-20 2021-07-02 深圳震有科技股份有限公司 FLASH access processing method and device based on AMP system
CN111338998A (en) * 2020-02-20 2020-06-26 深圳震有科技股份有限公司 FLASH access processing method and device based on AMP system
WO2021179218A1 (en) * 2020-03-11 2021-09-16 深圳市大疆创新科技有限公司 Direct memory access unit, processor, device, processing method, and storage medium
CN111460461A (en) * 2020-04-03 2020-07-28 全球能源互联网研究院有限公司 Trusted CPU system, read-write request and trusted checking method of DMA data
CN111460461B (en) * 2020-04-03 2023-06-06 全球能源互联网研究院有限公司 Trusted CPU system, read-write request and DMA data trusted checking method
CN113076189B (en) * 2020-04-17 2022-03-11 北京忆芯科技有限公司 Data processing system with multiple data paths and virtual electronic device constructed using multiple data paths
CN113076189A (en) * 2020-04-17 2021-07-06 北京忆芯科技有限公司 Data processing system with multiple data paths and virtual electronic device constructed using multiple data paths
CN111949600A (en) * 2020-09-25 2020-11-17 苏州浪潮智能科技有限公司 Method and device for applying thousand-gear market quotation based on programmable device
CN112486410A (en) * 2020-11-23 2021-03-12 华南师范大学 Method, system, device and storage medium for reading and writing persistent memory file
CN112486410B (en) * 2020-11-23 2024-03-26 华南师范大学 Method, system, device and storage medium for reading and writing persistent memory file
CN114691564A (en) * 2020-12-29 2022-07-01 新唐科技股份有限公司 Direct memory access device, data transmission method and electronic device
CN112783117A (en) * 2020-12-29 2021-05-11 浙江中控技术股份有限公司 Method and device for data isolation between security and conventional control applications
CN112749112B (en) * 2020-12-31 2021-12-24 无锡众星微系统技术有限公司 Hardware flow structure
CN112749112A (en) * 2020-12-31 2021-05-04 无锡众星微系统技术有限公司 Hardware flow structure
CN113190475B (en) * 2021-05-08 2022-08-02 中国电子科技集团公司第五十八研究所 Secondary cache controller structure
CN113190475A (en) * 2021-05-08 2021-07-30 中国电子科技集团公司第五十八研究所 Secondary cache controller structure
CN113254321A (en) * 2021-06-07 2021-08-13 恒为科技(上海)股份有限公司 Method and system for evaluating memory access performance of processor
CN113254321B (en) * 2021-06-07 2023-01-24 上海恒为智能科技有限公司 Method and system for evaluating memory access performance of processor
CN117094876A (en) * 2023-07-12 2023-11-21 荣耀终端有限公司 Data processing method, electronic device and readable storage medium
CN116860335A (en) * 2023-09-01 2023-10-10 北京大禹智芯科技有限公司 Method for realizing pipelining operation of direct memory access driving system
CN116860335B (en) * 2023-09-01 2023-11-17 北京大禹智芯科技有限公司 Method for realizing pipelining operation of direct memory access driving system

Also Published As

Publication number Publication date
CN102567256B (en) 2015-01-07

Similar Documents

Publication Publication Date Title
CN102567256B (en) Processor system, as well as multi-channel memory copying DMA accelerator and method thereof
US11880687B2 (en) System having a hybrid threading processor, a hybrid threading fabric having configurable computing elements, and a hybrid interconnection network
CN102231142B (en) Multi-channel direct memory access (DMA) controller with arbitrator
JP5764265B2 (en) Circuit devices, integrated circuit devices, program products and methods that utilize low-latency variable propagation networks for parallel processing of virtual threads across multiple hardware threads (grains of virtual threads across multiple hardware threads) Low latency variable transmission network for generalized parallel processing)
JP4476267B2 (en) Processor and data transfer unit
US9465767B2 (en) Multi-processor, multi-domain, multi-protocol cache coherent speculation aware shared memory controller and interconnect
CN112119376A (en) Thread start and completion using job descriptor packets in a system with self-scheduling processor and hybrid thread fabric
CN112088356A (en) Thread start using job descriptor packets in self-scheduling processor
JP6197196B2 (en) Power efficient processor architecture
CN112088358A (en) Thread priority management in a multithreaded self-scheduling processor
CN112106030A (en) Thread state monitoring in a system having a multithreaded self-scheduling processor
CN112088357A (en) System call management in user-mode multi-threaded self-scheduling processor
CN1279472C (en) Cache memory system and digital signal processor structure
CN112106026A (en) Load access resizing by a multithreaded self-scheduling processor to manage network congestion
CN112106027A (en) Memory request size management in a multithreaded self-scheduling processor
CN112088359A (en) Multi-threaded self-scheduling processor
CN112088355A (en) Thread creation on local or remote computing elements by a multithreaded self-scheduling processor
CN101894013B (en) Instruction level production line control method and system thereof in processor
CN104583944A (en) Fast deskew when exiting low-power partial-width high speed link state
US10659396B2 (en) Joining data within a reconfigurable fabric
CN103778070A (en) Parallel processing of multiple block coherence operations
EP3274860A1 (en) A method, apparatus and system for optimizing cache memory transaction handling in a processor
CN112136107A (en) Non-cached loads and stores in a system with a multithreaded self-scheduling processor
CN101498963A (en) Method for reducing CPU power consumption, CPU and digital chip
US20180212894A1 (en) Fork transfer of data between multiple agents within a reconfigurable fabric

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CP03 Change of name, title or address

Address after: 100095 Building 2, Longxin Industrial Park, Zhongguancun environmental protection technology demonstration park, Haidian District, Beijing

Patentee after: Loongson Zhongke Technology Co.,Ltd.

Address before: 100190 No. 10 South Road, Zhongguancun Academy of Sciences, Haidian District, Beijing

Patentee before: LOONGSON TECHNOLOGY Corp.,Ltd.

CP03 Change of name, title or address