CN117743207A - PRP linked list processing method, system, equipment and medium - Google Patents

PRP linked list processing method, system, equipment and medium Download PDF

Info

Publication number
CN117743207A
CN117743207A CN202311622955.6A CN202311622955A CN117743207A CN 117743207 A CN117743207 A CN 117743207A CN 202311622955 A CN202311622955 A CN 202311622955A CN 117743207 A CN117743207 A CN 117743207A
Authority
CN
China
Prior art keywords
data
linked list
prp
buffer
task
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311622955.6A
Other languages
Chinese (zh)
Inventor
张晓琳
荆晓龙
袁涛
王江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Original Assignee
Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd filed Critical Shandong Yunhai Guochuang Cloud Computing Equipment Industry Innovation Center Co Ltd
Priority to CN202311622955.6A priority Critical patent/CN117743207A/en
Publication of CN117743207A publication Critical patent/CN117743207A/en
Pending legal-status Critical Current

Links

Landscapes

  • Memory System Of A Hierarchy Structure (AREA)

Abstract

The invention relates to the field of data processing, and provides a PRP linked list processing method, a system, equipment and a medium, wherein the method comprises the following steps: storing the task into a pre-buffer area to be used as data to be processed; distributing the data to be processed to the corresponding channel which is arbitrated, and writing the arbitration result into the initial buffer; selecting a flow for executing data to be processed according to a working mode, writing an execution result into a corresponding working buffer, and writing the number of data in all the working buffers into an ending buffer; and outputting an execution result in the working buffer according to the sequence corresponding to the data in the starting buffer and the ending buffer. The invention uses a method combining pipelining and reordering, which not only ensures the processing performance, but also ensures the output data sequence. The number of the instantiation channels can be controlled, the performance is improved, the area is reduced, the flexible choice is realized, the workload caused by changing the number of the channels is small, and the flexibility is very high.

Description

PRP linked list processing method, system, equipment and medium
Technical Field
The present invention relates to the field of data processing, and in particular, to a method, a system, an apparatus, and a medium for processing a PRP linked list.
Background
The nonvolatile memory host controller interface specification (Nvme) is a communication specification that is specific to accessing nonvolatile memory media attached over the PCIe bus. The Nvme protocol is combined with the PCIe protocol, and the parallel characteristic of the solid state disk is utilized, so that the reading and writing speeds of NAND are improved, and faster nonvolatile storage is realized.
The Nvme protocol organizes memory pages in a chain table format and accesses the respective PRP (Physical Region Page, physical local page) linked list by an entry address. The existing memory allocation mode of PRP refers to that the memory fragmentation is called as minimum particles, and then each minimum particle is organized to form a PRP linked list (PRP list); in addition, there is a way to manage SGLs, which can map to physical space of arbitrary size, but is more complex.
However, there is a problem of insufficient performance in managing PRP linked lists, and a large number of serial processing operations occupy most of the time of the pipeline, and the performance is not high. If the data to be processed are all the same task within a period of time, only one processing path can be taken, only multi-stage pipeline processing is actually realized, other channels are idle and wasted, and the task processing efficiency is greatly reduced.
Disclosure of Invention
In view of the above, the invention provides a PRP linked list processing method, a system, a device and a medium, which aims at solving the problem of insufficient performance of the current PRP linked list processing module, and performs pipeline splitting according to processing time of different modes, and combines a reordering method with pipeline design, thereby improving parallelism degree, simplifying hardware implementation, and having stronger engineering value and practical value.
Based on the above objects, an aspect of the embodiments of the present invention provides a PRP linked list processing method, which specifically includes the following steps:
acquiring a PRP linked list task, and storing the PRP linked list task into a pre-buffer area to be used as data to be processed;
detecting whether a plurality of preset channels are idle, carrying out polling arbitration on the detected idle channels, distributing data to be processed to the corresponding channels which are arbitrated, and writing arbitration results into an initial buffer;
judging the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing an execution result into a corresponding working buffer, and writing the number of the data in all the working buffers into an ending buffer;
and outputting an execution result in the working buffer according to the sequence corresponding to the data in the starting buffer and the ending buffer.
In some embodiments, the step of obtaining the PRP linked list task and storing the PRP linked list task in the pre-buffer as the data to be processed includes:
acquiring a PRP linked list task and analyzing the storage position of the PRP linked list task;
and acquiring PRP linked list task information from the PRP linked list task queue pointed by the storage position, and storing the PRP linked list task information into a pre-buffer area to serve as data to be processed to wait for calling.
In some embodiments, the step of determining the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing the execution result into a corresponding working buffer, and writing the number of data in all the working buffers into an ending buffer includes:
judging the working mode of the data to be processed, and grouping according to the working mode of the data to be processed;
selecting and executing a flow of data to be processed according to the group;
periodically detecting the state of a task in the process of executing the flow;
responding to the normal task, selecting a creation flow, and creating a flow for the data to be processed;
in response to the task issuing abnormality or the occurrence of abnormality in the task executing process, selecting an abnormal flow to interrupt or report an event to a host;
in response to the occurrence of a shortage of index values of the memory index pool or other preset events, entering a retry flow, and reporting a retry requirement;
after the process is executed, writing an execution result into a corresponding work cache;
and writing the data number in all the working caches into the ending cache.
In some embodiments, the step of determining the operation mode of the data to be processed includes:
responding to the working mode to create an empty linked list, and then creating a PRP linked list structure;
responding to the working mode to create a new linked list, applying for a memory space, and constructing a PRP linked list;
responding to the working mode to create an expansion linked list, acquiring the space size of the existing PRP linked list, and expanding the PRP linked list on the basis of the original PRP linked list according to the space size required by the task;
and responding to the working mode to create a filling linked list, acquiring index information of the existing PRP linked list, and replacing invalid index values with newly constructed index values without changing the structure of the original PRP linked list.
In some embodiments, each of the number of preset channels is used to process all modes of operation.
In some embodiments, the step of outputting the execution result in the working buffer according to the order corresponding to the data in the start buffer and the end buffer includes:
acquiring an input sequence of data to be processed in a starting buffer memory, and the number of data in an ending buffer memory, and reordering based on the data to generate a data output sequence;
selecting an execution result in the working cache corresponding to the channel according to the data output sequence;
responding to the fact that the working buffer corresponding to the channel is empty, judging whether to wait or not based on the data output sequence;
in response to a yes result, waiting is performed until it is not empty.
In some embodiments, the method further comprises:
simulating according to the working time and the channel number corresponding to different working modes to obtain a simulation result;
and setting the channel number according to the simulation result.
The invention provides a PRP linked list processing system, which comprises:
the acquisition unit is configured to acquire a PRP linked list task, and store the PRP linked list task into the pre-buffer area to be used as data to be processed;
the channel unit is configured to detect whether a plurality of preset channels are idle, perform polling arbitration on the detected idle channels, distribute data to be processed to the corresponding channels which are arbitrated, and write arbitration results into the initial buffer;
the flow unit is configured to judge the working mode of the data to be processed, select the flow for executing the data to be processed according to the working mode, write the execution result into the corresponding working buffer memory, and write the number of the data in all the working buffer memories into the ending buffer memory;
and the output unit is configured to output the execution result in the working cache according to the sequence corresponding to the data in the starting cache and the ending cache.
The invention proposes a computer device comprising:
at least one processor; and a memory storing a computer program executable on the processor, the processor executing the steps of the one PRP linked list processing method when the program is executed.
The present invention proposes a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of the PRP linked list processing method.
The invention has at least the following beneficial technical effects:
the invention provides a PRP linked list processing method, a system, equipment and a medium, wherein the method comprises the following steps: acquiring a PRP linked list task, and storing the PRP linked list task into a pre-buffer area to be used as data to be processed; detecting whether a plurality of preset channels are idle, carrying out polling arbitration on the detected idle channels, distributing data to be processed to the corresponding channels which are arbitrated, and writing arbitration results into an initial buffer; judging the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing an execution result into a corresponding working buffer, and writing the number of the data in all the working buffers into an ending buffer; and outputting an execution result in the working buffer according to the sequence corresponding to the data in the starting buffer and the ending buffer. The invention is convenient for comprehensive treatment by dividing the fine granularity of normal treatment and abnormal treatment and retry treatment of the PRP linked list. By using a method combining pipelining and reordering, an optimal scheme is designed for PRP linked list processing, so that the processing performance can be ensured, and the output data sequence can be ensured. The number of the instantiation channels can be controlled, the performance is improved, the area is reduced, the flexible choice is realized, the workload caused by changing the number of the channels is small, and the flexibility is very high.
Drawings
In order to more clearly illustrate the embodiments of the invention or the technical solutions in the prior art, the drawings that are necessary for the description of the embodiments or the prior art will be briefly described, it being obvious that the drawings in the following description are only some embodiments of the invention and that other embodiments may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a PRP linked list processing method provided by the invention;
FIG. 2 is a block diagram of a PRP linked list processing system according to the present invention;
FIG. 3 is a prior art diagram of a PRP linked list processing method;
FIG. 4 is a prior art diagram II of a PRP linked list processing method;
FIG. 5 is a prior art diagram III of a PRP linked list processing method;
FIG. 6 is a prior art diagram of a PRP linked list processing method;
FIG. 7 is a flowchart illustrating an optimization procedure first of an embodiment of a PRP linked list processing method according to the present invention;
FIG. 8 is a flow chart illustrating a PRP linked list processing method according to an embodiment of the present invention;
FIG. 9 is a timing diagram illustrating an embodiment of a PRP linked list processing method according to the present invention;
FIG. 10 is a second timing diagram of a PRP linked list processing method according to an embodiment of the present invention;
FIG. 11 is a flow chart of a channel in an embodiment of a PRP linked list processing method according to the present invention;
FIG. 12 is a schematic diagram of a computer device according to an embodiment of the present invention;
fig. 13 is a schematic structural diagram of an embodiment of a computer readable storage medium according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the following embodiments of the present invention will be described in further detail with reference to the accompanying drawings.
It should be noted that, in the embodiments of the present invention, all the expressions "first" and "second" are used to distinguish two entities with the same name but different entities or different parameters, and it is noted that the "first" and "second" are only used for convenience of expression, and should not be construed as limiting the embodiments of the present invention, and the following embodiments are not described one by one.
The invention provides a PRP linked list processing method, referring to fig. 1, comprising,
s1: acquiring a PRP linked list task, and storing the PRP linked list task into a pre-buffer area to be used as data to be processed;
s2: detecting whether a plurality of preset channels are idle, carrying out polling arbitration on the detected idle channels, distributing data to be processed to the corresponding channels which are arbitrated, and writing arbitration results into an initial buffer;
s3: judging the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing an execution result into a corresponding working buffer, and writing the number of the data in all the working buffers into an ending buffer;
s4: and outputting an execution result in the working buffer according to the sequence corresponding to the data in the starting buffer and the ending buffer.
The PRP linked list normal processing and abnormal processing and retry processing are divided into fine granularity, so that comprehensive processing is facilitated.
By using a method combining pipelining and reordering, an optimal scheme is designed for PRP linked list processing, so that the processing performance can be ensured, and the output data sequence can be ensured.
The multiple channels of the final scheme are completely the same, the number of the instantiation channels can be controlled, the performance is improved (the number of the channels is increased) and the area is reduced (the number of the channels is reduced), the workload caused by changing the number of the channels is small, and the PPA can be optimized through multiple simulation and trial.
Assume that the order task performs each stage latency as follows:
prefich, i.e. the task decoding unit in the known method: latecyprefatch=t1;
an arbiter, a polling arbitration unit: labencarbitor=t2 (very little time consuming, negligible);
the necessary task judgment and inspection of a tracker: lateencchecker=t3;
executing the task to create PRP linked list: latecyexecutor=t4;
MUX, reorder and process: latecymux=t5 (very little time consuming, negligible);
finish, give task completion notification: latecyfinish=t6;
in the method provided by the prior patent, as shown in fig. 3, 4, 5 and 6, the distribution engine processes tasks with time consumption of latex=t3+t4+t6. In the patent, the task decoding unit and the distribution engine perform pipeline processing, so that the task processing time of the prior patent is roughly estimated to be Latencypre. When the multitasking is performed, its processing time is multiplied. For example, 4 tasks are processed for 4 x latex.
Compared with the channel-based multi-stage pipeline processing scheme provided in the method, the task execution is divided into multi-stage pipeline processing, the task processing time is the maximum value of the various stages of pipeline latency, for example, the latency executor is the maximum, and the single-channel task processing time in the method is the latency executor and is far lower than the latency pre. And the multi-channel processing scheme will further reduce the multi-task processing time, e.g. 4 task processing times can be estimated as latex pre.
As analyzed by the above example, a PRP linked list processing scheme based on channel allocation uses a performance improvement approach to multi-order pipelining and reordering schemes. After the scheme is applied, the processing time of the PRP linked list only depends on the longest path in multi-order processing, so that the processing efficiency of the PRP linked list is improved to the greatest extent, and the performance optimization rate is more than 50%.
According to the PRP linked list processing scheme based on channel allocation, a parallel processing channel and a PRP linked list mode are decoupled, whether a channel from ch0 to ch3 is busy or not is detected through an arbiter module no matter how tasks are issued, if an idle channel exists, polling arbitration is carried out on the channel from ch0 to ch3, and data are allocated to the corresponding channel which is arbitrated; meanwhile, writing the arbitration result into the origin_buf; maximum processing parallelism can be achieved. The channel-based allocation processing scheme greatly improves the expandability of parallel processing.
In some embodiments, referring to fig. 1 and 5, the step of obtaining the PRP linked list task, and storing the PRP linked list task in the pre-buffer as the data to be processed includes:
acquiring a PRP linked list task and analyzing the storage position of the PRP linked list task;
and acquiring PRP linked list task information from the PRP linked list task queue pointed by the storage position, and storing the PRP linked list task information into a pre-buffer area to be used as data waiting to be processed for calling.
Referring to fig. 5, the method of this embodiment may also be implemented by a task decoding unit and a distribution engine, and when there is a currently available task, the storage location of the current task is parsed and obtained, the current task information is obtained from the task queue pool, and the task information is sent to the distribution engine. When available tasks exist, the task decoding unit immediately acquires and analyzes the storage position of the current task, acquires the current task information from the task queue pool and provides the current task information for the distribution engine.
The distribution engine acquires an index value from the memory index pool according to the current task requirement, builds a PRP linked list, stores the built PRP linked list into the PRP list memory pool, and gives a notification after the task is completed. The task decoding unit and the distribution engine exchange data through valid and ready handshakes. And after receiving the request of the constructed PRP linked list, the distribution engine performs a series of serial operations until the task is completed to give a notification, and acquires the information of the next task from the task decoding unit again through a valid-ready handshake mode. Under the order-preserving requirement of task processing, namely: the order in which the notifications are given by the end of the tasks must be the same as the order of the tasks that are correspondingly issued.
The order-preserving function is realized in a mode that the distribution engine serially constructs a linked list, and the hardware utilization rate is improved to a certain extent by using a valid-ready handshake mode between the distribution engine and the task decoding unit.
After the data arrives, the data is prefetched by a prefetch module and stored in a pre_buffer to wait for the call of a later-stage submodule.
In some embodiments, referring to fig. 1, 8 and 11, the step of determining the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing the execution result into the corresponding working buffer, and writing the number of data in all the working buffers into the ending buffer includes:
judging the working mode of the data to be processed, and grouping according to the working mode of the data to be processed;
selecting and executing a flow of data to be processed according to the group;
periodically detecting the state of a task in the process of executing the flow;
responding to the normal task, selecting a creation flow, and creating a flow for the data to be processed;
in response to the task issuing abnormality or the occurrence of abnormality in the task executing process, selecting an abnormal flow to interrupt or report an event to a host;
in response to the occurrence of a shortage of index values of the memory index pool or other preset events, entering a retry flow, and reporting a retry requirement;
after the process is executed, writing an execution result into a corresponding work cache;
and writing the data number in all the working caches into the ending cache.
As shown in FIG. 11, splitting is performed according to the RPP linked list mode, and fine granularity splitting is performed in the executor according to the task execution flow, so that tasks in 4 different modes can be processed in parallel theoretically, the task execution efficiency is high, the order-preserving problem is solved through a reordering method, and the PRP linked list processing efficiency is greatly improved.
In this embodiment, the distribution engine is subjected to multi-stage pipeline processing, and is divided into three stages of pipeline processing modules including checker, executor, finisher, and buf is used for pipeline buffering among the stages of processing modules.
The special condition (error or retry) occurring in the task processing process is regarded as a special working mode, and the exechamutter re-granularity is decomposed into retry, error, normal according to the task condition;
in order to normally realize the data order-preserving function, a reordering module is required to be added after the assembly line is added, so that the order of the inlet data and the outlet data is ensured to be consistent.
As shown in fig. 8, the executor is divided into normal, error, retry execution branches, and any task is classified into one of the execution flows after the checker determines. normal is the normal creation flow under the corresponding PRP linked list type; when the task is abnormal in issuing or executing, reporting the error flow to the host computer through interruption or event; when the index value of the memory index pool is insufficient or other conditions occur, a retry flow is entered to report the retry requirement. And after the flow execution is finished, writing the result into the ch0_buffer.
In some embodiments, referring to fig. 1 and 11, the step of determining the operation mode of the data to be processed includes:
responding to the working mode to create an empty linked list, and then creating a PRP linked list structure;
responding to the working mode to create a new linked list, applying for a memory space, and constructing a PRP linked list;
responding to the working mode to create an expansion linked list, acquiring the space size of the existing PRP linked list, and expanding the PRP linked list on the basis of the original PRP linked list according to the space size required by the task;
and responding to the working mode to create a filling linked list, acquiring index information of the existing PRP linked list, and replacing invalid index values with newly constructed index values without changing the structure of the original PRP linked list.
As shown in fig. 11, according to the type of the created PRP linked list, tasks are grouped according to the granularity of the working mode, so as to realize multi-channel parallel processing under multi-order streaming water.
In some embodiments, referring to fig. 1, each of a number of preset channels is used to process all modes of operation.
Each channel can process all working modes, so that each task can be normally performed when the channel is selected, the parallelism of data processing is improved, and the working efficiency is improved.
In some embodiments, referring to fig. 1, the step of outputting the execution result in the working buffer according to the order corresponding to the data in the start buffer and the end buffer includes:
acquiring an input sequence of data to be processed in a starting buffer memory, and the number of data in an ending buffer memory, and reordering based on the data to generate a data output sequence;
selecting an execution result in the working cache corresponding to the channel according to the data output sequence;
responding to the fact that the working buffer corresponding to the channel is empty, judging whether to wait or not based on the data output sequence;
in response to a yes result, waiting is performed until it is not empty.
Inputting the origin_buf and result_buf information into a MUX module, and performing order-preserving output by the MUX module; because the original information of the data input sequence is reserved in the origin_buf, the MUX module can select which path of result is currently output according to the data sequence of the origin_buf; if the current buffer to be output is empty, waiting until the buffer is not empty is needed, and then data output is carried out.
In some embodiments, referring to fig. 1, the method further comprises:
simulating according to the working time and the channel number corresponding to different working modes to obtain a simulation result;
and setting the channel number according to the simulation result.
In order to achieve PPA optimization, the scheme actually needs to simulate according to the working time and the number of channels of different processing modes, and then determines how many channels should be instantiated, but the time required by the process is long. In practical application, the simulation calculation can be performed in advance by using methods such as system modeling and the like, so that one-step molding of the scheme is striven for, and only a small amount of modification is performed.
The invention provides a PRP linked list processing system, please refer to FIG. 2, comprising:
an obtaining unit 100 configured to obtain a PRP linked list task, and store the PRP linked list task in a pre-buffer area as data to be processed;
the channel unit 200 is configured to detect whether a plurality of preset channels are idle, perform polling arbitration on the detected idle channels, allocate data to be processed to the arbitrated corresponding channels, and write an arbitration result into the initial buffer;
the flow unit 300 is configured to judge the working mode of the data to be processed, select the flow of executing the data to be processed according to the working mode, write the execution result into the corresponding working buffer memory, and write the number of the data in all the working buffer memories into the ending buffer memory;
and an output unit 400 configured to output the execution result in the working buffer according to the order corresponding to the data in the start buffer and the end buffer.
The realization method not only can effectively reduce the PRP linked list processing time in the Nvme protocol and improve the parallelism degree, but also can achieve PPA optimization by controlling the number of instantiation channels and choosing between the performance and the area because the multiple channels of the final scheme are completely the same.
In some embodiments, referring to fig. 7, 9 and 10, the present invention is a PRP linked list process based on channel allocation, optimized in a PRP linked list process based on pattern allocation.
A PRP linked list processing scheme based on pattern allocation is shown in fig. 7.
As can be seen from fig. 7, the process is as follows:
(1) After the task arrives, firstly, pre-fetching is carried out through a pre-fetch module, the storage position of the current task is analyzed, the current task information is obtained from a task queue, and the current task information is stored in a pre-buffer to wait for the call of a post-stage sub-module;
(2) The mode_jitter module is responsible for taking out task information from the pre_buffer, judging the working mode of the mode_jitter module, inputting corresponding i_buffer_0-i_buffer_3 for standby, and writing the information of which buffer the data is sent into the origin_buf for standby;
(3) if the data exists in the i_buffer_0, the subsequent tracker_0 judges task information, judges task execution branches, and writes information required by the subsequent flow into the corresponding buffer.
(4) The executor is divided into normal, error, retry execution branches, and any task is classified into one execution flow after being judged by the tracker. And after any execution flow is finished, the execution result is submitted and written into the o_buffer.
(5) And writing the information of the number of the o_buffer_0 to o_buffer_3 cache data into the result_buf.
(6) The original_buf and result_buf information are input into the MUX module, and the MUX module performs order-preserving output.
However, it was found by analysis that this solution also has certain drawbacks: the data processing parallelism of the scheme actually depends on the PRP linked list mode of the external task.
As shown in fig. 9, the parallelism is highest if the tasks are transmitted in the mode 0/1/2/3 order, or the pattern interleaving is performed.
As shown in FIG. 10, if the data to be processed is all the same task (e.g. mode 0) within a period of time, only one processing path can be taken, and only multi-stage pipeline processing is actually realized, while other channels are idle and wasted. In this scenario, the task processing efficiency of the improvement will be greatly compromised.
From the above analysis, the bottleneck of this scheme is that the PRP linked list mode depends on the task, so each path in fig. 7 should be modified appropriately, and each path needs to have the capability of processing all the PRP linked list mode tasks. Namely: the "split by operating mode" is changed to "split by channel". The PRP linked list processing based on channel allocation designed by the invention is obtained.
According to another aspect of the present invention, as shown in fig. 12, according to the same inventive concept, an embodiment of the present invention further provides a computer device 30, in which the computer device 30 includes a processor 310 and a memory 320, the memory 320 storing a computer program 321 executable on the processor, and the processor 310 executing the steps of the method as above.
Based on the same inventive concept, according to another aspect of the present invention, as shown in fig. 13, an embodiment of the present invention further provides a computer-readable storage medium 40, the computer-readable storage medium 40 storing a computer program 410 that when executed by a processor performs the above method.
Embodiments of the invention may also include corresponding computer devices. The computer device includes a memory, at least one processor, and a computer program stored on the memory and executable on the processor, the processor executing any one of the methods described above when the program is executed.
The memory is used as a non-volatile computer readable storage medium for storing non-volatile software programs, non-volatile computer executable programs and modules, such as program instructions/modules in the embodiments of the present application. The processor performs the various functional applications of the device and data processing, i.e., implements the methods described above, by running non-volatile software programs, instructions, and modules stored in memory.
The memory may include a memory program area and a memory data area, wherein the memory program area may store an operating system, at least one application program required for a function; the storage data area may store data created according to the use of the device, etc. In addition, the memory may include high-speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid-state storage device. In an embodiment, the memory optionally includes memory remotely located relative to the processor, the remote memory being connectable to the local module through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Finally, it should be noted that, as will be appreciated by those skilled in the art, all or part of the procedures in implementing the methods of the embodiments described above may be implemented by a computer program to instruct related hardware, and the program may be stored in a computer readable storage medium, where the program may include the procedures of the embodiments of the methods described above when executed. The storage medium of the program may be a magnetic disk, an optical disk, a read-only memory (ROM), a random-access memory (RAM), or the like. The computer program embodiments described above may achieve the same or similar effects as any of the method embodiments described above.
Those of skill would further appreciate that the various illustrative logical blocks, modules, circuits, and algorithm steps described in connection with the disclosure herein may be implemented as electronic hardware, computer software, or combinations of both. To clearly illustrate this interchangeability of hardware and software, various illustrative components, blocks, modules, circuits, and steps have been described above generally in terms of their functionality. Whether such functionality is implemented as software or hardware depends upon the particular application and design constraints imposed on the overall system. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present disclosure.
The foregoing is an exemplary embodiment of the present disclosure, but it should be noted that various changes and modifications could be made herein without departing from the scope of the disclosure as defined by the appended claims. The functions, steps and/or actions of the method claims in accordance with the disclosed embodiments described herein need not be performed in any particular order. The foregoing embodiment of the present invention has been disclosed with reference to the number of embodiments for the purpose of description only, and does not represent the advantages or disadvantages of the embodiments. Furthermore, although elements of the disclosed embodiments may be described or claimed in the singular, the plural is contemplated unless limitation to the singular is explicitly stated.
It should be understood that as used herein, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly supports the exception. It should also be understood that "and/or" as used herein is meant to include any and all possible combinations of one or more of the associated listed items.
Those of ordinary skill in the art will appreciate that: the above discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure of embodiments of the invention, including the claims, is limited to such examples; combinations of features of the above embodiments or in different embodiments are also possible within the idea of an embodiment of the invention, and many other variations of the different aspects of the embodiments of the invention as described above exist, which are not provided in detail for the sake of brevity. Therefore, any omission, modification, equivalent replacement, improvement, etc. of the embodiments should be included in the protection scope of the embodiments of the present invention.

Claims (10)

1. A PRP linked list processing method, comprising:
acquiring a PRP linked list task, and storing the PRP linked list task into a pre-buffer area to be used as data to be processed; detecting whether a plurality of preset channels are idle, carrying out polling arbitration on the detected idle channels, distributing data to be processed to the corresponding channels which are arbitrated, and writing arbitration results into an initial buffer;
judging the working mode of the data to be processed, selecting a flow for executing the data to be processed according to the working mode, writing an execution result into a corresponding working buffer, and writing the number of the data in all the working buffers into an ending buffer;
and outputting an execution result in the working buffer according to the sequence corresponding to the data in the starting buffer and the ending buffer.
2. The PRP linked list processing method of claim 1, wherein the step of obtaining the PRP linked list task, storing the PRP linked list task in the pre-buffer, as the data to be processed, includes:
acquiring a PRP linked list task and analyzing the storage position of the PRP linked list task;
and acquiring PRP linked list task information from the PRP linked list task queue pointed by the storage position, and storing the PRP linked list task information into a pre-buffer area to serve as data to be processed to wait for calling.
3. The PRP linked list processing method of claim 1, wherein the step of determining the working mode of the data to be processed, selecting the flow of executing the data to be processed according to the working mode, writing the execution result into the corresponding working buffer, and writing the number of data in all the working buffers into the ending buffer includes:
judging the working mode of the data to be processed, and grouping according to the working mode of the data to be processed;
selecting and executing a flow of data to be processed according to the grouping;
periodically detecting the state of a task in the process of executing the flow;
responding to the normal task, selecting a creation flow, and creating a flow for the data to be processed;
in response to the task issuing abnormality or the occurrence of abnormality in the task executing process, selecting an abnormal flow to interrupt or report an event to a host;
in response to the occurrence of a shortage of index values of the memory index pool or other preset events, entering a retry flow, and reporting a retry requirement;
after the process is executed, writing an execution result into a corresponding work cache;
and writing the data number in all the working caches into the ending cache.
4. The PRP linked list processing method of claim 3, wherein the step of determining the operation mode of the data to be processed includes:
responding to the working mode to create an empty linked list, and then creating a PRP linked list structure;
responding to the working mode to create a new linked list, applying for a memory space, and constructing a PRP linked list;
responding to the working mode to create an expansion linked list, acquiring the space size of the existing PRP linked list, and expanding the PRP linked list on the basis of the original PRP linked list according to the space size required by the task;
and responding to the working mode to create a filling linked list, acquiring index information of the existing PRP linked list, and replacing invalid index values with newly constructed index values without changing the structure of the original PRP linked list.
5. The PRP linked list processing method of claim 1, wherein each of the plurality of preset channels is configured to process all operation modes.
6. The PRP linked list processing method of claim 1, wherein the step of outputting the execution result in the working buffer according to the order corresponding to the data in the start buffer and the end buffer includes:
acquiring an input sequence of data to be processed in a starting buffer memory, and the number of data in an ending buffer memory, and reordering based on the data to generate a data output sequence;
selecting an execution result in a working cache corresponding to the channel according to the data output sequence;
responding to the fact that the working buffer corresponding to the channel is empty, judging whether to wait or not based on the data output sequence;
in response to a yes result, waiting is performed until it is not empty.
7. The PRP linked list processing method of claim 1, further comprising:
simulating according to the working time and the channel number corresponding to different working modes to obtain a simulation result;
and setting the channel number according to the simulation result.
A prp linked list processing system, comprising:
the acquisition unit is configured to acquire a PRP linked list task, and store the PRP linked list task into the pre-buffer area to be used as data to be processed;
the channel unit is configured to detect whether a plurality of preset channels are idle, perform polling arbitration on the detected idle channels, distribute data to be processed to the corresponding channels which are arbitrated, and write arbitration results into the initial buffer;
the flow unit is configured to judge the working mode of the data to be processed, select the flow for executing the data to be processed according to the working mode, write the execution result into the corresponding working buffer memory, and write the number of the data in all the working buffer memories into the ending buffer memory;
and the output unit is configured to output the execution result in the working cache according to the sequence corresponding to the data in the starting cache and the ending cache.
9. A computer device, comprising:
at least one processor; and a memory storing a computer program executable on the processor, wherein the processor performs the steps of the PRP linked list processing method of any one of claims 1 to 7 when the program is executed.
10. A computer readable storage medium storing a computer program, wherein the computer program when executed by a processor performs the steps of the linked list processing method of PRP according to any one of claims 1 to 7.
CN202311622955.6A 2023-11-29 2023-11-29 PRP linked list processing method, system, equipment and medium Pending CN117743207A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311622955.6A CN117743207A (en) 2023-11-29 2023-11-29 PRP linked list processing method, system, equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311622955.6A CN117743207A (en) 2023-11-29 2023-11-29 PRP linked list processing method, system, equipment and medium

Publications (1)

Publication Number Publication Date
CN117743207A true CN117743207A (en) 2024-03-22

Family

ID=90255358

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311622955.6A Pending CN117743207A (en) 2023-11-29 2023-11-29 PRP linked list processing method, system, equipment and medium

Country Status (1)

Country Link
CN (1) CN117743207A (en)

Similar Documents

Publication Publication Date Title
US8108571B1 (en) Multithreaded DMA controller
US10235398B2 (en) Processor and data gathering method
US7577799B1 (en) Asynchronous, independent and multiple process shared memory system in an adaptive computing architecture
US8832350B2 (en) Method and apparatus for efficient memory bank utilization in multi-threaded packet processors
CN113918101B (en) Method, system, equipment and storage medium for writing data cache
US9170753B2 (en) Efficient method for memory accesses in a multi-core processor
EP2372530A1 (en) Data processing method and device
US9569381B2 (en) Scheduler for memory
WO2023045203A1 (en) Task scheduling method, chip, and electronic device
JP2012008919A (en) Information processing device
CN112948293A (en) DDR arbiter and DDR controller chip of multi-user interface
US20220350653A1 (en) Reduction of a Number of Stages of a Graph Streaming Processor
US20200409846A1 (en) Dual controller cache optimization in a deterministic data storage system
CN111181874B (en) Message processing method, device and storage medium
CN112789593A (en) Multithreading-based instruction processing method and device
US11237994B2 (en) Interrupt controller for controlling interrupts based on priorities of interrupts
US20090320034A1 (en) Data processing apparatus
CN117743207A (en) PRP linked list processing method, system, equipment and medium
US20220300322A1 (en) Cascading of Graph Streaming Processors
US12019909B2 (en) IO request pipeline processing device, method and system, and storage medium
JP5058116B2 (en) DMAC issue mechanism by streaming ID method
CN114546287A (en) Method and device for single-channel multi-logic-unit number cross transmission
JP4451010B2 (en) Programmable controller
JP2003015968A (en) Bus simulator
CN1331053C (en) Flag register and method for avoiding resource access conflict between multiple processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination