WO2016041150A1 - 并行访问方法及系统 - Google Patents

并行访问方法及系统 Download PDF

Info

Publication number
WO2016041150A1
WO2016041150A1 PCT/CN2014/086638 CN2014086638W WO2016041150A1 WO 2016041150 A1 WO2016041150 A1 WO 2016041150A1 CN 2014086638 W CN2014086638 W CN 2014086638W WO 2016041150 A1 WO2016041150 A1 WO 2016041150A1
Authority
WO
WIPO (PCT)
Prior art keywords
scheduling
cache
queue
component
write access
Prior art date
Application number
PCT/CN2014/086638
Other languages
English (en)
French (fr)
Inventor
何贵洲
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201480022122.9A priority Critical patent/CN105637475B/zh
Priority to PCT/CN2014/086638 priority patent/WO2016041150A1/zh
Publication of WO2016041150A1 publication Critical patent/WO2016041150A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs

Definitions

  • the embodiments of the present invention relate to computer technologies, and in particular, to a parallel access method and system.
  • Embodiments of the present invention provide a parallel access method and system, which improve the utilization of a core resource of a multi-core processor and improve the data processing capability of a single core.
  • an embodiment of the present invention provides a parallel access system, which is applicable to a multi-component concurrent processing scenario, and the system includes:
  • a scheduling component configured to receive a write access operation by using the access interface corresponding to each component, where the component is in one-to-one correspondence with the access interface, and multiple access interfaces are set in parallel; and, according to the pre- The scheduling mode is configured to schedule the write access operation to the high speed module, where the high speed module is a shared resource of the multiple components.
  • system further includes:
  • a memory for storing a cache queue
  • the cache queue is configured to store the write access operation, and each access interface corresponds to a cache queue
  • the access interface is further configured to detect whether each of the cache queues is full, and if it is determined that the cache queue is full, performing a back pressure operation on the component, where the back pressure operation is used to indicate that the component is waiting
  • the write access operation is performed after the preset period; otherwise, after the write access operation is completed, the write access operation is stored in the cache queue.
  • the preset scheduling manner is priority scheduling, where the scheduling component is specifically configured to:
  • the scheduling component preferentially schedules a write access operation in the high priority cache queue to the high speed module according to the priority order of each of the cache queues, until the write access operation in the high priority cache queue is scheduled, and the scheduling is started.
  • the next level of priority cache queue, each schedule starts with the highest priority cache queue.
  • the preset scheduling mode is a polling weight scheduling
  • the scheduling component is specifically configured to:
  • the scheduling component sequentially schedules the cache queue according to the weight of each of the cache queues in a fair scheduling manner, where the weight is the length of the corresponding cache queue;
  • the scheduling component schedules a write access operation to the high-speed module for processing, and performs a weight reduction corresponding to the cache queue by one, and stops the scheduling of the cache queue after the weight is reduced to zero;
  • the scheduling component determines that the write access operations of all cache queues are all scheduled or the weights of all cache queues are zeroed, the weights of the cache queues are restored and the next round of scheduling is started.
  • the preset scheduling manner is a combination of priority scheduling and polling weight scheduling, in all cache queues
  • the partial cache queue is configured as a priority cache queue
  • the remaining part of the cache sequence is configured as a polling weight cache queue
  • the scheduling component is specifically configured to:
  • the scheduling component For each of the priority cache queues, the scheduling component preferentially schedules write access operations in the high priority cache queue to the high speed module for processing according to the priority order of the priority cache queues, until the high priority cache queue After the write access operation is scheduled, the next level of priority cache queue is scheduled, and each schedule starts from the highest priority cache queue;
  • the scheduling component For each of the polling weight buffer queues, the scheduling component caches the queue according to the polling weight Weights, the polling weight buffer queues are sequentially scheduled in a fair scheduling manner, and the weights are the lengths in the corresponding cache queues; for each polling weight buffer queue, the scheduling component schedules a write access operation for each The high-speed module processes, performs weight reduction corresponding to the polling weight buffer queue by one, and stops the scheduling of the polling weight buffer queue after the weight is reduced to zero; when the scheduling component determines all polling weight buffer queues The write access operation is all scheduled to go out or the weights of all polling weight buffer queues are zeroed, the weight of each polling weight buffer queue is restored, and the next round of scheduling is started.
  • the memory is further configured to store a sequence-preserving queue, where the preset scheduling mode is a sequence-preserving scheduling,
  • the scheduling component is specifically configured to:
  • the scheduling component schedules the write access operation to the high speed module according to a write order of each of the write access operations in each of the cache queues, wherein each of the cache queues writes The write order of the access operations is stored in the save order queue, the length of the save order queue being greater than or equal to the sum of all cache queue lengths.
  • the embodiment of the present invention provides a parallel access method, which is applicable to a multi-component concurrent processing scenario, and the method includes:
  • the scheduling component receives the write access operation initiated by the component through the access interface corresponding to the component, the component is in one-to-one correspondence with the access interface, and the multiple accesses Parallel setting between interfaces;
  • the scheduling component schedules the write access operation received by each of the access interfaces to the high-speed module according to a preset scheduling manner, where the high-speed module is a shared resource of the multiple components.
  • the scheduling component before the processing of the write access operation received by each of the access interfaces, is processed by the high-speed module according to a preset scheduling manner,
  • the method also includes:
  • the access interface detects whether a corresponding cache queue in the memory is full. If it is determined that the cache queue is full, performing a back pressure operation on the component, the back pressure operation is used to indicate that the component is waiting for a preset.
  • the write access operation is performed after the cycle; otherwise, after the write access operation is completed, the write access operation is stored in the cache queue, and the access interface and the cache queue are in one-to-one correspondence.
  • the preset scheduling manner is priority scheduling
  • the scheduling component is configured according to preset Scheduling the write access operation received by each of the access interfaces to the high speed module, including:
  • the scheduling component preferentially schedules a write access operation in the high priority cache queue to the high speed module according to the priority order of each of the cache queues, until the write access operation in the high priority cache queue is scheduled, and the scheduling is started.
  • the next level of priority cache queue, each schedule starts with the highest priority cache queue.
  • the preset scheduling mode is a polling weight scheduling
  • the scheduling component is configured according to a preset scheduling manner.
  • scheduling by the high-speed module, the write access operation received by each of the access interfaces, including:
  • the scheduling component sequentially schedules the cache queue according to the weight of each of the cache queues in a fair scheduling manner, where the weight is the length of the corresponding cache queue;
  • the scheduling component schedules a write access operation to the high-speed module for processing, and performs a weight reduction corresponding to the cache queue by one, and stops the scheduling of the cache queue after the weight is reduced to zero;
  • the scheduling component determines that the write access operations of all cache queues are all scheduled or the weights of all cache queues are zeroed, the weights of the cache queues are restored and the next round of scheduling is started.
  • the preset scheduling manner is a combination of priority scheduling and polling weight scheduling, in all cache queues
  • the partial cache queue is configured as a priority cache queue
  • the remaining part of the cache sequence is configured as a polling weight buffer queue
  • the scheduling component schedules the write access operations received by each of the access interfaces according to a preset scheduling manner.
  • Processing the high speed module including:
  • the scheduling component For each of the priority cache queues, the scheduling component preferentially schedules write access operations in the high priority cache queue to the high speed module for processing according to the priority order of the priority cache queues, until the high priority cache queue After the write access operation is scheduled, the next level of priority cache queue is scheduled, and each schedule starts from the highest priority cache queue;
  • the scheduling component For each of the polling weight buffer queues, the scheduling component sequentially schedules the polling weight buffer queues according to the weight of the polling weight buffer queues in a fair scheduling manner, and the weights are the lengths in the corresponding buffer queues. For each polling weight cache queue, the scheduling component schedules a write access operation to the high speed module for processing, and performs weight reduction corresponding to the polling weight buffer queue. First, after the weight is reduced to zero, the scheduling of the polling buffer queue is stopped; when the scheduling component determines that all the write access operations of the polling weight buffer queue are all scheduled to be dispatched or the weights of all polling weight buffer queues are zeroed. , restore the weight of each polling weight cache queue and start the next round of scheduling.
  • the preset scheduling manner is a sequence scheduling, and the scheduling component is configured according to a preset scheduling manner. Scheduling the write access operation received by each of the access interfaces to the high speed module, including:
  • the scheduling component schedules the write access operation to the high speed module according to a write order of each of the write access operations in each of the cache queues, wherein each of the cache queues writes
  • the write order of the access operations is stored in a sequencer whose length is greater than or equal to the sum of all cache queue lengths, the save queue being stored in the memory.
  • the access interface and the components are in one-to-one correspondence, and the component can perform other operations after notifying the corresponding access interface by the write access operation, so that the remaining components do not need to wait for the shared resources.
  • the write access operation to the shared resource is performed, and the utilization of the core resource of the multi-core processor is improved.
  • the parallel access method effectively avoids the waste of time caused by the multi-part lock, thereby improving the single The core's data processing capabilities, in turn, improve the processing efficiency of multi-core processors.
  • the software is simple and efficient to code based on the implementation of the parallel access method.
  • FIG. 1 is a schematic diagram of a scenario in which a multi-core sends a message
  • Embodiment 1 of a parallel access system is a schematic structural diagram of Embodiment 1 of a parallel access system according to the present invention
  • Embodiment 3 is a schematic flowchart of Embodiment 1 of a parallel access method according to the present invention.
  • FIG. 4 is a diagram showing an example of an access interface of the present invention.
  • Embodiment 2 of a parallel access method according to the present invention
  • 6 is an example diagram of a circular queue
  • FIG. 7 is a diagram showing an example of a cache queue of the present invention.
  • FIG. 8 is a diagram showing an example of a save order queue of the present invention.
  • FIG. 1 is a schematic diagram of a scenario in which a multi-core transmission packet is sent.
  • Traffic management is a packet scheduling component, and all packets to be sent by the core are scheduled to be sent out by the TM, P0.
  • P1 and P2 are high-speed ports, and the docking device is a local area network switch (LSW).
  • LSW local area network switch
  • the core needs to send the packet description to the traffic management sending interface, and the sending interface becomes a critical resource. All the cores share the sending interface.
  • the message descriptor is 16 bytes (Byte, abbreviated as: B) is even longer.
  • the atomic operation requirement can be completed within 4B.
  • the 16B write operation of multiple check traffic management sending interfaces will inevitably cause confusion of write data if there is no mutually exclusive access mechanism.
  • Locking operations are employed in the prior art to avoid the above problems, but the use of a lock operation causes a large number of cores to be in a wait state, resulting in degradation of multi-core processor performance.
  • the embodiment of the present invention provides a parallel access method and system.
  • parallel access method and system provided by the embodiments of the present invention are applicable to all scenarios in which multiple execution components are concurrently processed, including but not limited to multi-core concurrent in a chip, and can also be used for concurrent multi-process or multi-threaded software. .
  • FIG. 2 is a schematic structural diagram of Embodiment 1 of a parallel access system according to the present invention.
  • the embodiment of the present invention provides a parallel access system, which is applicable to a multi-component concurrent processing scenario.
  • the system may be a device or a system including multiple execution components, such as a multi-core processor, which are not enumerated herein. As shown in FIG. 2, four components are taken as an example.
  • the parallel access system includes: component 21, component 22, component 23, component 24, access interface I1, access interface I2, access interface I3, access interface I4, and high speed.
  • Module 26 and scheduling component 25 is used to indicate the direction of data flow.
  • the scheduling component 25 is configured to receive, by using an access interface corresponding to each component, a write access operation initiated by the component, where the component and the access interface are in one-to-one correspondence, and multiple access interfaces are set in parallel; And, according to a preset scheduling manner, the write access operation is scheduled to be processed by the high speed module 26, and the high speed module 26 is a shared resource of the multiple components.
  • the parallel access system of the embodiment of the present invention can be used to implement the technical solution of the method embodiment shown in FIG. 3, and the implementation principle and technical effects are similar, and details are not described herein again.
  • the system may further include: a memory 27, configured to store a cache queue, where the cache queue is used to store the write access operation, and each access interface corresponds to a cache queue.
  • the access interface can also be used to detect whether the cache queue is full. If it is determined that the cache queue is full, the component performs a back pressure operation, and the back pressure operation is used to indicate that the component performs the write access operation after waiting for the preset period; otherwise, the After the write access operation is completed, the write access operation is stored in the cache queue.
  • the cache queues may respectively correspond to a section of storage space of the memory 27. In this embodiment, the number of the memory is one.
  • each access interface may also correspond to a separate memory, which is not limited by the present invention.
  • the preset scheduling mode is the priority scheduling
  • the scheduling component 25 may be specifically configured to: the scheduling component 25 preferentially schedules the write access operation in the high priority cache queue according to the priority order of each of the cache queues.
  • the high speed module 26 processes until the next priority cache queue is scheduled until the write access operation in the high priority cache queue is scheduled, each scheduling starting from the highest priority cache queue.
  • the preset scheduling mode is a polling weight scheduling
  • the scheduling component 25 may be specifically configured to: the scheduling component 25 sequentially schedules the cache queue according to a weight of each of the cache queues in a fair scheduling manner.
  • the weight is the length of the corresponding cache queue; for each cache queue, the scheduling component 25 dispatches a write access operation to the high speed module 26 for processing, and performs the weight corresponding to the cache queue minus one, until the weight is reduced to zero.
  • the scheduling of the cache queue is stopped; when the scheduling component 25 determines that all write access operations of the cache queue are all scheduled to go out or the weights of all the cache queues are zeroed, the weight of each cache queue is restored and the next round of scheduling is started.
  • the preset scheduling mode is a mixture of priority scheduling and polling weight scheduling, and some cache queues in all cache queues are configured as priority cache queues, and the remaining part of the mixed sequence is configured as polling.
  • the weight buffer queue, the scheduling component 25 may be specifically configured to: for each of the priority cache queues, the scheduling component 25 may select the priority order of the queues according to the priority.
  • the write access operation in the high priority cache queue is first scheduled to be processed by the high speed module 26 until the write access operation in the high priority cache queue is scheduled, and the next priority cache queue is scheduled to be scheduled, and each schedule is from the highest priority.
  • the buffer queue is started.
  • the scheduling component 25 may sequentially schedule the polling weight buffer queues according to the fair scheduling manner according to the weight of the polling weight buffer queues, where the weights are corresponding caches. The length of the queue; for each polling buffer queue, the scheduling component 25 dispatches a write access operation to the high-speed module 26 for processing, and performs the polling weight buffer queue corresponding weight minus one, until the weight is reduced to zero. Stop scheduling the polling cache queue for this polling; when the scheduling component 25 determines that the write access operations of all polling weight buffer queues are all scheduled to go out or the weights of all polling weight buffer queues are zeroed, the weights of each polling weight buffer queue are restored. And start the next round of scheduling.
  • the memory 27 can also be used to store a sequence-preserving queue
  • the preset scheduling mode is a sequence-preserving scheduling
  • the scheduling component 25 can be specifically configured to: the scheduling component 25 can be configured according to each of the cache queues.
  • the write sequence of the write access operation is scheduled to be processed by the high speed module 26, wherein the write order of each of the write access operations in each of the cache queues is stored in a save order queue.
  • the length of the sequence is greater than or equal to the sum of all cache queue lengths.
  • the writing capability of the component when the writing capability of the component is greater than the processing capability of the high-speed module, or the component writing capability is jittery, it is necessary to set a certain buffer to avoid congestion at the entrance, and set a corresponding cache queue for each access interface. Ensure that the non-blocking and high-speed modules of the write continue to flow.
  • FIG. 3 is a schematic flowchart of Embodiment 1 of a parallel access method according to the present invention.
  • the embodiment of the present invention provides a parallel access method, which may be performed by a parallel access system, which may be a device or system including multiple execution components, such as a multi-core processor, which are not enumerated here.
  • the parallel access method includes:
  • the scheduling component receives the write access operation initiated by the component through the access interface corresponding to the component, and the component and the access interface are in one-to-one correspondence, and the plurality of access interfaces are set in parallel.
  • the multiple components may be, for example, multiple cores, multiple accelerators, or multiple threads. These processing resources need to be concurrently executed at a high speed to avoid sharing resources among multiple resources by means of a spinlock operation.
  • Each core or accelerator corresponds to a set of read/write access interfaces, and one-to-one access is used to achieve concurrency purposes.
  • modules with high-speed processing capability usually have multiple high-speed modules in one chip system, for example, memory. Management module, dispatch center module, message output module, etc.
  • the access interface may correspond to a register space inside the chip, and each component corresponds to an access interface, addr x in FIG. 4, where x takes a value of 0, 1 , 2, ..., N, N are the values obtained by subtracting one number of access interfaces, and n is a positive integer power of 4, which identifies the access entry of each access interface.
  • the width of the access interface can be defined as 4 bytes, 8 bytes, 16 bytes or 32 bytes, etc., as shown in Figure 3, the traffic management, with 16 bytes The interface can be.
  • a component when a component performs a write access operation on its corresponding access interface, it is usually written in units of 4B, or can be written in units of 8B or 16B or 32B; the access interface detects that the last unit is written. , indicating that the write access operation is completed.
  • the scheduling component schedules the write access operation to the high-speed module according to a preset scheduling manner, and the high-speed module is a shared resource of multiple components.
  • the preset scheduling manner may include a sequence scheduling and an out-of-order scheduling, and the out-of-order scheduling includes but is not limited to priority scheduling and polling weight scheduling.
  • the preset scheduling mode is used to ensure that the plurality of components are transmitted to the high speed module through a write access operation performed by the respective access interfaces, so that the high speed module performs processing.
  • the access interface and the components are in one-to-one correspondence, and the component can perform other operations after notifying the corresponding access interface by the write access operation, so that the remaining components do not need to wait for the shared resources.
  • the write access operation to the shared resource is performed, and the utilization of the core resource of the multi-core processor is improved.
  • the parallel access method effectively avoids the waste of time caused by the multi-part lock, thereby improving the single The core's data processing capabilities, in turn, improve the processing efficiency of multi-core processors.
  • the software is simple and efficient to code based on the implementation of the parallel access method.
  • FIG. 5 is a schematic flowchart of Embodiment 2 of a parallel access method according to the present invention. This embodiment is improved on the basis of the embodiment shown in FIG. As shown in FIG. 5, the method may include:
  • the scheduling component receives the write access operation initiated by the component through the access interface corresponding to the component, and the component and the access interface are in one-to-one correspondence, and the plurality of access interfaces are set in parallel.
  • This step is the same as S301, and will not be described here.
  • the access interface detects whether a corresponding cache queue in the memory is full.
  • the cache queue can correspond to a section of memory space inside the chip.
  • the cache queue is in the form of a circular queue, and the circular queue is as shown in FIG. 6.
  • Each cache queue is provided with a head pointer and a tail pointer.
  • the access interface determines that the cache queue is full. Specifically, each time a CMD (command description) is entered, the write access operation is performed, and the tail pointer is incremented by one. If the head and tail pointers coincide, the cache queue is full.
  • the head pointer is used to schedule component access. Each time a CMD header pointer is called up, if the head pointer and the tail pointer coincide, the CMD has been scheduled.
  • an indication flag is set inside the chip system to indicate whether the cache queue is full.
  • the indication flag is set to "1" when the head and tail pointers coincide; when the scheduling component takes the write access operation from the cache queue, the indication flag is set to "0".
  • the access interface detects that a write access operation is written, the above indication flag is queried. If "1" has been set, the component corresponding to the access interface is back pressured.
  • head and tail pointers are modulo according to the length of the queue after moving, thereby forming a circular queue, otherwise the pointer will change beyond the length of the queue.
  • the access interface performs a back pressure operation on the component, where the back pressure operation is used to indicate that the component performs a write access operation after waiting for a preset period.
  • the access interface determines that its corresponding cache queue is full, the component is back-pressed, and the component needs to wait for a preset period (for example, 1 to N clock cycles) before writing, and the length of the preset period is pre-configured according to requirements. .
  • a preset period for example, 1 to N clock cycles
  • the scheduling component schedules the write access operation to the high-speed module according to a preset scheduling manner, and the high-speed module is a shared resource of multiple components.
  • This step is the same as S202, and will not be described here.
  • the writing capability of the component when the writing capability of the component is greater than the processing capability of the high-speed module, or the component writing capability is jittery, it is necessary to set a certain buffer to avoid congestion at the entrance, and set a corresponding cache queue for each access interface. Ensure that the non-blocking and high-speed modules of the write continue to flow.
  • the following describes in detail how the scheduling component schedules the write access operations received by each access interface to the high-speed module according to a preset scheduling manner in several specific manners.
  • the foregoing preset scheduling manner is priority scheduling.
  • the scheduling component according to the preset scheduling manner, the scheduling access operation received by the access interface is scheduled to be processed by the high-speed module, and the scheduling component may preferentially schedule the write access operation in the high-priority cache queue according to the priority order of each cache queue.
  • the high-speed module is processed until the next-level priority cache queue is scheduled until the write access operation in the high-priority cache queue is scheduled, and each schedule starts from the highest priority cache queue.
  • each cache queue is pre-configured with a priority
  • the priority is divided into 1 to M
  • M is the number of access interfaces, which is generally consistent with the number of cores or threads, arranged from low to high
  • the scheduling component is cached according to the cache.
  • the priority order of the queues is executed.
  • the foregoing preset scheduling manner is polling weight scheduling.
  • the scheduling component schedules the write access operation received by each access interface to the high-speed module according to a preset scheduling manner, and may include: the scheduling component sequentially schedules the cache queue according to the weight of each cache queue in a fair scheduling manner, and the weight is corresponding thereto. The length of the cache queue; for each cache queue, the scheduling component dispatches a write access operation to the high-speed module for processing, and performs the weight corresponding to the cache queue minus one, and stops the scheduling of the cache queue after the weight is reduced to zero; When the scheduling component determines that the write access operations of all cache queues are all scheduled or the weights of all cache queues are zeroed, the weights of each cache queue are restored and the next round of scheduling is started.
  • each cache queue is pre-configured with a weight, which avoids the situation that the lowest priority cache queue is not scheduled because the CMD is always present in the high priority cache queue according to the priority scheduling.
  • the preset scheduling mode is a mixture of priority scheduling and polling weight scheduling, and some cache queues in all cache queues are configured as priority cache queues, and the remaining part of the mixed sequence is configured as a round.
  • the query re-caches the queue.
  • the scheduling component schedules the write access operation received by each access interface to the high-speed module according to the preset scheduling manner, and may include: for each priority cache queue, the scheduling component prioritizes the priority according to the priority order of the priority cache queue.
  • the write access operation in the priority cache queue is processed by the high-speed module until the write access operation in the high-priority cache queue is scheduled, and the next-level priority cache queue is scheduled to be scheduled, and each schedule starts from the highest-priority cache queue;
  • the scheduling component sequentially schedules the polling weight buffer queues according to the fair scheduling manner according to the weights of the polling weight buffer queues; for each polling weight buffer queue, the scheduling component per scheduling A write access operation is processed for the high speed module, and the weight reduction corresponding to the polling weight buffer queue is executed.
  • the scheduling of the polling buffer queue is stopped; when the scheduling component determines that all the write access operations of the polling weight buffer queue are all scheduled to be dispatched or the weights of all the polling weight buffer queues are zeroed, the recovery is resumed.
  • Each polling weight buffers the weight of the queue and begins the next round of scheduling.
  • Out-of-order scheduling means that the order in which write access operations call up the cache queue is inconsistent with the order in which the cache queue is written.
  • order-preserving scheduling that is, the scheduling component is called to the high-speed module in the order in which the write access operation is written to the cache queue, without relying on the write access operation sequence of a single cache queue.
  • the foregoing preset scheduling manner is a sequence scheduling.
  • the scheduling component schedules the write access operation received by each access interface to the high-speed module according to a preset scheduling manner, and may include: the scheduling component schedules the write access operation according to the writing order of each write access operation in each cache queue.
  • the high-speed module processes, wherein the write order of each write access operation in each cache queue is stored in a save order queue whose length is greater than or equal to the sum of all cache queue lengths, and the save order queue is stored in the memory.
  • the component enqueues a CMD to a cache queue, and the queue number is added to the sequencer.
  • the order of the queue numbers of the sequencer is the order in which the CMDs are executed.
  • the scheduling module accesses the corresponding cache queue according to the queue number in the sequence, and takes the CMD from the address pointed to by the head pointer of the cache queue and sends it to the high-speed module.
  • the aforementioned program can be stored in a computer readable storage medium.
  • the program when executed, performs the steps including the foregoing method embodiments; and the foregoing storage medium includes various media that can store program codes, such as a ROM, a RAM, a magnetic disk, or an optical disk.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Memory System Of A Hierarchy Structure (AREA)

Abstract

本发明实施例提供一种并行访问方法及系统,适用于多部件并发处理场景,该系统包括:对于多部件中的每一部件,用于通过其对应的访问接口执行对高速模块的写访问操作,部件与访问接口是一一对应的,多个访问接口间并行设置,高速模块为多部件的共享资源;调度组件,用于根据预设的调度方式,将各访问接口所接收的写访问操作调度给高速模块处理。本发明实施例通过并行设置多个访问接口,使访问接口与部件之间一一对应,部件将写访问操作通知给其对应的访问接口后即可执行其他操作,从而无需等待,提升了多核处理器的核资源利用率;有效避免了多部件因抢锁导致的时间浪费,从而提高了单核的数据处理能力,进而提高多核处理器的处理效率。

Description

并行访问方法及系统 技术领域
本发明实施例涉及计算机技术,尤其涉及一种并行访问方法及系统。
背景技术
随着技术革新的发展,处理器的应用渗透到现代社会的各个层面。在单核处理器时代,由于只有一个核(core),处理器内部所有的资源,包括各种接口、内部的加速器等,都等待这一个核的操作,在该核不操作此资源时,该资源闲置。
引入多核处理器后,处理器内部所有资源都为多核共享。通常对处理能力要求不高的资源,仅规划一个固定的核访问;或,采用锁操作(lock)将此资源锁定,等某个核操作完该资源后,释放锁(unlock),下一个等待的核才可获取到该资源。而对于多核处理器内的高速模块的访问,如果仍采用锁操作,会使得大量的核处于等待状态,导致核的浪费;另外,操作锁资源,包括对资源的锁定及解锁本身会浪费较多的时间,从而降低了单个核的数据处理能力。
发明内容
本发明实施例提供一种并行访问方法及系统,以提升多核处理器的核资源利用率,并提高单核的数据处理能力。
一方面,本发明实施例提供一种并行访问系统,适用于多部件并发处理场景,所述系统包括:
所述多部件中的每一部件,用于发起写访问操作;
调度组件,用于通过各部件对应的访问接口,接收所述部件发起写访问操作,所述部件与所述访问接口是一一对应的,多个所述访问接口间并行设置;及,根据预设的调度方式,将所述写访问操作调度给所述高速模块处理,所述高速模块为所述多部件的共享资源。
在第一方面的第一种可能的实现方式中,所述系统还包括:
存储器,用于存储缓存队列,所述缓存队列用于存储所述写访问操作,每一访问接口对应一缓存队列;
所述访问接口,还用于检测各所述缓存队列是否已满,若确定所述缓存队列已满,则对所述部件实施反压操作,所述反压操作用于指示所述部件在等待预设周期后再执行写访问操作;否则,待所述写访问操作写入完成后,将所述写访问操作存储到所述缓存队列。
结合第一方面的第一种可能的实现方式,在第一方面的第二种可能的实现方式中,所述预设的调度方式为优先级调度,所述调度组件具体用于:
所述调度组件根据各所述缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
结合第一方面的第二种可能的实现方式,在第一方面的第三种可能的实现方式中,所述预设的调度方式为轮询权重调度,所述调度组件具体用于:
所述调度组件根据各所述缓存队列的权重,按公平调度方式顺序调度所述缓存队列,所述权重为其对应缓存队列的长度;
对每一缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;
当所述调度组件确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
结合第一方面的第二种可能的实现方式,在第一方面的第四种可能的实现方式中,所述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列,所述调度组件具体用于:
对各所述优先级缓存队列,所述调度组件根据所述优先级缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;
对各所述轮询权重缓存队列,所述调度组件根据所述轮询权重缓存队列 的权重,按公平调度方式顺序调度所述轮询权重缓存队列,所述权重为其对应缓存队列中的长度;对每一轮询权重缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该轮询权重缓存队列对应的权重减一,至权重减到零后,停止对此轮询权重缓存队列的调度;当所述调度组件确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
结合第一方面的第二种可能的实现方式,在第一方面的第五种可能的实现方式中,所述存储器还用于存储保序队列,所述预设的调度方式为保序调度,所述调度组件具体用于:
所述调度组件根据各所述缓存队列中各所述写访问操作的写入顺序,将所述写访问操作调度给所述高速模块处理,其中,所述各所述缓存队列中各所述写访问操作的写入顺序存储在所述保序队列中,所述保序队列的长度大于或等于所有缓存队列长度的总和。
第二方面,本发明实施例提供一种并行访问方法,适用于多部件并发处理场景,所述方法包括:
对于所述多部件中的每一部件,调度组件通过该部件对应的访问接口,接收所述部件发起的写访问操作,所述部件与所述访问接口是一一对应的,多个所述访问接口间并行设置;
所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,所述高速模块为所述多部件的共享资源。
在第二方面的第一种可能的实现方式中,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理之前,所述方法还包括:
所述访问接口检测存储器中其对应的缓存队列是否已满,若确定所述缓存队列已满,则对所述部件实施反压操作,所述反压操作用于指示所述部件在等待预设周期后再执行写访问操作;否则,待所述写访问操作写入完成后,将所述写访问操作存储到所述缓存队列,所述访问接口与所述缓存队列之间一一对应。
结合第二方面的第一种可能的实现方式,在第二方面的第二种可能的实现方式中,所述预设的调度方式为优先级调度,所述调度组件根据预设的调 度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
所述调度组件根据各所述缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
结合第二方面的第二种可能的实现方式,在第二方面的第三种可能的实现方式中,所述预设的调度方式为轮询权重调度,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
所述调度组件根据各所述缓存队列的权重,按公平调度方式顺序调度所述缓存队列,所述权重为其对应缓存队列的长度;
对每一缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;
当所述调度组件确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
结合第二方面的第二种可能的实现方式,在第二方面的第四种可能的实现方式中,所述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
对各所述优先级缓存队列,所述调度组件根据所述优先级缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;
对各所述轮询权重缓存队列,所述调度组件根据所述轮询权重缓存队列的权重,按公平调度方式顺序调度所述轮询权重缓存队列,所述权重为其对应缓存队列中的长度;对每一轮询权重缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该轮询权重缓存队列对应的权重减 一,至权重减到零后,停止对此轮询权重缓存队列的调度;当所述调度组件确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
结合第二方面的第二种可能的实现方式,在第二方面的第五种可能的实现方式中,所述预设的调度方式为保序调度,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
所述调度组件根据各所述缓存队列中各所述写访问操作的写入顺序,将所述写访问操作调度给所述高速模块处理,其中,所述各所述缓存队列中各所述写访问操作的写入顺序存储在保序队列中,所述保序队列的长度大于或等于所有缓存队列长度的总和,所述保序队列存储在所述存储器中。
本发明实施例通过并行设置多个访问接口,使访问接口与部件之间一一对应,部件将写访问操作通知给其对应的访问接口后即可执行其他操作,从而无需等待其余部件对共享资源的访问结束后再执行对该共享资源的写访问操作,提升了多核处理器的核资源利用率;另外,该并行访问方法还有效避免了多部件因抢锁导致的时间浪费,从而提高了单核的数据处理能力,进而提高多核处理器的处理效率。且,软件基于该并行访问方法的实现编码简洁高效。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1为多核发送报文的场景示意图;
图2为本发明并行访问系统实施例一的结构示意图;
图3为本发明并行访问方法实施例一的流程示意图;
图4为本发明访问接口示例图;
图5为本发明并行访问方法实施例二的流程示意图;
图6为环形队列示例图;
图7为本发明缓存队列示例图;
图8为本发明保序队列示例图。
具体实施方式
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。
为了便于理解,图1示出了多核发送报文的场景示意图,其中,流量管理(Traffic management,简称:TM)为报文调度组件,所有核要发送的报文均通过TM调度出去,P0、P1和P2为高速端口,对接器件为局域网交换机(Local Area Network Switch,简称:LSW)。
在图1所示场景中,核要发送报文,都需要将报文描述写入流量管理的发送接口,该发送接口则成为临界资源,所有的核共享该发送接口,通常报文描述符为16字节(Byte,简称:B)甚至更长,原子操作要求在4B以内才能完成,多个核对流量管理发送接口的16B写操作如果没有互斥的访问机制势必会造成写数据的混乱。
现有技术中采用锁操作以避免上述问题,但采用锁操作会使得大量的核处于等待状态,导致多核处理器性能下降。为保证多核处理器的数据处理能力,本发明实施例提供一种并行访问方法及系统。
需说明的是,本发明实施例提供的并行访问方法及系统适用于所有需多执行部件并发处理的场景,包括但不限于多核在芯片中的并发,还可用于软件多进程或者多线程的并发。
图2为本发明并行访问系统实施例一的结构示意图。本发明实施例提供一种并行访问系统,适用于多部件并发处理场景,该系统可以为多核处理器等包含多个执行部件的器件或系统,在此不一一列举。如图2所示,以四个部件为例进行说明,该并行访问系统包括:部件21、部件22、部件23、部件24、访问接口I1、访问接口I2、访问接口I3、访问接口I4、高速模块26和调度组件25。其中,图2中带箭头的指向用于表示数据流方向。
其中,多部件中的每一部件,即部件21、部件22、部件23和部件24, 用于发起写访问操作;调度组件25用于通过各部件对应的访问接口,接收所述部件发起的写访问操作,其中,部件与访问接口是一一对应的,多个访问接口间并行设置;及,根据预设的调度方式,将所述写访问操作调度给高速模块26处理,高速模块26为所述多部件的共享资源。
本发明实施例的并行访问系统,可以用于执行如图3所示方法实施例的技术方案,其实现原理和技术效果类似,此处不再赘述。
在上述基础上,所述系统还可以包括:存储器27,用于存储缓存队列,所述缓存队列用于存储所述写访问操作,每一访问接口对应一缓存队列。访问接口还可以用于检测缓存队列是否已满,若确定缓存队列已满,则对部件实施反压操作,反压操作用于指示部件在等待预设周期后再执行写访问操作;否则,待写访问操作写入完成后,将写访问操作存储到缓存队列。其中,缓存队列可以分别对应到存储器27的一段存储空间。该实施例中,存储器的个数为一个,可选地,还可以每一访问接口分别对应一独立的存储器,本发明不对其进行限制。
一种实现方式中,预设的调度方式为优先级调度,调度组件25可以具体用于:调度组件25根据各所述缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给高速模块26处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
另一种实现方式中,所述预设的调度方式为轮询权重调度,调度组件25可以具体用于:调度组件25根据各所述缓存队列的权重,按公平调度方式顺序调度所述缓存队列,所述权重为其对应缓存队列的长度;对每一缓存队列,调度组件25每调度出一个写访问操作给高速模块26处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;当调度组件25确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
又一种实现方式中,所述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列,调度组件25可以具体用于:对各所述优先级缓存队列,调度组件25可以根据所述优先级缓存队列的优先级顺序,优 先调度高优先级缓存队列中写访问操作给高速模块26处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;对各所述轮询权重缓存队列,调度组件25可以根据所述轮询权重缓存队列的权重,按公平调度方式顺序调度所述轮询权重缓存队列,所述权重为其对应缓存队列中的长度;对每一轮询权重缓存队列,调度组件25每调度出一个写访问操作给高速模块26处理,执行该轮询权重缓存队列对应的权重减一,至权重减到零后,停止对此轮询权重缓存队列的调度;当调度组件25确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
再一种实现方式中,存储器27还可以用于存储保序队列,所述预设的调度方式为保序调度,调度组件25可具体用于:调度组件25可根据各所述缓存队列中各所述写访问操作的写入顺序,将所述写访问操作调度给高速模块26处理,其中,所述各所述缓存队列中各所述写访问操作的写入顺序存储在保序队列中,所述保序队列的长度大于或等于所有缓存队列长度的总和。
在本发明实施例中,当部件的写入能力大于高速模块的处理能力,或者部件写入能力出现抖动时,需要设置一定的缓冲避免在入口处拥塞,为每个访问接口设置对应的缓存队列,确保写入的无阻塞及高速模块不断流。
图3为本发明并行访问方法实施例一的流程示意图。本发明实施例提供一种并行访问方法,该方法可以由并行访问系统执行,该系统可以为多核处理器等包含多个执行部件的器件或系统,在此不一一列举。如图3所示,该并行访问方法包括:
S301、对于多部件中的每一部件,调度组件通过该部件对应的访问接口,接收所述部件发起的写访问操作,部件与访问接口是一一对应的,多个访问接口间并行设置。
其中,多部件例如可以为多个核,或多个加速器,或多个线程等,这些处理资源需要能得到并发的高速执行,以避免多资源间采用原子锁(Spinlock)操作的方式共享资源。每一个核或加速器对应一套读/写访问接口,通过一对一的访问,达成并发目的。对于高速模块,本领域技术人员可以理解为具有高速处理能力的模块,通常一个芯片系统中会有多个高速模块,例如,内存 管理模块、调度中心模块、报文输出模块等。
一种具体的实现方式中,如图4所示,访问接口可以对应到芯片内部的一段寄存器空间,每一部件对应一个访问接口,图4中的addr x,其中,x取值为0,1,2,…,N,N为访问接口个数减一所得到的数值,n取值为4的正整数次幂,标识了每个访问接口的访问入口。根据高速模块所要求接口宽度的不同,访问接口的宽度可定义为4字节、8字节、16字节或32字节等,如图3中所示出的流量管理,用16字节的接口即可。
具体地,部件对其对应的访问接口执行写访问操作时,通常是以4B为单位写入,还是可以以8B或16B或32B等为单位写入;该访问接口检测到最后一个单位写入完成,则表明该写访问操作写入完成。
S302、调度组件根据预设的调度方式,将写访问操作调度给高速模块处理,高速模块为多部件的共享资源。
其中,预设的调度方式可包括保序调度和乱序调度,乱序调度包括但不限于优先级调度和轮询权重调度。该预设的调度方式用于保证上述多个部件通过各自访问接口执行的写访问操作传输至高速模块,以使高速模块进行处理。
本发明实施例通过并行设置多个访问接口,使访问接口与部件之间一一对应,部件将写访问操作通知给其对应的访问接口后即可执行其他操作,从而无需等待其余部件对共享资源的访问结束后再执行对该共享资源的写访问操作,提升了多核处理器的核资源利用率;另外,该并行访问方法还有效避免了多部件因抢锁导致的时间浪费,从而提高了单核的数据处理能力,进而提高多核处理器的处理效率。且,软件基于该并行访问方法的实现编码简洁高效。
图5为本发明并行访问方法实施例二的流程示意图。该实施例在如图3所示实施例的基础上进行改进。如图5所示,该方法可以包括:
S501、对于多部件中的每一部件,调度组件通过该部件对应的访问接口,接收所述部件发起的写访问操作,部件与访问接口是一一对应的,多个访问接口间并行设置。
该步骤同S301,此处不再赘述。
S502、访问接口检测存储器中其对应的缓存队列是否已满。
其中,访问接口与缓存队列之间一一对应。若访问接口确定缓存队列已满,则执行S503;否则,执行S504。
其中,缓存队列可以对应到芯片内部的一段存储器空间。可选地,缓存队列采用环形队列形式,环形队列如图6所示,每个缓存队列设置一个头部指针和一个尾部指针。访问接口确定缓存队列已满,具体为:每次入队一个CMD(命令描述),即写访问操作,尾部指针加一,如果头尾指针重合,说明缓存队列已满。头部指针用于调度组件访问,每次调出一个CMD头部指针加一,若头部指针和尾部指针重合,说明CMD已被调度完。
通常情况下,芯片系统内部会设置一个指示标识,以用于指示缓存队列是否已满。当缓存队列中有一写访问操作加入后,头尾指针重合时,将该指示标识标志置“1”;当调度组件将写访问操作从缓存队列中取出后,将该指示标识置“0”。访问接口检测到有写访问操作写入时,查询上述指示标识,如果已经置“1”,则对该访问接口对应的部件实施反压。
补充说明的是,头尾指针在移动后都要根据队列长度取模,由此形成环形队列,否则指针改变后会超出队列长度范围。
S503、访问接口对部件实施反压操作,该反压操作用于指示部件在等待预设周期后再执行写访问操作。
具体地,访问接口确定其对应的缓存队列已满,则对部件实施反压,部件需要等待预设周期(例如1至N个时钟周期)后再写入,预设周期的长度根据需求预先配置。
S504、访问接口待写访问操作写入完成后,将写访问操作存储到缓存队列。
S505、调度组件根据预设的调度方式,将写访问操作调度给高速模块处理,高速模块为多部件的共享资源。
该步骤同S202,此处不再赘述。
在本发明实施例中,当部件的写入能力大于高速模块的处理能力,或者部件写入能力出现抖动时,需要设置一定的缓冲避免在入口处拥塞,为每个访问接口设置对应的缓存队列,确保写入的无阻塞及高速模块不断流。
以下通过几种具体的方式详细说明调度组件如何根据预设的调度方式,将各访问接口所接收的写访问操作调度给高速模块处理。
一种具体的实现方式中,上述预设的调度方式为优先级调度。
调度组件根据预设的调度方式,访问接口所接收的所述写访问操作调度给高速模块处理,可以包括:调度组件根据各缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
该实现方式中,每个缓存队列预先配置一个优先级,优先级分为1至M,M为访问接口个数,通常与核数或线程数保持一致,从低到高排列,调度组件根据缓存队列的优先级顺序,执行调度。
另一种具体的实现方式中,上述预设的调度方式为轮询权重调度。
调度组件根据预设的调度方式,将各访问接口所接收的写访问操作调度给高速模块处理,可以包括:调度组件根据各缓存队列的权重,按公平调度方式顺序调度缓存队列,权重为其对应缓存队列的长度;对每一缓存队列,调度组件每调度出一个写访问操作给高速模块处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;当调度组件确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
该实现方式中,每个缓存队列预先配置一个权重,避免了根据优先级调度时,由于高优先级缓存队列一直存在CMD而导致最低优先级的缓存队列不被调度的情况。
又一种具体的实现方式中,上述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列。
调度组件根据预设的调度方式,将各访问接口所接收的写访问操作调度给高速模块处理,可以包括:对各优先级缓存队列,调度组件根据优先级缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;对各所述轮询权重缓存队列,调度组件根据所述轮询权重缓存队列的权重,按公平调度方式顺序调度所述轮询权重缓存队列;对每一轮询权重缓存队列,调度组件每调度出一个写访问操作给高速模块处理,执行该轮询权重缓存队列对应的权重减 一,至权重减到零后,停止对此轮询权重缓存队列的调度;当调度组件确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
基于上述三种方式的调度为乱序调度。乱序调度是指写访问操作调出缓存队列的顺序与写入缓存队列的顺序不一致。以下说明保序调度,即调度组件要按照写访问操作写入缓存队列的顺序,调出给高速模块,而不依赖于单个缓存队列的写访问操作顺序。
再一种具体的实现方式中,上述预设的调度方式为保序调度。
调度组件根据预设的调度方式,将各访问接口所接收的写访问操作调度给高速模块处理,可以包括:调度组件根据各缓存队列中各写访问操作的写入顺序,将写访问操作调度给高速模块处理,其中,各缓存队列中各写访问操作的写入顺序存储在保序队列中,该保序队列的长度大于或等于所有缓存队列长度的总和,保序队列存储在存储器中。
保序调度的原理见图7和图8所示,其中,队列号QA、QB、QC和QD分别标识不同的缓存队列,这里以4个缓存队列(对应4个核)进行说明,C0、C1和C2为单个缓存队列中写访问操作(CMD)入队列的顺序,①②③④等为部件(多核)CMD输入的顺序。
部件向某个缓存队列入队一个CMD,该队列号即被加入到保序队列中,这样保序队列的队列号排列顺序即为执行部件CMD入队顺序。
调度模块按照保序队列中的队列号顺序访问相应的缓存队列,从该缓存队列的头部指针所指向的地址取出CMD并送给高速模块。
本领域普通技术人员可以理解:实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于一计算机可读取存储介质中。该程序在执行时,执行包括上述各方法实施例的步骤;而前述的存储介质包括:ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。
最后应说明的是:以上各实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述各实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分或者全部技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。

Claims (12)

  1. 一种并行访问系统,其特征在于,适用于多部件并发处理场景,所述系统包括:
    所述多部件中的每一部件,用于发起写访问操作;
    调度组件,用于通过各部件对应的访问接口,接收所述部件发起写访问操作,所述部件与所述访问接口是一一对应的,多个所述访问接口间并行设置;及,根据预设的调度方式,将所述写访问操作调度给所述高速模块处理,所述高速模块为所述多部件的共享资源。
  2. 根据权利要求1所述的系统,其特征在于,所述系统还包括:
    存储器,用于存储缓存队列,所述缓存队列用于存储所述写访问操作,每一访问接口对应一缓存队列;
    所述访问接口,还用于检测各所述缓存队列是否已满,若确定所述缓存队列已满,则对所述部件实施反压操作,所述反压操作用于指示所述部件在等待预设周期后再执行写访问操作;否则,待所述写访问操作写入完成后,将所述写访问操作存储到所述缓存队列。
  3. 根据权利要求2所述的系统,其特征在于,所述预设的调度方式为优先级调度,所述调度组件具体用于:
    所述调度组件根据各所述缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
  4. 根据权利要求2所述的系统,其特征在于,所述预设的调度方式为轮询权重调度,所述调度组件具体用于:
    所述调度组件根据各所述缓存队列的权重,按公平调度方式顺序调度所述缓存队列,所述权重为其对应缓存队列的长度;
    对每一缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;
    当所述调度组件确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
  5. 根据权利要求2所述的系统,其特征在于,所述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列,所述调度组件具体用于:
    对各所述优先级缓存队列,所述调度组件根据所述优先级缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;
    对各所述轮询权重缓存队列,所述调度组件根据所述轮询权重缓存队列的权重,按公平调度方式顺序调度所述轮询权重缓存队列,所述权重为其对应缓存队列中的长度;对每一轮询权重缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该轮询权重缓存队列对应的权重减一,至权重减到零后,停止对此轮询权重缓存队列的调度;当所述调度组件确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
  6. 根据权利要求2所述的系统,其特征在于,所述存储器还用于存储保序队列,所述预设的调度方式为保序调度,所述调度组件具体用于:
    所述调度组件根据各所述缓存队列中各所述写访问操作的写入顺序,将所述写访问操作调度给所述高速模块处理,其中,所述各所述缓存队列中各所述写访问操作的写入顺序存储在所述保序队列中,所述保序队列的长度大于或等于所有缓存队列长度的总和。
  7. 一种并行访问方法,其特征在于,适用于多部件并发处理场景,所述方法包括:
    对于所述多部件中的每一部件,调度组件通过该部件对应的访问接口,接收所述部件发起的写访问操作,所述部件与所述访问接口是一一对应的,多个所述访问接口间并行设置;
    所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,所述高速模块为所述多部件的共享资源。
  8. 根据权利要求7所述的方法,其特征在于,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块 处理之前,所述方法还包括:
    所述访问接口检测存储器中其对应的缓存队列是否已满,若确定所述缓存队列已满,则对所述部件实施反压操作,所述反压操作用于指示所述部件在等待预设周期后再执行写访问操作;否则,待所述写访问操作写入完成后,将所述写访问操作存储到所述缓存队列,所述访问接口与所述缓存队列之间一一对应。
  9. 根据权利要求8所述的方法,其特征在于,所述预设的调度方式为优先级调度,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
    所述调度组件根据各所述缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理,直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始。
  10. 根据权利要求8所述的方法,其特征在于,所述预设的调度方式为轮询权重调度,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
    所述调度组件根据各所述缓存队列的权重,按公平调度方式顺序调度所述缓存队列,所述权重为其对应缓存队列的长度;
    对每一缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该缓存队列对应的权重减一,至权重减到零后,停止对此缓存队列的调度;
    当所述调度组件确定所有缓存队列的写访问操作全部调度出去或所有缓存队列的权重归零,恢复各缓存队列的权重并开始下一轮调度。
  11. 根据权利要求8所述的方法,其特征在于,所述预设的调度方式为优先级调度和轮询权重调度的混合,所有缓存队列中部分缓存队列配置为优先级缓存队列,剩余部分的混存序列配置为轮询权重缓存队列,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
    对各所述优先级缓存队列,所述调度组件根据所述优先级缓存队列的优先级顺序,优先调度高优先级缓存队列中写访问操作给所述高速模块处理, 直到高优先级缓存队列中的写访问操作调度完,才开始调度下一级优先级缓存队列,每次调度均从最高优先级缓存队列开始;
    对各所述轮询权重缓存队列,所述调度组件根据所述轮询权重缓存队列的权重,按公平调度方式顺序调度所述轮询权重缓存队列,所述权重为其对应缓存队列中的长度;对每一轮询权重缓存队列,所述调度组件每调度出一个写访问操作给所述高速模块处理,执行该轮询权重缓存队列对应的权重减一,至权重减到零后,停止对此轮询权重缓存队列的调度;当所述调度组件确定所有轮询权重缓存队列的写访问操作全部调度出去或所有轮询权重缓存队列的权重归零,恢复各轮询权重缓存队列的权重并开始下一轮调度。
  12. 根据权利要求8所述的方法,其特征在于,所述预设的调度方式为保序调度,所述调度组件根据预设的调度方式,将各所述访问接口所接收的所述写访问操作调度给所述高速模块处理,包括:
    所述调度组件根据各所述缓存队列中各所述写访问操作的写入顺序,将所述写访问操作调度给所述高速模块处理,其中,所述各所述缓存队列中各所述写访问操作的写入顺序存储在保序队列中,所述保序队列的长度大于或等于所有缓存队列长度的总和,所述保序队列存储在所述存储器中。
PCT/CN2014/086638 2014-09-16 2014-09-16 并行访问方法及系统 WO2016041150A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201480022122.9A CN105637475B (zh) 2014-09-16 2014-09-16 并行访问方法及系统
PCT/CN2014/086638 WO2016041150A1 (zh) 2014-09-16 2014-09-16 并行访问方法及系统

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2014/086638 WO2016041150A1 (zh) 2014-09-16 2014-09-16 并行访问方法及系统

Publications (1)

Publication Number Publication Date
WO2016041150A1 true WO2016041150A1 (zh) 2016-03-24

Family

ID=55532439

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/086638 WO2016041150A1 (zh) 2014-09-16 2014-09-16 并行访问方法及系统

Country Status (2)

Country Link
CN (1) CN105637475B (zh)
WO (1) WO2016041150A1 (zh)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113227984B (zh) * 2018-12-22 2023-12-15 华为技术有限公司 一种处理芯片、方法及相关设备
CN113495669B (zh) * 2020-03-19 2023-07-18 华为技术有限公司 一种解压装置、加速器、和用于解压装置的方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5832272A (en) * 1993-03-15 1998-11-03 University Of Westminister Apparatus and method for parallel computation
CN1960334A (zh) * 2006-09-12 2007-05-09 华为技术有限公司 队列调度方法及装置
CN101610552A (zh) * 2009-08-04 2009-12-23 杭州华三通信技术有限公司 共用资源的调度方法和装置
CN102609312A (zh) * 2012-01-10 2012-07-25 中国科学技术大学苏州研究院 基于公平性考虑的短作业优先内存请求调度方法

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1237767C (zh) * 2004-07-09 2006-01-18 清华大学 一种共享资源访问的调度控制方法及装置
KR100784385B1 (ko) * 2005-08-10 2007-12-11 삼성전자주식회사 공유 자원에 대한 접근 요청을 중재하는 시스템 및 방법
WO2007132424A2 (en) * 2006-05-17 2007-11-22 Nxp B.V. Multi-processing system and a method of executing a plurality of data processing tasks
CN101276294B (zh) * 2008-05-16 2010-10-13 杭州华三通信技术有限公司 异态性数据的并行处理方法和处理装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5832272A (en) * 1993-03-15 1998-11-03 University Of Westminister Apparatus and method for parallel computation
CN1960334A (zh) * 2006-09-12 2007-05-09 华为技术有限公司 队列调度方法及装置
CN101610552A (zh) * 2009-08-04 2009-12-23 杭州华三通信技术有限公司 共用资源的调度方法和装置
CN102609312A (zh) * 2012-01-10 2012-07-25 中国科学技术大学苏州研究院 基于公平性考虑的短作业优先内存请求调度方法

Also Published As

Publication number Publication date
CN105637475A (zh) 2016-06-01
CN105637475B (zh) 2019-08-20

Similar Documents

Publication Publication Date Title
US8566828B2 (en) Accelerator for multi-processing system and method
US7802255B2 (en) Thread execution scheduler for multi-processing system and method
CA2536037A1 (en) Fast and memory protected asynchronous message scheme in a multi-process and multi-thread environment
US7376952B2 (en) Optimizing critical section microblocks by controlling thread execution
KR101951072B1 (ko) 코어 간 통신 장치 및 방법
US8966488B2 (en) Synchronising groups of threads with dedicated hardware logic
US8713573B2 (en) Synchronization scheduling apparatus and method in real-time multi-core system
WO2012027959A1 (zh) 一种多处理器系统及其同步引擎装置
WO2018018611A1 (zh) 一种任务处理方法以及网卡
US20090100200A1 (en) Channel-less multithreaded DMA controller
WO2014099267A1 (en) Parallel processing using multi-core processor
CN102591843B (zh) 多核处理器的核间通信方法
US10331500B2 (en) Managing fairness for lock and unlock operations using operation prioritization
US8640135B2 (en) Schedule virtual interface by requesting locken tokens differently from a virtual interface context depending on the location of a scheduling element
US20110179199A1 (en) Support for non-locking parallel reception of packets belonging to the same reception fifo
US10545890B2 (en) Information processing device, information processing method, and program
WO2024040750A1 (zh) 标量处理单元的访问控制方法及标量处理单元
WO2017133439A1 (zh) 一种数据管理方法及装置、计算机存储介质
WO2023193441A1 (zh) 多核电路、数据交换方法、电子设备及存储介质
WO2016177081A1 (zh) 通知消息处理方法及装置
WO2016041150A1 (zh) 并行访问方法及系统
WO2021147877A1 (zh) 用于静态分布式计算架构的数据交换系统及其方法
US11785087B1 (en) Remote direct memory access operations with integrated data arrival indication
WO2017201693A1 (zh) 内存访问指令的调度方法、装置及计算机系统
US9384047B2 (en) Event-driven computation

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 14901893

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 14901893

Country of ref document: EP

Kind code of ref document: A1