WO2011127649A1 - Method and device for processing common data structure - Google Patents

Method and device for processing common data structure Download PDF

Info

Publication number
WO2011127649A1
WO2011127649A1 PCT/CN2010/071736 CN2010071736W WO2011127649A1 WO 2011127649 A1 WO2011127649 A1 WO 2011127649A1 CN 2010071736 W CN2010071736 W CN 2010071736W WO 2011127649 A1 WO2011127649 A1 WO 2011127649A1
Authority
WO
WIPO (PCT)
Prior art keywords
data structure
memory
core
sub
common data
Prior art date
Application number
PCT/CN2010/071736
Other languages
French (fr)
Chinese (zh)
Inventor
胡睿
钱俊
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN201080003755.7A priority Critical patent/CN102362256B/en
Priority to PCT/CN2010/071736 priority patent/WO2011127649A1/en
Publication of WO2011127649A1 publication Critical patent/WO2011127649A1/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5005Allocation of resources, e.g. of the central processing unit [CPU] to service a request
    • G06F9/5027Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals
    • G06F9/505Allocation of resources, e.g. of the central processing unit [CPU] to service a request the resource being a machine, e.g. CPUs, Servers, Terminals considering the load

Definitions

  • the present invention relates to multi-core processor technology, and more particularly to a method and apparatus for processing a common data structure. Background technique
  • the traditional CPU has a kernel inside, that is, a single-core CPU.
  • the processing of common data structures on a single-core CPU although it is possible to execute multi-core processes or threads concurrently, only one thread of instructions is executing at a time, and the protection of common data structures such as global variables, peripherals, etc. is relatively simple.
  • multi-core CPUs have emerged.
  • Multi-core CPUs are the development of semiconductor technology, CPU integration and frequency are increasing, and a CPU with multiple cores has emerged. Multicore CPUs have evolved from the first two cores to dozens of cores today, and some multicore processors under development have even thousands of cores. The first method still processes complex public data structures on the same core. Specifically, when dividing a thread, any complex public data structure is placed on the same core for processing. This method can not guarantee the uniform load on each core when the processing of a certain data structure consumes more CPU time, which leads to some nuclear processing tasks being heavy, while other cores [ ⁇ idle, multi-core parallel computing advantages Can't fully play.
  • the second method is to treat complex common data structures with multiple cores, but the parts of the different core processed data structures are completely unrelated. For example, to perform a function requires first processing a trigeminal tree with one core, and then processing another binary tree with another core, and there is no direct association between the trifurcation tree and the binary tree. Although this method divides the load common data structure on multiple cores, the division of computing resources between cores is almost entirely limited by the composition of the payload-public data structure. It still causes uneven load between cores and cannot fully exploit the advantages of multi-core processors.
  • the third method is to spread the operation of the strongly coupled common data structure on different cores, but to perform mutual exclusion protection (such as spin lock) when accessing the data structure. In this method, when multiple cores access the common data structure at the same time, more nuclear waits may occur, resulting in waste of CPU resources. Summary of the invention
  • Embodiments of the present invention provide a method and apparatus for processing a common data structure to implement parallel processing of a common data structure on a multi-core processor.
  • Embodiments of the present invention provide a method for processing a common data structure, including:
  • each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common portion in the common data structure;
  • the common portion includes a data range of the respective sub-data structures;
  • the sub-data structures of the corresponding collated distributions on the multi-core processor are processed separately.
  • the embodiment of the invention further provides an apparatus for processing a common data structure, including:
  • a distribution module configured to distribute each sub-data structure in the common data structure to a corresponding core on a multi-core processor by a common part in a common data structure; the common part includes a data range of each of the sub-data structures ;
  • a processing module located on a corresponding core of the multi-core processor, for separately processing the distributed sub-data structure.
  • the technical solution provided by the foregoing embodiment distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor, performs separate processing, avoids unbalanced load between the cores, and implements verification of common data on the multi-core processor.
  • the parallel processing of the structure fully utilizes the advantages of the multi-core processor, improves the processing efficiency of the multi-core processor, thereby avoiding more nuclear waiting phenomena and avoiding waste of CPU resources.
  • FIG 4 is the hair
  • Figure 5 is too ⁇ ⁇ .
  • the full binary tree in the law is shown in Figure 6 W • The method of obtaining the nuclear number in the law;
  • Figure 7 is the hair
  • Figure 8 A schematic diagram of a distribution in the W method
  • Figure 9 is a schematic representation of a distribution of too W
  • Figure 10 is a schematic flow chart of the use of the rotary lock of the present invention.
  • FIG. 11 is a schematic structural diagram of an apparatus for processing a common data structure according to an embodiment of the present invention
  • FIG. 12 is a schematic structural diagram of an underlying processing module 113 in an apparatus for processing a common data structure according to an embodiment of the present invention.
  • FIG. 1 is a flowchart of a method for processing a common data structure according to an embodiment of the present invention. As shown in Figure 1, the method includes:
  • Step 11 Distribute each sub-data structure in the common data structure to a corresponding core on the multi-core processor by using a common part in the common data structure; the common part includes a data range of the respective sub-data structures;
  • Step 12 The corresponding sub-data structure of the corresponding verification and distribution on the multi-core processor is separately processed, for example, searching for a sub-data structure.
  • the technical solution provided by the foregoing embodiment distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor, performs separate processing, avoids unbalanced load between the cores, and implements verification of common data on the multi-core processor.
  • the parallel processing of the structure gives full play to the advantages of multi-core processors and improves the processing efficiency of multi-core processors.
  • the distribution may be performed by one core, may be performed by multiple cores, or may be performed by a partial processing module of one or more cores to distribute each sub-data structure to different cores, that is, by different Check different parts of the common data structure for processing.
  • a common data structure is a complete data structure in which multiple cores are processed in parallel at the same time period.
  • the public data structure can be as shown in Figure 2, which is the same binary tree that multiple cores must handle at work.
  • the public data structure is a general abstraction. As shown in Figure 3, the public data structure is divided into three levels: public part, sub data structure, and memory management part from top to bottom.
  • the common part can be divided into several units, each unit corresponding to a unique sub-data structure.
  • the subdata structure is accessible through the information in each cell.
  • the computing resources that need to be consumed according to the sub-data structure corresponding to each unit are obtained according to the common part, such as a core that requires several core processing or what processing speed is required, and a computing resource, that is, an allocation core, is allocated for each sub-data structure.
  • the static allocation method for example, according to the number of sub-data structures and the average allocation of nuclear resources; or weighted allocation according to the distribution characteristics of the nuclear resources consumed by each sub-data structure.
  • the dynamic allocation method can dynamically allocate nuclear resources according to the actual situation of each core load.
  • the number of cores is 2 k (this simplifies the design of common data structures and the calculation of index data. If the number of cores is not the power of 2, it can be filled to the power of 2) , k is a positive integer. As shown, designated 2 k -l spatial boundary 4 numbered sub-data structure (not including the two endpoint number space), the 2 k -l boundary denoted by ⁇ (li ⁇ 2 k -l) , bi is a variable whose value is the number of the corresponding sub-data structure.
  • the number of sub-data structures processed by the i-th core (0 i ⁇ 2 k -l ) falls within the sub-tree structure within the interval [b l b 1+1 ).
  • the node of the full binary tree (the full binary tree itself can be organized without pointers, but directly with the position index) stores the boundary of the core data structure processed by the core. Compare the sub-data structure number corresponding to the sub-data structure to be processed with the full binary tree, record the comparison path, and obtain the core number.
  • the sub-data structure number corresponding to the newly added entry is 70, and the comparison path is 101 (as indicated by the arrow), that is, the number of the obtained core is represented by a binary of 101 and a decimal representation of 5, in the nucleus.
  • the sub-data structure numbered 70 is executed on 5.
  • the kernel 5 is assigned to the sub-data structure numbered 70.
  • 2 k cores each correspond to a queue of messages to be processed.
  • the method for allocating the nuclear resources is not limited thereto, and other methods for balancing the nuclear load may be used to allocate the nuclear resources, and details are not described herein again.
  • the assigned core that is, sends a message to the assigned core to inform it which sub-data structure to process. Then proceed to step 12.
  • the memory management part includes memory resource parameters of the common data structure, which can provide a basis for management of memory resources such as allocation.
  • Step 13 Allocate memory resources for the core in step 12 according to the memory management part in the common data structure. Specifically, it may include:
  • memory partitioning can be requested according to the memory management section.
  • Memory is allocated to each core according to the memory management part, so that each core has space to process the sub-data structure, and manages the memory of each core.
  • the memory resource is requested according to the memory management part.
  • the memory management part the memory is divided into blocks, and the memory is allocated for the core of the application memory resource, and the memory block available for itself is obtained.
  • the memory partitions applied by the core can be divided into pages, that is, memory paging, which ensures that the memory between the cores is independent of each other.
  • memory paging For any core, when building a self-managed data structure, request memory resources from the memory pages in its own memory partition. When you apply for memory paging inside the core, you do not need to do mutual protection, that is, lock protection.
  • a new memory partition that is, dynamically allocate memory, only when there is no space for all the memory partitions that the core has applied for.
  • nuclear application and translation In memory, the amount of memory used by the memory block is recorded, and all memory blocks of the core are sorted by the amount of used memory.
  • Memory partitioning When it is found that a certain memory block of the core is not used, it is translated into the free memory block resource pool, and then managed according to the memory management part, in preparation for other core applications, ensuring sufficient memory partitioning. Utilize, improve the utilization of memory resources, and lock the memory resources of the core application processing the sub-data structure.
  • the processing of the entire public data structure is shown in Figure 8, which can be divided into three parts: distribution, separate processing, and underlying public processing.
  • the underlying public processing allocates memory resources for the core that processes the sub-data structure.
  • the distribution processing includes two levels of distribution 1, distribution 2 .
  • Distribution hierarchy is the redistribution of multiple sub-data structures that cannot be handled by a single core until each sub-data structure is processed by a single core.
  • the distribution 1 divides the sub-data structure into two parts, one part has only one sub-data structure, and after the distribution 1 is executed, the individual processing can be realized; the other part includes multiple sub-data structures, and the distribution 1 cannot be implemented. Separate processing, distribution must be performed again 2, to achieve separate processing.
  • the above lock protection can be a spin lock.
  • the flow used by the spin lock is shown in Figure 10.
  • Memory resources are locked. When multiple cores access the common data structure, each core must apply for a spin lock before accessing the public data structure. If the application is successful, the shared resources are operated and the memory resources are unlocked. If the spin lock is occupied by another core when applying the spin lock, the thread on the core applying for the spin lock will not hang, but will continue to judge whether the lock has been translated. When the spin lock is spinning, the CPU does not do useful work, which is actually wasting CPU time, so the spin lock is suitable for the scene with short access time to the public data structure. More specifically, the spin lock is suitable for sharing. The access time of the resource is less than the task switching time or the scenario equivalent to the task switching time. Therefore, in this embodiment, only the memory resources of the core application are locked and protected in step 13.
  • FIG. 11 is a schematic structural diagram of an apparatus for processing a common data structure according to an embodiment of the present invention.
  • the device includes: a distribution module 111 and a processing module 112.
  • the distribution module 111 is configured to distribute each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common part in the common data structure; the common part includes a data range of the respective sub-data structure .
  • the processing module 112 is located on a corresponding core on the multi-core processor for separately processing the distributed sub-data structures.
  • the apparatus for processing a common data structure may further include: an underlying processing module
  • the method is configured to allocate, according to a memory management part in the common data structure, a memory resource for a corresponding core on the multi-core processor, for processing a sub-data structure.
  • the underlying processing module 113 may be specifically configured to allocate a memory resource to a corresponding core on the multi-core processor according to a memory management portion in the common data structure in a case where a memory space is reserved.
  • the apparatus for processing a common data structure may further include: a static allocation module
  • the public data structure is used according to a number of sub-data structures in the common data structure and a distribution feature on the multi-core processor for consuming resources according to each sub-data structure in the common data structure.
  • Each sub-data structure in the weight is assigned a corresponding core.
  • the distribution module 111 may be specifically configured to distribute each sub-data structure in the common data structure to a corresponding core allocated by the static allocation module by a common part in a common data structure.
  • the apparatus for processing a common data structure may further include: a dynamic allocation module
  • the method is used to compare a number of a sub-data structure to be processed with a full binary tree to obtain a comparison path; a number of a sub-data structure at a specified boundary, where the number of the boundary is larger than a number of cores on the multi-core processor The number is less than 1.
  • the distribution module 111 may be specifically configured to distribute each sub-data structure in the common data structure to a core numbered as a value of the comparison path by a common part in a common data structure.
  • the underlying processing module 113 may include: a resource allocation sub-module 116 and a lock protection sub-module 117, wherein the resource allocation sub-module 116 is configured to process the memory resource of the sub-data structure according to the public data structure
  • the memory management part is the phase of the multi-core processor
  • the core should allocate memory resources.
  • the lock protection sub-module 117 is configured to lock and protect the memory resources of the core application processing the sub-data structure.
  • the bottom layer processing module 113 may include: a recording submodule 121 and a management submodule 122, wherein the recording submodule 121 is located on each core, and is used for recording used memory in each core memory partition. Quantity, and sort all memory blocks in each core by the amount of used memory.
  • the management sub-module 122 is located on each core, and is used to apply for memory from the memory partition according to the order of the memory usage. In the case of releasing the memory, the data in the memory block with the lowest usage rate is moved to the memory usage. The highest and no memory block.
  • the device for processing the common data structure distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor through the distribution module, performs separate processing, avoids unbalanced load between the cores, and implements multi-core processing.
  • the parallel processing of the common data structure on the device fully exploits the advantages of the multi-core processor and improves the processing efficiency of the multi-core processor.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The present invention relates to a method and device for processing a common data structure. The method includes: each sub-data-structure of a common data structure is distributed through a common part of said common data structure to the corresponding core of a multi-core processor, wherein said common part contains the data range of said each sub-data-structure; and corresponding cores of said multi-core processor individually process the distributed sub-data-structures. Through the distribution of each sub-data-structure of the common data structure to the corresponding core of the multi-core processor for individual processing, an inter-core load imbalance is avoided, and the cores of the multi-core processor are enabled to concurrently process the common data structure, thus taking full advantage of the multi-core processor and improving processing efficiency of the multi-core processor.

Description

处理公共数据结构的方法及装置 技术领域  Method and device for processing public data structure
本发明涉及多核处理器技术, 尤其涉及一种处理公共数据结构的方法及 装置。 背景技术  The present invention relates to multi-core processor technology, and more particularly to a method and apparatus for processing a common data structure. Background technique
传统的 CPU内部有一个内核, 即单核 CPU。 单核 CPU上对于公共数据 结构的处理, 虽然可能并发执行多核进程或线程, 但同一时刻只有一个线程 的指令在执行, 并且对公共数据结构如全局变量、 外设等数据的保护比较简 单。 为了进一步提高公共数据结构的处理速度, 出现了多核 CPU。  The traditional CPU has a kernel inside, that is, a single-core CPU. The processing of common data structures on a single-core CPU, although it is possible to execute multi-core processes or threads concurrently, only one thread of instructions is executing at a time, and the protection of common data structures such as global variables, peripherals, etc. is relatively simple. To further increase the processing speed of public data structures, multi-core CPUs have emerged.
多核 CPU是随着半导体技术的发展, CPU的集成度及主频日益提高,应 运而生的包含有多个核的 CPU。多核 CPU从最初的 2个核发展到目前的几十 个核, 某些正在开发的多核处理器甚至有上千个核。 第一种方法仍然在同一个核上处理复杂的公共数据结构。 具体地, 在划 分线程时, 将任意一个复杂的公共数据结构放在同一个核上处理。 这种方法 在对某个数据结构的处理耗费较多 CPU时间时,无法保证各核上负载的均匀, 导致某些核处理任务艮重, 而另一些核 ^[艮空闲, 多核的并行计算优势无法充 分发挥。  Multi-core CPUs are the development of semiconductor technology, CPU integration and frequency are increasing, and a CPU with multiple cores has emerged. Multicore CPUs have evolved from the first two cores to dozens of cores today, and some multicore processors under development have even thousands of cores. The first method still processes complex public data structures on the same core. Specifically, when dividing a thread, any complex public data structure is placed on the same core for processing. This method can not guarantee the uniform load on each core when the processing of a certain data structure consumes more CPU time, which leads to some nuclear processing tasks being heavy, while other cores [艮 idle, multi-core parallel computing advantages Can't fully play.
第二种方法是将复杂的公共数据结构用多个核处理, 但不同的核处理的 数据结构的部分完全无关。 例如, 完成某种功能需要先用一个核处理一个三 叉树, 然后另一个核再处理一个二叉树, 而该三叉树和二叉树之间没有直接 关联。这种方法虽然将负载 公共数据结构分在多个核上处理,但核之间的计 算资源的划分几乎完全受限于负载 --公共数据结构的构成。仍然会导致核间负 载不均, 无法充分发挥多核处理器的优势。 第三种方法是将强耦合公共数据结构的操作分散在不同核上处理, 但在 访问数据结构时做互斥保护(如自旋锁)。 这种方法在多个核同时访问公共数 据结构时, 可能会发生较多核等待的现象, 导致 CPU资源的浪费。 发明内容 The second method is to treat complex common data structures with multiple cores, but the parts of the different core processed data structures are completely unrelated. For example, to perform a function requires first processing a trigeminal tree with one core, and then processing another binary tree with another core, and there is no direct association between the trifurcation tree and the binary tree. Although this method divides the load common data structure on multiple cores, the division of computing resources between cores is almost entirely limited by the composition of the payload-public data structure. It still causes uneven load between cores and cannot fully exploit the advantages of multi-core processors. The third method is to spread the operation of the strongly coupled common data structure on different cores, but to perform mutual exclusion protection (such as spin lock) when accessing the data structure. In this method, when multiple cores access the common data structure at the same time, more nuclear waits may occur, resulting in waste of CPU resources. Summary of the invention
本发明实施例提出一种处理公共数据结构的方法及装置, 以实现多核处 理器上的公共数据结构的并行处理。  Embodiments of the present invention provide a method and apparatus for processing a common data structure to implement parallel processing of a common data structure on a multi-core processor.
本发明实施例提供了一种处理公共数据结构的方法, 包括:  Embodiments of the present invention provide a method for processing a common data structure, including:
通过公共数据结构中的公共部分将所述公共数据结构中的各个子数据结 构分发给多核处理器上相应的核; 所述公共部分包含有所述各个子数据结构 的数据范围;  Distributing each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common portion in the common data structure; the common portion includes a data range of the respective sub-data structures;
所述多核处理器上相应的核对分发的子数据结构进行单独处理。  The sub-data structures of the corresponding collated distributions on the multi-core processor are processed separately.
本发明实施例还提供了一种处理公共数据结构的装置, 包括:  The embodiment of the invention further provides an apparatus for processing a common data structure, including:
分发模块, 用于通过公共数据结构中的公共部分将所述公共数据结构中 的各个子数据结构分发给多核处理器上相应的核; 所述公共部分包含有所述 各个子数据结构的数据范围;  a distribution module, configured to distribute each sub-data structure in the common data structure to a corresponding core on a multi-core processor by a common part in a common data structure; the common part includes a data range of each of the sub-data structures ;
处理模块, 位于所述多核处理器上相应的核, 用于对分发的子数据结构 进行单独处理。  A processing module, located on a corresponding core of the multi-core processor, for separately processing the distributed sub-data structure.
上述实施例提供的技术方案通过将公共数据结构中的各个子数据结构分 发给多核处理器上相应的核, 进行单独处理, 避免了核间负载不均衡, 实现 了多核处理器上的核对公共数据结构的并行处理, 充分发挥了多核处理器的 优势, 提高了多核处理器的处理效率, 从而避免了较多核等待的现象, 避免 了 CPU资源的浪费。 附图说明  The technical solution provided by the foregoing embodiment distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor, performs separate processing, avoids unbalanced load between the cores, and implements verification of common data on the multi-core processor. The parallel processing of the structure fully utilizes the advantages of the multi-core processor, improves the processing efficiency of the multi-core processor, thereby avoiding more nuclear waiting phenomena and avoiding waste of CPU resources. DRAWINGS
为了更清楚地说明本发明实施例中的技术方案, 下面将对实施例中所需 要使用的附图作简单地介绍, 显而易见地, 下面描述中的附图仅仅是本发明 的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动性的前 提下, 还可以根据这些附图获得其他的附图。 图
Figure imgf000005_0001
In order to more clearly illustrate the technical solution in the embodiment of the present invention, the following will be required in the embodiment. The drawings to be used are briefly described, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and can be used by those skilled in the art without any inventive labor. These figures take additional drawings. Figure
Figure imgf000005_0001
一种示意图 a schematic
图 3为太 曰.  Figure 3 is too 曰.
分层示意图 Hierarchical diagram
图 4为本发  Figure 4 is the hair
号空间的划分示意图 Schematic diagram of the division of space
图 5 为太^ 曰. 法中满二叉树示意 图 6 W •法中获得核编号的示 意图;  Figure 5 is too ^ 曰. The full binary tree in the law is shown in Figure 6 W • The method of obtaining the nuclear number in the law;
图 7为本发  Figure 7 is the hair
结构边界调整示意图 Schematic diagram of structural boundary adjustment
图 8 ^太^ W 法中一种分发示意 图;  Figure 8 ^ A schematic diagram of a distribution in the W method;
图 9为太 W 一种分发的示 意图;  Figure 9 is a schematic representation of a distribution of too W;
图 10 为本发^ 旋锁使用的 流程示意图;  Figure 10 is a schematic flow chart of the use of the rotary lock of the present invention;
图 11为本发明实施例提供的处理公共数据结构的装置的结构示意图; 图 12 为本发明实施例提供的处理公共数据结构的装置中底层处理模块 113的结构示意图。 具体实施方式 FIG. 11 is a schematic structural diagram of an apparatus for processing a common data structure according to an embodiment of the present invention; FIG. 12 is a schematic structural diagram of an underlying processing module 113 in an apparatus for processing a common data structure according to an embodiment of the present invention. detailed description
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而 不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作 出创造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。  The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
图 1 为本发明实施例提供的处理公共数据结构的方法的流程图。 如图 1 所示, 该方法包括:  FIG. 1 is a flowchart of a method for processing a common data structure according to an embodiment of the present invention. As shown in Figure 1, the method includes:
步骤 11、 通过公共数据结构中的公共部分将所述公共数据结构中的各个 子数据结构分发给多核处理器上相应的核; 所述公共部分包含有所述各个子 数据结构的数据范围;  Step 11. Distribute each sub-data structure in the common data structure to a corresponding core on the multi-core processor by using a common part in the common data structure; the common part includes a data range of the respective sub-data structures;
步骤 12、所述多核处理器上相应的核对分发的子数据结构进行单独处理, 如进行子数据结构的查找等。  Step 12: The corresponding sub-data structure of the corresponding verification and distribution on the multi-core processor is separately processed, for example, searching for a sub-data structure.
上述实施例提供的技术方案通过将公共数据结构中的各个子数据结构分 发给多核处理器上相应的核, 进行单独处理, 避免了核间负载不均衡, 实现 了多核处理器上的核对公共数据结构的并行处理, 充分发挥了多核处理器的 优势, 提高了多核处理器的处理效率。  The technical solution provided by the foregoing embodiment distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor, performs separate processing, avoids unbalanced load between the cores, and implements verification of common data on the multi-core processor. The parallel processing of the structure gives full play to the advantages of multi-core processors and improves the processing efficiency of multi-core processors.
上述步骤 11中, 分发可由一个核执行, 也可由多个核执行, 也可以由一 个或多个核的部分处理模块来执行, 以将各个子数据结构分发到不同的核, 也即由不同的核对公共数据结构不同部分进行处理。  In the above step 11, the distribution may be performed by one core, may be performed by multiple cores, or may be performed by a partial processing module of one or more cores to distribute each sub-data structure to different cores, that is, by different Check different parts of the common data structure for processing.
公共数据结构为多个核在同一时间段并行处理的完整的数据结构。 公共 数据结构可如图 2所示, 为多个核在工作中都要处理的同一个二叉树。  A common data structure is a complete data structure in which multiple cores are processed in parallel at the same time period. The public data structure can be as shown in Figure 2, which is the same binary tree that multiple cores must handle at work.
为了将对公共数据结构的处理高效地分到多个核上执行, 将公共数据结 构做一般性的抽象。 如图 3所示, 公共数据结构自顶向下分为公共部分、 子 数据结构、 内存管理部分三个层次。  In order to efficiently divide the processing of common data structures onto multiple cores, the public data structure is a general abstraction. As shown in Figure 3, the public data structure is divided into three levels: public part, sub data structure, and memory management part from top to bottom.
子数据结构有多个, 公共部分可以划分为若干个单元, 每个单元对应唯 一的一个子数据结构。 通过每个单元内的信息可以访问到子数据结构。 还可 以根据公共部分获知各单元对应的子数据结构需要消耗的计算资源如需要几 个核处理或需要什么处理速度的核进行处理, 为各子数据结构分配计算资源 即分配核。 There are multiple sub-data structures, and the common part can be divided into several units, each unit corresponding to a unique sub-data structure. The subdata structure is accessible through the information in each cell. Can also The computing resources that need to be consumed according to the sub-data structure corresponding to each unit are obtained according to the common part, such as a core that requires several core processing or what processing speed is required, and a computing resource, that is, an allocation core, is allocated for each sub-data structure.
核资源的分配方法有两种, 一种是静态分配方法, 一种是动态分配方法。 静态分配方法, 如按照子数据结构的个数和核资源平均分配; 或者按照 各子数据结构消耗核资源的分布特征加权分配。  There are two methods for allocating nuclear resources, one is static allocation method and the other is dynamic allocation method. The static allocation method, for example, according to the number of sub-data structures and the average allocation of nuclear resources; or weighted allocation according to the distribution characteristics of the nuclear resources consumed by each sub-data structure.
动态分配方法可以才艮据各核负载的实际情况动态分配核资源。  The dynamic allocation method can dynamically allocate nuclear resources according to the actual situation of each core load.
假设多核处理器中, 核的个数为 2k (这样可以简化公共数据结构的设计和索 引数据的计算。 核的个数如果不是 2的整幂次, 可以补齐为 2的整幂次), k 为正整数。 如图 4所示, 在子数据结构的编号空间中指定 2k-l个边界(不包 括编号空间的 2个端点), 将 2k-l个边界记为 ^ ( l i < 2k-l ), bi为变量, 其值为对应的子数据结构的编号。 第 i个核(0 i < 2k-l )处理的子数据结 构的编号落在区间 [bl b1+1)内的子树据结构。 如图 5所示, 满二叉树(满二叉 树本身可以不用指针指的方式组织, 而是直接用位置索引) 的节点存储有核 处理的子数据结构的边界。 将待处理的子数据结构对应的子数据结构编号和 满二叉树比较, 记录其比较路径, 即可得到核的编号。 如图 6所示, 新添加 表项对应的子数据结构编号为 70, 则比较的路径为 101 (如箭头表示), 即得 到核的编号用二进制表示为 101 , 用十进制表示为 5, 在核 5上执行编号为 70的子数据结构。 即将核 5分配给编号为 70的子数据结构。 Assume that in a multi-core processor, the number of cores is 2 k (this simplifies the design of common data structures and the calculation of index data. If the number of cores is not the power of 2, it can be filled to the power of 2) , k is a positive integer. As shown, designated 2 k -l spatial boundary 4 numbered sub-data structure (not including the two endpoint number space), the 2 k -l boundary denoted by ^ (li <2 k -l) , bi is a variable whose value is the number of the corresponding sub-data structure. The number of sub-data structures processed by the i-th core (0 i < 2 k -l ) falls within the sub-tree structure within the interval [b l b 1+1 ). As shown in FIG. 5, the node of the full binary tree (the full binary tree itself can be organized without pointers, but directly with the position index) stores the boundary of the core data structure processed by the core. Compare the sub-data structure number corresponding to the sub-data structure to be processed with the full binary tree, record the comparison path, and obtain the core number. As shown in Figure 6, the sub-data structure number corresponding to the newly added entry is 70, and the comparison path is 101 (as indicated by the arrow), that is, the number of the obtained core is represented by a binary of 101 and a decimal representation of 5, in the nucleus. The sub-data structure numbered 70 is executed on 5. The kernel 5 is assigned to the sub-data structure numbered 70.
2k个核各对应一个待处理的消息的队列, 当某一核对应的队列中的数据 超过水线 T时,如图 7所示, 则将该核可处理的子数据结构的个数减少 2个, 即上下边界各向内缩 1 , ^ b ^ +l , b1+1= b1+1 -l。 2 k cores each correspond to a queue of messages to be processed. When the data in the queue corresponding to a core exceeds the water line T, as shown in FIG. 7, the number of sub-data structures that can be processed by the core is reduced. 2, that is, the upper and lower boundaries are inwardly contracted by 1, ^ b ^ +l , b 1+1 = b 1+1 -l.
核资源的分配方法不限于此, 也可以采用使核负载均衡的其他方法来分 配核资源, 这里不再贅述。 给分配的核, 即向分配的核发送消息告知其所要处理的子数据结构是哪个。 然后执行步骤 12。 内存管理部分包括公共数据结构的内存资源参数, 可为内 存资源的管理如分配等提供依据。
Figure imgf000008_0001
The method for allocating the nuclear resources is not limited thereto, and other methods for balancing the nuclear load may be used to allocate the nuclear resources, and details are not described herein again. The assigned core, that is, sends a message to the assigned core to inform it which sub-data structure to process. Then proceed to step 12. The memory management part includes memory resource parameters of the common data structure, which can provide a basis for management of memory resources such as allocation.
Figure imgf000008_0001
步骤 13、 根据所述公共数据结构中的内存管理部分为步骤 12 中的核分 配内存资源。 具体地, 可包括:  Step 13. Allocate memory resources for the core in step 12 according to the memory management part in the common data structure. Specifically, it may include:
在处理子数据结构的核申请内存资源的情况下, 根据所述公共数据结构 中的内存管理部分为所述多核处理器上相应的核分配内存资源;  In the case of processing a core application memory resource of the sub-data structure, allocating a memory resource for a corresponding core on the multi-core processor according to a memory management portion in the common data structure;
例如, 当处理子数据结构的核没有可用的内存资源时, 可根据内存管理 部分申请内存分块。 依据内存管理部分为各核分配内存, 以使各核有空间来 处理子数据结构, 并对各核译放的内存进行管理。 分配内存时, 可根据内存 需被占用的多少预留一部分内存。 如内存被占用的多, 则多预留出一部分内 存, 如内存被占用的少, 则少预留出一部分内存; 如果内存资源较多, 则尽 可能多地预留出内存。 即以空间换时间, 通过预留出较多的内存空间, 减少 消耗在内存管理上的时间, 从而对于内存全部共享, 各核上的处理模块在做 内存的申请译放时要用锁保护的情况, 减少了内存管理消耗的时间, 避免了 CPU数据处理性能的下降。  For example, when there is no memory resource available for processing a core of a sub-data structure, memory partitioning can be requested according to the memory management section. Memory is allocated to each core according to the memory management part, so that each core has space to process the sub-data structure, and manages the memory of each core. When you allocate memory, you can reserve a portion of the memory based on how much memory is occupied. If the memory is occupied a lot, a part of the memory is reserved. If the memory is occupied less, a part of the memory is reserved; if there are more memory resources, the memory is reserved as much as possible. That is, space is exchanged for time, and more memory space is reserved, so that the time spent on memory management is reduced, so that for all memory sharing, the processing modules on each core need to use lock protection when doing memory translation. In this case, the time spent by memory management is reduced, and the performance of CPU data processing is prevented from deteriorating.
多核处理器中的核对分发的子数据结构进行单独处理之前, 根据内存管 理部分申请内存资源。 依据内存管理部分对内存进行分块, 为申请内存资源 的核分配内存分块, 核便得到自身可用的内存分块。 核申请到的内存分块又 可分为页, 即内存分页, 保证了各核之间的内存彼此独立。 对任何一个核, 构建自己管理的数据结构时, 从自己的内存分块中的内存分页中申请内存资 源。 在核内部申请内存分页对子数据结构进行处理时不需要做互斥保护即锁 保护。  Before the separate processing of the sub-data structure of the multi-core processor is performed, the memory resource is requested according to the memory management part. According to the memory management part, the memory is divided into blocks, and the memory is allocated for the core of the application memory resource, and the memory block available for itself is obtained. The memory partitions applied by the core can be divided into pages, that is, memory paging, which ensures that the memory between the cores is independent of each other. For any core, when building a self-managed data structure, request memory resources from the memory pages in its own memory partition. When you apply for memory paging inside the core, you do not need to do mutual protection, that is, lock protection.
分配内存时, 还可以对任意一个核, 仅当该核已申请的所有内存分块都 没有空间时, 才申请新的内存分块, 即动态分配内存。 并且, 核申请和译放 内存时, 记录内存分块已用内存的数量, 并将本核的所有内存分块按已用内 存数量排序。 在申请内存时, 优先从本核内存使用率最高的内存分块中申请 内存; 在译放内存时, 尝试将内存使用率最低的内存分块中的数据搬移至内 存使用率最高且没有占满的内存分块。 当发现本核的某个内存分块没有被使 用的内存时, 将其译放至空闲内存分块资源池, 再次依据内存管理部分进行 管理, 以备其他核申请, 保证了内存分块的充分利用, 提高了内存资源的利 用率, 对处理子数据结构的核申请的内存资源进行锁保护。 When allocating memory, you can also apply for a new memory partition, that is, dynamically allocate memory, only when there is no space for all the memory partitions that the core has applied for. And, nuclear application and translation In memory, the amount of memory used by the memory block is recorded, and all memory blocks of the core are sorted by the amount of used memory. When applying for memory, it is preferred to apply for memory from the memory block with the highest memory usage; when relocating memory, try to move the data in the memory block with the lowest memory usage to the highest memory usage and not full. Memory partitioning. When it is found that a certain memory block of the core is not used, it is translated into the free memory block resource pool, and then managed according to the memory management part, in preparation for other core applications, ensuring sufficient memory partitioning. Utilize, improve the utilization of memory resources, and lock the memory resources of the core application processing the sub-data structure.
此时, 将对整个公共数据结构的处理如图 8所示, 可分为分发、 单独处 理、 底层公共处理三部分。 其中, 底层公共处理即为处理子数据结构的核分 配内存资源。 当步骤 11中的分发分层次完成时, 可如图 9所示, 分发处理包 括为分发 1、 分发 2两个层次。 分发分层次即是对不能由一个核单独处理的 多个子数据结构进行再次分发, 直至每个子数据结构均由单独的一个核进行 处理。 以图 9为例, 分发 1将子数据结构分成两部分分发, 一部分即只有一 个子数据结构, 执行分发 1后, 可实现单独处理; 另一部分为包括多个子数 据结构, 执行分发 1后不能实现单独处理, 必须再次执行分发即分发 2, 实 现单独处理。  At this point, the processing of the entire public data structure is shown in Figure 8, which can be divided into three parts: distribution, separate processing, and underlying public processing. The underlying public processing allocates memory resources for the core that processes the sub-data structure. When the distribution in step 11 is completed hierarchically, as shown in Fig. 9, the distribution processing includes two levels of distribution 1, distribution 2 . Distribution hierarchy is the redistribution of multiple sub-data structures that cannot be handled by a single core until each sub-data structure is processed by a single core. Taking FIG. 9 as an example, the distribution 1 divides the sub-data structure into two parts, one part has only one sub-data structure, and after the distribution 1 is executed, the individual processing can be realized; the other part includes multiple sub-data structures, and the distribution 1 cannot be implemented. Separate processing, distribution must be performed again 2, to achieve separate processing.
上述锁保护可为自旋锁。 自旋锁使用的流程如图 10所示, 内存资源被加 锁, 当多个核访问公共数据结构时, 每个核在访问公共数据结构前要申请自 旋锁。 申请成功, 则对共享资源进行操作, 并对内存资源进行解锁。 如果申 请自旋锁时自旋锁被别的核占有, 申请自旋锁的核上的线程不会挂起, 而是 不停的判断锁是否已经译放。 自旋锁在自旋时 CPU没有做有用的工作, 实际 上是在浪费 CPU时间,所以自旋锁适合对公共数据结构访问时间较短的场景, 更确切的说, 自旋锁适合于对共享资源的访问时间小于任务切换时间或和任 务切换时间相当的场景。 因此, 本实施例仅在步骤 13中对核申请的内存资源 进行锁保护。  The above lock protection can be a spin lock. The flow used by the spin lock is shown in Figure 10. Memory resources are locked. When multiple cores access the common data structure, each core must apply for a spin lock before accessing the public data structure. If the application is successful, the shared resources are operated and the memory resources are unlocked. If the spin lock is occupied by another core when applying the spin lock, the thread on the core applying for the spin lock will not hang, but will continue to judge whether the lock has been translated. When the spin lock is spinning, the CPU does not do useful work, which is actually wasting CPU time, so the spin lock is suitable for the scene with short access time to the public data structure. More specifically, the spin lock is suitable for sharing. The access time of the resource is less than the task switching time or the scenario equivalent to the task switching time. Therefore, in this embodiment, only the memory resources of the core application are locked and protected in step 13.
图 11为本发明实施例提供的处理公共数据结构的装置的结构示意图。如 图 11所示, 该装置包括: 分发模块 111、 处理模块 112。 分发模块 111用于 通过公共数据结构中的公共部分将所述公共数据结构中的各个子数据结构分 发给多核处理器上相应的核; 所述公共部分包含有所述各个子数据结构的数 据范围。 处理模块 112位于所述多核处理器上相应的核, 用于对分发的子数 据结构进行单独处理。 FIG. 11 is a schematic structural diagram of an apparatus for processing a common data structure according to an embodiment of the present invention. Such as As shown in FIG. 11, the device includes: a distribution module 111 and a processing module 112. The distribution module 111 is configured to distribute each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common part in the common data structure; the common part includes a data range of the respective sub-data structure . The processing module 112 is located on a corresponding core on the multi-core processor for separately processing the distributed sub-data structures.
本发明实施例提供的处理公共数据结构的装置还可包括: 底层处理模块 The apparatus for processing a common data structure provided by the embodiment of the present invention may further include: an underlying processing module
113 ,用于根据所述公共数据结构中的内存管理部分为所述多核处理器上相应 的核分配内存资源, 以用于处理子数据结构。 所述底层处理模块 113可具体 用于在预留内存空间的情况下, 根据所述公共数据结构中的内存管理部分为 所述多核处理器上相应的核分配内存资源。 113. The method is configured to allocate, according to a memory management part in the common data structure, a memory resource for a corresponding core on the multi-core processor, for processing a sub-data structure. The underlying processing module 113 may be specifically configured to allocate a memory resource to a corresponding core on the multi-core processor according to a memory management portion in the common data structure in a case where a memory space is reserved.
本发明实施例提供的处理公共数据结构的装置还可包括: 静态分配模块 The apparatus for processing a common data structure provided by the embodiment of the present invention may further include: a static allocation module
114,用于根据所述公共数据结构中子数据结构的个数及所述多核处理器上的 用于根据所述公共数据结构中各个子数据结构消耗资源的分布特征, 为所述 公共数据结构中的各个子数据结构加权分配相应的核。 此时, 所述分发模块 111 可具体用于通过公共数据结构中的公共部分将所述公共数据结构中的各 个子数据结构分发到所述静态分配模块分配的相应的核。 114. The public data structure is used according to a number of sub-data structures in the common data structure and a distribution feature on the multi-core processor for consuming resources according to each sub-data structure in the common data structure. Each sub-data structure in the weight is assigned a corresponding core. At this time, the distribution module 111 may be specifically configured to distribute each sub-data structure in the common data structure to a corresponding core allocated by the static allocation module by a common part in a common data structure.
本发明实施例提供的处理公共数据结构的装置还可包括: 动态分配模块 The apparatus for processing a common data structure provided by the embodiment of the present invention may further include: a dynamic allocation module
115 ,用于将待处理的子数据结构的编号与满二叉树进行比较,得到比较路径; 处于指定边界的子数据结构的编号, 所述边界的个数比所述多核处理器上的 核的个数少 1。此时,所述分发模块 111可具体用于通过公共数据结构中的公 共部分将所述公共数据结构中的各个子数据结构分发到编号为所述比较路径 的值的核。 所述底层处理模块 113可包括: 资源分配子模块 116和锁保护子 模块 117 ,其中资源分配子模块 116用于在处理子数据结构的核申请内存资源 的情况下, 根据所述公共数据结构中的内存管理部分为所述多核处理器上相 应的核分配内存资源。 锁保护子模块 117用于对处理子数据结构的核申请的 内存资源进行锁保护。 115. The method is used to compare a number of a sub-data structure to be processed with a full binary tree to obtain a comparison path; a number of a sub-data structure at a specified boundary, where the number of the boundary is larger than a number of cores on the multi-core processor The number is less than 1. At this time, the distribution module 111 may be specifically configured to distribute each sub-data structure in the common data structure to a core numbered as a value of the comparison path by a common part in a common data structure. The underlying processing module 113 may include: a resource allocation sub-module 116 and a lock protection sub-module 117, wherein the resource allocation sub-module 116 is configured to process the memory resource of the sub-data structure according to the public data structure The memory management part is the phase of the multi-core processor The core should allocate memory resources. The lock protection sub-module 117 is configured to lock and protect the memory resources of the core application processing the sub-data structure.
或者, 如图 12所示, 所述底层处理模块 113可包括: 记录子模块 121和 管理子模块 122, 其中记录子模块 121位于各核上, 用于记录各核内存分块 中已用内存的数量, 并将各核内的所有内存分块按已用内存数量排序。 管理 子模块 122位于各核上, 用于按照内存使用率的高低顺序从内存分块中申请 内存, 在译放内存的情况下, 将使用率最低的内存分块中的数据搬移至内存 使用率最高且没有占满的内存分块。  Or, as shown in FIG. 12, the bottom layer processing module 113 may include: a recording submodule 121 and a management submodule 122, wherein the recording submodule 121 is located on each core, and is used for recording used memory in each core memory partition. Quantity, and sort all memory blocks in each core by the amount of used memory. The management sub-module 122 is located on each core, and is used to apply for memory from the memory partition according to the order of the memory usage. In the case of releasing the memory, the data in the memory block with the lowest usage rate is moved to the memory usage. The highest and no memory block.
上述实施例中, 处理公共数据结构的装置通过分发模块将公共数据结构 中的各个子数据结构分发给多核处理器上相应的核, 进行单独处理, 避免了 核间负载不均衡, 实现了多核处理器上的核对公共数据结构的并行处理, 充 分发挥了多核处理器的优势, 提高了多核处理器的处理效率。  In the above embodiment, the device for processing the common data structure distributes each sub-data structure in the common data structure to the corresponding core on the multi-core processor through the distribution module, performs separate processing, avoids unbalanced load between the cores, and implements multi-core processing. The parallel processing of the common data structure on the device fully exploits the advantages of the multi-core processor and improves the processing efficiency of the multi-core processor.
本领域普通技术人员可以理解: 实现上述方法实施例的全部或部分步骤 可以通过程序指令相关的硬件来完成, 前述的程序可以存储于一计算机可读 取存储介质中, 该程序在执行时, 执行包括上述方法实施例的步骤; 而前述 的存储介质包括: ROM, RAM, 磁碟或者光盘等各种可以存储程序代码的介 最后应说明的是: 以上实施例仅用以说明本发明的技术方案, 而非对其 限制; 尽管参照前述实施例对本发明进行了详细的说明, 本领域的普通技术 人员应当理解: 其依然可以对前述各实施例所记载的技术方案进行修改, 或 者对其中部分技术特征进行等同替换; 而这些修改或者替换, 并不使相应技 术方案的本质脱离本发明各实施例技术方案的精神和范围。  A person skilled in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by using hardware related to program instructions, and the foregoing program may be stored in a computer readable storage medium, and the program is executed when executed. The foregoing storage medium includes: ROM, RAM, magnetic disk or optical disk, and the like, which can store various program codes. Finally, the above embodiments are only used to illustrate the technical solution of the present invention. The invention is described in detail with reference to the foregoing embodiments, and those of ordinary skill in the art should understand that the technical solutions described in the foregoing embodiments may be modified or some of the techniques may be The features are equivalent to the equivalents; and the modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

权 利 要求 Rights request
1、 一种处理公共数据结构的方法, 其特征在于, 包括: A method of processing a common data structure, comprising:
通过公共数据结构中的公共部分将所述公共数据结构中的各个子数据结 构分发给多核处理器上相应的核; 所述公共部分包含有所述各个子数据结构 的数据范围;  Distributing each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common portion in the common data structure; the common portion includes a data range of the respective sub-data structures;
所述多核处理器上相应的核对分发的子数据结构进行单独处理。  The sub-data structures of the corresponding collated distributions on the multi-core processor are processed separately.
2、 根据权利要求 1所述的处理公共数据结构的方法, 其特征在于, 所述 根据所述公共数据结构中的内存管理部分为所述多核处理器上相应的核 分配内存资源。  2. The method of processing a common data structure according to claim 1, wherein said allocating a memory resource to a corresponding core on said multi-core processor according to a memory management portion in said common data structure.
3、 根据权利要求 1或 2所述的处理公共数据结构的方法, 其特征在于, 还包括:  The method for processing a common data structure according to claim 1 or 2, further comprising:
或者根据所述公共数据结构中各个子数据结构消耗资源的分布特征, 为 所述公共数据结构中的各个子数据结构加权分配相应的核; Or weighting the corresponding cores for each sub-data structure in the common data structure according to the distribution characteristics of the consumption resources of each sub-data structure in the common data structure;
所述通过公共数据结构中的公共部分将所述公共数据结构中的各个子数 据结构分发到多核处理器上相应的核包括:  Distributing each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common portion in the common data structure includes:
通过公共数据结构中的公共部分将所述公共数据结构中的各个子数据结 构分发到分配的相应的核。  Each sub-data structure in the common data structure is distributed to the assigned corresponding core by a common portion in the common data structure.
4、 根据权利要求 1所述的处理公共数据结构的方法, 其特征在于, 还包 括:  4. The method of processing a common data structure according to claim 1, further comprising:
将待处理的子数据结构的编号与满二叉树进行比较, 得到比较路径; 所 于指定边界的子数据结构的编号, 所述边界的个数比所述多核处理器上的核 的个数少 1 ; Comparing the number of the sub-data structure to be processed with the full binary tree to obtain a comparison path; the number of the sub-data structure at the specified boundary, the number of the boundary being larger than the core on the multi-core processor The number is less than 1;
所述通过公共数据结构中的公共部分将所述公共数据结构中的各个子数 据结构分发到多核处理器上相应的核包括:  Distributing each sub-data structure in the common data structure to a corresponding core on the multi-core processor through a common portion in the common data structure includes:
通过公共数据结构中的公共部分将所子数据结构编号对应的子数据结构 分发到编号为所述比较路径的值的核。  The sub-data structures corresponding to the sub-data structure numbers are distributed to the core numbered as the value of the comparison path by a common part in the common data structure.
5、 根据权利要求 2所述的处理公共数据结构的方法, 其特征在于, 根据 所述公共数据结构中的内存管理部分为所述多核处理器上相应的核分配内存 资源包括:  The method for processing a common data structure according to claim 2, wherein the allocating memory resources for the corresponding cores on the multi-core processor according to the memory management portion of the common data structure comprises:
在处理子数据结构的核申请内存资源的情况下, 根据所述公共数据结构 中的内存管理部分为所述多核处理器上相应的核分配内存资源;  In the case of processing a core application memory resource of the sub-data structure, allocating a memory resource for a corresponding core on the multi-core processor according to a memory management portion in the common data structure;
对处理子数据结构的核申请的内存资源进行锁保护。  Locks the memory resources of the core application that handles the sub-data structure.
6、 根据权利要求 2所述的处理公共数据结构的方法, 其特征在于, 根据 所述公共数据结构中的内存管理部分为所述多核处理器上相应的核分配内存 资源包括:  The method for processing a common data structure according to claim 2, wherein the allocating memory resources for the corresponding cores on the multi-core processor according to the memory management portion of the common data structure comprises:
在预留内存空间的情况下, 根据所述公共数据结构中的内存管理部分为 所述多核处理器上相应的核分配内存资源。  In the case of reserving the memory space, memory resources are allocated to the corresponding cores on the multi-core processor according to the memory management portion in the common data structure.
7、 根据权利要求 2所述的处理公共数据结构的方法, 其特征在于, 根据 所述公共数据结构中的内存管理部分为所述多核处理器上相应的核分配内存 资源包括:  The method for processing a common data structure according to claim 2, wherein the allocating memory resources for the corresponding cores on the multi-core processor according to the memory management portion of the common data structure comprises:
各核记录各自内存分块中已用内存的数量, 并将各自的所有内存分块按 已用内存数量排序;  Each core records the amount of used memory in its respective memory partition, and sorts all of its memory partitions by the amount of used memory;
各核按照内存使用率的高低顺序从内存分块中申请内存, 在译放内存的 情况下, 将使用率最低的内存分块中的数据搬移至内存使用率最高且没有占 满的内存分块。  Each core applies for memory from the memory partition according to the order of memory usage. In the case of deciphering memory, the data in the memory partition with the lowest usage rate is moved to the memory partition with the highest memory usage and no full memory. .
8、 一种处理公共数据结构的装置, 其特征在于, 包括:  8. An apparatus for processing a common data structure, comprising:
分发模块, 用于通过公共数据结构中的公共部分将所述公共数据结构中 的各个子数据结构分发给多核处理器上相应的核; 所述公共部分包含有所述 各个子数据结构的数据范围; a distribution module, configured to pass the common data structure through a common part in a common data structure Each of the sub-data structures is distributed to a corresponding core on the multi-core processor; the common portion includes a data range of the respective sub-data structures;
处理模块, 位于所述多核处理器上相应的核, 用于对分发的子数据结构 进行单独处理。  A processing module, located on a corresponding core of the multi-core processor, for separately processing the distributed sub-data structure.
9、 根据权利要求 8所述的处理公共数据结构的装置, 其特征在于, 还包 括:  9. The apparatus for processing a common data structure according to claim 8, further comprising:
底层处理模块, 用于根据所述公共数据结构中的内存管理部分为所述多 核处理器上相应的核分配内存资源, 以用于处理子数据结构。  And an underlying processing module, configured to allocate, according to a memory management part in the common data structure, a memory resource for a corresponding core on the multi-core processor, to process the sub-data structure.
10、 根据权利要求 8所述的处理公共数据结构的装置, 其特征在于, 还 包括:  The device for processing a common data structure according to claim 8, further comprising:
静态分配模块, 用于根据所述公共数据结构中子数据结构的个数及所述 多核处理器上的核资源, 为所述公共数据结构中的各个子数据结构平均分配 布特征, 为所述公共数据结构中的各个子数据结构加权分配相应的核;  a static allocation module, configured to allocate, according to the number of sub-data structures in the common data structure and the core resources on the multi-core processor, a cloth feature for each sub-data structure in the common data structure, Each sub-data structure in the common data structure is weighted to allocate a corresponding core;
所述分发模块具体用于通过公共数据结构中的公共部分将所述公共数据 结构中的各个子数据结构分发到所述静态分配模块分配的相应的核。  The distribution module is specifically configured to distribute each sub-data structure in the common data structure to a corresponding core allocated by the static allocation module by a common part in a common data structure.
11、 根据权利要求 8所述的处理公共数据结构的装置, 其特征在于, 还 包括:  The device for processing a common data structure according to claim 8, further comprising:
动态分配模块,用于将待处理的子数据结构的编号与满二叉树进行比较, 得到比较路径; 所述满二叉树的节点存储有所述公共数据结构中的子数据结 构的编号空间中处于指定边界的子数据结构的编号, 所述边界的个数比所述 多核处理器上的核的个数少 1 ;  a dynamic allocation module, configured to compare a number of the to-be-processed sub-data structure with a full binary tree to obtain a comparison path; the node of the full binary tree stores a specified boundary in a number space of the sub-data structure in the common data structure Number of the sub-data structure, the number of the boundaries being one less than the number of cores on the multi-core processor;
所述分发模块具体用于通过公共数据结构中的公共部分将所述公共数据 结构中的各个子数据结构分发到编号为所述比较路径的值的核。  The distribution module is specifically configured to distribute each sub-data structure in the common data structure to a core numbered as a value of the comparison path by a common part in a common data structure.
12、 根据权利要求 9所述的处理公共数据结构的装置, 其特征在于, 所 述底层处理模块包括: 资源分配子模块, 用于在处理子数据结构的核申请内存资源的情况下, 根据所述公共数据结构中的内存管理部分为所述多核处理器上相应的核分配 内存资源; The device for processing a common data structure according to claim 9, wherein the bottom layer processing module comprises: a resource allocation submodule, configured to allocate a memory resource to a corresponding core on the multi-core processor according to a memory management part in the common data structure, in a case of processing a memory resource of a sub-data structure;
锁保护子模块,用于对处理子数据结构的核申请的内存资源进行锁保护。 The lock protection submodule is configured to lock and protect the memory resources of the core application processing the sub data structure.
13、 根据权利要求 9所述的处理公共数据结构的装置, 其特征在于, 所 述底层处理模块具体用于在预留内存空间的情况下, 根据所述公共数据结构 中的内存管理部分为所述多核处理器上相应的核分配内存资源。 The apparatus for processing a common data structure according to claim 9, wherein the underlying processing module is specifically configured to: according to a memory management part in the common data structure, in a case where a memory space is reserved The corresponding core on the multi-core processor allocates memory resources.
14、 根据权利要求 9所述的处理公共数据结构的装置, 其特征在于, 所 述底层处理模块包括:  The apparatus for processing a common data structure according to claim 9, wherein the underlying processing module comprises:
记录子模块, 位于各核上, 用于记录各核内存分块中已用内存的数量, 并将各核内的所有内存分块按已用内存数量排序;  Recording sub-modules, located on each core, used to record the amount of used memory in each core memory block, and sort all memory blocks in each core according to the amount of used memory;
管理子模块, 位于各核上, 用于按照内存使用率的高低顺序从内存分块 中申请内存, 在译放内存的情况下, 将使用率最低的内存分块中的数据搬移 至内存使用率最高且没有占满的内存分块。  The management sub-module is located on each core and is used to apply for memory from the memory partition according to the order of memory usage. In the case of deciphering the memory, the data in the memory block with the lowest usage rate is moved to the memory usage. The highest and no memory block.
PCT/CN2010/071736 2010-04-13 2010-04-13 Method and device for processing common data structure WO2011127649A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201080003755.7A CN102362256B (en) 2010-04-13 2010-04-13 Method and device for processing common data structure
PCT/CN2010/071736 WO2011127649A1 (en) 2010-04-13 2010-04-13 Method and device for processing common data structure

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2010/071736 WO2011127649A1 (en) 2010-04-13 2010-04-13 Method and device for processing common data structure

Publications (1)

Publication Number Publication Date
WO2011127649A1 true WO2011127649A1 (en) 2011-10-20

Family

ID=44798257

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2010/071736 WO2011127649A1 (en) 2010-04-13 2010-04-13 Method and device for processing common data structure

Country Status (2)

Country Link
CN (1) CN102362256B (en)
WO (1) WO2011127649A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105893319A (en) * 2014-12-12 2016-08-24 上海芯豪微电子有限公司 Multi-lane/multi-core system and method

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2003316408A (en) * 2002-04-04 2003-11-07 Hwacheon Machine Tool Co Ltd Controller for machine tool
CN1567187A (en) * 2003-06-11 2005-01-19 华为技术有限公司 Data processing system and method
US7171666B2 (en) * 2000-08-04 2007-01-30 International Business Machines Corporation Processor module for a multiprocessor system and task allocation method thereof
US20090164399A1 (en) * 2007-12-19 2009-06-25 International Business Machines Corporation Method for Autonomic Workload Distribution on a Multicore Processor

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6832378B1 (en) * 2000-06-20 2004-12-14 International Business Machines Corporation Parallel software processing system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7171666B2 (en) * 2000-08-04 2007-01-30 International Business Machines Corporation Processor module for a multiprocessor system and task allocation method thereof
JP2003316408A (en) * 2002-04-04 2003-11-07 Hwacheon Machine Tool Co Ltd Controller for machine tool
CN1567187A (en) * 2003-06-11 2005-01-19 华为技术有限公司 Data processing system and method
US20090164399A1 (en) * 2007-12-19 2009-06-25 International Business Machines Corporation Method for Autonomic Workload Distribution on a Multicore Processor

Also Published As

Publication number Publication date
CN102362256B (en) 2014-07-30
CN102362256A (en) 2012-02-22

Similar Documents

Publication Publication Date Title
US8141091B2 (en) Resource allocation in a NUMA architecture based on application specified resource and strength preferences for processor and memory resources
Singh et al. Task scheduling in cloud computing
US8205208B2 (en) Scheduling grid jobs using dynamic grid scheduling policy
Maurya et al. Energy conscious dynamic provisioning of virtual machines using adaptive migration thresholds in cloud data center
CN108108245B (en) Hybrid scheduling method and system for cloud platform wide-node scientific workflow
US8527988B1 (en) Proximity mapping of virtual-machine threads to processors
KR20130100689A (en) Scalable, customizable, and load-balancing physical memory management scheme
Li et al. An energy-aware scheduling algorithm for big data applications in Spark
Ma et al. vLocality: Revisiting data locality for MapReduce in virtualized clouds
Mishra et al. Improving energy consumption in cloud
CN108132834A (en) Method for allocating tasks and system under multi-level sharing cache memory framework
Maqsood et al. Leveraging on deep memory hierarchies to minimize energy consumption and data access latency on single-chip cloud computers
Sontakke et al. Optimization of hadoop mapreduce model in cloud computing environment
Koneru et al. Resource allocation method using scheduling methods for parallel data processing in cloud
WO2011127649A1 (en) Method and device for processing common data structure
Luckow et al. Abstractions for loosely-coupled and ensemble-based simulations on Azure
Huang et al. A general novel parallel framework for SPH-centric algorithms
Hu et al. Optimizing locality-aware memory management of key-value caches
Fan et al. A scheduler for serverless framework base on kubernetes
US7925841B2 (en) Managing shared memory usage within a memory resource group infrastructure
Lin et al. Allocation and scheduling of real-time tasks with volatile/non-volatile hybrid memory systems
Giridas et al. Compatibility of hybrid process scheduler in green it cloud computing environment
Kim et al. Exploration of a PIM design configuration for energy-efficient task offloading
CN109815249A (en) The fast parallel extracting method of the large data files mapped based on memory
Shrimali et al. Performance based energy efficient techniques for VM allocation in cloud environment

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201080003755.7

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 10849659

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 10849659

Country of ref document: EP

Kind code of ref document: A1