KR101892920B1 - Flow based parallel processing method and apparatus thereof - Google Patents
Flow based parallel processing method and apparatus thereof Download PDFInfo
- Publication number
- KR101892920B1 KR101892920B1 KR1020150159702A KR20150159702A KR101892920B1 KR 101892920 B1 KR101892920 B1 KR 101892920B1 KR 1020150159702 A KR1020150159702 A KR 1020150159702A KR 20150159702 A KR20150159702 A KR 20150159702A KR 101892920 B1 KR101892920 B1 KR 101892920B1
- Authority
- KR
- South Korea
- Prior art keywords
- queue
- flow
- data
- parallel processing
- new
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17306—Intercommunication techniques
- G06F15/17318—Parallel communications techniques, e.g. gather, scatter, reduce, roadcast, multicast, all to all
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F13/00—Interconnection of, or transfer of information or other signals between, memories, input/output devices or central processing units
- G06F13/14—Handling requests for interconnection or transfer
- G06F13/16—Handling requests for interconnection or transfer for access to memory bus
- G06F13/1605—Handling requests for interconnection or transfer for access to memory bus based on arbitration
- G06F13/1642—Handling requests for interconnection or transfer for access to memory bus based on arbitration with request queuing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F15/00—Digital computers in general; Data processing equipment in general
- G06F15/16—Combinations of two or more digital computers each having at least an arithmetic unit, a program unit and a register, e.g. for a simultaneous processing of several programs
- G06F15/163—Interprocessor communication
- G06F15/173—Interprocessor communication using an interconnection network, e.g. matrix, shuffle, pyramid, star, snowflake
- G06F15/17306—Intercommunication techniques
- G06F15/17331—Distributed shared memory [DSM], e.g. remote direct memory access [RDMA]
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Computer Hardware Design (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Multi Processors (AREA)
Abstract
Embodiments of the present invention relate to data parallel processing, and a flow-based parallel processing apparatus according to an embodiment of the present invention includes: a queue memory for storing one or more queues; A data memory for storing data; A mapper for storing a pointer of the data in a queue mapped with the flow, based on flow information of the data; A plurality of processors for performing a process according to input data; And a distributor for reading the data from the data memory with reference to a pointer stored in the queue and for transferring data corresponding to a single queue out of the read data to a single one of the plurality of processors. According to embodiments of the present invention, it is possible to perform parallel processing of data having an order in a multiprocessor or multicore environment.
Description
Embodiments of the present invention relate to data parallel processing.
A data processing system in a multicore environment is a processing technology for speeding up network traffic performance. Such a data processing system needs to keep the processing order of the data in order even if one or more multicore cores concurrently process the ordered data.
Embodiments of the present invention provide a method for allowing sequential data to be processed in parallel in a data processing system in a multi-processor or multi-core environment.
Embodiments of the present invention provide a way to avoid the problem of data re-ordering during parallel processing of ordered data.
Embodiments of the present invention provide a way to enable scaling of a processor or core depending on the context of network traffic.
A flow-based parallel processing apparatus according to an embodiment of the present invention includes: a queue memory for storing one or more queues; A data memory for storing data; A mapper for storing a pointer of the data in a queue mapped with the flow, based on flow information of the data; A plurality of processors for performing a process according to input data; And a distributor for reading the data from the data memory with reference to a pointer stored in the queue and for transferring data corresponding to a single queue out of the read data to a single one of the plurality of processors.
In one embodiment, the mapper may map the new flow to a new queue or an existing queue if the data corresponds to a new flow.
In one embodiment, the mapper may map the new flow and the new queue if there is no existing queue whose flow counter information is less than the threshold.
In one embodiment, the apparatus comprises: a distributor manager for allocating a distributor for the new queue; And a processor manager for assigning a single processor to the new queue.
In one embodiment, the mapper may map the existing queue and the new flow if there is an existing queue whose flow counter information is less than the threshold value.
In one embodiment, if there are a plurality of existing queues whose flow counter information is less than a threshold value, the mapper may map the queue storing the smallest number of pointers of the existing queues and the new flow.
In one embodiment, if there are a plurality of existing queues whose flow counter information is less than the threshold value, the mapper may map the queue storing the largest number of pointers of the existing queues and the new flow.
A method for performing data parallel processing in a flow-based parallel processing apparatus according to an embodiment of the present invention includes: storing received data in a data memory; Storing a pointer of the data in a queue mapped with the flow based on the flow information of the data; Reading the data from the data memory with reference to a pointer stored in the queue; And transmitting data corresponding to a single queue out of the read data to a single processor.
In one embodiment, the method may further comprise, if the data corresponds to a new flow, mapping the new flow to a new queue or an existing queue.
In one embodiment, the method may include mapping the new flow to a new queue if there is no existing queue with flow counter information below the threshold.
In one embodiment, the method further comprises the steps of: allocating a distributor for the new queue; And allocating a single processor for the new queue.
In one embodiment, the method may include mapping the existing queue and the new flow if an existing queue with flow counter information below the threshold is present.
In one embodiment, the method may include mapping the new flow to a queue in which the smallest number of pointers of the existing queues are stored if the existing queue has flow counter information less than the threshold.
In one embodiment, the method may include mapping the new flow to a queue in which the largest number of pointers of the existing queues are stored if the flow counter information is a multiple of existing queues below the threshold.
According to embodiments of the present invention, it is possible to perform parallel processing of data having an order in a multiprocessor or multicore environment.
According to embodiments of the present invention, in performing parallel processing of ordered data, it is possible to perform scaling according to network traffic conditions without causing a re-ordering problem.
1 is a block diagram illustrating a flow-based parallel processing apparatus according to an embodiment of the present invention.
2 is an exemplary view for explaining a flow table according to an embodiment of the present invention,
3 is an exemplary diagram for explaining a queue table according to an embodiment of the present invention;
FIG. 4 is a flowchart illustrating a flow-based parallel processing method according to an embodiment of the present invention. FIG.
5 is an exemplary diagram showing a path through which input data is transferred to a processor;
In the following description of the embodiments of the present invention, a detailed description of known functions and configurations incorporated herein will be omitted when it may make the subject matter of the present invention rather unclear.
Hereinafter, embodiments of the present invention will be described with reference to the accompanying drawings.
1 is a block diagram illustrating a flow-based parallel processing apparatus according to an embodiment of the present invention.
Referring to FIG. 1, a flow-based parallel processing apparatus according to an embodiment of the present invention includes a
The
The
2 is an exemplary diagram for explaining a flow table according to an embodiment of the present invention.
Referring to FIG. 2, the flow table according to an exemplary embodiment of the present invention may include at least one of flow information, queue information, and flow expiration information.
The flow information can be generated by applying a specific operation to the input data as information indicating a flow to which the input data belongs. For example, the flow information may be a value obtained by bit masking the identification reference value included in the input data, or may be a value generated by applying a hash function to the identification reference value. Here, the identification reference value may be, for example, a source Internet protocol (SIP) address value, a destination internet protocol (DIP) address value, a source port (SPORT) Value, a PROTOCOL field value, or a value included in the payload information of the input data. According to an embodiment, the flow information may be an address where queue information of a queue storing a pointer indicating an address where input data is stored is stored.
The queue information may be information that can identify one queue from another queue.
The flow expiration information is information serving as a criterion for determining the point in time when the flow is inactivated, and may be composed of one or more bits. The flow expiration information may be incremented by 1 each time input data belonging to the flow is received, and may be decremented by 1 every set period.
Referring again to FIG. 1, the
<1. If the currently received input data is new To flow If applicable>
If the flow information for the currently received input data is not stored in the flow table, the
In this case, the
The
3 is an exemplary diagram illustrating a queue table according to an embodiment of the present invention.
The queue table can map and store the information of the currently active queue, the number of pointers stored in each queue, and the flow counter information.
The flow counter information is information indicating the number of flows activated in each queue, and may be a value composed of 1 bit or more. The flow counter information may be a criterion for determining a point in time when the queue is inactivated, and the initial value may be zero.
Referring back to FIG. 1, the
<1-1. new To flow To store a pointer to a new queue>
When there is no queue whose flow counter information is less than the threshold value among the queues (existing queues) currently active in the
The
When the queue creation completion signal is received from the
After storing the pointer in the new queue, the
The
<1-2. new To flow To store a pointer to an existing queue>
If there is a queue whose flow counter information is less than the threshold among the activated queues in the
At this time, the
The
After storing the pointer in the existing queue, the
When the
<2. If the currently received input data is new To flow If not>
If the flow information for the currently received input data and the queue information corresponding to the flow information are mapped to the flow table, the mapper 100 can determine that the currently received input data does not correspond to the new flow .
In this case, the
The
After storing the pointer in the existing queue, the
The
<3. Distributor and Processor Assignment>
When the
The
<4. Input data distribution>
The
The
The
The
<5. Distributor and Processor Deactivation>
The
The
When the
4 is a flowchart illustrating a flow-based parallel processing method according to an embodiment of the present invention. Each of the steps shown in FIG. 4 is performed by each component included in the flow-based parallel processing apparatus, but for the sake of brevity and clarity of explanation, the subject of each step is referred to as a flow-based parallel processing apparatus. Depending on the embodiment, at least one of the steps shown in Fig. 4 may be omitted.
In
In
In
In
In
In
In
In
In
In
At
On the other hand, if the currently received input data corresponds to an existing flow, the flow-based parallel processing device may proceed to step 471 and store the pointer in an existing queue mapped to the existing flow. Then, the flow-based parallel processing apparatus can proceed to step 413 to update the flow table and the queue table.
On the other hand, if the currently received input data corresponds to a new flow and there is a queue whose flow counter information is less than the threshold value, the flow-based parallel processing device proceeds to step 451 to set a new flow You can map existing queues. For example, the flow-based parallel processing apparatus can determine a queue having the largest flow counter information among the queues whose flow counter information is less than a threshold, as a queue in which the pointer is to be stored. Alternatively, the flow-based parallel processing apparatus may determine a queue having the smallest flow counter information among the queues whose flow counter information is less than the threshold, as a queue in which the pointer is to be stored.
Then, the flow-based parallel processing device may proceed to step 471 and store the pointer in the existing queue mapped to the new flow. Then, the flow-based parallel processing apparatus can proceed to step 413 to update the flow table and the queue table.
5 is an exemplary diagram showing a path through which input data is transferred to a processor.
In the example referring to FIG. 5, assume that the
In this case, the input data corresponding to the
The embodiments of the invention described above may be implemented in any of a variety of ways. For example, embodiments of the present invention may be implemented using hardware, software, or a combination thereof. When implemented in software, it may be implemented as software running on one or more processors using various operating systems or platforms. Additionally, such software may be written using any of a number of suitable programming languages, and may also be compiled into machine code or intermediate code executable in a framework or virtual machine.
Also, when embodiments of the present invention are implemented on one or more processors, one or more programs for carrying out the methods of implementing the various embodiments of the invention discussed above may be stored on a processor readable medium (e.g., memory, A floppy disk, a hard disk, a compact disk, an optical disk, a magnetic tape, or the like).
Claims (14)
A data memory for storing data;
A mapper for storing a pointer to the data in a queue of one or more of the one or more queues mapped to a flow of the data based on information on the flow of the data;
A plurality of processors for performing a process according to input data; And
A distributor for reading the data from the data memory with reference to a pointer stored in the one queue and transmitting the read data to one of the plurality of processors,
Based parallel processing unit.
And if the flow of data corresponds to a new flow, mapping the flow of data to a new queue or a queue of one or more of the queues
Flow-based parallel processing unit.
Maps the flow of data and the new queue if there is no queue whose number of active flows is less than a threshold value and if there is a queue whose number of active flows among the one or more queues is less than a threshold value , Mapping the flows of the data and the queues whose number of active flows is less than a threshold value
Flow-based parallel processing unit.
A distributor manager for assigning one of the one or more distributors to the new queue if there is more than one distributor; And
A processor manager for assigning said one processor to said new queue;
Based parallel processing unit.
If there are a plurality of queues whose number of active flows among the one or more queues is less than a threshold value, a flow of the data and a queue storing the smallest number of pointers among the plurality of queues or a queue storing the largest number of pointers are mapped doing
Flow-based parallel processing unit.
Storing received data in a data memory;
Storing a pointer to the data in a queue mapped with a flow of the data based on the information on the flow of the data;
Reading the data from the data memory with reference to a pointer stored in the one queue; And
Transmitting the read data to one processor
Based parallel processing method.
If the flow of data corresponds to a new flow, mapping the flow of data to a new queue or a queue of one of the one or more queues
Based parallel processing method.
Map the flow of data and the new queue if there is no queue whose number of active flows is less than a threshold, and if there is an existing queue with the number of active flows less than the threshold value of the one or more queues , ≪ / RTI > mapping the flow of data and the queue with the number of active flows less than a threshold
Based parallel processing method.
Assigning, if there is more than one distributor, a distributor of one of the one or more distributors to the new queue; And
Assigning the one processor to the new queue
Based parallel processing method.
If there are a plurality of queues whose number of active flows among the one or more queues is less than a threshold value, a flow of the data and a queue storing the smallest number of pointers among the plurality of queues or a queue storing the largest number of pointers are mapped Step
Based parallel processing method.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150159702A KR101892920B1 (en) | 2015-11-13 | 2015-11-13 | Flow based parallel processing method and apparatus thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020150159702A KR101892920B1 (en) | 2015-11-13 | 2015-11-13 | Flow based parallel processing method and apparatus thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
KR20170056773A KR20170056773A (en) | 2017-05-24 |
KR101892920B1 true KR101892920B1 (en) | 2018-08-30 |
Family
ID=59051202
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
KR1020150159702A KR101892920B1 (en) | 2015-11-13 | 2015-11-13 | Flow based parallel processing method and apparatus thereof |
Country Status (1)
Country | Link |
---|---|
KR (1) | KR101892920B1 (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113301104B (en) * | 2021-02-09 | 2024-04-12 | 阿里巴巴集团控股有限公司 | Data processing system and method |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7039061B2 (en) * | 2001-09-25 | 2006-05-02 | Intel Corporation | Methods and apparatus for retaining packet order in systems utilizing multiple transmit queues |
US7512706B2 (en) * | 2004-12-16 | 2009-03-31 | International Business Machines Corporation | Method, computer program product, and data processing system for data queuing prioritization in a multi-tiered network |
US7765405B2 (en) | 2005-02-25 | 2010-07-27 | Microsoft Corporation | Receive side scaling with cryptographically secure hashing |
KR101350000B1 (en) * | 2009-11-09 | 2014-01-13 | 한국전자통신연구원 | Cross flow parallel processing method and system |
KR101440122B1 (en) * | 2010-11-17 | 2014-09-12 | 한국전자통신연구원 | Apparatus and method for processing multi-layer data |
-
2015
- 2015-11-13 KR KR1020150159702A patent/KR101892920B1/en active IP Right Grant
Also Published As
Publication number | Publication date |
---|---|
KR20170056773A (en) | 2017-05-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11861203B2 (en) | Method, apparatus and electronic device for cloud service migration | |
US10572290B2 (en) | Method and apparatus for allocating a physical resource to a virtual machine | |
CN109983441B (en) | Resource management for batch jobs | |
KR101775227B1 (en) | Techniques for routing service chain flow packets between virtual machines | |
US9608917B1 (en) | Systems and methods for achieving high network link utilization | |
US7765405B2 (en) | Receive side scaling with cryptographically secure hashing | |
CN106411558B (en) | Method and system for limiting data flow | |
US20190243757A1 (en) | Systems and methods for input/output computing resource control | |
CN109726005B (en) | Method, server system and computer readable medium for managing resources | |
US9141436B2 (en) | Apparatus and method for partition scheduling for a processor with cores | |
CN105264509A (en) | Adaptive interrupt coalescing in a converged network | |
KR20200076700A (en) | Apparatus and method for providing performance-based packet scheduler | |
US9621439B2 (en) | Dynamic and adaptive quota shares | |
US11734172B2 (en) | Data transmission method and apparatus using resources in a resource pool of a same NUMA node | |
CN110235106B (en) | Completion side client throttling | |
WO2017024965A1 (en) | Method and system for limiting data traffic | |
KR101953546B1 (en) | Apparatus and method for virtual switching | |
US20140089624A1 (en) | Cooperation of hoarding memory allocators in a multi-process system | |
US11283723B2 (en) | Technologies for managing single-producer and single consumer rings | |
US11520700B2 (en) | Techniques to support a holistic view of cache class of service for a processor cache | |
KR101892920B1 (en) | Flow based parallel processing method and apparatus thereof | |
CN110235105B (en) | System and method for client-side throttling after server processing in a trusted client component | |
US10270715B2 (en) | High performance network I/O in a virtualized environment | |
US10142245B2 (en) | Apparatus and method for parallel processing | |
US9894670B1 (en) | Implementing adaptive resource allocation for network devices |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
A201 | Request for examination | ||
E902 | Notification of reason for refusal | ||
E701 | Decision to grant or registration of patent right | ||
GRNT | Written decision to grant |