WO2004099983A1 - Threshold on unblocking a processing node that is blocked due to data packet passing - Google Patents
Threshold on unblocking a processing node that is blocked due to data packet passing Download PDFInfo
- Publication number
- WO2004099983A1 WO2004099983A1 PCT/IB2004/001447 IB2004001447W WO2004099983A1 WO 2004099983 A1 WO2004099983 A1 WO 2004099983A1 IB 2004001447 W IB2004001447 W IB 2004001447W WO 2004099983 A1 WO2004099983 A1 WO 2004099983A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- threshold
- context
- producer
- consumer
- buffer
- Prior art date
Links
- 239000000872 buffer Substances 0.000 claims abstract description 55
- 238000000034 method Methods 0.000 claims description 9
- 230000001934 delay Effects 0.000 claims 1
- 238000004519 manufacturing process Methods 0.000 claims 1
- 230000007246 mechanism Effects 0.000 abstract description 15
- 230000009467 reduction Effects 0.000 abstract description 4
- 230000007704 transition Effects 0.000 description 3
- 238000004891 communication Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 241001522296 Erithacus rubecula Species 0.000 description 1
- 230000003111 delayed effect Effects 0.000 description 1
- 230000010354 integration Effects 0.000 description 1
- 230000004043 responsiveness Effects 0.000 description 1
- 230000011664 signaling Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/48—Program initiating; Program switching, e.g. by interrupt
- G06F9/4806—Task transfer initiation or dispatching
- G06F9/4843—Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
- G06F9/485—Task life-cycle, e.g. stopping, restarting, resuming execution
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/461—Saving or restoring of program or task context
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/54—Interprogram communication
- G06F9/544—Buffers; Shared memory; Pipes
Definitions
- This invention relates to a threshold for controlling the exchange of data packets between a producer and a consumer residing on the same CPU. More particularly, the invention relates to software streaming based on a graph of processing nodes, and is a system and method for a threshold to unblock a processing node that is blocked due to data packet passing between two processing nodes residing on the same CPU.
- Software streaming is based on a graph of processing nodes, where the communication between the nodes is done using discrete packets of data. Nodes actively transport packets from their input edges to their output edges, making the data flow through the graph. Each packet follows a certain route through the graph, starting at a source node and ending at a sink node.
- Two processing nodes running on the same resource e.g.
- a context switch is relatively expensive due to the fact that all registers have to be saved, and on a CPU with a cache it is far more expensive since the cache will be thrashed (no locality any more due to the switch).
- a context switch is required, before the consumer can consume this packet.
- two context-switches back and forth are required for each packet.
- This option has the disadvantage that both the producer and consumer must be able to handle parameterized packet sizes. This increases the complexity of the nodes. - Let the producer produce a number of packets, before switching to the consumer. o Give the producer a higher priority than the consumer. The producer will continue until it cannot make progress any more. The reason is usually that no empty packets are available any more to be filled and sent to the consumer. Then, the consumer can start. However, as soon as the consumer has consumed one packet and releases the packet so that it is again available for reuse, the producer will be activated immediately since there is again an empty packet and the producer has a higher priority than the consumer. So giving the producer a higher priority does not, in most cases, solve the problem.
- priorities are for the control responsiveness and should not be used for this context switch problem. o Give the producer and consumer equal priority. These will typically be handled in a round robin fashion. The context switches will be limited, depending on the time slice used by the Operating System. However, giving tasks the same priority has a negative impact on their worst case behavior (worst-case response). So, this option is a trade-off between the context-switch overhead and the cost of a worse response. o A threshold mechanism that prevents the consumer from starting consumption immediately when a packet is produced by the producer. And, in addition a threshold mechanism such that when a consumer releases a packet to be reused again, the producer cannot immediately consume that packet, i.e., must wait until the amount of packets has reached the threshold.
- the basic threshold mechanism is that a blocked consumer is unblocked when a certain threshold (of available packets) is reached.
- a threshold of 5 (only informs the consumer that there are again packets when 5 packets are available), can reduce the number of context switches with about the same factor (5).
- the mechanism is used when the consumer is blocked, i.e. the consumer previously tried to get a data packet but it failed. In case the consumer is not blocked on the data packet input and the threshold of 5 is reached the mechanism is not used.
- the threshold mechanism is independent of the processing nodes, i.e. a processing node does not know whether a threshold is used and what value it has.
- a system integrator can configure the thresholds during integration.
- the threshold mechanism has several disadvantages:
- the threshold mechanism of the present invention is an effective means for reducing the number of context switches that are the result of the data packet passing that occurs on a given processor. A reduction in the number of context switches leads to a more effective use of the processor.
- the threshold mechanism of the present invention is invisible to the processing nodes, i.e. no additional complexity inside the nodes is required.
- the present invention addresses the disadvantages noted above by providing thresholds on unblocking a blocked processing node to reduce the context switch overhead.
- the present invention can be extended by mechanisms that prevent deadlock or data that gets stuck in the buffer.
- Example extensions are a timeout on a threshold that determines the maximum time a threshold can delay the notification to the consumer, or an additional task in the system with a low priority that removes all thresholds in case the system becomes 'idle'.
- FIG. 1 illustrates a low priority producer component 100 connected to a high priority consumer component 110 via buffer component 120.
- FIG. 2a illustrates context switches 200 and buffer filling for the low priority producer component producing buffers for the high priority consumer component of FIG. 1 when no threshold is used.
- FIG. 2b illustrates context switches 210 and buffer filling for the low priority producer component producing buffers for the high priority consumer component of FIG. 1 when a threshold of 5 is used.
- FIG. 3 illustrates a state transition diagram for an embodiment of the present invention having a threshold of two associated with a buffer that can hold three packets.
- the system and method of the present invention reduces the number of context switches that are the result of passing packets to higher priority streaming components in a system with a pre-emptive priority based scheduler.
- a reduction of the number of context switches in general leads to better performance at the expense of an increase in latency.
- a buffer threshold is a mechanism that postpones signaling a consumer component that is blocked waiting for packets until sufficient packets are available in the buffer, i.e., the threshold amount of packets are present in the buffer.
- the threshold of a buffer is set to x, the waiting, i.e., blocked, component associated with that buffer is signaled only when x packets are available in the buffer.
- FIG. 1 illustrates a low priority producer component 100 connected to a high priority consumer component 110 via a buffer component 120, all on the same processor.
- Figure 2a depicts the filling of the buffer component 120 of FIG. 1 when data packets are passed, i.e., produced and consumed right away. That is, when the producer component 100 puts a packet in buffer 120, component 100 will be pre-empted immediately by the consumer component 110 (assuming that consumer component 110 is ready to run). When the consumer component 110 is ready and again waits for new input, producer component 100 is resumed.
- the number of context switches 200 as a result of this communication depends on the frequency of passing packets.
- FIG. 2b illustrates the same situation except the buffer has an associated threshold of 5.
- the number of context switches 210 is reduced by a factor that is almost equal to 5. This reduction occurs because consumer component 110 first consumes all 5 available packets in a burst and then waits for new input, whereupon producer component 100 resumes. Producer component 100 can produce 5 packets before it is pre-empted by consumer component 110.
- the irregularity in both FIGs. 2a and 2b is caused by consumer component 110 not being ready to run for some reason when it is signaled that full packets are available in a buffer, e.g. there is no empty packet for the result of consumer component 110. Packets in buffers are not available for the consuming component as long the threshold associated with that buffer has not been reached. It is possible, e.g.
- Precautions are required so that using a buffer threshold that is too high does not result in deadlock, e.g., the number of packets produced is insufficient to reach the buffer threshold.
- a basic example is that if 4 packets are circulating through a system, setting a threshold to 5 will result in a system that locks-up, since the threshold is never reached. So a basic rule is that the threshold value should be equal or lower than the amount of packets that is produced. In more complex systems, where the availability of a packet depends on the availability of other packets, this simple rule may not suffice, and more complex rules that take multiple thresholds into account are required.
- a buffer threshold value should minimize the increase in latency of a chain that results from the use of the threshold.
- a low latency e.g. audio may only be delayed a couple of milliseconds before the end-user starts noticing it
- the low latency requirement may dictate a lower threshold setting, which is sub optimal with respect to context switch overhead.
- a buffer threshold is not active. In case components are blocked, or an end of file is reached, a special command packet is sent through the chain to flush the packets held by the buffer components.
- the possible states for a buffer which can hold three packets and has its threshold set to two is illustrated in FIG. 3. The following state transitions are possible:
- - flush 340 a flush command deactivates the threshold, so the consumer can obtain packets even though the threshold was not reached.
- - notify consumer 350 the threshold has been reached, and the consumer is notified that it can start getting packets from the buffer. Context switches are prevented because after the get_that_fails transition, the consumer is not scheduled until it is notified that there are two packets in the buffer.
- the state diagram can be extended in a straightforward manner to include larger buffers and higher threshold values.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
Description
Claims
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/555,831 US7490178B2 (en) | 2003-05-08 | 2004-04-29 | Threshold on unblocking a processing node that is blocked due data packet passing |
JP2006506592A JP2006525578A (en) | 2003-05-08 | 2004-04-29 | Threshold for unblocking processing nodes that are blocked by the path of the data packet |
EP04730332A EP1625498A1 (en) | 2003-05-08 | 2004-04-29 | Threshold on unblocking a processing node that is blocked due to data packet passing |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US46906803P | 2003-05-08 | 2003-05-08 | |
US60/469,068 | 2003-05-08 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2004099983A1 true WO2004099983A1 (en) | 2004-11-18 |
Family
ID=33435218
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/IB2004/001447 WO2004099983A1 (en) | 2003-05-08 | 2004-04-29 | Threshold on unblocking a processing node that is blocked due to data packet passing |
Country Status (6)
Country | Link |
---|---|
US (1) | US7490178B2 (en) |
EP (1) | EP1625498A1 (en) |
JP (1) | JP2006525578A (en) |
KR (1) | KR20060009000A (en) |
CN (1) | CN1784658A (en) |
WO (1) | WO2004099983A1 (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100421070C (en) * | 2005-03-30 | 2008-09-24 | 国际商业机器公司 | Method and system for managing dynamic configuration data |
US20170060771A1 (en) * | 2015-08-31 | 2017-03-02 | Salesforce.Com, Inc. | System and method for generating and storing real-time analytics metric data using an in memory buffer service consumer framework |
Families Citing this family (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US8588253B2 (en) * | 2008-06-26 | 2013-11-19 | Qualcomm Incorporated | Methods and apparatuses to reduce context switching during data transmission and reception in a multi-processor device |
US9436969B2 (en) * | 2009-10-05 | 2016-09-06 | Nvidia Corporation | Time slice processing of tessellation and geometry shaders |
CN102298580A (en) | 2010-06-22 | 2011-12-28 | Sap股份公司 | Multi-core query processing system using asynchronous buffer |
EP2405353B1 (en) * | 2010-07-07 | 2017-11-22 | Sap Se | Multi-core query processing using asynchronous buffers |
US10235220B2 (en) * | 2012-01-23 | 2019-03-19 | Advanced Micro Devices, Inc. | Multithreaded computing |
US9229847B1 (en) * | 2012-04-18 | 2016-01-05 | Open Invention Network, Llc | Memory sharing for buffered macro-pipelined data plane processing in multicore embedded systems |
ES2750216T3 (en) * | 2013-02-13 | 2020-03-25 | Composecure Llc | Durable card |
CN103793207B (en) * | 2014-01-21 | 2016-06-29 | 上海爱数信息技术股份有限公司 | A kind of intelligent dispatching method of single-threaded multipriority system |
US11792307B2 (en) | 2018-03-28 | 2023-10-17 | Apple Inc. | Methods and apparatus for single entity buffer pool management |
US11829303B2 (en) | 2019-09-26 | 2023-11-28 | Apple Inc. | Methods and apparatus for device driver operation in non-kernel space |
US11558348B2 (en) | 2019-09-26 | 2023-01-17 | Apple Inc. | Methods and apparatus for emerging use case support in user space networking |
US11606302B2 (en) | 2020-06-12 | 2023-03-14 | Apple Inc. | Methods and apparatus for flow-based batching and processing |
US11775359B2 (en) | 2020-09-11 | 2023-10-03 | Apple Inc. | Methods and apparatuses for cross-layer processing |
US11954540B2 (en) | 2020-09-14 | 2024-04-09 | Apple Inc. | Methods and apparatus for thread-level execution in non-kernel space |
US11799986B2 (en) * | 2020-09-22 | 2023-10-24 | Apple Inc. | Methods and apparatus for thread level execution in non-kernel space |
US11882051B2 (en) | 2021-07-26 | 2024-01-23 | Apple Inc. | Systems and methods for managing transmission control protocol (TCP) acknowledgements |
US11876719B2 (en) | 2021-07-26 | 2024-01-16 | Apple Inc. | Systems and methods for managing transmission control protocol (TCP) acknowledgements |
-
2004
- 2004-04-29 KR KR1020057021123A patent/KR20060009000A/en not_active Application Discontinuation
- 2004-04-29 JP JP2006506592A patent/JP2006525578A/en not_active Withdrawn
- 2004-04-29 WO PCT/IB2004/001447 patent/WO2004099983A1/en not_active Application Discontinuation
- 2004-04-29 US US10/555,831 patent/US7490178B2/en not_active Expired - Fee Related
- 2004-04-29 EP EP04730332A patent/EP1625498A1/en not_active Withdrawn
- 2004-04-29 CN CNA2004800122450A patent/CN1784658A/en active Pending
Non-Patent Citations (3)
Title |
---|
CARTER J B ET AL: "TECHNIQUES FOR REDUCING CONSISTENCY-RELATED COMMUNICATION IN DISTRIBUTED SHARED-MEMORY SYSTEMS", ACM TRANSACTIONS ON COMPUTER SYSTEMS, ASSOCIATION FOR COMPUTING MACHINERY. NEW YORK, US, vol. 13, no. 3, 1 August 1995 (1995-08-01), pages 205 - 243, XP000558452, ISSN: 0734-2071 * |
K. JEFFAY: "The real-time producer/consumer paradigm: a paradigm for the construction of efficient, predictable real-time systems", PROCEEDINGS OF THE 1993 ACM-SIGAPP SYMPOSIUM ON APPLIED COMPUTING, March 1993 (1993-03-01), INDIANAPOLIS, INDIANA, USA, pages 796 - 804, XP002293835 * |
NEGISHI Y ET AL: "A portable communication system for video-on-demand applications using the existing infrastructure", PROCEEDINGS OF IEEE INFOCOM 1996. CONFERENCE ON COMPUTER COMMUNICATIONS. FIFTEENTH ANNUAL JOINT CONFERENCE OF THE IEEE COMPUTER AND COMMUNICATIONS SOCIETIES. NETWORKING THE NEXT GENERATION. SAN FRANCISCO, MAR. 24 - 28, 1996, PROCEEDINGS OF INFOCOM, L, vol. VOL. 2 CONF. 15, 24 March 1996 (1996-03-24), pages 18 - 26, XP010158050, ISBN: 0-8186-7293-5 * |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN100421070C (en) * | 2005-03-30 | 2008-09-24 | 国际商业机器公司 | Method and system for managing dynamic configuration data |
US20170060771A1 (en) * | 2015-08-31 | 2017-03-02 | Salesforce.Com, Inc. | System and method for generating and storing real-time analytics metric data using an in memory buffer service consumer framework |
US9767040B2 (en) * | 2015-08-31 | 2017-09-19 | Salesforce.Com, Inc. | System and method for generating and storing real-time analytics metric data using an in memory buffer service consumer framework |
Also Published As
Publication number | Publication date |
---|---|
KR20060009000A (en) | 2006-01-27 |
CN1784658A (en) | 2006-06-07 |
JP2006525578A (en) | 2006-11-09 |
US20070008983A1 (en) | 2007-01-11 |
EP1625498A1 (en) | 2006-02-15 |
US7490178B2 (en) | 2009-02-10 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US7490178B2 (en) | Threshold on unblocking a processing node that is blocked due data packet passing | |
US6757897B1 (en) | Apparatus and methods for scheduling and performing tasks | |
CN107077390B (en) | Task processing method and network card | |
Buttazzo | Rate monotonic vs. EDF: Judgment day | |
US7647594B2 (en) | Processor system, task control method on computer system, computer program | |
US8533503B2 (en) | Managing power consumption in a multicore processor | |
JP2006155646A (en) | Register transfer unit for electronic processor | |
US20050278719A1 (en) | Information processing device, process control method, and computer program | |
JPH10301793A (en) | Information processor and scheduling method | |
WO2023246044A1 (en) | Scheduling method and apparatus, chip, electronic device, and storage medium | |
US20060037021A1 (en) | System, apparatus and method of adaptively queueing processes for execution scheduling | |
US6820263B1 (en) | Methods and system for time management in a shared memory parallel processor computing environment | |
KR100617228B1 (en) | method for implementation of transferring event in real-time operating system kernel | |
US6105102A (en) | Mechanism for minimizing overhead usage of a host system by polling for subsequent interrupts after service of a prior interrupt | |
US11588747B2 (en) | Systems and methods for providing lockless bimodal queues for selective packet capture | |
CN115391003A (en) | Queuing delay control method and device for DPDK data packet processing | |
JP4320390B2 (en) | Method and apparatus for changing output rate | |
CN115022227A (en) | Data transmission method and system based on circulation or rerouting in data center network | |
US10949367B2 (en) | Method for handling kernel service request for interrupt routines in multi-core environment and electronic device thereof | |
WO2023144878A1 (en) | Intra-server delay control device, intra-server delay control method, and program | |
US20200379798A1 (en) | Apparatus for transmitting packets using timer interrupt service routine | |
JPH10247161A (en) | Memory management system | |
US7996845B2 (en) | Methods and apparatus to control application execution resource with a variable delay | |
Schmitt et al. | Adaptive receiver notification for non-dedicated workstation clusters | |
Han et al. | Supporting Configurable Real-Time Communication Services |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AK | Designated states |
Kind code of ref document: A1 Designated state(s): AE AG AL AM AT AU AZ BA BB BG BR BW BY BZ CA CH CN CO CR CU CZ DE DK DM DZ EC EE EG ES FI GB GD GE GH GM HR HU ID IL IN IS JP KE KG KP KR KZ LC LK LR LS LT LU LV MA MD MG MK MN MW MX MZ NA NI NO NZ OM PG PH PL PT RO RU SC SD SE SG SK SL SY TJ TM TN TR TT TZ UA UG US UZ VC VN YU ZA ZM ZW |
|
AL | Designated countries for regional patents |
Kind code of ref document: A1 Designated state(s): GM KE LS MW MZ NA SD SL SZ TZ UG ZM ZW AM AZ BY KG KZ MD RU TJ TM AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LU MC NL PL PT RO SE SI SK TR BF BJ CF CG CI CM GA GN GQ GW ML MR NE SN TD TG |
|
121 | Ep: the epo has been informed by wipo that ep was designated in this application | ||
WWE | Wipo information: entry into national phase |
Ref document number: 2004730332 Country of ref document: EP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2006506592 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2007008983 Country of ref document: US Ref document number: 20048122450 Country of ref document: CN Ref document number: 1020057021123 Country of ref document: KR Ref document number: 10555831 Country of ref document: US |
|
WWP | Wipo information: published in national office |
Ref document number: 1020057021123 Country of ref document: KR |
|
WWP | Wipo information: published in national office |
Ref document number: 2004730332 Country of ref document: EP |
|
WWP | Wipo information: published in national office |
Ref document number: 10555831 Country of ref document: US |
|
WWW | Wipo information: withdrawn in national office |
Ref document number: 2004730332 Country of ref document: EP |