WO2012106943A1 - 一种基于多核系统的同步处理方法及装置 - Google Patents

一种基于多核系统的同步处理方法及装置 Download PDF

Info

Publication number
WO2012106943A1
WO2012106943A1 PCT/CN2011/078411 CN2011078411W WO2012106943A1 WO 2012106943 A1 WO2012106943 A1 WO 2012106943A1 CN 2011078411 W CN2011078411 W CN 2011078411W WO 2012106943 A1 WO2012106943 A1 WO 2012106943A1
Authority
WO
WIPO (PCT)
Prior art keywords
processing
processing devices
semaphore
group
message
Prior art date
Application number
PCT/CN2011/078411
Other languages
English (en)
French (fr)
Inventor
杜学峰
Original Assignee
华为技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 华为技术有限公司 filed Critical 华为技术有限公司
Priority to CN2011800014795A priority Critical patent/CN102334104B/zh
Priority to PCT/CN2011/078411 priority patent/WO2012106943A1/zh
Publication of WO2012106943A1 publication Critical patent/WO2012106943A1/zh
Priority to US14/077,421 priority patent/US9424101B2/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • G06F9/522Barrier synchronisation

Definitions

  • the present invention relates to the field of communication network technologies, and in particular, to a synchronization processing method and apparatus based on a multi-core system. Background technique
  • the prior art processes synchronization of events between multiple cores.
  • multiple processors are multi-core, and the above multiple processors and semaphores
  • the processing unit is located on the SOC (System On a Chip), and the mutual cache is only available to one processor at a time.
  • the semaphore processing unit sends the identifier of the obtained cache resource to one of the processors A through the interrupt, and when the processor A responds to the interrupt, completes the event processing and clears the interrupt, and then the semaphore processing unit sends the identifier of the obtained cache resource through the interrupt.
  • the semaphore processing unit sends the identifier of the obtained cache resource through the interrupt.
  • the processor A in the multi-core when processing the message synchronization between the multi-cores, applies for a message channel X in the processing unit of the IPC (Inter-Professional Inter-Process Communication), and the processor A writes the message to After message channel X, the IPC processing unit notifies other processors other than processor A that need to obtain this message by interrupt. At this point, the other processor responds to the interrupt and reads the message in message channel X, then clears the interrupt.
  • IPC Inter-Professional Inter-Process Communication
  • Embodiments of the present invention provide a synchronization processing method and apparatus based on a multi-core system, which can improve system scheduling efficiency and consume less resources.
  • a synchronization processing method based on a multi-core system comprising:
  • the initial setting includes setting a count semaphore value of the current multi-core synchronization processing unit to M and setting the The current multi-core synchronization processing unit starts the control message content of the second group of processing devices, where M is the number of times the group of processing devices synchronously processes the current task; and receives a notification sent by any one of the processing devices of the first group of processing devices a message, the count semaphore value is correspondingly decremented by one, and the content of the notification message is that the processing device that sends the notification message completes the current task;
  • the current multi-core synchronization processing unit is configured to start the control message content of the second group of processing devices, and the control is sent to the second group of processing devices by using a message sending interface. a message, such that the second set of processing devices processes the current task.
  • a synchronous processing device based on a multi-core system comprising:
  • An initialization module configured to receive an initialization setting sent by any one of the first processing devices that synchronously processes the same current task, and perform initialization, where the initial setting includes setting a count signal value of the current multi-core synchronization processing unit And setting the current multi-core synchronization processing unit to start the control message content of the second group of processing devices, where M is the number of times the set of processing devices synchronously processes the current task;
  • a processing module configured to receive a notification message sent by any one of the first processing devices, and decrement the counting signal value by one, and the content of the notification message is a process for sending the notification message
  • the device completes the current task
  • a sending module configured to: when the counted semaphore value is 0, start the control message content of the second group of processing devices according to the set current multi-core synchronization processing unit, and process the content to the second group by using a message sending interface
  • the device sends a control message to the second group of processing devices to process the current task.
  • the method and apparatus for synchronizing processing based on a multi-core system provided by an embodiment of the present invention, by receiving an initialization setting sent by any one of the processing devices of the first group of processing devices that synchronously process the same current task, and performing initialization, and then receiving the Any one of the first set of processing devices sends
  • the notification message the count semaphore value is correspondingly decremented by 1, and when the count semaphore value is 0, the current multi-core synchronization processing unit is configured to start the message content of the second group of processing devices according to the set
  • the message sending interface sends the control message to the second group of processing devices, so that the second group of processing devices processes the current task.
  • the synchronization operation is usually completed by the interruption, and the system scheduling efficiency is low, and the resource consumption is large.
  • the solution provided by the embodiment of the present invention sends a control message. Handling synchronization and communication between multiple processing devices can improve system scheduling efficiency and consume less resources.
  • FIG. 1 is a flow chart of a synchronization processing method based on a multi-core system according to Embodiment 1 of the present invention
  • FIG. 2 is a block diagram of a synchronization processing apparatus based on a multi-core system according to Embodiment 1 of the present invention
  • FIG. 3 is a flow chart of a synchronization processing method based on a multi-core system according to Embodiment 2 of the present invention
  • FIG. 4 is a schematic diagram of independent processing of each synchronization processing unit according to Embodiment 2 of the present invention
  • FIG. 5 is a schematic diagram of virtual address mapping according to Embodiment 2 of the present invention
  • FIG. 6 is a schematic diagram of processing tasks of a processing device according to Embodiment 2 of the present invention.
  • FIG. 7 is a block diagram of a synchronization processing apparatus based on a multi-core system according to Embodiment 2 of the present invention. detailed description
  • Example 1 The embodiment of the present invention provides a synchronization processing method based on a multi-core system.
  • the multi-core system refers to a system in which multiple processing devices exist in the embodiment of the present invention.
  • the method includes:
  • Step 101 Receive initialization settings sent by any one of the first group of processing devices that synchronously process the same current task, and perform initialization, where the initial setting includes setting a count signal value of the current multi-core synchronization processing unit to be M and Setting the current multi-core synchronization processing unit to start the control message content of the second group of processing devices, where M is the number of times the group of processing devices synchronously processes the current task;
  • the processing device can be either a processor or an accelerator.
  • the accelerator can process tasks independently or assist the processor in performing arithmetic processing so that the processor can speed up processing tasks.
  • the internal cache corresponding to the channel is read, the content in the internal cache is updated to the initialization setting, and the initialization setting is saved.
  • Step 102 Receive a notification message sent by any one of the processing devices of the first group of processing devices, and decrement the value of the counting signal by one, and the content of the notification message is completed by a processing device that sends the notification message.
  • the current task ;
  • the current value of the count semaphore is decremented by one to store the value of the count semaphore in the current internal buffer.
  • Step 103 When the value of the semaphore is 0, the content of the control message of the second group of processing devices is started according to the current multi-core synchronization processing unit, and is sent to the second group of processing devices by using a message sending interface.
  • the control message is such that the second group of processing devices processes the current task.
  • a method for synchronizing processing based on a multi-core system by receiving an initialization setting sent by any one of a processing device of a first group of processing devices that synchronously processes the same current task, and performing initialization, and then receiving the a notification message sent by any one of the processing devices of the first group of processing devices, correspondingly decreasing the value of the counting signal by one, and when the value of the counting signal is 0, starting according to the current multi-core synchronization processing unit that is set
  • the message content of the second group of processing devices sends the control message to the second group of processing devices through a message sending interface, so that the second group of processing devices processes the current task.
  • the synchronization operation is usually completed by the interruption, and the system scheduling efficiency is low, and the resource consumption is large.
  • the solution provided by the embodiment of the present invention sends a control message. Handling synchronization and communication between multiple processing devices can improve system scheduling efficiency and consume less resources.
  • An embodiment of the present invention provides an apparatus for performing synchronization processing based on a multi-core system.
  • the apparatus is a multi-core synchronization processing unit.
  • the apparatus includes: an initialization module 201, a processing module 202, and a sending module 203.
  • the initialization module 201 is configured to receive an initialization setting sent by any one of the first processing devices that synchronously process the same current task, and perform initialization, where the initial setting includes setting a count signal value of the current multi-core synchronization processing unit. Initiating the control message content of the second group of processing devices for the M and setting the current multi-core synchronization processing unit, where M is the number of times the set of processing devices synchronously processes the current task;
  • the initialization module 201 includes a first receiving submodule and an update saving submodule.
  • the first receiving submodule is configured to receive, by using any channel of the internal cache, an initialization setting sent by any one of the first processing devices that synchronously process the same current task;
  • the update save submodule is configured to read an internal cache corresponding to the channel, update content in the internal cache to the initial setting, and save the initialization setting.
  • the processing module 202 is configured to receive a notification message sent by any one of the processing devices of the first group of processing devices, and decrement the value of the counting signal by one, and the content of the notification message is sent by the notification message. Processing the device to complete the current task;
  • the processing module 202 includes a second receiving submodule, an obtaining submodule, and an update submodule.
  • the second receiving submodule is configured to receive, by using any channel of the internal cache, a notification message sent by any one of the processing devices of the first group of processing devices;
  • the obtaining submodule is configured to read an internal cache corresponding to the channel, and obtain a current value of the counting semaphore in the internal buffer;
  • the update submodule is configured to decrement the current value of the count semaphore by one, and save the value of the count semaphore in the current inner buffer.
  • the sending module 203 is configured to: when the counted semaphore value is 0, start the control message content of the second group of processing devices according to the set current multi-core synchronization processing unit, and send the message to the second group by using a message sending interface
  • the processing device sends the control message for the second group of processing devices to process the current task.
  • the apparatus for synchronizing processing based on the multi-core system provided by the embodiment of the present invention, when processing the current task by processing each processing device of the same current task synchronously, the processing module reduces the value of the counting semaphore by 1 correspondingly, and then When the value of the counting semaphore is 0, the sending module sends a response message to the second group of processing devices, so that the second group of processing devices processes the current task.
  • the synchronization and communication between the multi-cores are processed in the prior art, the synchronization operation is usually completed by the interruption, which results in the system scheduling being less efficient and the resource consumption is larger.
  • the solution provided by the embodiment of the present invention is controlled by the transmission response.
  • the message processing synchronization and communication between multiple processing devices can improve the efficiency of system scheduling and consume less resources.
  • the embodiment of the present invention provides a method for synchronizing processing based on a multi-core system
  • the application scenario is as follows: each processing device in the first group of processing devices that synchronously processes the same current task needs to process the current task.
  • the second group of processing devices is notified to process the current task by sending a control message.
  • the processing device includes a processor or an accelerator that can be completed by the processor or by the accelerator when processing the current task.
  • the solution provided by the embodiment of the present invention to process the current task description is based on the method of synchronous processing of the multi-core system. As shown in FIG. 3, the method includes:
  • Step 301 The multi-core synchronization processing unit receives the synchronization processing phase through any channel of the internal cache. Initializing settings sent by any one of the first processing devices of the current task, the initial setting includes setting a count semaphore value of the current multi-core synchronization processing unit to M and setting the current multi-core synchronization processing unit to start The control message content of the two groups of processing devices, where M is the number of times the set of processing devices simultaneously processes the current task;
  • the semaphore is a non-negative integer count
  • the semaphore is used to coordinate access to the resource, which is initialized to the number of available resources, that is, M in the count semaphore whose application value is M is the number of available resources, in the present invention
  • M in the count semaphore whose application value is M is the number of available resources
  • Each processor in the first group of processors that process the same current task in synchronization needs to occupy one resource when processing the current task, and the number of times the first group of processors simultaneously process the same current task is M, which requires a total of M. Resources.
  • the initial setting is carried in the application message, and may be sent through multiple channels when sending the application message of the initial setting.
  • the application message further includes an application identifier, and the application identifier is used to identify the resource in the internal cache to which the application is applied, so as to avoid repeatedly requesting the resource.
  • Step 302 The multi-core synchronization processing unit reads an internal cache corresponding to the channel, updates content in the internal cache to the initial setting, and saves the initialization setting.
  • the multi-core synchronization processing unit receives the initialization settings sent by any one of the processing devices of the first group of processing devices that process the same current task, and performs initialization, specifically
  • the counting field value semaphore unit in the multi-core synchronization processing unit executes, wherein the counting domain value semaphore unit receives the application message of the initial setting through the configuration interface, and the configuration interface is set on the multi-core synchronization processing unit.
  • the counting domain value semaphore unit is integrated on the multi-core synchronization processing unit, and the multi-core synchronization processing unit further includes a direct acquisition semaphore unit, an indirect acquisition semaphore unit, and a sending message unit, where the direct acquisition semaphore unit is used for direct Sending an acquisition identifier to a processing device that acquires a semaphore; the indirectly acquiring semaphore unit, configured to send an acquisition identifier to a processing device that acquires a semaphore by sending a message or sending an interrupt; the sending message unit, configured to send a message through the message sending interface Processing The device sends a control message or a notification message; the counting field value semaphore unit is configured to control a counting signal amount obtained by the processing device that simultaneously accesses the multi-core synchronization processing unit.
  • the direct acquisition semaphore unit is used for direct Sending an acquisition identifier to a processing device that acquires a semaphore
  • the processing flow of directly acquiring the semaphore unit, indirectly acquiring the semaphore unit, and transmitting the message unit is similar to the processing flow of the counting domain value semaphore unit, wherein directly acquiring the semaphore unit and indirectly acquiring
  • the semaphore unit and the sending message unit read the internal cache after receiving the new initial setting, and check whether the internal cache has a duplicate request or an illegal application, and then update the contents of the internal cache to the initial setting. Content, and save the initial settings.
  • the indirect acquisition of the semaphore unit, the calculation of the domain value signal unit, and the sending of the message unit also require sending a message or an interrupt to notify the multi-core synchronization processing unit to complete the task.
  • the bus configuration interface controls the access traffic of the bus. As shown in FIG. 4, specifically, before the end of the bus access, the bus interface controls the continuous operation of the bus, and each operation of the bus corresponds to an internal cache read, process, and update operation. In this way, a synchronous collision scheduling mechanism of multiple synchronization processing units can be realized.
  • the multi-core synchronization processing unit When the application is requested, the multi-core synchronization processing unit is initialized, and each processing device can process the corresponding task.
  • Step 303 The multi-core synchronization processing unit receives a notification message sent by any one of the first processing devices, and decrements the count signal value by one, and the content of the notification message is The processing device of the notification message completes the current task; the resource processing task in the semaphore, each time one of the processing devices of the first group of processing devices completes the current task, the multi-core synchronization processing unit performs a semaphore corresponding to the current task Release operation, That is, the count signal value is correspondingly decremented by one.
  • the internal cache is read by any channel of the internal cache, and the current value of the counting semaphore in the internal buffer is obtained, and an abnormality is checked; when the current value of the counting semaphore in the internal buffer is obtained, The current value of the count semaphore is decremented by one, and the value of the count semaphore in the current internal buffer is saved.
  • the internal cache can be read through multiple channels, and the direct acquisition semaphore unit, the indirect acquisition semaphore unit, the send message unit, and the count field value semaphore unit on the multi-core synchronous processing unit use multiple channels to access the internal cache.
  • the schematic diagram of the virtual address mapping shown in FIG. 5 is stored, and is different from the prior art semaphore processing unit and the IPC (Inter Proces sor Communicating) unit, and cannot effectively share the channel.
  • the solution provided by the embodiment of the present invention can achieve maximum sharing of internal cache and save internal cache resources. Further, the internal cache can replace the register with smaller resources, and increase the number of channels as much as possible, thereby effectively reducing The chance of competition between multiple cores.
  • the shared channel buffer it is assumed that there are 10 channels.
  • some channels of the 10 channels can be set by software to implement functions of A, B, C, and D, for example, setting 4 channels.
  • the function provided by the embodiment of the present invention can realize the functions of A, B, C, and D in all 10 channels, so that each channel can be more flexibly allocated to realize the corresponding function.
  • Step 304 determining whether the value of the counting semaphore is 0;
  • Step 305 When the value of the counting semaphore is not 0, proceed to process the current task by the first group of processing devices;
  • step 303 in the first group of processing devices When a certain processing device completes the current task, the count signal value is correspondingly decremented by one, and it is determined whether the value of the count signal amount is 0 at this time.
  • Step 306 When the value of the counting semaphore is 0, the current multi-core synchronization processing unit is configured to start the content of the control message of the second group of processing devices, and send the message to the second group of processing devices by using a message sending interface. Controlling the message so that the second set of processing devices processes the current task.
  • control message is initialized and set in the application message sent by any one of the processing devices of the first group of processing devices that synchronously process the same current task in step 301.
  • the value of the counting semaphore is 0, it indicates that the first group of processing devices has completed the current task assigned to each processing device, and at this time, the first group of processing devices starts processing the second task, and the processing second The processing flow of each task is the same as the processing of the current task.
  • a first set of processing devices includes a processor 1, a processor 2, an accelerator 1, and a second group of processing devices including an accelerator 2, wherein the processor 1 processes the A portion of the current task.
  • the processor 1 processes the B part of the current task
  • the accelerator 1 processes the C part of the current task
  • the accelerator 2 processes the D part of the current task
  • the processor 1 has the shortest time to complete the part A task
  • the processor 2 completes the part B task. The longest time is. After processor 1 completes part B of the task, both processor 1 and accelerator 1 have completed the corresponding tasks.
  • the multi-core synchronous processing unit A control message is sent to the accelerator 2 so that the accelerator 2 processes the D portion of the task.
  • processor 1, processor 1, and accelerator 1 complete the tasks, they perform other tasks.
  • a plurality of channel types are carried in one frequency band. After multiple channels are separated, different units are required for subsequent processing.
  • the multi-core synchronization processing unit distributes the control message to different addresses.
  • start the accelerator B start the accelerator C, and notify the processor D to process the current task.
  • the processor D is notified by the interrupt, and then the processor D starts the accelerator B through the interrupt.
  • the solution provided by the embodiment of the present invention can effectively reduce the overhead of the system, in particular, when the synchronized processing unit is an accelerator.
  • the solution provided by the embodiment of the present invention does not require a response interrupt.
  • the processing device in the second group is an accelerator
  • the multi-core synchronization processing unit sends a control message to the accelerator in the second group in step 306, it is not necessary to send an interrupt to the accelerator in the second group;
  • the processing device in the two groups is a processor
  • the multi-core synchronization processing unit sends a control message to the processor in step 306, and may send an interrupt to the processor in the second group, or may not send an interrupt.
  • the interrupt output interface is set on the multi-core synchronization processing unit, and the interrupt output interface is used for the merge notification of the plurality of control messages, such that , can reduce the transmission of notification interrupts, reducing resource consumption.
  • the second group of processing devices receives the response message sent by the multi-core synchronization processing unit, and processes the current task.
  • the scheduling processor when used to process synchronization, the scheduling speed of the scheduling processor is limited, and the scheduling efficiency is low. In the high-speed service processing, the requirement cannot be met. For example, the service period of the wireless communication is On the order of milliseconds, the processing period of the scheduling processor is a subtle level. When the same service is processed, hundreds of scheduling processors need to be scheduled.
  • the solution provided by the embodiment of the present invention can be implemented by using a multi-core synchronous processing unit to process synchronization. Fast synchronization scheduling of multiple processing devices improves scheduling efficiency and saves processing time.
  • a method for synchronizing processing based on a multi-core system when each processing device that synchronously processes the same current task processes and completes the current task, the value of the counting semaphore is reduced to 0, and then multi-core synchronization processing
  • the unit sends a response message to the processing device of the second group to notify the second group of processors to process the current task. Synchronization and communication with multiple cores in the prior art When the synchronization operation is completed by the interrupt, the efficiency of the system scheduling is low, and the resource consumption is relatively large. Compared with the solution provided by the embodiment of the present invention, the synchronization and communication between the multiple processing devices are processed by sending the response control message. Improve system scheduling efficiency, reduce system overhead, and consume less resources.
  • the device for synchronizing processing based on the multi-core system may be a multi-core synchronous processing unit, and the multi-core synchronous processing unit is located on a S0C (System On a Chi p, system on chip).
  • the apparatus includes: an initialization module 701, a first receiving submodule 702, an update saving submodule 703, a processing module 704, a second receiving submodule 705, an obtaining submodule 706, an updating submodule 707, and a sending module. 708.
  • the initialization module 701 is configured to receive an initialization setting sent by any one of the first processing devices that synchronously process the same current task, and perform initialization, where the initial setting includes setting a count signal value of the current multi-core synchronization processing unit. Initiating the control message content of the second group of processing devices for the M and setting the current multi-core synchronization processing unit, where M is the number of times the set of processing devices synchronously processes the current task;
  • the processing device is a processor or accelerator, which can be completed by the processor or by the accelerator when processing the current task.
  • M in the count semaphore whose application value is M is the number of available resources.
  • each of the processing devices that synchronously process the same current task needs to occupy one resource when processing the current task.
  • the number of times a group of processing devices simultaneously process the same current task is M, and a total of M resources are required.
  • the first receiving submodule 702 in the initialization module 701 is configured to receive, by using any channel of the internal cache, an initialization setting sent by any one of the first processing devices that synchronously process the same current task;
  • the initial setting is carried in the application message, and may be sent through multiple channels when sending the application message of the initial setting.
  • the application message further includes an application identifier, and the application identifier is used to identify the resource in the internal cache to which the application is applied, so as to avoid repeatedly requesting the resource.
  • the update save submodule 703 is configured to read an internal cache corresponding to the channel, and the internal cache is The content is updated to the initial setting, and the initialization setting is saved;
  • the processing module 704 is configured to receive a notification message sent by any one of the processing devices of the first group of processing devices, and decrement the value of the counting signal by one, and the content of the notification message is sent by the notification message. Processing the device to complete the current task;
  • the second receiving submodule 705 of the processing module 704 is configured to receive, by using any channel of the internal cache, a notification message sent by any one of the processing devices of the first group of processing devices;
  • the notification message further includes an identifier for releasing the counting semaphore, that is, releasing the application identifier carried in the application message, so as to use the released resource again.
  • the obtaining submodule 706 is configured to read an internal cache corresponding to the channel, and obtain a current value of the counting semaphore in the internal buffer;
  • the counting domain value semaphore unit can read the internal buffer through multiple channels, and directly acquire the semaphore unit on the multi-core synchronous processing unit, and obtain indirectly
  • the semaphore unit, the send message unit, and the count field value semaphore address are mapped to the same internal cache, which can maximize the internal cache and save internal cache resources.
  • the internal cache can be replaced by a register with a smaller resource. And try to increase the number of channels, effectively reducing the chance of competition between multiple cores.
  • the direct acquisition semaphore unit is configured to send an acquisition identifier directly to the processing device that acquires the semaphore;
  • the indirect acquisition semaphore unit is configured to send the acquisition identifier to the processing device that acquires the semaphore by sending a message or sending an interruption;
  • the sending message unit is configured to send a control message or a notification message to the processing device by using a message sending interface;
  • the counting field value semaphore unit is configured to control a counting signal amount obtained by the processing device that simultaneously accesses the multi-core synchronization processing unit.
  • the internal cache is read, and the internal cache is checked for duplicate applications or illegal applications, etc., and then the contents of the internal cache are updated to the contents of the initial settings, and the initial settings are saved.
  • the updating submodule 707 is configured to decrement the current value of the counting semaphore by 1 to save a value of the counting semaphore in the current internal buffer;
  • the sending module 708 is configured to start, according to the set current multi-core synchronization processing unit, the content of the control message of the second group of processing devices, to send to the second group by using a message sending interface.
  • the processing device sends the control message for the second group of processing devices to process the current task.
  • the sending module is added, and the synchronization event of the synchronization unit is changed.
  • multiple processors and accelerators can be synchronized and communicated with each other.
  • the sending module sends a control message to the accelerator, it is not required to send an interrupt to the second group of processing devices, and if the sending module sends a control message to the processor, the interrupt may be sent to the second group of processing devices. , you can also not send interrupts.
  • the interrupt output interface is set on the multi-core synchronization processing unit, and the interrupt output interface is used for the merge notification of the plurality of control messages, such that , can reduce the transmission of notification interrupts, reducing resource consumption.
  • a plurality of channel types are carried in one frequency band. After multiple channels are separated, different units are required for subsequent processing. For example, after the accelerator A processes the current task, the multi-core synchronization processing unit distributes the control message to different addresses. In order to start the accelerator B, start the accelerator C and notify the processor D to process the current task. In the prior art, after the accelerator A processes the current task, the processor D is notified by the interrupt, and then the processor D starts the accelerator B through the interrupt.
  • the solution provided by the embodiment of the present invention can effectively reduce the overhead of the system, in particular, when the synchronized processing unit is an accelerator, The solution provided by the embodiment of the invention does not require a response interrupt.
  • the apparatus for synchronizing processing based on the multi-core system provided by the embodiment of the present invention, when processing the current task by processing each processing device of the same current task synchronously, the processing unit decrements the value of the counting semaphore by 1 when When the value of the counting semaphore is 0, the transmitting module sends a control message to the multi-core synchronization processing unit to notify the second group of processing devices to process the current task.
  • the synchronization and the communication between the multi-cores are processed in the prior art, the synchronization operation is usually completed by the interruption, and the system scheduling efficiency is low, and the resource consumption is large.
  • the solution provided by the embodiment of the present invention sends a response message. Handling synchronization and communication between multiple cores can improve system scheduling efficiency, reduce system overhead, and consume less resources.

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Multi Processors (AREA)

Description

一种基于多核系统的同步处理方法及装置 技术领域
本发明涉及通信网络技术领域, 尤其涉及一种基于多核系统的同步处理 方法及装置。 背景技术
多核技术的发展使得系统越来越复杂, 多核之间的交互事件越来越多, 现有技术在处理多核之间的同步和通讯时, 通常通过中断来完成。
现有技术在处理多核之间的事件同步, 对于多个处理器同时申请信号量 处理单元内的同一个互斥緩存时, 这里, 多个处理器即为多核, 上述多个处 理器和信号量处理单元位于 SOC ( Sys tem On a Chip, 片上系统)上, 所述互 斥緩存为緩存资源在某一时刻只能提供给一个处理器使用。 通常信号量处理 单元将获取緩存资源的标识通过中断发送给其中一个处理器 A,当处理器 A响 应中断, 完成事件处理后清中断, 然后信号量处理单元再将获取緩存资源的 标识通过中断发送给其他的处理器。
现有技术在处理多核之间的消息同步时,对于多核中的处理器 A申请 IPC ( Inter Proces sor Communica t ion, 进程间通信)处理单元内的一个消息通 道 X, 处理器 A将消息写到消息通道 X后, IPC处理单元通过中断通知除处理 器 A 以外的需要获取此消息的其他处理器。 这时, 其他处理器响应中断并读 取消息通道 X内的消息, 然后清中断。
然而, 现有技术在处理多核之间的同步和通讯时, 通常需要通过中断完 成同步操作, 导致系统调度的效率较低, 资源消耗较大。 发明内容
本发明的实施例提供一种基于多核系统的同步处理方法及装置, 可以提 高系统调度的效率, 资源消耗较小。
为达到上述目的, 本发明的实施例釆用如下技术方案: 一种基于多核系统的同步处理方法, 包括:
接收同步处理相同的当前任务的第一组处理设备中的任一处理设备发送 的初始化设置, 并进行初始化, 所述初始化设置包括设置当前多核同步处理 单元的计数信号量值为 M和设置所述当前多核同步处理单元启动第二组处理 设备的控制消息内容, M为所述一组处理设备同步处理所述当前任务的次数; 接收所述第一组处理设备中的任一处理设备发送的通知消息, 将所述计 数信号量值相应的减 1 ,所述通知消息的内容为发送所述通知消息的处理设备 完成所述当前任务;
当所述计数信号量值为 0 时, 根据设置的所述当前多核同步处理单元启 动第二组处理设备的所述控制消息内容, 通过消息发送接口向所述第二组处 理设备发送所述控制消息 , 以便所述第二组处理设备处理所述当前任务。
一种基于多核系统的同步处理装置, 包括:
初始化模块, 用于接收同步处理相同的当前任务的第一组处理设备中的 任一处理设备发送的初始化设置, 并进行初始化, 所述初始化设置包括设置 当前多核同步处理单元的计数信号量值为 M和设置所述当前多核同步处理单 元启动第二组处理设备的控制消息内容, M为所述一组处理设备同步处理所述 当前任务的次数;
处理模块, 用于接收所述第一组处理设备中的任一处理设备发送的通知 消息, 将所述计数信号量值相应的减 1 , 所述通知消息的内容为发送所述通知 消息的处理设备完成所述当前任务;
发送模块, 用于当所述计数信号量值为 0 时, 根据设置的所述当前多核 同步处理单元启动第二组处理设备的所述控制消息内容, 通过消息发送接口 向所述第二组处理设备所述发送控制消息 , 以便所述第二组处理设备处理所 述当前任务。
本发明实施例提供的基于多核系统的同步处理的方法及装置, 通过接收 同步处理相同的当前任务的第一组处理设备中的任一处理设备发送的初始化 设置, 并进行初始化, 然后接收所述第一组处理设备中的任一处理设备发送 的通知消息, 将所述计数信号量值相应的减 1 , 当所述计数信号量值为 0时, 根据设置的所述当前多核同步处理单元启动第二组处理设备的所述消息内 容, 通过消息发送接口向所述第二组处理设备发送所述控制消息, 以便所述 第二组处理设备处理所述当前任务。 与现有技术中在处理多核之间的同步和 通讯时, 通常需要通过中断完成同步操作, 导致系统调度的效率较低, 资源 消耗较大相比, 本发明实施例提供的方案通过发送控制消息处理多个处理设 备之间的同步和通讯, 可以提高系统调度的效率, 资源消耗较小。
附图说明 施例或现有技术描述中所需要使用的附图作简单地介绍, 显而易见地, 下面 描述中的附图仅仅是本发明的一些实施例, 对于本领域普通技术人员来讲, 在不付出创造性劳动性的前提下, 还可以根据这些附图获得其他的附图。
图 1为本发明实施例 1提供的一种基于多核系统的同步处理方法的流程 图;
图 2为本发明实施例 1提供的一种基于多核系统的同步处理装置的框图; 图 3为本发明实施例 2提供的一种基于多核系统的同步处理方法的流程 图;
图 4为本发明实施例 2提供的各个同步处理单元独立处理的示意图; 图 5为本发明实施例 2提供的虚拟地址映射的示意图;
图 6为本发明实施例 2提供的处理设备处理任务的示意图;
图 7为本发明实施例 2提供的一种基于多核系统的同步处理装置的框图。 具体实施方式
下面将结合本发明实施例中的附图, 对本发明实施例中的技术方案进行 清楚、 完整地描述, 显然, 所描述的实施例仅仅是本发明一部分实施例, 而 不是全部的实施例。 基于本发明中的实施例, 本领域普通技术人员在没有作 出创造性劳动前提下所获得的所有其他实施例, 都属于本发明保护的范围。
实施例 1 本发明实施例提供一种基于多核系统的同步处理方法, 需要说明的是, 所述多核系统在本发明实施例中指存在多个处理设备的系统, 如图 1 所示, 该方法包括:
步骤 101 ,接收同步处理相同的当前任务的第一组处理设备中的任一处理 设备发送的初始化设置, 并进行初始化, 所述初始化设置包括设置当前多核 同步处理单元的计数信号量值为 M和设置所述当前多核同步处理单元启动第 二组处理设备的控制消息内容, M为所述一组处理设备同步处理所述当前任务 的次数;
处理设备可以为处理器, 也可以为加速器, 加速器可以独立处理任务, 也可以协助处理器完成运算处理, 以使处理器在处理任务时速度加快。
具体地, 通过内部緩存的任一通道接收同步处理相同的当前任务的第一 组处理设备中的任一处理设备发送的初始化设置;
读取所述通道对应的内部緩存, 将内部緩存中的内容更新为所述初始化 设置, 并保存所述初始化设置。
步骤 102 , 接收所述第一组处理设备中的任一处理设备发送的通知消息, 将所述计数信号量值相应的减 1 ,所述通知消息的内容为发送所述通知消息的 处理设备完成所述当前任务;
进一步地, 通过所述内部緩存的任一通道接收所述第一组处理设备中的 任一个处理设备发送的通知消息;
读取所述通道对应的内部緩存, 并获取内部緩存中所述计数信号量的当 前值;
将所述计数信号量的当前值减 1 ,保存所述当前内部緩存中的计数信号量 的值。
步骤 103 , 当所述计数信号量值为 0时,根据设置的所述当前多核同步处 理单元启动第二组处理设备的所述控制消息内容, 通过消息发送接口向所述 第二组处理设备发送所述控制消息 , 以便所述第二组处理设备处理所述当前 任务。 本发明实施例提供的一种基于多核系统的同步处理的方法, 通过接收同 步处理相同的当前任务的第一组处理设备中的任一处理设备发送的初始化设 置, 并进行初始化, 然后接收所述第一组处理设备中的任一处理设备发送的 通知消息, 将所述计数信号量值相应的减 1 , 当所述计数信号量值为 0时, 根 据设置的所述当前多核同步处理单元启动第二组处理设备的所述消息内容, 通过消息发送接口向所述第二组处理设备发送所述控制消息, 以便所述第二 组处理设备处理所述当前任务。 与现有技术中在处理多核之间的同步和通讯 时, 通常需要通过中断完成同步操作, 导致系统调度的效率较低, 资源消耗 较大相比, 本发明实施例提供的方案通过发送控制消息处理多个处理设备之 间的同步和通讯, 可以提高系统调度的效率, 资源消耗较小。
本发明实施例提供一种基于多核系统的同步处理的装置, 具体地, 所述 装置为多核同步处理单元, 如图 2所示, 该装置包括: 初始化模块 201 , 处理 模块 202 , 发送模块 203。
初始化模块 201 ,用于接收同步处理相同的当前任务的第一组处理设备中 的任一处理设备发送的初始化设置, 并进行初始化, 所述初始化设置包括设 置当前多核同步处理单元的计数信号量值为 M和设置所述当前多核同步处理 单元启动第二组处理设备的控制消息内容, M为所述一组处理设备同步处理所 述当前任务的次数;
所述初始化模块 201包括第一接收子模块和更新保存子模块。
所述第一接收子模块, 用于通过内部緩存的任一通道接收同步处理相同 的当前任务的第一组处理设备中的任一处理设备发送的初始化设置;
所述更新保存子模块, 用于读取所述通道对应的内部緩存, 将内部緩存 中的内容更新为所述初始化设置, 并保存所述初始化设置。
处理模块 202 ,用于接收所述第一组处理设备中的任一处理设备发送的通 知消息, 将所述计数信号量值相应的减 1 , 所述通知消息的内容为发送所述通 知消息的处理设备完成所述当前任务;
所述处理模块 202包括第二接收子模块、 获取子模块和更新子模块。 所述第二接收子模块, 用于通过所述内部緩存的任一通道接收所述第一 组处理设备中的任一个处理设备发送的通知消息;
所述获取子模块, 用于读取所述通道对应的内部緩存, 并获取内部緩存 中所述计数信号量的当前值;
所述更新子模块, 用于将所述计数信号量的当前值减 1 , 保存所述当前内 部緩存中的计数信号量的值。
发送模块 203 , 用于当所述计数信号量值为 0时,根据设置的所述当前多 核同步处理单元启动第二组处理设备的所述控制消息内容, 通过消息发送接 口向所述第二组处理设备发送所述控制消息, 以便所述第二组处理设备处理 所述当前任务。
本发明实施例提供的一种基于多核系统的同步处理的装置, 通过同步处 理相同的当前任务的每个处理设备处理完成当前的任务时, 处理模块将计数 信号量的值相应的减 1 , 然后当所述计数信号量的值为 0时,发送模块向第二 组处理设备发送响应消息, 以便所述第二组处理设备处理所述当前任务。 与 现有技术中在处理多核之间的同步和通讯时, 通常需要通过中断完成同步操 作, 导致系统调度的效率较低, 资源消耗较大相比, 本发明实施例提供的方 案通过发送响应控制消息处理多个处理设备之间的同步和通讯, 可以提高系 统调度的效率, 资源消耗较小。
实施例 2
需要说明的是, 本发明实施例提供一种基于多核系统的同步处理的方法, 其应用场景为: 同步处理相同的当前任务的第一组处理设备中的每个处理设 备需要处理当前任务中的一部分, 完成当前任务时, 通过发送控制消息通知 第二组处理设备处理当前任务。 处理设备包括处理器或加速器, 在处理当前 任务时, 可以由处理器完成, 也可以由加速器完成。 本发明实施例提供的方 案以处理当前任务描述基于多核系统的同步处理的方法, 如图 3 所示, 该方 法包括:
步骤 301 , 多核同步处理单元通过内部緩存的任一通道,接收同步处理相 同的当前任务的第一组处理设备中的任一处理设备发送的初始化设置, 所述 初始化设置包括设置当前多核同步处理单元的计数信号量值为 M和设置所述 当前多核同步处理单元启动第二组处理设备的控制消息内容, M为所述一组处 理设备同步处理所述当前任务的次数;
这里, 信号量是一个非负整数计数, 信号量用来协调对资源的访问, 其 初始化为可用资源的数目, 即申请值为 M的计数信号量中的 M为可用资源的 数目, 在本发明中同步处理相同的当前任务的第一组处理器中的每个处理器 在处理当前任务时, 需要占用一个资源, 第一组处理器同步处理相同的当前 任务的次数为 M, 总共需要占用 M个资源。
具体地, 所述初始化设置携带在申请消息中, 在发送初始化设置的申请 消息时, 可以通过多个通道发送。 另外, 申请消息中还包括申请标识, 申请 标识用于标识此次申请到的内部緩存中的资源, 以免重复申请此资源。
步骤 302 , 所述多核同步处理单元读取所述通道对应的内部緩存, 将内部 緩存中的内容更新为所述初始化设置, 并保存所述初始化设置;
需要说明的是, 读取所述通道对应的内部緩存后, 还可以检查是否存在 异常, 具体地, 检查是否重复初始化设置, 是否存在非法申请进行初始化设 置等。
需要说明的是, 步骤 301和步骤 302 , 为多核同步处理单元接收同步处理 相同的当前任务的第一组处理设备中的任一处理设备发送的初始化设置, 并 进行初始化的流程, 具体为所述多核同步处理单元内的计数域值信号量单元 执行的, 其中计数域值信号量单元通过配置接口接收初始化设置的申请消息, 配置接口设置在多核同步处理单元上。
另外, 计数域值信号量单元集成在多核同步处理单元上, 多核同步处理 单元上还包括直接获取信号量单元、 间接获取信号量单元、 发送消息单元, 所述直接获取信号量单元, 用于直接向获取信号量的处理设备发送获取标识; 所述间接获取信号量单元, 用于通过发送消息或者发送中断向获取信号量的 处理设备发送获取标识; 所述发送消息单元, 用于通过消息发送接口向处理 设备发送控制消息或者通知消息; 所述计数域值信号量单元, 用于控制同时 访问多核同步处理单元的处理设备获得的计数信号量。
如图 4 所示的示意图, 其中, 直接获取信号量单元、 间接获取信号量单 元、 发送消息单元的处理流程与计数域值信号量单元的处理流程类似, 其中, 直接获取信号量单元、 间接获取信号量单元、 发送消息单元, 在接收到新的 初始化设置后, 读取内部緩存, 并检查内部緩存是否存在重复申请或者是否 存在非法申请等异常, 然后将内部緩存中的内容更新为初始化设置的内容, 并保存初始化设置内容。 需要说明的是, 间接获取信号量单元、 计算域值信 号量单元、 发送消息单元还需要发送消息或者中断, 以通知多核同步处理单 元完成了此次任务。
与现有技术无法实现加速器和处理器之间的消息传递, 只能通过中断完 成相比, 本发明实施例提供的方案中新增加了计数域值信号量单元和发送消 息单元, 这样, 可以有效地实现加速器和处理器之间相互的消息传递, 使资 源消耗较小。 过总线配置接口控制总线的访问流量, 如图 4 中, 具体地, 一次总线访问结 束前, 总线接口控制总线的继续操作, 每次总线的操作均对应一次内部緩存 的读取、 处理和更新操作, 这样, 可以实现多个同步处理单元快速无冲突的 同步调度机制。
当申请到资源, 多核同步处理单元初始化后, 各个处理设备可以处理相 应的任务。
步骤 303 ,所述多核同步处理单元接收所述第一组处理设备中的任一处理 设备发送的通知消息, 将所述计数信号量值相应的减 1 , 所述通知消息的内容 为发送所述通知消息的处理设备完成所述当前任务; 信号量中的资源处理任务, 每次第一组处理设备中的一个处理设备完成当前 任务时, 多核同步处理单元中进行一次对应当前任务的信号量的释放操作, 即将所述计数信号量值相应的减 1。
具体地, 通过所述内部緩存的任一通道读取内部緩存, 并获取内部緩存 中所述计数信号量的当前值, 并检查异常; 获取内部緩存中所述计数信号量 的当前值时, 再将所述计数信号量的当前值减 1 , 保存所述当前内部緩存中的 计数信号量的值。
在读取内部緩存时可以通过多条通道读取, 并且多核同步处理单元上的 直接获取信号量单元、 间接获取信号量单元、 发送消息单元和计数域值信号 量单元釆用多通道访问内部緩存, 如图 5 所示的虚拟地址映射的示意图, 所 存, 与现有技术中信号量处理单元和 IPC ( Inter Proces sor Communica t ion, 进程间通信)单元是两个独立的单元, 无法有效共享通道緩存相比, 本发明 实施例提供的方案可以实现最大限度的共享内部緩存, 节省内部緩存资源, 进一步地, 内部緩存可以釆用资源较小的寄存器代替, 并尽量增大通道数, 有效地降低了多核之间的竟争几率。
另外, 对于共享通道緩存, 假设有 10个通道, 在现有技术中, 通过软件 可以设置这 10个通道中某些通道可以分别实现 A、 B、 C、 D的功能, 例如, 设置 4个通道可以实现 A的功能, 设置 1个通道可以实现 B的功能, 设置 1 个通道可以实现 C的功能, 设置 2个通道可以实现!)的功能; 通过本发明实 施例提供的方案, 这 10个通道都可以实现 A、 B、 C、 D的功能, 这样, 可以 更加灵活的分配各个通道去实现相应的功能。
步骤 304 , 判断计数信号量的值是否为 0;
具体地, 判断当前内部緩存中的计数信号量的值是否为 0。
步骤 305 , 当计数信号量的值不为 0时, 则继续由所述第一组处理设备处 理当前任务;
如果计数信号量的值不为 0 时, 表示当前需要处理的任务还没有完成, 则继续由所述第一组处理设备中的没有处理完成当前任务的处理设备占用计 数信号量的资源处理当前任务, 然后执行步骤 303 , 当所述第一组处理设备中 的某一个处理设备完成当前任务时, 再将所述计数信号量值相应的减 1 , 判断 此时计数信号量的值是否为 0。
步骤 306 , 当计数信号量的值为 0时,根据设置的所述当前多核同步处理 单元启动第二组处理设备的所述控制消息内容, 通过消息发送接口向所述第 二组处理设备发送所述控制消息 , 以便所述第二组处理设备处理所述当前任 务。
需要说明的是, 控制消息是在步骤 301 中同步处理相同的当前任务的第 一组处理设备中的任一个处理设备发送的申请消息中, 已经初始化设置好的。 这里, 当计数信号量的值为 0 时, 说明第一组处理设备已经将分配给各个处 理设备的当前任务完成了, 此时, 第一组处理设备开始处理第二个任务, 其 处理第二个任务的处理流程与处理当前任务的流程相同。
如图 6所示的处理各个任务的示意图, 第一组处理设备包括处理器 1 , 处 理器 2 , 加速器 1 , 第二组处理设备包括加速器 2 , 其中, 处理器 1处理当前 任务中的 A部分, 处理器 1处理当前任务中的 B部分, 加速器 1处理当前任 务中的 C部分, 加速器 2处理当前任务中的 D部分, 处理器 1完成 A部分任 务的时间最短, 处理器 2完成 B部分任务的时间最长, 当处理器 1完成 B部 分任务后, 处理器 1和加速器 1都已完成相应的任务, 此时, 多核同步处理 单元的计数信号量值已经减为 0 , 则多核同步处理单元向加速器 2发送控制 消息, 以便加速器 2处理任务的 D部分。 当处理器 1、 处理器 1和加速器 1分 别完成任务后, 再去执行其他的任务。
例如, 无线通讯中一个频带上承载多种信道类型, 多种信道分离后, 需 要不同的单元进行后续处理, 例如: 加速器 A处理完当前任务后, 多核同步 处理单元将控制消息分发到不同的地址, 以启动加速器 B、 启动加速器 C并通 知处理器 D处理所述当前任务, 与现有技术中, 加速器 A处理完当前任务后, 通过中断通知处理器 D, 然后处理器 D通过中断启动加速器 B和加速器 C , 如 果需要处理 N个任务时, 至少需要响应 3N次中断相比, 本发明实施例提供的 方案可以有效地降低系统的开销, 特别是, 当同步后的处理单元为加速器时, 本发明实施例提供的方案不需要响应中断。
需要说明的是, 如果第二组中的处理设备为加速器时, 步骤 306 中多核 同步处理单元给第二组中的加速器发送控制消息, 则不需要向第二组中的加 速器发送中断; 如果第二组中的处理设备为处理器时, 步骤 306 多核同步处 理单元给处理器发送控制消息, 则可以向第二组中的处理器发送中断, 也可 以不发送中断。
具体地, 如果向第二组中的处理器发送中断时, 具体通过中断输出接口 发送, 中断输出接口设置在多核同步处理单元上, 中断输出接口用于多个所 述控制消息的合并通知, 这样, 可以减少通知中断的发送, 降低资源消耗。
进一步地, 所述第二组处理设备接收到所述多核同步处理单元发送的响 应消息, 处理所述当前任务。
需要说明的是, 如果也接收到中断时, 则需要处理中断。 当所有的处理 设备需要处理 N个任务时, 由于各个处理设备处理任务的处理时间不定, 如 果第二组处理设备中的某个处理器处理任务时处理得慢, 则处理全部任务的 过程中仅需响应一次中断, 如果第二组处理设备中的某个处理器处理任务时 处理得快, 则处理全部任务的过程中最多响应 N次中断, 这样, 有效地减少 了系统响应中断的开销。
另外, 现有技术中釆用调度处理器处理同步时, 调度处理器的调动速度 受限制, 并且调度的效率 4艮低, 在高速业务处理时, 无法满足需求, 例如, 无线通讯的业务周期是毫秒量级, 调度处理器的处理周期为微妙量级, 处理 相同的业务时, 需要调度几百次调度处理器, 本发明实施例提供的方案通过 釆用多核同步处理单元处理同步时, 可以实现快速同步调度多个处理设备, 提高了调度的效率, 节约了处理的时间。
本发明实施例提供的一种基于多核系统的同步处理的方法, 通过同步处 理相同的当前任务的每个处理设备都处理完成当前的任务时, 计数信号量的 值减为 0 , 然后多核同步处理单元向第二组的处理设备发送响应消息, 以便通 知第二组处理器处理当前任务。 与现有技术中在处理多核之间的同步和通讯 时, 通常需要通过中断完成同步操作, 导致系统调度的效率较低, 资源消耗 较大相比, 本发明实施例提供的方案通过发送响应控制消息处理多个处理设 备之间的同步和通讯, 可以提高系统调度的效率, 降低系统的开销, 资源消 耗较小。
本发明实施例提供的一种基于多核系统的同步处理的装置, 该装置可以 为多核同步处理单元, 多核同步处理单元位于 S0C ( Sys tem On a Chi p,片上 系统)上。 如图 7所示, 该装置包括: 初始化模块 701 , 第一接收子模块 702 , 更新保存子模块 703 , 处理模块 704 , 第二接收子模块 705 , 获取子模块 706 , 更新子模块 707 , 发送模块 708。
初始化模块 701 ,用于接收同步处理相同的当前任务的第一组处理设备中 的任一处理设备发送的初始化设置, 并进行初始化, 所述初始化设置包括设 置当前多核同步处理单元的计数信号量值为 M和设置所述当前多核同步处理 单元启动第二组处理设备的控制消息内容, M为所述一组处理设备同步处理所 述当前任务的次数;
处理设备为处理器或加速器, 在处理当前任务时, 可以由处理器完成, 也可以由加速器完成。
申请值为 M的计数信号量中的 M为可用资源的数目, 在本发明中同步处 理相同的当前任务的一组处理设备中的每个处理设备在处理当前任务时, 需 要占用一个资源, 第一组处理设备同步处理相同的当前任务的次数为 M, 总共 需要占用 M个资源。
具体地, 所述初始化模块 701中的第一接收子模块 702 , 用于通过内部緩 存的任一通道接收同步处理相同的当前任务的第一组处理设备中的任一处理 设备发送的初始化设置;
其中, 所述初始化设置携带在申请消息中发送的, 在发送初始化设置的 申请消息时, 可以通过多个通道发送。 另外, 申请消息中还包括申请标识, 申请标识用于标识此次申请到的内部緩存中的资源, 以免重复申请此资源。
更新保存子模块 703 , 用于读取所述通道对应的内部緩存, 将内部緩存中 的内容更新为所述初始化设置, 并保存所述初始化设置;
需要说明的是, 读取所述通道对应的内部緩存后, 还需要检查是否存在 异常, 具体地, 检查是否重复初始化设置, 是否存在非法申请进行初始化设 置等。
处理模块 704 ,用于接收所述第一组处理设备中的任一处理设备发送的通 知消息, 将所述计数信号量值相应的减 1 , 所述通知消息的内容为发送所述通 知消息的处理设备完成所述当前任务;
具体地, 所述处理模块 704中的第二接收子模块 705 , 用于通过所述内部 緩存的任一通道接收所述第一组处理设备中的任一个处理设备发送的通知消 息;
所述通知消息中还包括释放计数信号量的标识, 即对申请消息中携带的 申请标识进行释放, 以便再次使用释放的资源。
获取子模块 706 , 用于读取所述通道对应的内部緩存, 并获取内部緩存 中所述计数信号量的当前值;
需要说明的是, 获取内部緩存中所述计数信号量的当前值时, 计数域值 信号量单元可以通过多条通道读取内部緩存, 并且多核同步处理单元上的直 接获取信号量单元、 间接获取信号量单元、 发送消息单元和计数域值信号量 地址映射到同一个内部緩存, 可以实现最大限度的共享内部緩存, 节省内部 緩存资源, 进一步地, 内部緩存可以釆用资源较小的寄存器代替, 并尽量增 大通道数, 有效地降低了多核之间的竟争几率。
另外所述直接获取信号量单元, 用于直接向获取信号量的处理设备发送 获取标识; 所述间接获取信号量单元, 用于通过发送消息或者发送中断向获 取信号量的处理设备发送获取标识; 所述发送消息单元, 用于通过消息发送 接口向处理设备发送控制消息或者通知消息; 所述计数域值信号量单元, 用 于控制同时访问多核同步处理单元的处理设备获得的计数信号量。 具体地, 直接获取信号量单元、 间接获取信号量单元、 发送消息单元, 在接收到新的 初始化设置后, 读取内部緩存, 并检查内部緩存是否存在重复申请或者是否 存在非法申请等异常, 然后将内部緩存中的内容更新为初始化设置的内容, 并保存初始化设置内容。
更新子模块 707 , 用于将所述计数信号量的当前值减 1 , 保存所述当前内 部緩存中的计数信号量的值;
此时, 判断计数信号量的值是否为 0, 如果计数信号量的值不为 0时, 则 继续由所述第一组处理设备中的处理设备处理当前任务;
当所述计数信号量值为 0时,发送模块 708 , 用于根据设置的所述当前多 核同步处理单元启动第二组处理设备的所述控制消息内容, 通过消息发送接 口向所述第二组处理设备发送所述控制消息, 以便所述第二组处理设备处理 所述当前任务。
需要说明的是, 现有技术中没有发送模块, 即不能发送响应消息, 只能 通过中断进行同步单元之间的通讯, 本发明实施例提供的方案中增加了发送 模块, 同步单元的同步事件转为消息发送, 可以实现多个处理器和加速器之 间可以互相同步和通讯。
向所述第二组处理设备发送控制消息是通过消息发送接口发送的, 消息 发送接口设置在多核同步处理单元上。
需要说明的是, 如果发送模块给加速器发送控制消息, 则不需要向所述 第二组处理设备发送中断, 如果发送模块给处理器发送控制消息, 则可以向 所述第二组处理设备发送中断, 也可以不发送中断。
具体地, 如果向第二组中的处理器发送中断时, 具体通过中断输出接口 发送, 中断输出接口设置在多核同步处理单元上, 中断输出接口用于多个所 述控制消息的合并通知, 这样, 可以减少通知中断的发送, 降低资源消耗。。
另外, 当所有的处理设备需要处理 N个任务时, 由于各个处理设备处理 任务的处理时间不定, 如果第二组处理设备中的处理器处理任务时处理得慢, 则处理全部任务的过程中仅需响应一次中断, 如果第二组处理设备中的处理 器处理任务时处理得快, 则处理全部任务的过程中最多响应 N次中断, 这样, 有效地减少了系统响应中断的开销。
例如, 无线通讯中一个频带上承载多种信道类型, 多种信道分离后, 需 要不同的单元进行后续处理, 例如: 加速器 A处理完当前任务后, 多核同步 处理单元将控制消息分发到不同的地址, 以启动加速器 B、 启动加速器 C并通 知处理器 D处理所述当前任务, 与现有技术中, 加速器 A处理完当前任务后, 通过中断通知处理器 D , 然后处理器 D通过中断启动加速器 B和加速器 C , 如 果需要处理 N个任务时, 至少需要响应 3N此中断相比, 本发明实施例提供的 方案可以有效地降低系统的开销, 特别是, 当同步后的处理单元为加速器时, 本发明实施例提供的方案不需要响应中断。
本发明实施例提供的一种基于多核系统的同步处理的装置, 通过同步处 理相同的当前任务的每个处理设备处理完成当前的任务时, 处理单元将计数 信号量的值减 1 , 当所述计数信号量的值为 0时,发送模块向多核同步处理单 元发送控制消息, 以便通知第二组处理设备处理当前任务。 与现有技术中在 处理多核之间的同步和通讯时, 通常需要通过中断完成同步操作, 导致系统 调度的效率较低, 资源消耗较大相比, 本发明实施例提供的方案通过发送响 应消息处理多核之间的同步和通讯, 可以提高系统调度的效率, 降低系统的 开销, 资源消耗较小。
以上所述, 仅为本发明的具体实施方式, 但本发明的保护范围并不局限 于此, 任何熟悉本技术领域的技术人员在本发明揭露的技术范围内, 可轻易 想到变化或替换, 都应涵盖在本发明的保护范围之内。 因此, 本发明的保护 范围应所述以权利要求的保护范围为准。

Claims

权利 要 求 书
1、 一种基于多核系统的同步处理方法, 其特征在于, 包括:
接收同步处理相同的当前任务的第一组处理设备中的任一处理设备发送的 初始化设置, 并进行初始化, 所述初始化设置包括设置当前多核同步处理单元 的计数信号量值为 M和设置所述当前多核同步处理单元启动第二组处理设备的 控制消息内容, M为所述第一组处理设备同步处理所述当前任务的次数;
接收所述第一组处理设备中的任一处理设备发送的通知消息, 将所述计数 信号量值相应的减 1 ,所述通知消息的内容为发送所述通知消息的处理设备完成 所述当前任务;
当所述计数信号量值为 0 时, 根据设置的所述当前多核同步处理单元启动 第二组处理设备的所述控制消息内容, 通过消息发送接口向所述第二组处理设 备发送所述控制消息, 以便所述第二组处理设备处理所述当前任务。
2、 根据权利要求 1所述的基于多核系统的同步处理方法, 其特征在于, 在 所述接收同步处理相同的当前任务的第一组处理设备中的任一处理设备发送的 初始化设置, 并进行初始化包括:
通过内部緩存的任一通道接收同步处理相同的当前任务的第一组处理设备 中的任一处理设备发送的初始化设置;
读取所述通道对应的内部緩存, 将内部緩存中的内容更新为所述初始化设 置, 并保存所述初始化设置。
3、根据权利要求 2所述的基于多核系统的同步处理方法, 其特征在于, 所 述接收所述第一组处理设备中的任一处理设备发送的通知消息, 将所述计数信 号量值相应的减 1包括:
通过所述内部緩存的任一通道接收所述第一组处理设备中的任一个处理设 备发送的通知消息;
读取所述通道对应的内部緩存, 并获取内部緩存中所述计数信号量的当前 值;
将所述计数信号量的当前值减 1 ,保存所述当前内部緩存中的计数信号量的 值。
4、 一种基于多核系统的同步处理装置, 其特征在于, 包括:
初始化模块, 用于接收同步处理相同的当前任务的第一组处理设备中的任 一处理设备发送的初始化设置, 并进行初始化, 所述初始化设置包括设置当前 多核同步处理单元的计数信号量值为 M和设置所述当前多核同步处理单元启动 第二组处理设备的控制消息内容, M为所述第一组处理设备同步处理所述当前任 务的次数;
处理模块, 用于接收所述第一组处理设备中的任一处理设备发送的通知消 息, 将所述计数信号量值相应的减 1 , 所述通知消息的内容为发送所述通知消息 的处理设备完成所述当前任务;
发送模块, 用于当所述计数信号量值为 0 时, 根据设置的所述当前多核同 步处理单元启动第二组处理设备的所述控制消息内容, 通过消息发送接口向所 述第二组处理设备发送所述控制消息 , 以便所述第二组处理设备处理所述当前 任务。
5、 根据权利要求 4所述的基于多核系统的同步处理装置, 其特征在于, 所 述初始化模块包括:
第一接收子模块, 用于通过内部緩存的任一通道接收同步处理相同的当前 任务的第一组处理设备中的任一处理设备发送的初始化设置;
更新保存子模块, 用于读取所述通道对应的内部緩存, 将内部緩存中的内 容更新为所述初始化设置, 并保存所述初始化设置。
6、根据权利要求 5所述的基于多核系统的同步处理装置, 其特征在于, 所 述处理模块包括:
第二接收子模块, 用于通过所述内部緩存的任一通道接收所述第一组处理 设备中的任一个处理设备发送的通知消息;
获取子模块, 用于读取所述通道对应的内部緩存, 并获取内部緩存中所述 计数信号量的当前值;
更新子模块, 用于将所述计数信号量的当前值减 1 , 保存所述当前内部緩存 中的计数信号量的值。
7、 根据权利要求 6中所述的基于多核系统的同步处理装置, 其特征在于, 所述多核同步处理单元包括直接获取信号量单元、 间接获取信号量单元、 发送 消息单元、 计数域值信号量单元;
所述直接获取信号量单元, 用于直接向获取信号量的处理设备发送获取标 识;
所述间接获取信号量单元, 用于通过发送消息或者发送中断向获取信号量 的处理设备发送获取标识;
所述发送消息单元, 用于通过消息发送接口向处理设备发送控制消息或者 通知消息;
所述计数域值信号量单元, 用于控制同时访问多核同步处理单元的处理设 备获得的计数信号量。
PCT/CN2011/078411 2011-08-15 2011-08-15 一种基于多核系统的同步处理方法及装置 WO2012106943A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN2011800014795A CN102334104B (zh) 2011-08-15 2011-08-15 一种基于多核系统的同步处理方法及装置
PCT/CN2011/078411 WO2012106943A1 (zh) 2011-08-15 2011-08-15 一种基于多核系统的同步处理方法及装置
US14/077,421 US9424101B2 (en) 2011-08-15 2013-11-12 Method and apparatus for synchronous processing based on multi-core system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2011/078411 WO2012106943A1 (zh) 2011-08-15 2011-08-15 一种基于多核系统的同步处理方法及装置

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US14/077,421 Continuation US9424101B2 (en) 2011-08-15 2013-11-12 Method and apparatus for synchronous processing based on multi-core system

Publications (1)

Publication Number Publication Date
WO2012106943A1 true WO2012106943A1 (zh) 2012-08-16

Family

ID=45484998

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2011/078411 WO2012106943A1 (zh) 2011-08-15 2011-08-15 一种基于多核系统的同步处理方法及装置

Country Status (3)

Country Link
US (1) US9424101B2 (zh)
CN (1) CN102334104B (zh)
WO (1) WO2012106943A1 (zh)

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104281558B (zh) * 2013-07-01 2017-11-17 华为技术有限公司 一种在线升级的方法和芯片
CN103559095B (zh) * 2013-10-30 2016-08-31 武汉烽火富华电气有限责任公司 用于继电保护领域的双核多处理器架构的数据同步方法
CN104462006B (zh) * 2015-01-05 2017-09-19 华为技术有限公司 系统级芯片中的多个处理器核间配置同步方法和设备
US9778951B2 (en) * 2015-10-16 2017-10-03 Qualcomm Incorporated Task signaling off a critical path of execution
DE102017100655A1 (de) * 2017-01-13 2018-07-19 Beckhoff Automation Gmbh Steuerung eines technischen Prozesses auf einer Mehr-Rechenkern-Anlage
CN113821469A (zh) * 2021-09-23 2021-12-21 深圳市元征科技股份有限公司 多处理器的同步方法、装置、终端设备及存储介质
CN114546669B (zh) * 2022-02-25 2024-06-25 苏州浪潮智能科技有限公司 一种基于桶分片的数据同步处理方法、系统及终端
CN117194308A (zh) * 2022-05-30 2023-12-08 华为技术有限公司 一种多核处理器及相关核间通信方法

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040221246A1 (en) * 2003-04-30 2004-11-04 Lsi Logic Corporation Method for use of hardware semaphores for resource release notification
CN101114272A (zh) * 2007-01-22 2008-01-30 北京中星微电子有限公司 一种可实现芯片内多核间通信的芯片及通信方法
CN101546277A (zh) * 2009-04-27 2009-09-30 华为技术有限公司 一种多核处理器平台及多核处理器同步的方法

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101561766B (zh) * 2009-05-26 2011-06-15 北京理工大学 一种支持多核帮助线程的低开销的块同步方法
US9342379B2 (en) * 2011-01-21 2016-05-17 Wind River Systems, Inc. Lock free acquisition and release of a semaphore in a multi-core processor environment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20040221246A1 (en) * 2003-04-30 2004-11-04 Lsi Logic Corporation Method for use of hardware semaphores for resource release notification
CN101114272A (zh) * 2007-01-22 2008-01-30 北京中星微电子有限公司 一种可实现芯片内多核间通信的芯片及通信方法
CN101546277A (zh) * 2009-04-27 2009-09-30 华为技术有限公司 一种多核处理器平台及多核处理器同步的方法

Also Published As

Publication number Publication date
US20140075449A1 (en) 2014-03-13
CN102334104B (zh) 2013-09-11
CN102334104A (zh) 2012-01-25
US9424101B2 (en) 2016-08-23

Similar Documents

Publication Publication Date Title
WO2012106943A1 (zh) 一种基于多核系统的同步处理方法及装置
US11093284B2 (en) Data processing system
CN109697122B (zh) 任务处理方法、设备及计算机存储介质
US10884786B2 (en) Switch device, switching method, and computer program product
US7231638B2 (en) Memory sharing in a distributed data processing system using modified address space to create extended address space for copying data
WO2018035856A1 (zh) 实现硬件加速处理的方法、设备和系统
JP6449872B2 (ja) ネットワーク環境における効率的なパケット処理モデルおよびパケット処理のための最適化されたバッファ利用をサポートするためのシステムおよび方法
JP5453825B2 (ja) プログラム並列実行システム、マルチコアプロセッサ上のプログラム並列実行方法
US10795840B2 (en) Persistent kernel for graphics processing unit direct memory access network packet processing
US11922304B2 (en) Remote artificial intelligence (AI) acceleration system
CN109564528B (zh) 分布式计算中计算资源分配的系统和方法
JP2007079789A (ja) 計算機システム及びイベント処理方法
WO2013181939A1 (zh) 通信设备硬件资源的虚拟化管理方法及相关装置
WO2017114061A1 (zh) 多核异构系统及其硬件资源的管理方法
EP3402172A1 (en) A data processing system
WO2016189294A1 (en) Single-chip multi-processor communication
CN109729113B (zh) 管理专用处理资源的方法、服务器系统和计算机程序产品
WO2023274278A1 (zh) 一种资源调度的方法、装置及计算节点
US11036205B2 (en) Control device and communication device
CN114816777A (zh) 命令处理装置、方法、电子设备以及计算机可读存储介质
US20230153153A1 (en) Task processing method and apparatus
JP7451438B2 (ja) 通信装置、通信システム、通知方法及びプログラム
WO2012126212A1 (zh) 射频识别设备接口层的通信装置及方法
US20120096245A1 (en) Computing device, parallel computer system, and method of controlling computer device
WO2024013830A1 (ja) サーバ内データ転送装置、データ転送システム、サーバ内データ転送方法およびプログラム

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201180001479.5

Country of ref document: CN

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 11858097

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 11858097

Country of ref document: EP

Kind code of ref document: A1