CN109840151B

CN109840151B - Load balancing method and device for multi-core processor

Info

Publication number: CN109840151B
Application number: CN201711229966.2A
Authority: CN
Inventors: 张鹏飞; 吴乐; 冯园园
Original assignee: Datang Mobile Communications Equipment Co Ltd
Current assignee: Datang Mobile Communications Equipment Co Ltd
Priority date: 2017-11-29
Filing date: 2017-11-29
Publication date: 2021-08-27
Anticipated expiration: 2037-11-29
Also published as: CN109840151A

Abstract

The embodiment of the invention relates to the technical field of computers, in particular to a load balancing method and a load balancing device for a multi-core processor, which comprise the following steps: aiming at least one shared data, acquiring a first processor identifier carried by a first process having access to the shared data and a second process needing to access the shared data; judging whether a second processor identifier carried by the second process is the same as the first processor identifier or not, and if not, modifying the processor identifier carried by the second process into the first processor identifier; migrating the second process to the first processor in a load balancing period. It can be seen that, multiple processes sharing data can be migrated to the same processor, and the shared data is placed in the cache of the same processor without being placed in respective caches, so that the cache utilization rate of the operating system can be improved, and the execution efficiency of the processes can be improved.

Description

Load balancing method and device for multi-core processor

Technical Field

The embodiment of the invention relates to the technical field of computers, in particular to a load balancing method and device for a multi-core processor.

Background

At present, a multi-core processor becomes a mainstream processor and is widely applied. The process scheduler of the Linux operating system kernel has excellent performance under an SMP (symmetric Multi-Processing) architecture, and is also widely adopted. However, the Linux operating system cache utilization rate under the SMP architecture is insufficient, and the execution efficiency of the process is reduced to a great extent. For example, when a plurality of processes under the multi-core processor share shared data, each process needs to take out the shared data from the memory and place the shared data into its respective cache when using the shared data, which causes a shortage of utilization rate of the cache of the operating system and also reduces the execution efficiency of the processes.

Disclosure of Invention

The embodiment of the invention provides a load balancing method and device for a multi-core processor, which are used for improving the cache utilization rate of an operating system so as to improve the execution efficiency of a process.

The embodiment of the invention provides a load balancing method for a multi-core processor, which comprises the following steps:

aiming at least one shared data, acquiring a first processor identifier carried by a first process having access to the shared data and a second process needing to access the shared data;

judging whether a second processor identifier carried by the second process is the same as the first processor identifier or not, and if not, modifying the processor identifier carried by the second process into the first processor identifier;

in a load balancing period, aiming at a second process under any processor, a scheduler judges whether a processor identifier carried by the second process is the same as that of the processor or not, and if not, the second process is migrated to the first processor.

Preferably, the acquiring the first processor identifier carried by the first process having accessed the shared data and the second process needing to access the shared data includes:

determining a plurality of processes accessing the shared data;

after monitoring a first process using the shared data in the first process, acquiring a first processor identifier carried by the first process;

modifying the processor identifier carried by the second process into the first processor identifier, including:

and for each process except the first process in the plurality of processes, modifying the processor identifier carried by the process into the first processor identifier.

Preferably, before acquiring the first processor identifier carried by the first process having accessed the shared data and the second process needing to access the shared data, the method further includes:

when initializing, aiming at any process, the processor identification carried by the process is given to the process.

Preferably, after modifying the processor identifier carried by the second process into the first processor identifier, the method further includes:

counting the number of the processes which are modified into the first processor identifier;

determining the process migration volume according to the process quantity;

before the scheduler determines whether the processor identifier carried by the process is the same as the identifier of the processor, the method further includes:

and judging that the process migration quantity is not zero.

An embodiment of the present invention further provides a load balancing apparatus for a multi-core processor, including:

the acquisition module is used for acquiring a first processor identifier carried by a first process which has accessed the shared data and a second process which needs to access the shared data aiming at least one piece of shared data;

the feedback module is used for judging whether a second processor identifier carried by the second process is the same as the first processor identifier or not, and if not, modifying the processor identifier carried by the second process into the first processor identifier;

and the scheduling module is used for judging whether the processor identifier carried by the second process is the same as the identifier of the processor or not aiming at the second process under any processor in a load balancing period, and if not, migrating the second process to the first processor.

Preferably, the obtaining module is specifically configured to:

determining a plurality of processes accessing the shared data;

the feedback module is specifically configured to:

Preferably, the method further comprises the following steps: initializing a module;

the initialization module is configured to assign a processor identifier carried by a process to the process for any process during initialization.

Preferably, the feedback module is further configured to:

after the processor identifier carried by the second process is modified into the first processor identifier, counting the number of the processes modified into the first processor identifier;

determining the process migration volume according to the process quantity;

and before the scheduling module judges whether the processor identifier carried by the process is the same as the identifier of the processor, judging that the process migration quantity is not zero.

Another embodiment of the present invention provides a computing device, which includes a memory for storing program instructions and a processor for calling the program instructions stored in the memory to execute any one of the above methods according to the obtained program.

Another embodiment of the present invention provides a computer storage medium having stored thereon computer-executable instructions for causing a computer to perform any one of the methods described above.

The load balancing method and device for the multi-core processor provided by the above embodiments include: aiming at least one shared data, acquiring a first processor identifier carried by a first process having access to the shared data and a second process needing to access the shared data; judging whether a second processor identifier carried by the second process is the same as the first processor identifier or not, and if not, modifying the processor identifier carried by the second process into the first processor identifier; in a load balancing period, aiming at a second process under any processor, a scheduler judges whether a processor identifier carried by the second process is the same as that of the processor or not, and if not, the second process is migrated to the first processor. It can be seen that, when shared data exists in the multi-core operating system, first obtaining a first processor identifier carried by a first process having accessed the shared data and a second process needing to access the shared data, then modifying the first processor identifier by a processor identifier carried by the second process when the second processor identifier carried by the second process is different from the first processor identifier carried by the first process, and finally, in a load balancing cycle, for a second process under any processor, a scheduler judges whether the processor identifier carried by the second process is the same as the identifier of the processor, if not, migrating the second process under the first processor, so that when the processor identifiers carried by the other processes sharing the shared data are different from the first processor identifier, the other processes can be migrated under the first processor, and thus, the shared data can be transferred to the same processor by the processes, so that when the shared data is reused by the processes, the shared data is not required to be respectively placed in respective caches, but the shared data is placed in the cache of the first processor, and therefore, the cache utilization rate of an operating system can be improved, and the execution efficiency of the processes can be improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings that are required to be used in the description of the embodiments will be briefly described below.

Fig. 1 is a schematic flowchart of a load balancing method for a multi-core processor according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a feedback process provided by an embodiment of the present invention;

FIG. 3 is a flow chart illustrating the scheduler determination according to an embodiment of the present invention;

fig. 4 is a schematic structural diagram of a load balancing apparatus for a multicore processor according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more clearly apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

It should be noted that the load balancing method and apparatus provided by the embodiment of the present invention may be applied to an operating system including a homogeneous multicore processor. For example, it can be applied to an operating system under an SMP (symmetric Multi-Processing) architecture.

Fig. 1 schematically illustrates a flowchart of a load balancing method for a multi-core processor according to an embodiment of the present invention, where as shown in fig. 1, the method may include:

s101, aiming at least one shared data, acquiring a first processor identifier carried by a first process which has accessed the shared data and a second process which needs to access the shared data.

S102, judging whether a second processor identifier carried by a second process is the same as the first processor identifier, if not, turning to the step S103, otherwise, ending the process.

S103, modifying the processor identifier carried by the second process into the first processor identifier.

And S104, in the load balancing period, aiming at a second process under any processor, the scheduler judges whether the processor identifier carried by the second process is the same as the identifier of the processor, and if not, the second process is transferred to the first processor.

Specifically, in a load balancing cycle, for a process under any processor in a multi-core processor, a scheduler determines whether a processor identifier carried by the process is the same as an identifier of the processor, if so, the process is not migrated, and if not, the process is migrated from the processor to a processor corresponding to the processing identifier carried by the process.

First, before the step S101, in initialization, for any process under the operating system including the multi-core processor, an identifier of a processor to which the process belongs may be given to the process, so that the process carries the identifier of the processor to which the process belongs.

Specifically, the identifier of the processor is added by modifying the structure of the operating system definition process, so that the identifier of the processor to which the process belongs is given to the process, and the process carries the identifier of the processor to which the process belongs. For example, a "CPU number" may be added to a structure in which an operating system defines a process, so that an identifier of a processor to which the process belongs is given to the process, so that the process carries number information of the CPU to which the process belongs.

In an alternative embodiment, the following program codes are used to implement a structure for modifying the os definition process, and add the processor identifier, so as to assign the identifier of the processor to which the process belongs to the process.

struct task_struct{

int prio, static _ prio, normal _ prio; // Process priority

unsigned int rt_priority；

conststruct scheduled _ class _ scheduled _ class; class of scheduler to which it belongs

unsigned int policy；

cpumask _ t cpu _ allowed; // limiting CPU on which processes may run

struct sched_entity se；

struct sched_rt_entity rt；

int cpu _ num; v/saving the CPU number that the process needs to be migrated

…………………………

}

It should be noted that the above program codes are used as examples only, and are not used to limit the implementation manner of the embodiment of the present invention, that is, the embodiment of the present invention may also implement, by other ways, modifying a structure of an operating system definition process, and adding a processor identifier, so as to assign an identifier of a processor to which the process belongs to the process.

After the step S103 is executed, that is, after the processor identifier carried by the second process is modified to the first processor identifier, the number of processes modified to the first processor identifier may be counted, and the process migration amount is determined according to the process data.

In order to improve the execution efficiency of the scheduler, the scheduler may first determine whether the process migration amount is zero before migrating the process, and in the case that the process migration amount is zero, the scheduler does not need to determine, for each processor, whether a processor identifier carried by the process below the scheduler is the same as the identifier of the processor, or does not need to migrate the second process below the first processor, thereby improving the execution efficiency of the scheduler.

In an operating system, when shared data exists, a lock mechanism is usually adopted, for example, a basic tool for locking and advanced mutual exclusive lock is realized on Linux through a fast user space mutex, the Futex is a synchronization mechanism in which a user mode and a kernel mode are mixed, two parts are needed to cooperate to complete, Linux provides corresponding system call, and provides support for synchronization problem under the condition of process competition, processes needing to be synchronized share a memory space, and the Futex is located in the memory space. In addition, based on the excellent performance of the Futex itself, a higher-level lock may be constructed, such as a common mutual exclusion lock based on POSIX (Portable Operating System Interface of UNIX), and the like.

Based on the lock mechanism of the operating system, the step S101 may be executed by a shared data acquisition module preset in the operating system, where the shared data acquisition module is used to acquire information of each process of shared data. In order to accelerate the modification speed and improve the modification efficiency, after the shared data acquisition module acquires the first processor identifier carried by the first process having accessed the shared data, the shared data feedback module may be used to determine one by one whether the processor identifiers carried by the remaining processes under the shared data are the same as the first processor identifier carried by the first process, and if not, the processor identifiers carried by the processes are modified into the first processor identifier.

Based on the lock mechanism of the operating system, the step S101 and the step S102 may also be implemented according to the feedback process shown in fig. 2.

In fig. 2, after a shared data acquisition module acquires a first processor identifier carried by a first process using shared data, the first processor identifier is sent to a shared data feedback module, and when any one of the other processes is scheduled, the shared data feedback module determines whether the processor identifier carried by the process is the same as the first processor identifier, and if not, modifies the processor identifier carried by the process into the first processor identifier. For example, when the process wakes up, whether the processor identifier carried by the process is the same as the first processor identifier or not can be judged, and if not, the processor identifier carried by the process is modified into the first processor identifier.

In an alternative embodiment, the operations of step S101 and step S102 may be implemented by the following codes of programs.

It should be noted that the above program codes are only used as examples and are not used to limit the implementation manner of the embodiment of the present invention, that is, the embodiment of the present invention may also use other manners to perform the operations of step S101 and step S102.

Based on a locking mechanism in an operating system, a plurality of processes accessing shared data can be determined, when a first process using the shared data in the plurality of processes is monitored, a first processor identifier carried by the first process is obtained, and then a processing identifier carried by the process is modified into the first processor identifier for each process except the first process in the plurality of processes.

For example, based on a Futex (Fast user space mutex) lock in Linux operation, assume that three processes can be determined to need to share "shared data a", which are: process 1, process 2, and process 3, and assuming that the processor carried by process 1 is identified as "CPU 1", the processor carried by process 2 is identified as "CPU 2", and the processor carried by process 3 is identified as "CPU 3", further assuming that process 1 of these three processes is the first process using "shared data a", the processor identification carried by process 2 may be modified from "CPU 2" to "CPU 1", and the processor identification carried by process 3 is modified from "CPU 3" to "CPU 1".

When determining that multiple shared data exist based on a lock mechanism in an operating system, the method may first determine an order of discovering processes, then obtain a first processor identifier carried by a first process after monitoring the first process using the shared data in the multiple processes, and then modify a processing identifier carried by the process into the first processor identifier for each process except the first process in the multiple processes.

For example, based on Futex lock 1 in Linux operations, it may be determined that two processes need to share "shared data B", which are: the method comprises the following steps that a process 4 and a process 5 are determined to share 'shared data C' based on a Futex lock 2 in Linux operation, and the two processes are respectively as follows: process 6, process 7. And assuming that the processor carried by process 4 is identified as "CPU 4", the processor carried by process 5 is identified as "CPU 5", the processor carried by process 6 is identified as "CPU 6", and the processor carried by process 7 is identified as "CPU 7", further assuming that the order of the discovery process is: process 4, process 5, process 6, process 7, further assuming that process 4 is the first process to use "shared data B", the processor identification carried by process 5 may be modified from "CPU 5" to "CPU 4". Further discovering that process 6 and process 7 share "shared data C" and that process 6 is the first process to use "shared data C", the processor identification carried by process 7 can be modified from "CPU 7" to "CPU 4".

When determining that a plurality of shared data exist based on a lock mechanism in an operating system and when one process needs to share the plurality of shared data, firstly determining the sequence of process discovery, then obtaining a first processor identifier carried by a first process after monitoring the first process using the shared data in the plurality of processes, and then modifying a processing identifier carried by the process into the first processor identifier for each process except the first process in the plurality of processes.

For example, based on Futex lock 3 in Linux operations, it may be determined that two processes need to share "shared data E", which are: the process 8 and the process 9 determine that the two processes need to share the shared data F based on the Futex lock 2 in the Linux operation, and the two processes are respectively as follows: process 9, process 10. And assuming that the processor carried by process 8 is identified as "CPU 8", the processor carried by process 9 is identified as "CPU 9", and the processor carried by process 10 is identified as "CPU 10", assume that the order of the discovery process is: process 8, process 9, and process 10, further assuming that process 8 is the first process using shared data E, the processor id carried by process 9 may be modified from "CPU 9" to "CPU 8", and at this time, when it is found that process 9 and process 10 share "shared data F", the processor id carried by process 10 is also modified from "CPU 10" to "CPU 8".

In specific implementation, when a processing identifier carried by a process is modified into a first processor identifier for each process except a first process in a plurality of processes, in order to accelerate modification speed and improve modification efficiency, after a first process using shared data in the plurality of processes is monitored, the processor identifier carried by the first process is obtained, whether the processor identifiers carried by the other processes under the shared data are the same as the processor identifier carried by the first process is judged one by one, and if the processor identifiers carried by the processes are different from each other, the processor identifier carried by the process is modified into the processor identifier carried by the first process.

In specific implementation, when a processing identifier carried by a process is modified into a first processor identifier for each process except for a first process in a plurality of processes, after the first process using shared data in the plurality of processes is monitored, the processor identifier carried by the first process is obtained, then, for any process in the other processes sharing data, whether the processor identifier carried by the process is the same as the processing identifier carried by the first process or not is judged while the process is awakened, if the processor identifier carried by the process is not the same as the processing identifier carried by the first process, the first processor identifier carried by the first process is sent to the process, and the processor identifier carried by the process is modified into the first processing identifier through a predefined structural body.

In the load balancing cycle, for a process under any processor in the multi-core processor, a specific flow of the scheduler judgment can be seen in fig. 3.

S301, judging whether the migration volume is zero, if not, turning to the step S302, otherwise, ending the process.

S302, judging whether the processor identifier carried by the process in the queue of the current processor is the same as the identifier of the current processor, if so, turning to the step S303, otherwise, turning to the step S304.

And S303, not carrying out process migration.

S304, migrating the process to a processor corresponding to the processor identifier carried by the process.

Specifically, the scheduler may migrate the process to the processor corresponding to the carried processor identifier by calling the pull _ task () function.

S305, judging whether other processes which are not judged exist in the queue of the current processor, if so, turning to the step S302, otherwise, ending the flow.

According to the above, when shared data exists in the multi-core operating system, first obtaining a first processor identifier carried by a first process having access to the shared data and a second process needing access to the shared data, then modifying the first processor identifier by a processor identifier carried by the second process when the second processor identifier carried by the second process is different from the first processor identifier carried by the first process, and finally, in a load balancing cycle, for the second process under any processor, judging whether the processor identifier carried by the second process is the same as the identifier of the processor by a scheduler, if not, migrating the second process under the first processor, so that when the processor identifiers carried by the other processes sharing the shared data are different from the first processor identifier, all the other processes can be migrated under the first processor, therefore, when the shared data is reused by the processes sharing the data, the shared data is not required to be respectively placed under respective caches, but the shared data is placed in the cache of the first processor, so that the cache utilization rate of the operating system can be improved, and the execution efficiency of the processes can be improved. In addition, before migrating the process, the scheduler may first determine whether the process migration amount is zero, and in the case that the process migration amount is zero, the scheduler does not need to determine, for each processor, whether the processor identifier carried by the process below the scheduler is the same as the identifier of the processor, or does not need to migrate the second process below the first processor, so that the execution efficiency of the scheduler can also be improved.

Based on the same technical concept, an embodiment of the present invention further provides a load balancing apparatus for a multi-core processor, as shown in fig. 4, the apparatus may include:

an obtaining module 402, configured to obtain, for at least one piece of shared data, a first processor identifier carried by a first process that has accessed the shared data and a second process that needs to access the shared data;

a feedback module 403, configured to determine whether a second processor identifier carried by the second process is the same as the first processor identifier, and if not, modify the processor identifier carried by the second process into the first processor identifier;

a scheduling module 404, configured to determine, for a second process under any processor in a load balancing cycle, whether a processor identifier carried by the second process is the same as an identifier of the processor, and if not, migrate the second process to the first processor.

Preferably, the obtaining module 402 is specifically configured to:

determining a plurality of processes accessing the shared data;

preferably, the feedback module 403 is specifically configured to:

Preferably, the method further comprises the following steps: an initialization module 401;

an initialization module 401, configured to assign, to any process, a processor identifier carried by the process to the process during initialization.

Preferably, the feedback module 403 is further configured to:

determining the process migration volume according to the process quantity;

The embodiment of the present invention further provides a computing device, which may be specifically a desktop computer, a portable computer, a smart phone, a tablet computer, a Personal Digital Assistant (PDA), and the like. The computing device may include a Central Processing Unit (CPU), memory, input/output devices, etc., the input devices may include a keyboard, mouse, touch screen, etc., and the output devices may include a Display device, such as a Liquid Crystal Display (LCD), a Cathode Ray Tube (CRT), etc.

The memory may include Read Only Memory (ROM) and Random Access Memory (RAM), and provides the processor with program instructions and data stored in the memory. In an embodiment of the present invention, the memory may be used to store a program for a load balancing method of a multicore processor.

The processor is configured to perform any of the above methods in accordance with the obtained program instructions by calling the program instructions stored in the memory.

An embodiment of the present invention further provides a computer storage medium, configured to store computer program instructions for the computing device, where the computer program instructions include a program for executing the load balancing method for a multi-core processor.

The computer storage media may be any available media or data storage device that can be accessed by a computer, including, but not limited to, magnetic memory (e.g., floppy disks, hard disks, magnetic tape, magneto-optical disks (MOs), etc.), optical memory (e.g., CDs, DVDs, BDs, HVDs, etc.), and semiconductor memory (e.g., ROMs, EPROMs, EEPROMs, non-volatile memory (NAND FLASH), Solid State Disks (SSDs)), etc.

In summary, when shared data exists in the multi-core operating system, first obtaining a first processor identifier carried by a first process having access to the shared data and a second process needing access to the shared data, then modifying the first processor identifier by a processor identifier carried by the second process when the second processor identifier carried by the second process is different from the first processor identifier carried by the first process, and finally, in a load balancing cycle, for a second process under any processor, a scheduler determines whether the processor identifier carried by the second process is the same as the identifier of the processor, if not, migrating the second process under the first processor, so that when the processor identifiers carried by the other processes sharing the data are different from the first processor identifier, the other processes can be migrated under the first processor, and thus, the shared data can be transferred to the same processor by the processes, so that when the shared data is reused by the processes, the shared data is not required to be respectively placed in respective caches, but the shared data is placed in the cache of the first processor, and therefore, the cache utilization rate of an operating system can be improved, and the execution efficiency of the processes can be improved. In addition, before migrating the process, the scheduler may first determine whether the process migration amount is zero, and in the case that the process migration amount is zero, the scheduler does not need to determine, for each processor, whether the processor identifier carried by the process below the scheduler is the same as the identifier of the processor, or does not need to migrate the second process below the first processor, so that the execution efficiency of the scheduler can also be improved.

It should be apparent to those skilled in the art that embodiments of the present invention may be provided as a method, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.

It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims

1. A method for load balancing of a multi-core processor, comprising:

for at least one shared data, determining a plurality of processes accessing the shared data; after monitoring a first process using the shared data in the first process, acquiring a first processor identifier carried by the first process;

judging whether a second processor identifier carried by each process except the first process in the plurality of processes is the same as the first processor identifier or not, and if not, modifying the processor identifier carried by the process into the first processor identifier;

in a load balancing period, aiming at each process except the first process in the plurality of processes, a scheduler judges whether a processor identifier carried by the process is the same as the identifier of the processor or not, and if not, the process is moved to the first processor.

2. The method of claim 1, wherein after monitoring a first process of the plurality of processes that uses the shared data, and before obtaining a first processor identification carried by the first process, further comprising:

3. The method of claim 1, wherein after modifying the processor identification carried by the process to the first processor identification, further comprising:

determining the process migration volume according to the process quantity;

and judging that the process migration quantity is not zero.

4. A load balancing apparatus for a multi-core processor, comprising:

an obtaining module, configured to determine, for at least one shared data, a plurality of processes accessing the shared data; after monitoring a first process using the shared data in the first process, acquiring a first processor identifier carried by the first process;

a feedback module, configured to determine whether a second processor identifier carried by each of the multiple processes, except for the first process, is the same as the first processor identifier, and if not, modify the processor identifier carried by the process into the first processor identifier;

and the scheduling module is used for judging whether the processor identifier carried by the process is the same as the identifier of the processor or not by the scheduler aiming at each process except the first process in the multiple processes in a load balancing period, and if not, migrating the process to the first processor.

5. The apparatus of claim 4, further comprising: initializing a module;

6. The apparatus of claim 4, wherein the feedback module is further configured to:

after the processor identifier carried by the process is modified into the first processor identifier, counting the number of the processes modified into the first processor identifier;

determining the process migration volume according to the process quantity;

7. A computing device, comprising:

a memory for storing program instructions;

a processor for calling program instructions stored in said memory to execute the method of any one of claims 1 to 3 in accordance with the obtained program.

8. A computer storage medium having stored thereon computer-executable instructions for causing a computer to perform the method of any one of claims 1 to 3.