WO2014188561A1 - マルチcpuシステム及びマルチcpuシステムのスケーリング方法 - Google Patents
マルチcpuシステム及びマルチcpuシステムのスケーリング方法 Download PDFInfo
- Publication number
- WO2014188561A1 WO2014188561A1 PCT/JP2013/064370 JP2013064370W WO2014188561A1 WO 2014188561 A1 WO2014188561 A1 WO 2014188561A1 JP 2013064370 W JP2013064370 W JP 2013064370W WO 2014188561 A1 WO2014188561 A1 WO 2014188561A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cpu
- data processing
- cpus
- performance
- environment
- Prior art date
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for program control, e.g. control units
- G06F9/06—Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
- G06F9/46—Multiprogramming arrangements
- G06F9/50—Allocation of resources, e.g. of the central processing unit [CPU]
- G06F9/5094—Allocation of resources, e.g. of the central processing unit [CPU] where the allocation takes into account power or heat criteria
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Definitions
- the present invention relates to an asymmetric multi-CPU system, and further to a scaling method therefor, for example, a technique effective for extending battery driving time by application to a portable information terminal device.
- Patent Document 1 There is a technique described in Patent Document 1 as a technique for extending battery driving time while guaranteeing high data processing performance.
- this is provided with a second CPU that has lower peak performance and higher power efficiency than the first CPU, and the load is monitored by the second CPU.
- the first CPU executes the process.
- the second CPU executes the process instead of the first CPU.
- the leakage power during system operation is reduced in accordance with the load state and temperature change.
- Patent Document 2 in a multiprocessor system, a management unit that manages power consumption information required when a task is executed by each processor is provided, and the power consumption information is used when selecting a processor to execute the task.
- a technique is described in which a processor having the maximum execution processing amount per unit power consumption is selected and a task is assigned to the processor. This makes it possible to implement a multiprocessor system that can execute a larger amount of processing with a limited amount of power in an environment where the amount of power that can be used by a mobile terminal is limited.
- the present inventor has examined scaling of CPU processing in an asymmetric multi-CPU system in which processing is assigned asymmetrically to a plurality of CPUs. That is, the assignment of CPUs that execute tasks is variable between a CPU with high data processing performance and a CPU with low power consumption. As a technique for this, it has been proposed to exclusively switch a group of CPUs to be used between a group of CPUs with high data processing performance and a group of CPUs with low power consumption according to the system load. In addition, CPUs included in a group of CPUs with high data processing performance and CPUs included in a group of CPUs with low power consumption are made to correspond one-to-one, and the system load is increased by using DVFS processing or the like between the corresponding CPUs.
- the overall data processing performance and the maximum power consumption are varied (overall Multiple types of combinations of the CPU type and number are selected from the definition information according to the data processing environment, so that the maximum data processing performance and power consumption maximum value differ in stages)
- the type and number of CPUs assigned to data processing are controlled according to the form.
- FIG. 1 is an explanatory diagram that exemplarily shows a hierarchical configuration of processor hardware and software.
- FIG. 2 is a system configuration diagram showing a system configuration example of an asymmetric multi-CPU system.
- FIG. 3 is an explanatory view exemplifying a combination of CPUs (BigCPU) 8a to 8d and CPUs (LittleCPU) 9a to 9d.
- FIG. 4A is an explanatory diagram showing, as a comparative example, a method of exclusively switching a group of CPUs to be used according to a system load or the like.
- FIG. 1 is an explanatory diagram that exemplarily shows a hierarchical configuration of processor hardware and software.
- FIG. 2 is a system configuration diagram showing a system configuration example of an asymmetric multi-CPU system.
- FIG. 3 is an explanatory view exemplifying a combination of CPUs (BigCPU) 8a to 8d and CPUs (LittleCPU) 9a to 9d.
- FIG. 4B shows a one-to-one correspondence between a plurality of CPUs included in a group of CPUs with high data processing performance and a plurality of CPUs included in a group of CPUs with low power consumption, and the corresponding CPUs according to the system load. It is explanatory drawing which shows the method of switching exclusively used CPU as a comparative example.
- FIG. 5A is an explanatory view exemplifying a combination form relating to the type and number of CPUs in the case of FIG. 4A.
- FIG. 5B is an explanatory view exemplifying a combination form relating to the type and number of CPUs in the case of FIG. 4B.
- FIG. 6 is a flowchart illustrating the control flow of the virtual processor assignment process.
- FIG. 5A is an explanatory view exemplifying a combination form relating to the type and number of CPUs in the case of FIG. 4A.
- FIG. 5B is an explanatory view exemplifying a combination form
- FIG. 7 is an explanatory diagram showing an example of how the governor obtains information on the heat generation status, the remaining battery capacity, and the processing load.
- FIG. 8 is an explanatory diagram showing a processing example of how the governor determines how to switch virtual processors and how to select an appropriate virtual processor according to the result.
- FIG. 9 is a flowchart exemplifying virtual processor allocation control in a case where skip update as well as stepwise update of the virtual processor can be adopted.
- FIG. 10 is an explanatory diagram that hierarchically illustrates the hardware and software configurations of the processor when the power control of the CPU is also taken into consideration.
- FIG. 11 is a flowchart illustrating the processing flow of the initialization part when the configuration of FIG. 10 is adopted.
- FIG. 12 is an explanatory diagram showing an example of a table that defines criteria used for determination when unnecessary CPUs are reduced in steps 18-4 and 18-5.
- FIG. 13 is a flowchart illustrating a processing flow in a case where the power saving effect is further enhanced by controlling the CPU hot plug from the governor of FIG. 10 while the system is operating.
- FIG. 14 shows a hierarchical structure of the hardware and software of the processor that enables control to continuously improve the performance by minimizing power consumption by further combining DVFS control with the example of FIG.
- FIG. 15 is a flowchart illustrating a control flow in which DVFS processing is added to FIG.
- FIG. 16 is an explanatory diagram illustrating the contents of the DVFS process in step 25.
- FIG. 17 is an explanatory diagram showing a modification example in which the CPU synchronous clock frequency is scaled within a range that does not fall within the range of the adjacent processing performance from the range of the processing performance of the selected virtual processor.
- allocation of CPUs for data processing is based on definition information, so that only high-load tasks are allocated to CPUs with high data processing performance, and only low-load tasks are allocated to low-power consumption CPUs. Can be controlled. Therefore, it is not necessary to rely entirely on the OS task scheduler or task dispatcher, and it is not necessary to optimize the OS task scheduler or task dispatcher for such processing.
- all CPUs can be used at the maximum, the operating efficiency of CPU resources is good, and the performance that can be realized by assigning data processing to CPUs is limited to high data processing performance or low data processing performance.
- intermediate data processing performance can be realized, there is almost no possibility of wasting power. Therefore, it is possible to suppress unnecessary power consumption according to the load of the data processing environment and easily realize necessary data processing performance.
- the plurality of forms of the combination of the CPU type and the number possessed by the definition information are forms in which the CPU type and the number are combined in the direction of increasing the data processing performance and the power consumption in stages (10 , 13).
- the data processing environment is a first environment grasped by the magnitude of the data processing load (1b), a magnitude of the data processing load (1b), and a second condition grasped by the heat generation situation (1c). 3rd environment grasped by environment, data processing load (1b), heat generation status (1c) and remaining battery level (1d), or user settings (1e, 1f), data processing load (1b)
- the fourth environment is grasped by the heat generation state (1c) and the remaining battery level (1d).
- the data processing environment in terms of the margin for the data processing capacity by the first environment.
- the second environment it is possible to grasp the data processing environment in terms of the margin of the data processing capacity in consideration of the heat generation situation.
- the third environment it is possible to grasp the data processing environment in terms of the margin of data processing capacity in consideration of the heat generation state and the power supply limit.
- the fourth environment it is possible to grasp the data processing environment in terms of the margin of the data processing capacity in consideration of the heat generation state and the power supply limit, and it is possible to reflect the user setting in the data processing environment.
- the heat generation state is a heat generation state of CPUs included in a group of CPUs having relatively large data processing performance and power consumption.
- the type of the CPU is a plurality of groups of CPUs classified according to the data processing performance and power consumption of the CPU.
- the CPU types are a group (8) of CPUs with large data processing performance and power consumption, and a group (9) of CPUs with low data processing performance and power consumption.
- the number of combinations may be the same as or less than the number of CPUs mounted.
- control of the CPU type and number assigned to the data processing is a process of notifying the kernel of the CPU type and number that can be used for data processing by the user space control program (1).
- CPU allocation control can be easily performed based on the definition information.
- kernel function called control group [9] ⁇ Kernel function called control group>
- the process notified by the control program is realized by a kernel function for controlling a kernel scheduler from user space.
- the existing function of the kernel can be effectively utilized.
- the process notified by the control program may be realized in the kernel space.
- the CPU executing the control program is a predetermined CPU in a group (9) of CPUs having relatively small data processing performance and power consumption.
- the predetermined CPU selects one of the forms from the definition information according to at least a user setting (1f) as the data processing environment, and uses the data processing according to the selected form And inactive CPUs that are not used (18-5).
- wasteful power consumption can be suppressed from the beginning by inactivating the operation of the CPU that is not used from the beginning by user setting. If a CPU that is not used is not made inactive at the time of system boot, the CPU stands by in a power supply state, so that further lower power consumption can be realized.
- the mode to be selected is gradually updated step by step (21), or the mode to be selected is selected at a plurality of levels at a time.
- Skip update (21b) for skipping and updating is performed.
- the stepwise update is performed.
- the skip update is performed. .
- Item 15 includes DVFS definition information (FIG. 16) that defines the degree of increase in the power supply voltage and the synchronous clock frequency for the CPU that is subject to DVFS processing and the CPU in accordance with the degree of the performance requirement.
- the DVFS process is executed with reference to FIG.
- the plurality of forms of the combination of the CPU type and the number possessed by the definition information is a form in which the CPU type and the number are combined in a direction to increase the data processing performance and the power consumption step by step (10, 13).
- Item 17 is a combination of the type and number of CPUs in a direction that gradually satisfies the processing performance required according to the data processing environment.
- the mode (10, 13) is selected, and the minimum performance mode that satisfies the required processing performance is selected.
- the data processing environment in terms of the margin for the data processing capacity by the first environment.
- the second environment it is possible to grasp the data processing environment in terms of the margin of the data processing capacity in consideration of the heat generation situation.
- the third environment it is possible to grasp the data processing environment in terms of the margin of data processing capacity in consideration of the heat generation state and the power supply limit.
- the fourth environment it is possible to grasp the data processing environment in terms of the margin of the data processing capacity in consideration of the heat generation state and the power supply limit, and it is possible to reflect the user setting in the data processing environment.
- the heat generation state is a heat generation state of CPUs included in a group of CPUs having relatively large data processing performance and power consumption.
- the CPU type is a plurality of groups of CPUs classified according to the data processing performance and power consumption of the CPU.
- the CPU types are a group (8) of CPUs with large data processing performance and power consumption, and a group (9) of CPUs with low data processing performance and power consumption.
- the number of combinations may be the same as or less than the number of CPUs mounted.
- the process for controlling the type and number of CPUs assigned to data processing is a process of notifying the kernel of the types and number of CPUs that can be used for data processing by the user space control program (1). .
- the form referenced in the definition information can be easily bridged to the CPU allocation control.
- kernel function called control group [25] ⁇ Kernel function called control group>
- the process notified by the control program is realized by a kernel function for controlling a kernel scheduler from user space.
- the CPU that executes the control program is a predetermined CPU in a group (9) of CPUs having relatively small data processing performance and power consumption.
- the predetermined CPU selects one of the forms from the definition information according to at least a user setting (1f) as the data processing environment, and selects a CPU to be used for data processing according to the selected form.
- the CPU to be activated is deactivated and the CPU to be removed from use is deactivated (18-5).
- wasteful power consumption can be suppressed from the beginning by inactivating the operation of the CPU that is not used from the beginning by the user setting.
- the CPU stands by in a power supply state, and lower power consumption can be realized.
- Stepwise update and skip update in form assignment update> Item 27 When updating the allocation of the type and number of CPUs allocated to data processing, the mode to be selected is updated step by step (21) in which the mode is selected step by step, or the mode to be selected is selected in multiple steps at a time.
- the skip update (21b) is performed in which the update is skipped.
- the stepwise update is performed when the change in the data processing environment is within a predetermined range, and the skip update is performed when the change in the data processing environment exceeds the predetermined range.
- Item 31 has DVFS definition information (FIG. 16) that defines the degree of increase in the power supply voltage and the synchronous clock frequency for the CPU subject to DVFS processing and the CPU in accordance with the degree of the performance request, and the definition information
- the DVFS process is executed with reference to FIG.
- FIG. 2 shows a system configuration example of an asymmetric multi-CPU system. Although not particularly limited, this figure illustrates a system configuration in which the processor 100 and the peripheral device 101 are connected via a bus (or network) 102.
- the processor 100 may be configured with a single chip or may be configured with multiple chips.
- the peripheral device 101 may be configured by various devices and apparatuses. For example, assuming a portable information communication terminal device as a multi-CPU system, the processor 100 performs communication protocol processing and application processing, and the peripheral device 101 includes a liquid crystal display, a touch panel, a battery, and the like.
- the processor 100 is configured as an asymmetric multiprocessor in which a plurality of types of CPUs having different data processing performance and power consumption are mounted for each type.
- the processor 100 is a first group of CPUs with high data processing performance and high power consumption (BigCPUs) 8 and a second group of CPUs with low power consumption and low data processing performance (LittleCPUs). 9.
- the CPU (BigCPU) of the first group 8 is not particularly limited, but is four CPUs (CPU_B # 0 to CPU_B # 3) indicated by reference numerals 8a to 8d, and the CPU (LittleCPU) of the second group 9 Are four CPUs (CPU_L # 0 to CPU_L # 3) indicated by reference numerals 9a to 9d, although not particularly limited.
- the CPUs 8a to 8d of the first group 8 and the CPUs 9a to 9d of the second group 9 have the same architecture. For example, when the CPUs 9a to 9d of the second group 9 have different cache memory configurations from the CPUs 8a to 8d of the first group 8, they are virtually realized by software emulation so as to have exactly the same architecture.
- the CPUs 8 a to 8 d of the first group 8 and the CPUs 9 a to 9 d of the second group 9 are connected to the memory 111, the input / output interface circuit 112, and the peripheral module 113 via the bus 110.
- the peripheral module 113 includes an interrupt controller, a DMA (Direct Memory Access) controller, a communication controller, and the like.
- the input / output interface circuit 112 is connected to the peripheral device 101.
- FIG. 1 hierarchically illustrates the hardware and software configurations of the processor 100.
- the hardware layer (HW) 120, the firmware layer (Firmware) 121, the kernel layer (Kernel) 122, and the user space layer (Userspace) 123 are configured in four layers.
- the number of CPUs in the first group (BigCPU) 8 and the number of CPUs in the second group 9 (LittleCPUs) may be arbitrary, but here, in order to make the explanation easy to understand, as described above, high performance is achieved. It is assumed that four CPUs (BigCPU) 8a to 8d having a large current consumption preferentially and four CPUs (LittleCPU) 9a to 9d having a moderate performance and suppressing the current consumption.
- the firmware layer (Firmware) 121 is a lower software group such as a boot code (Boot) 7 and is stored in the ROM of the memory 111, for example.
- the kernel layer (Kernel) 122 is an operating system (OS) such as Linux (registered trademark), and FIG. 1 shows typical components such as a scheduler (Scheduler) 4, a device driver (Device (Driver) 5, and power management ( Each function of (Power Management) 6 is shown.
- a scheduler (Scheduler) 4 is a function used for task management, and performs scheduling or dispatch for allocating processes constituting data processing to an operable CPU according to priority order.
- a device driver (Device Driver) 5 performs device management for inputting / outputting information to / from a hardware device such as a video card or a LAN card.
- the power management 6 performs power supply management such as suspend / resume and dynamic control (DVFS) of power supply voltage and frequency according to system load, temperature, and the like.
- DVFS dynamic control
- this application software has a slow process group (SlowSProcess Group) 3 in which the required processing performance remains low, and a dynamic process group (Dynamic Process Group) 2 in which the required processing performance changes depending on the situation. Group into one group.
- software located on the left side is executed by CPUs (BigCPU) 8a to 8d
- software located on the right side is executed by CPUs (LittleCPU) 9a to 9d.
- the dynamic process group 4 is executed by the CPU (BigCPU) 8a to 8d and the CPU (LittleCPU) 9a to 9d depending on the required processing performance. It is arranged so as to straddle the left and right.
- the combination of the CPU (BigCPU) 8a to 8d executing the dynamic process group 4 and the CPU (LittleCPU) 9a to 9d is switched by the control signal 1a by the governor 1 according to the required processing performance. This switching is performed by the governor 1 acting on the scheduler 4 of the kernel 122.
- This operation is performed using a kernel function (control program supported by the kernel) called a control group (C group) supported by the Linux (registered trademark) OS.
- the governor 1 can use a control program supported by, for example, an Android (registered trademark) OS. Of course, the governor 1 may be in the kernel.
- FIG. 3 illustrates a combination form of CPUs (BigCPU) 8a to 8d and CPUs (LittleCPU) 9a to 9d.
- CPUs (BigCPU) 8a to 8d are illustrated as B1 to B4, and CPUs (LittleCPU) 9a to 9d are illustrated as L1 to L4.
- the mapped state is illustrated as a “possible combination” indicated at 13.
- the performance of each CPU of the CPUs (LittleCPU) 9a to 9d is unit (1), and the performance of the CPUs (BigCPU) 8a to 8d is twice that of the CPU (BigCPU) 8a to 8d. Is set. Therefore, the subscript i of the virtual processor Vi indicates performance.
- FIG. 4A and 4B show comparative examples of CPU combinations.
- FIG. 4A illustrates a method of exclusively switching the CPU groups 8 and 9 to be used according to the system load, etc.
- FIG. 4B illustrates the CPUs 8a to 8d included in the CPU group 8 having high data processing performance and the low CPUs 8a to 8d.
- An example is a method in which the CPUs 9a to 9d included in the power consumption CPU group 9 are in one-to-one correspondence and the CPU to be used is exclusively switched between the corresponding CPUs according to the system load.
- the relationship between the processing performance and the power consumption of the CPUs 9a to 9d and 8a to 8d is 1: 2, as in FIG. 3, the combinations related to the type and number of CPUs in the case of FIGS. 4A and 4B are as shown in FIG. And as shown in FIG. 5B.
- V9 to V12 having a performance value of 9 or more cannot be realized, as is clear from the example of “possible combinations” 13a.
- performance of V9 or higher it can be handled by increasing the CPU frequency, but it is necessary to increase the power supply voltage at the same time, but the current consumption is proportional to the first power of the frequency and the second power of the voltage. Power consumption will be consumed.
- FIG. 6 illustrates a control flowchart of the virtual processor assignment process.
- the boot code 7 is executed by a predetermined one of the CPUs (Little CPUs) 9a to 9d with reduced power consumption, and the OS kernel (including the scheduler 4, the device driver 5, and the power management 6 code) is executed.
- the OS kernel including the scheduler 4, the device driver 5, and the power management 6 code
- 122 is activated by the predetermined CPU that has executed the boot code 7.
- the processing in step 18 in FIG. 6 includes processing up to the execution of the program included in the slow process group 3 in the user space 123.
- step 19 the governor 1 sets, as the data processing environment, the heat generation status (Temperature) 1c, the remaining battery capacity (Battery Level) 1d, and the processing load (CPU Load) 1b as the data processing load in this order.
- Check in It is determined whether the virtual processor Vi needs to be switched according to the check result (step 20). If it is determined that the change is necessary, an appropriate virtual processor Vi is selected and assigned to the dynamic processor group 2 (step 21).
- the governor 1 obtains information on the heat generation status 1c, the remaining battery capacity 1d, and the processing load 1b is illustrated in FIG.
- the heat generation 1c and the remaining battery capacity 1d are obtained from the thermal sensor 25 and the battery sensor 26 of the hardware layer 120 via the device driver 5, respectively.
- the processing load (CPU Load) 1 b is obtained from the scheduler 4 of the kernel layer 122.
- the processing load 1b is grasped by, for example, the CPU occupation rate.
- the firmware layer 121 is not shown.
- FIG. 8 shows an operation (Operation) for selecting an appropriate virtual processor Vi according to the above three inputs, namely, the heat generation state (Temperature) 1c, the remaining battery capacity (Battery Level) 1d, and the processing load (CPU Load) 1b. It is summarized.
- the specific numerical values such as thresholds and operations (Choose V1 etc.) in FIG. 8 are merely examples for explaining the mechanism, and it goes without saying that they can be changed according to the actual system.
- the temperature is greater than 70 degrees Celsius, it is determined that an abnormal situation has occurred, and V1 with the lowest current consumption is selected as the virtual processor regardless of the remaining battery capacity and processing load.
- FIG. 8 shows an example of determination conditions for determining whether or not the virtual processor needs to be changed in step 20.
- NOP that is, when the temperature is 70 degrees Celsius or less, the remaining battery capacity is 50% or more and the processing load is 30% or more and 70% or less, the remaining battery capacity is 50% or less and the processing load is There are two conditions for 30% or more.
- the conditions for selecting an appropriate virtual processor Vi in step 21 of FIG. 6 are the remaining three conditions in FIG. That is, when the temperature is 70 degrees Celsius or less, the remaining battery capacity is 50% or more and the processing load is more than 70%, a virtual processor having a higher processing amount is allocated (Vi ⁇ Vi + 1).
- the process waits for an event (step 22). While waiting for an event, the process of the dynamic process group 2 in FIG. 1 is executed by the virtual processor Vi allocated in step 21.
- the event waited at step 22 is an event requesting the virtual processor allocation process again, such as a timer interrupt activated at a certain time interval or a thermal sensor interrupt when the temperature rises above a predetermined threshold. .
- the confirmation and handling of the heat generation state described here is an example and is not an essential condition.
- the processes in steps 19 to 22 described above are repeated until the end of a series of program processes is determined in step 23.
- the governor 1 in FIG. 1 that controls the allocation processing of the virtual processor Vi is operated by one of the CPUs (Little CPUs) 9a to 9d, for example. Although it is in the user space layer 123 in the figure, it may be in the kernel layer 122.
- the software of the user space layer 123 that requires high-performance processing is placed in the dynamic process group 2 that can dynamically switch the virtual processor Vi.
- the CPU allocation to the data processing of the dynamic process group 2, that is, the virtual processor Vi may be allocated according to the definition information of FIG. 3 based on the rules illustrated in FIG.
- the CPU sets used for data processing are the CPU (Big CPU) 8a to 8d and the CPU (Little CPU) 9a to 9d. It can be executed in an optimal combination. Therefore, control that assigns only a task with a large load to a CPU with high data processing performance and assigns only a task with a low load to a CPU with low power consumption depends on the scheduler 4 (or task dispatcher) of the OS. It does not have to be. Therefore, it is not necessary to optimize the OS scheduler 4 (or task dispatcher) for such processing.
- the operation efficiency of the CPU resource is high, and the performance that can be realized by the allocation of data processing to the CPU is high data processing performance or low data processing. It is not limited to performance, and intermediate data processing performance can also be realized, so there is almost no possibility of consuming unnecessary power. Therefore, wasteful power consumption can be suppressed and necessary data processing performance can be easily realized according to a data processing environment such as a data processing load.
- FIG. 8 shown as an example of the allocation rule of the virtual processor Vi is a gradual update in which the virtual processor Vi is updated step by step when the allocation of the virtual processor Vi is updated.
- the present invention is not limited to this, and it is possible to employ a skip update in which the update destination virtual processor Vi is updated by skipping multiple stages at a time (skipping).
- FIG. 9 illustrates a virtual processor allocation flow in the case where it is possible to adopt skip update as well as stepwise update of the virtual processor Vi.
- FIG. 9 illustrates a virtual processor allocation flow in the case where it is possible to adopt skip update as well as stepwise update of the virtual processor Vi.
- an example is shown in which there is an instruction from the user to rapidly increase or decrease virtual processor selection. When a user launches an application with a heavy load, the virtual processor selection suddenly increases.
- step 20 in FIG. 9 it is determined that the virtual processor needs to be changed (step 20 in FIG. 9), and it is determined whether the cause is an instruction for sudden rise or sudden drop. This is performed in the next step 20b.
- the virtual processor Vi is selected according to the instruction in step 21b. In this example, the maximum performance V12 is selected in the case of a sudden rise, and the minimum performance V1 is selected in the case of a sudden drop. However, any selection according to individual requirements is not prevented.
- FIG. 10 hierarchically illustrates the hardware and software configurations of the processor 100 in consideration of CPU power control. The portions not directly related to the description here are not shown.
- the important components added in FIG. 10 to FIG. 1 are a CPU hot plug 6a and a power management wordware 14 (Power ManagementWHW).
- the CPU hot plug 6a is a function of the power management 6 of the Linux (registered trademark) kernel (Linux Kernel), and is installed by using the power management hardware 14 of the hardware layer 120.
- the CPUs 8a to 8d and 9a to 9d can be turned on / off during operation. An equivalent function can be used even when another OS is used.
- the CPU hot plug 6a has an interface to the user space layer 123. In the example of FIG. 10, for example, by controlling this interface from the governor 1, the power saving effect described based on FIG. It can be further strengthened.
- FIG. 11 illustrates the processing flow of the initialization part when the configuration of FIG. 10 is adopted.
- the initialization part shown in FIG. 11 can correspond to the boot process (Boot) 18 of FIG. 4 of FIG.
- step 18-1 in FIG. 11 various settings are made. For example, power on all on-chip modules to be used, clock frequency setting, interrupt vector table setting and the like are performed.
- the kernel layer (Kerne) 122 program is started (step 18-2). After starting the program of the kernel layer 122, it is necessary to check the user setting mode, the temperature and the remaining battery level as part of the initialization process (step 18-3), and reduce the number of CPUs to be operated based on the check result. It is determined whether or not there is (step 18-4). When it is necessary to reduce the number, unnecessary ones are excluded from the objects of use by using the function of the CPU hot plug 6a (step 18-5).
- FIG. 12 shows an example of a table that defines criteria used for determination when reducing unnecessary CPUs in steps 18-4 and 18-5. Processing for reducing the CPU (Operation) is performed according to the heat generation status (Temperature) 1c, the remaining battery capacity (Battery Level) 1d, and the user setting (User Setting) 1f. It goes without saying that the content of the operation (Operation) is merely an example for explaining the mechanism, and can be changed according to the actual system.
- the temperature is greater than 70 degrees Celsius, it is determined that an abnormal situation has occurred and only one of the CPUs (Little CPUs) 9a to 9d is turned on regardless of the remaining battery capacity and user settings. When the temperature is lower than 70 degrees Celsius, selection is made according to the user setting and the battery level, whether processing power is prioritized or low power consumption is prioritized.
- step 21-2 is added to the flowchart of FIG.
- the governor 1 selects an appropriate virtual processor Vi and assigns it to the dynamic process group 2
- the power of the CPU that is no longer used is dynamically turned off by the CPU hot plug 6a.
- the performance of the CPU (BigCPU) 8a to 8d which has a large leak current and is important for performance, can be turned off when not in use.
- ⁇ DVFS control The performance is further improved when the CPU allocation to the dynamic process group 2 in which the allocation of the virtual processor Vi is dynamically switched becomes 8 CPUs (BigCPU) 8a to 8d and 9 (CPU) 9a to 9d. Even if a request comes in, only the selection form shown in FIG. If power consumption may be increased, further performance improvement can be achieved by increasing both the voltage and frequency of the CPU. This technique is known by the name of DVFS (Dynamic Voltage Frequency Scaling). However, since power consumption is proportional to the square of the voltage and the first power of the frequency, extremely fine control is required to suppress the increase in power consumption to the minimum necessary. An efficient implementation method in an asymmetric multiprocessor system is not known.
- FIG. 14 hierarchically illustrates the hardware and software configurations of the processor 100 to which DVFS control is further added. The portions not directly related to the description here are not shown.
- the important components adopted in FIG. 14 are DVFS 6b, power management hardware (PowerManagement ⁇ ⁇ ⁇ ⁇ ⁇ ⁇ HW) 14, and clock control hardware (Clock ⁇ Control HW) 16.
- the DVFS 6b is a function of the power management 6 of the Linux® kernel, and includes the power management hardware 14 of the hardware layer 120 and the clock control hardware 16 of the clock control hardware 16.
- the power supply voltages 14a and 14b of all the CPUs 8a to 8d and 9a to 9d and the synchronous clocks 16a and 16b are dynamically controlled in conjunction with each other.
- the DVFS 6b starts control in response to a boost request 15 given from the governor 1.
- step 15 illustrates a control flow in which DVFS processing is added to FIG.
- the dynamic process group 2 in which the allocation of the virtual processor Vi is dynamically switched uses the eight CPUs (BigCPU) 8a to 8d and the CPU (LittleCPU) 9a to 9d.
- (V12) is selected (see step S20c)
- DVFS processing (step 25) is executed when a further performance improvement request is received.
- FIG. 16 illustrates the contents of the DVFS process in step 25.
- the processing performance of the CPUs (LittleCPU) 9a to 9d is assumed to be 1 and the processing performance of the CPUs (BigCPU) 8a to 8d is assumed to be 2.
- the power supply voltage initially supplied to the CPU (BigCPU) 8a to 8d and the CPU (LittleCPU) 9a to 9d is set as the reference 1, and the power supply voltage is increased by a factor of 1.2 in the DVFS process of step 25 to synchronize the clock frequency.
- FIG. 16 exemplifies a plurality of modes of performance improvement based on the highest performance state (synthetic performance is 12) in the virtual processor V12 before the DVFS process is applied. Based on this, for example, when realizing a total performance of 13, DVFS processing is applied to two CPUs among CPUs (LittleCPU) 19a to 19d. Since the performance of two of the CPUs (Little CPUs) 9a to 9d is improved from 1 to 1.5, the total performance is 13, as described in the performance breakdown. The same applies when the total performance is 14 or more. In the “Big CPU voltage” and “Little CPU voltage” columns, an increase in voltage necessary for performance improvement is described so as to increase in accordance with the performance improvement.
- the DVFS process can be applied to two of the CPUs (LittleCPU) 9a to 9d and one of the CPUs (UBigCPU) 8a to 8d.
- a mode in which only one of the power supply voltages is increased to 1.2 is selected according to the definition information in accordance with the definition of FIG. This is because it is more excellent in low power consumption when the power supply voltage is increased only for the low-power CPU (LittleCPU) than when the power supply voltage of the CPUs of both groups is increased.
- the frequency and the power supply voltage are similarly lowered according to the definition of FIG. 16 when the required performance decreases.
- the virtual; processor selection or update logic shown in FIG. 8 is a non-operation (NOP) operation for selecting a virtual processor Vi for a system load fluctuation that does not affect the change of the virtual processor in the logic of processor selection or update, the present invention is not limited to this.
- NOP non-operation
- the synchronous clock frequency of the CPU may be scaled within a range not exceeding the performance range of the virtual processor having the next processing performance from the range of the processing performance of the selected virtual processor.
- the types of CPUs are not limited to two types of CPU groups with high data processing performance and low power consumption CPU groups, but there are three types by adding a group of CPUs having intermediate data processing capabilities. The above is also possible. Further, the number of CPUs belonging to one group is not limited to four, and the number of CPUs belonging to each group is not limited. In this specification, CPU is synonymous with processor core. Therefore, the CPU includes not only an arithmetic unit, an instruction control unit, and a data fetch unit, but also includes accelerator hardware such as a cache memory, an address translation buffer, a RAM, and an FPU, and has a function of emulating them with software. Needless to say, it can be a thing.
- FIG. 1 a plurality of forms of combinations of CPU types and numbers defined by the definition information so that the overall data processing performance and the maximum value of power consumption differ stepwise are as illustrated in FIG.
- the present invention is not limited to a plurality of stages of V1 to V12, and the combination content and the number of stages can be changed as appropriate.
- the process of assigning data processing to the CPU specified in the form selected from the definition information according to the data processing environment is performed by the specific CPU using the control program of the governor or the C group to the scheduler 4 of the kernel layer 122. It is not limited to the technique which acts. It can also be realized by using other functions of the kernel layer 122.
- the data processing environment is not limited to data processing load, heat generation status, battery level, and user settings.
- the user settings (1e, 1f), the data processing load (1b), the heat generation status (1c), and the remaining battery level (1d) have been described as factors grasped as the data processing environment. It is not limited to it. In addition, the factors are grasped by the magnitude of the data processing load (1b) alone, the magnitude of the data processing load (1b) and the heat generation status (1c), or the magnitude of the data processing load (1b). The heat generation status (1c) and the remaining battery level (1d) may be grasped.
- the multi-CPU system can be applied to an electronic device (data processing system) using a so-called big.LITTLE CPU core equipped with an SoC (System on a chip) or a microcomputer.
- SoC System on a chip
- the present invention provides an asymmetric multi-CPU system in which a plurality of types of CPUs having different data processing performance and power consumption are mounted for each type, and further, in such a system, a combination of the type and number of CPUs used is data processing.
- the present invention can be widely applied to a scaling method of a multi-CPU system that performs scaling according to the environment of the CPU. Sometimes, it can be applied to a battery-driven portable information terminal device represented by a smartphone.
Landscapes
- Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Power Sources (AREA)
Abstract
Description
先ず、本願において開示される発明の代表的な実施の形態について、その概要を説明する。代表的な実施の形態における概要説明において、括弧を付して参照する図面中の参照符号は、それが付された構成要素の概念に含まれるものを例示するに過ぎない。
データ処理性能と電力消費量の異なる複数種類のCPU(8a~8d,9a~9d)を種類毎に複数個搭載した非対称マルチCPUシステムは、全体的なデータ処理性能と電力消費量の最大値にバリエーションを持たせる(全体的なデータ処理性能と電力消費量の最大値が段階的に相違する)ようにCPUの種類と個数の組合せの複数の形態を定義する定義情報(10,13)を有し、データ処理の環境に応じて前記定義情報から選択した形態で特定されるCPUにデータ処理を割当てる(21)。
項1において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理性能と電力消費量を段階に増加させる方向にCPUの種類と個数を組合せた形態である(10,13)。
項1において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理の環境に応じて要求される処理性能を段階的に満たす方向にCPUの種類と個数を組合せた形態であり(10,13)、要求される処理性能を満たす最小性能の形態を選択する。
項2又は3において前記データ処理の環境は、データ処理の負荷(1b)の大小で把握される第1環境、データ処理の負荷(1b)の大小及び発熱状況(1c)で把握される第2環境、データ処理の負荷(1b)の大小、発熱状況(1c)及び電池残量(1d)で把握される第3環境、又はユーザー設定(1e、1f)、データ処理の負荷(1b)の大小、発熱状況(1c)及び電池残量(1d)で把握される第4環境である。
項4において、前記発熱状況は、データ処理性能と電力消費量の相対的に大きなCPUのグループに含まれるCPUの発熱状況である。
項1において、前記CPUの種類は、CPUのデータ処理性能と電力消費量の大小の別に応じて分類されたCPUの複数のグループである。例えば、前記CPUの種類は、データ処理性能と電力消費量の大きなCPUのグループ(8)と、データ処理性能と電力消費量の小さなCPUのグループ(9)とである。
項1において、前記CPUの種類と個数の組合せの形態の数は、搭載しているCPUの個数よりも多い。
項1において、データ処理に割当てるCPUの種類と個数の制御は、ユーザースペースの制御プログラム(1)によって、データ処理に用いることが可能なCPUの種類と個数をカーネルに通知する処理である。
項8において、前記制御プログラムによって通知する処理は、ユーザースペースからカーネルのスケジューラを制御するカーネル機能によって実現される。
項9において、前記制御プログラムを実行するCPUは、データ処理性能と電力消費量が相対的に小さなCPUのグループ(9)の中の所定のCPUである。
項3において、ブート処理に際して前記所定のCPUは、前記データ処理の環境として少なくともユーザー設定(1f)に応じて前記定義情報から一つの前記形態を選択し、選択した形態に従ってデータ処理に使用するCPUをアクティブにし、使用しないCPUをインアクティブにする(18-5)。
項11において、データ処理に使用するCPUの種類と個数の割当てを更新するとき、インアクティブなCPUを使用対象に割当てる場合は、当該CPUをアクティブ化し、逆にアクティブなCPUを使用対象から除外する場合は、当該CPUをインアクティブ化する(21-2)。
項11において、データ処理に割当てるCPUの種類と個数の割当てを更新するとき、選択する前記形態を、1段ずつ段階的に更新する段階的更新(21)又は選択する前記形態を一度に複数段スキップして更新するスキップ更新(21b)とし、データ処理環境の変化が所定の範囲内のときは前記段階的更新、データ処理環境の変化が前記所定の範囲を超えたときは前記スキップ更新とする。
項11において、CPUのインアクティブ化は当該CPUへの同期クロックの供給停止又は/及び電源供給停止であり(21-2)、アクティブ化は当該CPUへの同期クロックの供給開始又は/及び電源供給開始である。
項1において、データ処理環境として前記定義情報で定義された最大性能を超える性能要求があったときは、DVFS(Dynamic Voltage/Frequency Scaling)処理(25)を実行して、その要求度合に応じて所定のCPUの電源電圧及び同期クロック周波数の何れか一方又は双方を上昇させる。
項15において、前記性能要求の度合に応じてDVFS処理対象とするCPU及び当該CPUに対する電源電圧及び同期クロック周波数の上昇度合を定義したDVFS用定義情報(図16)を有し、当該定義情報を参照してDVFS処理を実行する。
データ処理性能と電力消費量の異なる複数種類のCPU(8a~8d,9a~9d)を種類毎に複数個搭載した非対称マルチCPUシステムにおいて、使用するCPUの種類と個数の組合せをデータ処理の環境に応じてスケーリングする、マルチCPUシステムのスケーリング方法は、
(a)データ処理の環境を判別する判別処理(19)と、
(b)その判別結果に基づいて、全体的なデータ処理性能と電力消費量の最大値が段階的に相違するようにCPUの種類と個数の組合せの複数の形態を定義する定義情報(10,13)から一つの形態を選択する選択処理(21)と、
(c)選択した形態で特定されるCPUにデータ処理を割当てる制御処理(21)と、を含む。
項17において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理性能と電力消費量を段階的に増加させる方向にCPUの種類と個数を組合せた形態(10,13)である。
項17において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理の環境に応じて要求される処理性能を段階的に満たす方向にCPUの種類と個数を組合せた形態(10,13)であり、要求される処理性能を満たす最小性能の形態を選択する。
請求項18又は19において前記データ処理の環境は、データ処理の負荷(1b)の大小で把握される第1環境、データ処理の負荷の大小及び発熱状況(1c)で把握される第2環境、データ処理の負荷の大小、発熱状況及び電池残量(1d)で把握される第3環境、又はユーザー設定(1e,1f)、データ処理の負荷の大小、発熱状況及び電池残量で把握される第4環境である、マルチCPUシステムのスケーリング方法。
項20において、前記発熱状況は、データ処理性能と電力消費量の相対的に大きなCPUのグループに含まれるCPUの発熱状況である。
項17において、前記CPUの種類は、CPUのデータ処理性能と電力消費量の大小の別に応じて分類されたCPUの複数のグループである。例えば、前記CPUの種類は、データ処理性能と電力消費量の大きなCPUのグループ(8)と、データ処理性能と電力消費量の小さなCPUのグループ(9)とである。
項17において、前記CPUの種類と個数の組合せの形態の数は、例えば、搭載しているCPUの個数よりも数の多くすることが望ましい。
項17において、データ処理に割当てるCPUの種類と個数を制御する処理は、ユーザースペースの制御プログラム(1)によって、データ処理に用いることが可能なCPUの種類と個数をカーネルに通知する処理である。
項17において、前記制御プログラムによって通知する処理は、ユーザースペースからカーネルのスケジューラを制御するカーネル機能によって実現される。
項25において、前記制御プログラムを実行するCPUは、データ処理性能と電力消費量が相対的に小さなCPUのグループ(9)中の所定のCPUである。
項26において、ブート処理に際して前記所定のCPUは前記データ処理の環境として少なくともユーザー設定(1f)に応じて前記定義情報から一つの前記形態を選択し、選択した形態に従ってデータ処理に使用するCPUをアクティブにし、使用から外されるCPUをインアクティブにする(18-5)。
項27において、データ処理に使用するCPUの種類と個数の割当てを更新するとき、インアクティブなCPUを使用対象に割当てる場合は当該CPUをアクティブ化し、逆にアクティブなCPUを使用対象から外す場合は当該CPUをインアクティブ化する(21-2)。
項27において、データ処理に割当てるCPUの種類と個数の割当てを更新するとき、選択する前記形態を、1段ずつ段階的に更新する段階的更新(21)又は選択する前記形態を一度に複数段飛越して更新するスキップ更新(21b)とし、データ処理環境の変化が所定の範囲のときは前記段階的更新、データ処理環境の変化が前記所定の範囲を超えたときは前記スキップ更新とする。
項27において、CPUのインアクティブ化は当該CPUへの同期クロックの供給停止及び電源供給停止であり(21-2)、アクティブ化は当該CPUへの同期クロックの供給開始及び電源供給開始である。
項17において、データ処理環境として前記定義情報で定義された最大性能を超える性能要求があったときは、DVFS(Dynamic Voltage/Frequency Scaling)処理(25)を実行して、その要求度合に応じて所定のCPUの電源電圧及び同期クロック周波数の何れか一方又は双方を上昇させる。
項31において、前記性能要求の度合に応じてDVFS処理対象とするCPU及び当該CPUに対する電源電圧及び同期クロック周波数の上昇度合を定義したDVFS用定義情報(図16)を有し、当該定義情報を参照してDVFS処理を実行する。
実施の形態について更に詳述する。なお、発明を実施するための形態を説明するための全図において、同一の機能を有する要素には同一の符号を付して、その繰り返しの説明を省略する。
図2には非対称マルチCPUシステムのシステム構成例が示される。特に制限されないが、同図には、プロセッサ100と周辺機器101がバス(若しくはネットワーク)102を介して接続されたシステム構成が例示される。プロセッサ100はシングルチップで構成されていても或いはマルチチップで構成されていてもよい。周辺機器101は各種デバイスや装置によって構成されて良い。例えばマルチCPUシステムとして携帯情報通信端末装置を想定すると、プロセッサ100は通信プロトコル処理及びアプリケーション処理などを行い、周辺機器101は液晶ディスプレイ、タッチパネル、バッテリーなどを有する。
図1にはプロセッサ100のハードウェア及びソフトウェアの構成を階層的に例示する。ここでは、ハードウェア層(HW)120、ファームウェア層(Firmware)121、カーネル層(Kernel)122、ユーザースペース層(Userspace)123の4階層で構成されている。
図3にはCPU(BigCPU)8a~8dとCPU(LittleCPU)9a~9dとの組合せの形態が例示される。図においてCPU(BigCPU)8a~8dをB1~B4と図示し、CPU(LittleCPU)9a~9dをL1~L4と図示する。ここでは、CPU(BigCPU)8a~8dとCPU(LittleCPU)9a~9dとの組合せのグループを表す仮想プロセッサVi(i=1~12)という概念を導入する。図3の行列式は4個のCPU(BigCPU)8a~8dと4個のCPU(LittleCPU)9a~9dで構成するベクタ12を12個の仮想プロセッサVi=(i=1~12)にマッピングしている。マッピングした状態は13で示される「可能な組合せ」として図示される。図3ではCPU(LittleCPU)9a~9dの夫々のCPUの性能を単位(1)としてCPU(BigCPU)8a~8dの性能がその2倍、消費電流についてもCPU(BigCPU)8a~8dが2倍と設定してある。よって仮想プロセッサViの添え字iは性能を示す。行列11は4個のCPU(BigCPU)8a~8dと4個のCPU(LittleCPU)9a~9dを13で示される組合せにしたがって12個の仮想プロセッサVi(i=1~12)にマッピングする変換行列である。
仮想プロセッサViの割当てルールの一例として示した図8は、仮想プロセッサViの割当てを更新するとき、仮想プロセッサViを、1段ずつ段階的に更新する段階的更新とする。これだけに限定されるものではなく、更新先の仮想プロセッサViを一度に複数段飛越して(スキップして)更新するスキップ更新を採用する事が可能である。図9には仮想プロセッサViの段階的な更新と共にスキップ更新も採用可能にした場合の仮想プロセッサ割当てフローが例示される。ここではユーザーから仮想プロセッサ選択の急上昇、急降下の指示がある場合の例を示す。ユーザーが負荷の重たいアプリケーションを立ち上げる場合には仮想プロセッサ選択の急上昇が、負荷の重たいアプリケーションの実行終了後には電池容量の無駄な消耗を減らすために、すみやかな急降下が必要 となる。図7のユーザーインストラクション(User Instruction)1eによる急上昇、急降下の指示があれば仮想プロセッサの変更が必要と判別され(図9のステップ20)、その要因が急上昇又は急降下の指示であるかの判定が次のステップ20bで行われる。急上昇又は急効果の指示がある場合にはステップ21bの指示に従った仮想プロセッサViの選択を行う。この例では急上昇の場合、最大性能のV12を選択し、急降下の場合は最小性能のV1を選択しているが、個別の要求に従った任意の選択とすることは妨げられない。
ユーザー設定や、電池残量が少なくなった時のポリシーが省電優先となっている場合は、使用しないCPUの電源を動的に落とす仕組みを更に採用することによって、更なる消費電力の削減が可能になる。またプロセッサ100内部の温度が異常に上昇した時にCPU(Big CPU)8a~8dに関連の電源を全て落とすことで温度を下げることが可能となる。このことを更に適用した具体例について説明する。図10にはCPUの電源制御も考慮したときのプロセッサ100のハードウェア及びソフトウェアの構成を階層的に例示する。ここでの説明に直接関係のない部分は図示を省略した。
各種設定を行う。例えば、使用するオンチップモジュール類全ての電源オン、クロック周波数設定、割り込みベクタテーブルの設定などを行う。各種設定を行った後にカーネル層(Kerne)122のプログラムを起動する(ステップ18-2)。カーネル層122のプログラムを起動した後に初期化処理の一環として、ユーザー設定モード、温度及び電池残量をチェックし(ステップ18-3)、そのチェック結果に基づいて、動作させるべきCPUを減らす必要があるか否かを判別する(ステップ18-4)。減らす必要があるときは、不要なものをCPUホットプラグ6aの機能を用いて使用対象から除外する(ステップ18-5)。
仮想プロセッサViの割当てが動的に切替えられるダイナミックプロセスグループ2へのCPUの割当てがCPU(BigCPU)8a~8dとCPU(LittleCPU)9a~9dの8個全てになってきる状態で、更に性能向上要求が来ても、図3の選択形態だけではその状態にとどまり性能向上は得られない。消費電力を増加させてもよいならば、CPUの電圧と周波数の両者をアップすることで更なる性能向上が可能である。この技術はDVFS(Dynamic Voltage Frequency Scaling)という名前で知られている。しかし消費電力は電圧の2乗、周波数の1乗に比例するため、消費電力の増加を必要最低限に抑えるために極め細かい制御が必要である。非対称マルチプロセッサシステムでの効率のよい実装方法は知られていない。前述の図10の例に対してDVFS制御を更に組合せることで消費電力を必要最低限に抑えて連続的な性能向上を図る制御が可能になる。図14にはDVFS制御を更に追加したプロセッサ100のハードウェア及びソフトウェアの構成を階層的に例示する。ここでの説明に直接関係のない部分は図示を省略した。
図8に示される仮想;プロセッサの選択若しくは更新の論理において仮想プロセッサの変更に及ばないシステム負荷変動に対して仮想プロセッサVi選択のオペレーション(Operation)はノンオペレーション(NOP)としたが、これに限定されない。図17のように、選択されている仮想プロセッサの処理性能の範囲から隣の処理性能の仮想プロセッサの性能の範囲に入らない範囲で、CPUの同期クロック周波数をスケーリングするようにしてもよい。
1a 要求処理性能に応じた制御信号
1b 処理負荷(CPU Load)
1c 熱の発生状況(Temperature)
1d 電池残容量(Battery Level)
1e ユーザーインストラクション(User Instruction)
1f ユーザーセッティング(User Setting)
2 ダイナミックプロセスグループ(Dynamic Process Group)
3 スロープロセスグループ(Slow Process Group)
4 スケジューラ(Scheduler)
5 デバイスドライバ(Device Driver)
6 パワーマネジメント(Power Management)
6a CPUホットプラグ(CPUHotplug)
6b DVFS
7 ブートコード(Boot)
8 データ処理性能が高く電力消費の大きなCPUの第1グループ(BigCPUs)
8a~8d 第1グループ8のCPU(CPU_B#0~CPU_B#3)
9 低消費電力でデータ処理性能が低いCPUの第2グループ(LittleCPUs)
9a~9d 第2グループ9のCPU(CPU_L#0~CPU_L#3)
Vi 仮想プロセッサ
14 パワーマネージメントワードウェア(Power Management HW)
14a、14b 電源電圧
15 ブーストリクエスト(Boost Request)
16 クロックコントロールハードウェア(Clock Control HW)
16a,16b 同期クロック
100 プロセッサ
101 周辺機器
102 バス(若しくはネットワーク)
110 バス
111 メモリ
112 入出力インタフェース回路
113 周辺モジュール
120 ハードウェア層(HW)
121 ファームウェア層(Firmware)
122 カーネル層(Kernel)
123 ユーザースペース層(User space)
Claims (20)
- データ処理性能と電力消費量の異なる複数種類のCPUを種類毎に複数個搭載した非対称マルチCPUシステムであって、
全体的なデータ処理性能と電力消費量の最大値が段階的に相違するようにCPUの種類と個数の組合せの複数の形態を定義する定義情報を有し、データ処理の環境に応じて前記定義情報から選択した形態で特定されるCPUにデータ処理を割当てる、マルチCPUシステム。 - 請求項1において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理性能と電力消費量を段階的に増加させる方向にCPUの種類と個数を組合せた形態である、マルチCPUシステム。
- 請求項1において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理の環境に応じて要求される処理性能を段階的に満たす方向にCPUの種類と個数を組合せた形態であり、要求される処理性能を満たす最小性能の形態を選択する、マルチCPUシステム。
- 請求項2又は3において前記データ処理の環境は、データ処理の負荷の大小で把握される第1環境、データ処理の負荷の大小及び発熱状況で把握される第2環境、データ処理の負荷の大小、発熱状況及び電池残量で把握される第3環境、又はユーザー設定、データ処理の負荷の大小、発熱状況及び電池残量で把握される第4環境である、マルチCPUシステム。
- 請求項4において、前記発熱状況はデータ処理性能と電力消費量の相対的に大きなCPUのグループに含まれるCPUの発熱状況である、マルチCPUシステム。
- 請求項1において、前記CPUの種類は、CPUのデータ処理性能と電力消費量の大小の別に応じて分類されたCPUの複数のグループである、マルチCPUシステム。
- 請求項1において、前記CPUの種類と個数の組合せの複数の形態は、搭載しているCPUの個数よりも数の多い形態数である、マルチCPUシステム。
- 請求項1において、データ処理に割当てるCPUの種類と個数の制御は、ユーザースペースの制御プログラムによって、データ処理に用いることが可能なCPUの種類と個数をカーネルに通知する処理である、マルチCPUシステム。
- 請求項8において、前記制御プログラムによって通知する処理は、前記ユーザースペースからスケジューラを制御するカーネル機能によって実現される、マルチCPUシステム。
- 請求項9において、前記制御プログラムを実行するCPUは、データ処理性能と電力消費量が相対的に小さなCPUのグループ中の所定のCPUである、マルチCPUシステム。
- 請求項3において、ブート処理に際して前記所定のCPUは前記データ処理の環境として少なくともユーザー設定に応じて前記定義情報から一つの前記形態を選択し、選択した形態に従ってデータ処理に使用するCPUをアクティブにし、使用しないCPUをインアクティブにする、マルチCPUシステム。
- 請求項11において、データ処理に使用するCPUの種類と個数の割当てを更新するとき、インアクティブなCPUを使用対象に割当てる場合は当該CPUをアクティブ化し、逆にアクティブなCPUを使用対象から除外する場合は当該CPUをインアクティブ化する、マルチCPUシステム。
- 請求項11において、データ処理に割当てるCPUの種類と個数の割当てを更新するとき、選択する前記形態を、1段ずつ段階的に更新する段階的更新又は選択する前記形態を一度に複数段飛越して更新するスキップ更新とし、データ処理環境の変化が所定の範囲内のときは前記段階的更新、データ処理環境の変化が前記所定の範囲を超えたときは前記スキップ更新とする、マルチCPUシステム。
- 請求項11において、CPUのインアクティブ化は当該CPUへの同期クロックの供給停止又は/及び電源供給停止であり、アクティブ化は当該CPUへの同期クロックの供給開始又は/及び電源供給開始である、マルチCPUシステム。
- 請求項1において、データ処理環境として前記定義情報で定義された最大性能を超える性能要求があったときは、DVFS(Dynamic Voltage/Frequency Scaling)処理を実行して、その要求度合に応じて所定のCPUの電源電圧及び同期クロック周波数の何れか一方又は双方を上昇させる、マルチCPUシステム。
- 請求項15において、前記性能要求の度合に応じてDVFS処理対象とするCPU及び当該CPUに対する電源電圧及び同期クロック周波数の上昇度合を定義したDVFS用定義情報を有し、当該定義情報を参照してDVFS処理を実行する、マルチCPUシステム。
- データ処理性能と電力消費量の異なる複数種類のCPUを種類毎に複数個搭載した非対称マルチCPUシステムにおいて、使用するCPUの種類と個数の組合せをデータ処理の環境に応じてスケーリングする、マルチCPUシステムのスケーリング方法であって、
データ処理の環境を判別する判別処理と、
その判別結果に基づいて、全体的なデータ処理性能と電力消費量の最大値が段階的に相違するようにCPUの種類と個数の組合せの複数の形態を定義する定義情報から一つの形態を選択する選択処理と、
選択した形態で特定されるCPUにデータ処理を割当てる制御処理と、を含むマルチCPUシステムのスケーリング方法。 - 請求項17において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理性能と電力消費量を段階的に増加させる方向にCPUの種類と個数を組合せた形態である、マルチCPUシステムのスケーリング方法。
- 請求項17において、前記定義情報が保有するCPUの種類と個数の組合せの複数の形態は、データ処理の環境に応じて要求される処理性能を段階的に満たす方向にCPUの種類と個数を組合せた形態であり、要求される処理性能を満たす最小性能の形態を選択する、マルチCPUシステムのスケーリング方法。
- 請求項18又は19において前記データ処理の環境は、データ処理の負荷の大小で把握される第1環境、データ処理の負荷の大小及び発熱状況で把握される第2環境、データ処理の負荷の大小、発熱状況及び電池残量で把握される第3環境、又はユーザー設定、データ処理の負荷の大小、発熱状況及び電池残量で把握される第4環境である、マルチCPUシステムのスケーリング方法。
Priority Applications (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201380076775.0A CN105247486B (zh) | 2013-05-23 | 2013-05-23 | 多cpu系统及多cpu系统的调整方法 |
PCT/JP2013/064370 WO2014188561A1 (ja) | 2013-05-23 | 2013-05-23 | マルチcpuシステム及びマルチcpuシステムのスケーリング方法 |
JP2015517999A JP6483609B2 (ja) | 2013-05-23 | 2013-05-23 | マルチcpuシステム |
US14/785,617 US9996400B2 (en) | 2013-05-23 | 2013-05-23 | Multi-CPU system and multi-CPU system scaling method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
PCT/JP2013/064370 WO2014188561A1 (ja) | 2013-05-23 | 2013-05-23 | マルチcpuシステム及びマルチcpuシステムのスケーリング方法 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2014188561A1 true WO2014188561A1 (ja) | 2014-11-27 |
Family
ID=51933144
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/JP2013/064370 WO2014188561A1 (ja) | 2013-05-23 | 2013-05-23 | マルチcpuシステム及びマルチcpuシステムのスケーリング方法 |
Country Status (4)
Country | Link |
---|---|
US (1) | US9996400B2 (ja) |
JP (1) | JP6483609B2 (ja) |
CN (1) | CN105247486B (ja) |
WO (1) | WO2014188561A1 (ja) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016139271A (ja) * | 2015-01-27 | 2016-08-04 | 富士通株式会社 | 演算処理システムおよび演算処理システムの制御方法 |
JP2018169696A (ja) * | 2017-03-29 | 2018-11-01 | 富士通株式会社 | 情報処理装置,試験プログラムおよび試験方法 |
JP2019121185A (ja) * | 2018-01-05 | 2019-07-22 | コニカミノルタ株式会社 | Gpu割当プログラム、gpu割当方法、コンピュータ読取可能な記録媒体、および、gpu割当装置 |
Families Citing this family (37)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102205836B1 (ko) * | 2014-01-29 | 2021-01-21 | 삼성전자 주식회사 | 태스크 스케줄링 방법 및 장치 |
US9785481B2 (en) * | 2014-07-24 | 2017-10-10 | Qualcomm Innovation Center, Inc. | Power aware task scheduling on multi-processor systems |
US10031574B2 (en) * | 2015-05-20 | 2018-07-24 | Mediatek Inc. | Apparatus and method for controlling multi-core processor of computing system |
CN106095567B (zh) * | 2016-05-31 | 2019-08-30 | Oppo广东移动通信有限公司 | 一种安装任务的分配方法及移动终端 |
US11620336B1 (en) | 2016-09-26 | 2023-04-04 | Splunk Inc. | Managing and storing buckets to a remote shared storage system based on a collective bucket size |
US10353965B2 (en) | 2016-09-26 | 2019-07-16 | Splunk Inc. | Data fabric service system architecture |
US11593377B2 (en) | 2016-09-26 | 2023-02-28 | Splunk Inc. | Assigning processing tasks in a data intake and query system |
US11860940B1 (en) | 2016-09-26 | 2024-01-02 | Splunk Inc. | Identifying buckets for query execution using a catalog of buckets |
US11281706B2 (en) * | 2016-09-26 | 2022-03-22 | Splunk Inc. | Multi-layer partition allocation for query execution |
US11550847B1 (en) | 2016-09-26 | 2023-01-10 | Splunk Inc. | Hashing bucket identifiers to identify search nodes for efficient query execution |
US11567993B1 (en) | 2016-09-26 | 2023-01-31 | Splunk Inc. | Copying buckets from a remote shared storage system to memory associated with a search node for query execution |
US11604795B2 (en) | 2016-09-26 | 2023-03-14 | Splunk Inc. | Distributing partial results from an external data system between worker nodes |
US11599541B2 (en) | 2016-09-26 | 2023-03-07 | Splunk Inc. | Determining records generated by a processing task of a query |
US11461334B2 (en) | 2016-09-26 | 2022-10-04 | Splunk Inc. | Data conditioning for dataset destination |
US11663227B2 (en) | 2016-09-26 | 2023-05-30 | Splunk Inc. | Generating a subquery for a distinct data intake and query system |
US11874691B1 (en) | 2016-09-26 | 2024-01-16 | Splunk Inc. | Managing efficient query execution including mapping of buckets to search nodes |
US11442935B2 (en) | 2016-09-26 | 2022-09-13 | Splunk Inc. | Determining a record generation estimate of a processing task |
US20180089324A1 (en) | 2016-09-26 | 2018-03-29 | Splunk Inc. | Dynamic resource allocation for real-time search |
US11416528B2 (en) | 2016-09-26 | 2022-08-16 | Splunk Inc. | Query acceleration data store |
US11615104B2 (en) | 2016-09-26 | 2023-03-28 | Splunk Inc. | Subquery generation based on a data ingest estimate of an external data system |
US11586627B2 (en) | 2016-09-26 | 2023-02-21 | Splunk Inc. | Partitioning and reducing records at ingest of a worker node |
US11580107B2 (en) | 2016-09-26 | 2023-02-14 | Splunk Inc. | Bucket data distribution for exporting data to worker nodes |
US11562023B1 (en) | 2016-09-26 | 2023-01-24 | Splunk Inc. | Merging buckets in a data intake and query system |
US10956415B2 (en) | 2016-09-26 | 2021-03-23 | Splunk Inc. | Generating a subquery for an external data system using a configuration file |
US10282226B2 (en) | 2016-12-20 | 2019-05-07 | Vmware, Inc. | Optimizing host CPU usage based on virtual machine guest OS power and performance management |
CN108958449B (zh) * | 2017-05-26 | 2023-07-07 | 中兴通讯股份有限公司 | 一种cpu功耗调整方法与装置 |
US10417054B2 (en) * | 2017-06-04 | 2019-09-17 | Apple Inc. | Scheduler for AMP architecture with closed loop performance controller |
US11989194B2 (en) | 2017-07-31 | 2024-05-21 | Splunk Inc. | Addressing memory limits for partition tracking among worker nodes |
US11921672B2 (en) | 2017-07-31 | 2024-03-05 | Splunk Inc. | Query execution at a remote heterogeneous data store of a data fabric service |
US10896182B2 (en) | 2017-09-25 | 2021-01-19 | Splunk Inc. | Multi-partitioning determination for combination operations |
US11334543B1 (en) | 2018-04-30 | 2022-05-17 | Splunk Inc. | Scalable bucket merging for a data intake and query system |
CN109960395B (zh) * | 2018-10-15 | 2021-06-08 | 华为技术有限公司 | 资源调度方法和计算机设备 |
WO2020220216A1 (en) | 2019-04-29 | 2020-11-05 | Splunk Inc. | Search time estimate in data intake and query system |
US11715051B1 (en) | 2019-04-30 | 2023-08-01 | Splunk Inc. | Service provider instance recommendations using machine-learned classifications and reconciliation |
US11494380B2 (en) | 2019-10-18 | 2022-11-08 | Splunk Inc. | Management of distributed computing framework components in a data fabric service system |
US11922222B1 (en) | 2020-01-30 | 2024-03-05 | Splunk Inc. | Generating a modified component for a data intake and query system using an isolated execution environment image |
US11704313B1 (en) | 2020-10-19 | 2023-07-18 | Splunk Inc. | Parallel branch operation using intermediary nodes |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09138716A (ja) * | 1995-11-14 | 1997-05-27 | Toshiba Corp | 電子計算機 |
JP2005085164A (ja) * | 2003-09-10 | 2005-03-31 | Sharp Corp | マルチプロセッサシステムの制御方法およびマルチプロセッサシステム |
JP2010231329A (ja) * | 2009-03-26 | 2010-10-14 | Fuji Xerox Co Ltd | 情報処理装置、画像処理装置、画像出力装置、画像出力システム、プログラム |
JP2012256306A (ja) * | 2011-06-08 | 2012-12-27 | Shijin Kogyo Sakushinkai | 環境に配慮した演算処理異種計算機システム |
Family Cites Families (13)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH0659906A (ja) | 1992-08-10 | 1994-03-04 | Hitachi Ltd | 並列計算機の実行制御方法 |
JP2004280378A (ja) | 2003-03-14 | 2004-10-07 | Handotai Rikougaku Kenkyu Center:Kk | 半導体装置 |
JP2007213286A (ja) * | 2006-02-09 | 2007-08-23 | Hitachi Ltd | システム装置 |
WO2008073597A1 (en) * | 2006-12-14 | 2008-06-19 | Intel Corporation | Method and apparatus of power management of processor |
JP5079342B2 (ja) * | 2007-01-22 | 2012-11-21 | ルネサスエレクトロニクス株式会社 | マルチプロセッサ装置 |
JP2009230220A (ja) * | 2008-03-19 | 2009-10-08 | Fuji Xerox Co Ltd | 情報処理装置、及び画像処理装置 |
JP2011209846A (ja) | 2010-03-29 | 2011-10-20 | Panasonic Corp | マルチプロセッサシステムとそのタスク割り当て方法 |
US8707314B2 (en) * | 2011-12-16 | 2014-04-22 | Advanced Micro Devices, Inc. | Scheduling compute kernel workgroups to heterogeneous processors based on historical processor execution times and utilizations |
US10185566B2 (en) * | 2012-04-27 | 2019-01-22 | Intel Corporation | Migrating tasks between asymmetric computing elements of a multi-core processor |
US8984200B2 (en) * | 2012-08-21 | 2015-03-17 | Lenovo (Singapore) Pte. Ltd. | Task scheduling in big and little cores |
US10073779B2 (en) * | 2012-12-28 | 2018-09-11 | Intel Corporation | Processors having virtually clustered cores and cache slices |
KR102082859B1 (ko) * | 2013-01-07 | 2020-02-28 | 삼성전자주식회사 | 복수의 이종 코어들을 포함하는 시스템 온 칩 및 그 동작 방법 |
US9218198B2 (en) * | 2013-03-13 | 2015-12-22 | Oracle America, Inc. | Method and system for specifying the layout of computer system resources |
-
2013
- 2013-05-23 CN CN201380076775.0A patent/CN105247486B/zh active Active
- 2013-05-23 JP JP2015517999A patent/JP6483609B2/ja active Active
- 2013-05-23 WO PCT/JP2013/064370 patent/WO2014188561A1/ja active Application Filing
- 2013-05-23 US US14/785,617 patent/US9996400B2/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH09138716A (ja) * | 1995-11-14 | 1997-05-27 | Toshiba Corp | 電子計算機 |
JP2005085164A (ja) * | 2003-09-10 | 2005-03-31 | Sharp Corp | マルチプロセッサシステムの制御方法およびマルチプロセッサシステム |
JP2010231329A (ja) * | 2009-03-26 | 2010-10-14 | Fuji Xerox Co Ltd | 情報処理装置、画像処理装置、画像出力装置、画像出力システム、プログラム |
JP2012256306A (ja) * | 2011-06-08 | 2012-12-27 | Shijin Kogyo Sakushinkai | 環境に配慮した演算処理異種計算機システム |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2016139271A (ja) * | 2015-01-27 | 2016-08-04 | 富士通株式会社 | 演算処理システムおよび演算処理システムの制御方法 |
JP2018169696A (ja) * | 2017-03-29 | 2018-11-01 | 富士通株式会社 | 情報処理装置,試験プログラムおよび試験方法 |
JP2019121185A (ja) * | 2018-01-05 | 2019-07-22 | コニカミノルタ株式会社 | Gpu割当プログラム、gpu割当方法、コンピュータ読取可能な記録媒体、および、gpu割当装置 |
JP6992515B2 (ja) | 2018-01-05 | 2022-01-13 | コニカミノルタ株式会社 | Gpu割当プログラム、gpu割当方法、コンピュータ読取可能な記録媒体、および、gpu割当装置 |
Also Published As
Publication number | Publication date |
---|---|
US20160085596A1 (en) | 2016-03-24 |
US9996400B2 (en) | 2018-06-12 |
JP6483609B2 (ja) | 2019-03-13 |
CN105247486B (zh) | 2019-05-21 |
CN105247486A (zh) | 2016-01-13 |
JPWO2014188561A1 (ja) | 2017-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP6483609B2 (ja) | マルチcpuシステム | |
JP5075274B2 (ja) | 電力認識スレッドスケジューリングおよびプロセッサーの動的使用 | |
JP5433837B2 (ja) | 仮想計算機システム、仮想計算機の制御方法及びプログラム | |
US10031574B2 (en) | Apparatus and method for controlling multi-core processor of computing system | |
TWI569202B (zh) | 用於基於網路負載來調整處理器電力使用之設備及方法 | |
US9201490B2 (en) | Power management for a computer system | |
CN107743608B (zh) | 至硬件加速器的动态功率路由 | |
JP2006344162A (ja) | 並列計算装置 | |
US10768684B2 (en) | Reducing power by vacating subsets of CPUs and memory | |
JP2010160565A (ja) | タスクスケジューリング装置、タスクスケジューリング制御方法、及びタスクスケジューリング制御プログラム | |
US20160170474A1 (en) | Power-saving control system, control device, control method, and control program for server equipped with non-volatile memory | |
JP2014167780A (ja) | 情報処理装置、動作状態制御方法及びプログラム | |
US20180275742A1 (en) | Apparatus and method for controlling governor based on heterogeneous multicore system | |
KR102333391B1 (ko) | 전자 장치 및 이의 전력 제어 방법 | |
US20190391846A1 (en) | Semiconductor integrated circuit, cpu allocation method, and program | |
KR101433030B1 (ko) | 가상환경 내 중앙처리장치의 전력 스케줄링 방법 및 시스템 | |
JP5768586B2 (ja) | 計算装置、計算装置の制御方法、及びプログラム | |
CN117271058A (zh) | 容器资源调度方法、装置及存储介质 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 13885324 Country of ref document: EP Kind code of ref document: A1 |
|
ENP | Entry into the national phase |
Ref document number: 2015517999 Country of ref document: JP Kind code of ref document: A |
|
WWE | Wipo information: entry into national phase |
Ref document number: 14785617 Country of ref document: US |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 13885324 Country of ref document: EP Kind code of ref document: A1 |