CN112416052A

CN112416052A - Method for realizing over-frequency of kernel time slice

Info

Publication number: CN112416052A
Application number: CN202011433645.6A
Authority: CN
Inventors: 王志平
Original assignee: Individual
Current assignee: Individual
Priority date: 2020-12-10
Filing date: 2020-12-10
Publication date: 2021-02-26

Abstract

For applications, multitasking is a real need. Programming is the theoretical basis for processor technology. The problem that neither processor technology can escape is that the multi-tasking requirement is realized on a programmed basis, thus the concept of "time slicing" is provided. The definition of "time slices" or related techniques, i.e., time division multiplexing of kernel hardware resources, is not difficult to understand and is the case in the prior art. How to define the length of the "time slice" is often limited by the dominant frequency of the processor and the real-time nature of the application. The technology of the invention can solve the problem of the balance between the time slice and the actual requirement in theory and actual implementation, and realize the actual application of obtaining the over-frequency effect for the application under the premise of determining the time slice (namely, under the premise of defining the time slice as T, 2T, 3T or even more actual execution time slices can be obtained).

Description

Method for realizing over-frequency of kernel time slice

Technical Field

The invention relates to the field of integrated circuits and computers, in particular to a method for realizing over-frequency of a kernel time slice.

Background

The trade-off of whether a "time slice" is defined to be long or short depends on the real-time nature of the processes within the processor and the number of processes. The real-time nature of the system requires that all processes must be executed at least once in a time unit, i.e. the process is waiting in the processor for the next "time slice" to be acquired for a time period that should not exceed the time period required for the real-time nature of the system. Hereinafter, the duration required for the real-Time characteristics of the System is abbreviated as SAT, i.e., the abbreviation of System Aging Time, and "Time Slice" is abbreviated as TS, i.e., the abbreviation of Time Slice.

Assuming that the number of processor cores is N and the number of processes in the processor is X, the following exists: SAT = TS (X/N). For any processor, N in the above formula is fixed. The SAT maximum is constant for any one system application requirement. Therefore, when X is increased, the me SAT increases if TS remains unchanged, and when the SAT value increases to the maximum value allowed, the system can only decrease TS.

Under the prior art, TS is the duration of each process from the loading of the kernel to the unloading of the kernel. Since the time consumed by the process to load and unload the kernel is constant (i.e. the time for switching the context of the process is constant), when the TS is reduced, the ratio of the actual running time of the process is reduced, which means that the actual running efficiency of the kernel is reduced. Therefore, when the value of TS is reduced to a certain degree, which results in the kernel running efficiency being too low, if the definition of SAT by the application requirement is not reduced, the system may fall into a stuck state (or even a crash state), and if the definition of SAT by the application requirement is reduced, the system may also be stuck, or even cause a logic error of the synchronization between the processes.

Therefore, under the prior art, when the application requirements change, i.e. the SAT requirement is not changed or is higher, and the processor is required to process more tasks at the same time, the processor manufacturer must increase the number of cores (i.e. increase the value of N). However, increasing the value of N still does not improve the pressure on system bandwidth caused by too frequent context switching of processes, and does not change the state in which the kernel actually runs less efficiently.

Therefore, a technical method for obtaining higher actual operation efficiency of the kernel under the condition of a smaller TS value is needed for the processor kernel.

Disclosure of Invention

In fact, meeting the SAT requirement is almost equivalent to the TS requirement with a fixed number of kernels. However, perhaps a processor manufacturer in the background would like to certainly consider the requirements for TS to be equivalent to the requirements for process load and unload kernels. However, this is not the case, and in fact, the SAT requirement is equivalent to the TS requirement, simply requiring that the process need to be loaded into the kernel as soon as possible, and under this requirement there is no requirement that the process must be loaded out of the kernel within the TS duration. That is, in theory, when TS is defined as a fixed value, it is not limited that a process must be loaded out of the kernel after it is loaded into the kernel for the duration of the TS. In theory, to meet the SAT requirement, it is necessary to limit that after a process is unloaded from the kernel, the process must be reloaded into the kernel for operation within a "fixed time", and a TS value is calculated for this purpose. The technical scheme of the invention realizes that the requirement of setting a smaller TS value is met on the premise that the process has enough execution time (several times of TS duration) and the practical operation efficiency of the kernel is ensured. I.e., the SAT can be smaller for a fixed number of processes. Or the number of processes may be larger in case the SAT value is fixed.

Before the technical solution of the present invention is explained in more detail, it is necessary to explain one concept of the technical solution of the present invention: "Process Field", hereinafter abbreviated as "POF", an abbreviation for Process Operating Field. POF is a generic term for different hardware resources in the kernel that are only used by a single process when the process is running, and that are involved. Therefore, the POF must contain the following necessary hardware resources:

1. all pipelined hardware resources usable only by a single process

2. A pipeline dependent control module that can only be used by a single process;

3. "code cache" that can only be used by a single process;

4. "data cache" that can only be used by a single process;

5. "code coupled cache" that can only be used by a single process;

6. "data-coupled caches" that can only be used by a single process;

7. a system interface adaptation module that can only be used by a single process.

In the technical solution of the present invention, there are 2 or more POFs in a single core (theoretically, the technical solution of the present invention does not limit the number of POFs that can be implemented in a single core), each POF has a unique index number, and the index number is described laterPOF of 0 is abbreviated as POF₀POF with index number 1 is abbreviated as POF₁In this example, push. Any POF has two states, i.e., "operating state" and "idle state", and the "operating state" of the POF or the POF in the "operating state" will be hereinafter abbreviated as POF_onThe POF in "idle state" or in "idle state" is abbreviated as POF_off。POF_onIndicating that a process, POF, is loaded in the POF_offIndicating that no new process is loaded in the POF. All POFs in a core share a system interface to serve as an interconnect between the core and a "system bus controller". In each core, in addition to all POFs, there is a necessary module for scheduling processes for loading into the core and processes for unloading out of the core, and for managing the POFs, and for completing control of the System Interface, which is hereinafter referred to as a "System Interface Controller", abbreviated as SIC, i.e., System Interface Controller.

And the SIC loads a process from the kernel to return to the system bus controller and then acquires a new process from the system bus controller to load the process into the kernel when one TS is exhausted according to the TS duration set by the operating system. After SIC is loaded in a process, if the kernel still has a POF_offThe SIC then continues to fetch new processes from the "System bus controller" and load them into the POF_offUntil no POF is present in the core_offUntil now. When SIC is loaded out of process, all POFs are selected according to the principle of process first-in first-out_onThe process which is loaded into the kernel firstly is selected to be loaded out.

In summary, in the processor, assuming that the POF number in each kernel is n, each kernel loads a process from the kernel every TS time length and loads a new process on the kernel under the premise that the operating system sets TS. But each process that is loaded into the kernel until it is unloaded from the kernel can continue to run in the kernel n times TS (see vice versa if a process only needs to continue to run TS for a time n times TS_nThen the operating system may set the TS of the system to TS_nIs one-n times ofThe technical scheme of the invention completes the overclocking of the system 'time slice').

Drawings

FIG. 1 is a schematic diagram of the internal structure of a "Process farm" POF;

FIG. 2 is a schematic diagram of a "process farm" POF integrated within a kernel;

fig. 1 is a schematic diagram of a possible hardware of the technical solution of the present invention, which is a schematic diagram of a basic solution and does not show that the technical solution of the present invention needs to be fixed to the illustrated structure. The figure illustrates a simple structure of a POF, and as shown in the figure, 4 pipeline hardware resources are implemented in the POF, each pipeline hardware resource can independently access a code coupling cache and a data coupling cache inside the POF, and can also independently access a system interface adaptation module;

as shown in fig. 1, "code-coupled cache" and "data-coupled cache" refer to cache modules for caching code and data, respectively, which are in direct physical connection with hardware resources of a pipeline;

as shown in fig. 1, "code cache" and "data cache" refer to one of the stages of caching of code and data, respectively, that do not have a direct physical connection to the pipeline hardware resources, but may be accomplished by executing program instructions through other modules;

FIG. 1 shows a "pipeline-related control module" for managing and controlling the use of pipeline hardware resources by different threads in a process or different branches of a binary tree within the same thread;

as shown in fig. 1, the "system interface adaptation module" completes interconnection between the POF and the SIC module shown in fig. 2 in the kernel, and is used for a process to complete access to a system bus, and to complete related operations of code and data cache, and to complete related operations of the SIC module to a POF load-in or load-out process;

as shown in fig. 1, after a process is loaded into a POF, the process runs independently in the POF independent of other POFs in the kernel, and during the running process, the other POFs cannot access any contents of hardware status information, code, and data of the process running in the current POF.

Fig. 2 is a schematic diagram of a possible hardware of the technical solution of the present invention, which is a schematic diagram of a basic solution and does not show that the technical solution of the present invention needs to be fixed to a diagram structure, and the diagram illustrates a processor structure with two cores, wherein each core implements a simple structure of 4 POFs;

SIC is the 'system interface controller' abbreviated as SIC in the above description, and each core has one and only one SIC, and all POFs in the core share the SIC to realize the interconnection with the 'system bus controller'. Obviously, as shown in fig. 2, in the technical solution of the present invention, one kernel can simultaneously maintain a plurality of processes to run simultaneously, but for the "system bus controller", one kernel only occupies a port of the "system bus controller";

shown in FIG. 2, POF₀To POF₃POFs representing index numbers 0 to 3. The SIC completes management and process scheduling of all POFs in the kernel according to the POF index numbers, and realizes that processes are scheduled out of a system bus controller and loaded into the POFs corresponding to the corresponding index numbers according to a first-in first-out principle, and realizes that the processes are loaded out of the POFs corresponding to the corresponding index numbers and are scheduled back to the system bus controller;

as shown in fig. 2, it is obvious that the SIC based on the POF infrastructure implements the first-in first-out principle for the scheduling of the processes, so that each process loaded into the kernel can obtain several times of the actual running time of TS (as in the structure shown in fig. 2, the process can obtain 4 times of the running time of TS) on the premise that the operating system sets the fixed TS value.

Detailed Description

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following describes specific embodiments of the present invention with reference to the drawings. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings and other real-time means can be obtained from these drawings without inventive effort.

For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure or flow of the product. In addition, for the sake of simplicity and comprehension of the drawings, some of the drawings have the same structure or function, only one of which is schematically depicted or labeled, and in the present specification, "a" or "an" does not only mean "only one" but also mean "more than one".

Example 1

In one embodiment of the present invention, the kernel loads the first process, including the steps of:

step 1, resetting the system, wherein all POFs are in the POF_offState, operating System sets System TS value to TS₀，TS₀Starting to count down, SIC will POF₀Setting a candidate POF loaded by the next process;

step 2, SIC obtains a process from the system bus controller, and loads the process to the POF₀And starting operation of the POF₁The candidate POFs loaded for the next process are set.

Example 2

In an embodiment of the present invention, on the basis of embodiment 1, the kernel loads a new process, including the steps of:

step 1, SIC obtains a process from a system bus controller, and loads the process to a POF₁And starting operation of the POF₂The candidate POFs loaded for the next process are set.

Example 3

In an embodiment of the present invention, on the basis of embodiment 2, the kernel loads a new process, including the steps of:

step 1, SIC obtains a process from a system bus controller, and loads the process to a POF₂And starting operation of the POF₃The candidate POFs loaded for the next process are set.

Example 4

In an embodiment of the present invention, based on embodiment 3, the kernel loads a new process, including the steps of:

step 1, SIC obtains a process from a system bus controller, and loads the process to a POF₃And starting to operate;

step 2, all POFs are in the POF_onState, SIC waiting for TS₀The countdown is complete.

Example 5

In an embodiment of the present invention, based on embodiment 4, the kernel loads a process and loads a new process, including the steps of:

step 1, TS₀The countdown is over and a new round of TS begins₀Counting down, SIC to POF₀The process in (1) is loaded from the kernel, POF₀Return to POF_offA state;

step 2, SIC obtains a process from the system bus controller, and loads the process to the POF₀And starting to operate;

step 3, all POFs are in the POF_onState, SIC waiting for TS₀The countdown is complete.

Example 6

In an embodiment of the present invention, based on embodiment 5, the kernel loads a process and loads a new process, including the steps of:

step 1, TS₀The countdown is over and a new round of TS begins₀Counting down, SIC to POF₁The process in (1) is loaded from the kernel, POF₁Return to POF_offA state;

step 2, SIC obtains a process from the system bus controller, and loads the process to the POF₁And starting to operate;

Example 7

In an embodiment of the present invention, based on embodiment 6, the kernel loads a process and loads a new process, including the steps of:

step 1, TS₀The countdown is over and a new round of TS begins₀Counting down, SIC to POF₂The process in (1) is loaded from the kernel, POF₂Return to POF_offA state;

step 2, SIC obtains a process from the system bus controller, and loads the process to the POF₂And starting to operate;

Example 8

In an embodiment of the present invention, based on embodiment 7, the kernel loads a process and loads a new process, including the steps of:

step 1, TS₀The countdown is over and a new round of TS begins₀Counting down, SIC to POF₃The process in (1) is loaded from the kernel, POF₃Return to POF_offA state;

step 2, SIC obtains a process from the system bus controller, and loads the process to the POF₃And starting to operate;

Example 9

In an embodiment of the present invention, based on embodiment 8, the kernel loads a process and loads a new process, including the steps of:

It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims

1. Implementing a processor core circuit structure, defined as a POF, in a circuit structure characterized by:

all pipeline hardware resources which can only be used by a single process at the same time are necessarily contained in the POF;

the POF necessarily comprises a pipeline related control module which can only be used by a single process at the same time;

the POF includes a "code buffer" that can be used by only a single process at a time;

the POF includes a "data buffer" that can be used by only a single process at a time;

the POF includes a "code-coupled cache" that can be used by only a single process at a time;

the POF includes "data-coupled buffers" that can be used by only a single process at a time;

the POF includes a system interface adapter module that can be used by only a single process at a time.

2. A core circuit structure according to claim 1, characterized in that:

2 or more POFs are implemented in a single core;

and/or

Hardware resources between any two POFs cannot access each other;

and/or

Each POF has a unique index number;

and/or

Under the control of the SIC, each POF can be loaded with a process or can be unloaded from a POF.

3. According to claim 2, the implementation process is loaded into the kernel and unloaded out of the kernel, characterized in that:

the process occupation loaded into the kernel is currently in the POF_offA POF in a state;

and/or

The process unloading kernel is unloaded from the POF corresponding to the corresponding index number according to the principle of process first-in first-out;

and/or

Under the control of SIC, if there is a POF in the core_offSIC will continue towards POF_offAnd (6) loading the process.

4. The method of claim 3, wherein the core slice overclocking effect is achieved by:

under the premise that the operating system sets a fixed TS value, the kernel realizes that the process loaded into the kernel can obtain the actual running time of 2 times or more than 2 times of the TS time length and then needs to be loaded out of the kernel.

5. According to claim 3, the kernel that occupies one "system bus controller" port can run 2 or more than 2 processes at the same time, which is characterized in that:

after the kernel is reset from the system, before the kernel loads the first process, 2 or more than 2 processes can be loaded into the kernel;

and/or

The number of processes that can run simultaneously in the kernel is equal to the number of POFs owned in the kernel.