CN112416052A - Method for realizing over-frequency of kernel time slice - Google Patents

Method for realizing over-frequency of kernel time slice Download PDF

Info

Publication number
CN112416052A
CN112416052A CN202011433645.6A CN202011433645A CN112416052A CN 112416052 A CN112416052 A CN 112416052A CN 202011433645 A CN202011433645 A CN 202011433645A CN 112416052 A CN112416052 A CN 112416052A
Authority
CN
China
Prior art keywords
pof
kernel
time
loaded
sic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011433645.6A
Other languages
Chinese (zh)
Inventor
王志平
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Individual
Original Assignee
Individual
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Individual filed Critical Individual
Priority to CN202011433645.6A priority Critical patent/CN112416052A/en
Publication of CN112416052A publication Critical patent/CN112416052A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F1/00Details not covered by groups G06F3/00 - G06F13/00 and G06F21/00
    • G06F1/04Generating or distributing clock signals or signals derived directly therefrom
    • G06F1/08Clock generators with changeable or programmable clock frequency

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Stored Programmes (AREA)

Abstract

For applications, multitasking is a real need. Programming is the theoretical basis for processor technology. The problem that neither processor technology can escape is that the multi-tasking requirement is realized on a programmed basis, thus the concept of "time slicing" is provided. The definition of "time slices" or related techniques, i.e., time division multiplexing of kernel hardware resources, is not difficult to understand and is the case in the prior art. How to define the length of the "time slice" is often limited by the dominant frequency of the processor and the real-time nature of the application. The technology of the invention can solve the problem of the balance between the time slice and the actual requirement in theory and actual implementation, and realize the actual application of obtaining the over-frequency effect for the application under the premise of determining the time slice (namely, under the premise of defining the time slice as T, 2T, 3T or even more actual execution time slices can be obtained).

Description

Method for realizing over-frequency of kernel time slice
Technical Field
The invention relates to the field of integrated circuits and computers, in particular to a method for realizing over-frequency of a kernel time slice.
Background
The trade-off of whether a "time slice" is defined to be long or short depends on the real-time nature of the processes within the processor and the number of processes. The real-time nature of the system requires that all processes must be executed at least once in a time unit, i.e. the process is waiting in the processor for the next "time slice" to be acquired for a time period that should not exceed the time period required for the real-time nature of the system. Hereinafter, the duration required for the real-Time characteristics of the System is abbreviated as SAT, i.e., the abbreviation of System Aging Time, and "Time Slice" is abbreviated as TS, i.e., the abbreviation of Time Slice.
Assuming that the number of processor cores is N and the number of processes in the processor is X, the following exists: SAT = TS (X/N). For any processor, N in the above formula is fixed. The SAT maximum is constant for any one system application requirement. Therefore, when X is increased, the me SAT increases if TS remains unchanged, and when the SAT value increases to the maximum value allowed, the system can only decrease TS.
Under the prior art, TS is the duration of each process from the loading of the kernel to the unloading of the kernel. Since the time consumed by the process to load and unload the kernel is constant (i.e. the time for switching the context of the process is constant), when the TS is reduced, the ratio of the actual running time of the process is reduced, which means that the actual running efficiency of the kernel is reduced. Therefore, when the value of TS is reduced to a certain degree, which results in the kernel running efficiency being too low, if the definition of SAT by the application requirement is not reduced, the system may fall into a stuck state (or even a crash state), and if the definition of SAT by the application requirement is reduced, the system may also be stuck, or even cause a logic error of the synchronization between the processes.
Therefore, under the prior art, when the application requirements change, i.e. the SAT requirement is not changed or is higher, and the processor is required to process more tasks at the same time, the processor manufacturer must increase the number of cores (i.e. increase the value of N). However, increasing the value of N still does not improve the pressure on system bandwidth caused by too frequent context switching of processes, and does not change the state in which the kernel actually runs less efficiently.
Therefore, a technical method for obtaining higher actual operation efficiency of the kernel under the condition of a smaller TS value is needed for the processor kernel.
Disclosure of Invention
In fact, meeting the SAT requirement is almost equivalent to the TS requirement with a fixed number of kernels. However, perhaps a processor manufacturer in the background would like to certainly consider the requirements for TS to be equivalent to the requirements for process load and unload kernels. However, this is not the case, and in fact, the SAT requirement is equivalent to the TS requirement, simply requiring that the process need to be loaded into the kernel as soon as possible, and under this requirement there is no requirement that the process must be loaded out of the kernel within the TS duration. That is, in theory, when TS is defined as a fixed value, it is not limited that a process must be loaded out of the kernel after it is loaded into the kernel for the duration of the TS. In theory, to meet the SAT requirement, it is necessary to limit that after a process is unloaded from the kernel, the process must be reloaded into the kernel for operation within a "fixed time", and a TS value is calculated for this purpose. The technical scheme of the invention realizes that the requirement of setting a smaller TS value is met on the premise that the process has enough execution time (several times of TS duration) and the practical operation efficiency of the kernel is ensured. I.e., the SAT can be smaller for a fixed number of processes. Or the number of processes may be larger in case the SAT value is fixed.
Before the technical solution of the present invention is explained in more detail, it is necessary to explain one concept of the technical solution of the present invention: "Process Field", hereinafter abbreviated as "POF", an abbreviation for Process Operating Field. POF is a generic term for different hardware resources in the kernel that are only used by a single process when the process is running, and that are involved. Therefore, the POF must contain the following necessary hardware resources:
1. all pipelined hardware resources usable only by a single process
2. A pipeline dependent control module that can only be used by a single process;
3. "code cache" that can only be used by a single process;
4. "data cache" that can only be used by a single process;
5. "code coupled cache" that can only be used by a single process;
6. "data-coupled caches" that can only be used by a single process;
7. a system interface adaptation module that can only be used by a single process.
In the technical solution of the present invention, there are 2 or more POFs in a single core (theoretically, the technical solution of the present invention does not limit the number of POFs that can be implemented in a single core), each POF has a unique index number, and the index number is described laterPOF of 0 is abbreviated as POF0POF with index number 1 is abbreviated as POF1In this example, push. Any POF has two states, i.e., "operating state" and "idle state", and the "operating state" of the POF or the POF in the "operating state" will be hereinafter abbreviated as POFonThe POF in "idle state" or in "idle state" is abbreviated as POFoff。POFonIndicating that a process, POF, is loaded in the POFoffIndicating that no new process is loaded in the POF. All POFs in a core share a system interface to serve as an interconnect between the core and a "system bus controller". In each core, in addition to all POFs, there is a necessary module for scheduling processes for loading into the core and processes for unloading out of the core, and for managing the POFs, and for completing control of the System Interface, which is hereinafter referred to as a "System Interface Controller", abbreviated as SIC, i.e., System Interface Controller.
And the SIC loads a process from the kernel to return to the system bus controller and then acquires a new process from the system bus controller to load the process into the kernel when one TS is exhausted according to the TS duration set by the operating system. After SIC is loaded in a process, if the kernel still has a POFoffThe SIC then continues to fetch new processes from the "System bus controller" and load them into the POFoffUntil no POF is present in the coreoffUntil now. When SIC is loaded out of process, all POFs are selected according to the principle of process first-in first-outonThe process which is loaded into the kernel firstly is selected to be loaded out.
In summary, in the processor, assuming that the POF number in each kernel is n, each kernel loads a process from the kernel every TS time length and loads a new process on the kernel under the premise that the operating system sets TS. But each process that is loaded into the kernel until it is unloaded from the kernel can continue to run in the kernel n times TS (see vice versa if a process only needs to continue to run TS for a time n times TSnThen the operating system may set the TS of the system to TSnIs one-n times ofThe technical scheme of the invention completes the overclocking of the system 'time slice').
Drawings
FIG. 1 is a schematic diagram of the internal structure of a "Process farm" POF;
FIG. 2 is a schematic diagram of a "process farm" POF integrated within a kernel;
fig. 1 is a schematic diagram of a possible hardware of the technical solution of the present invention, which is a schematic diagram of a basic solution and does not show that the technical solution of the present invention needs to be fixed to the illustrated structure. The figure illustrates a simple structure of a POF, and as shown in the figure, 4 pipeline hardware resources are implemented in the POF, each pipeline hardware resource can independently access a code coupling cache and a data coupling cache inside the POF, and can also independently access a system interface adaptation module;
as shown in fig. 1, "code-coupled cache" and "data-coupled cache" refer to cache modules for caching code and data, respectively, which are in direct physical connection with hardware resources of a pipeline;
as shown in fig. 1, "code cache" and "data cache" refer to one of the stages of caching of code and data, respectively, that do not have a direct physical connection to the pipeline hardware resources, but may be accomplished by executing program instructions through other modules;
FIG. 1 shows a "pipeline-related control module" for managing and controlling the use of pipeline hardware resources by different threads in a process or different branches of a binary tree within the same thread;
as shown in fig. 1, the "system interface adaptation module" completes interconnection between the POF and the SIC module shown in fig. 2 in the kernel, and is used for a process to complete access to a system bus, and to complete related operations of code and data cache, and to complete related operations of the SIC module to a POF load-in or load-out process;
as shown in fig. 1, after a process is loaded into a POF, the process runs independently in the POF independent of other POFs in the kernel, and during the running process, the other POFs cannot access any contents of hardware status information, code, and data of the process running in the current POF.
Fig. 2 is a schematic diagram of a possible hardware of the technical solution of the present invention, which is a schematic diagram of a basic solution and does not show that the technical solution of the present invention needs to be fixed to a diagram structure, and the diagram illustrates a processor structure with two cores, wherein each core implements a simple structure of 4 POFs;
SIC is the 'system interface controller' abbreviated as SIC in the above description, and each core has one and only one SIC, and all POFs in the core share the SIC to realize the interconnection with the 'system bus controller'. Obviously, as shown in fig. 2, in the technical solution of the present invention, one kernel can simultaneously maintain a plurality of processes to run simultaneously, but for the "system bus controller", one kernel only occupies a port of the "system bus controller";
shown in FIG. 2, POF0To POF3POFs representing index numbers 0 to 3. The SIC completes management and process scheduling of all POFs in the kernel according to the POF index numbers, and realizes that processes are scheduled out of a system bus controller and loaded into the POFs corresponding to the corresponding index numbers according to a first-in first-out principle, and realizes that the processes are loaded out of the POFs corresponding to the corresponding index numbers and are scheduled back to the system bus controller;
as shown in fig. 2, it is obvious that the SIC based on the POF infrastructure implements the first-in first-out principle for the scheduling of the processes, so that each process loaded into the kernel can obtain several times of the actual running time of TS (as in the structure shown in fig. 2, the process can obtain 4 times of the running time of TS) on the premise that the operating system sets the fixed TS value.
Detailed Description
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following describes specific embodiments of the present invention with reference to the drawings. It is obvious that the drawings in the following description are only some embodiments of the invention, and that for a person skilled in the art, other drawings and other real-time means can be obtained from these drawings without inventive effort.
For the sake of simplicity, the drawings only schematically show the parts relevant to the present invention, and they do not represent the actual structure or flow of the product. In addition, for the sake of simplicity and comprehension of the drawings, some of the drawings have the same structure or function, only one of which is schematically depicted or labeled, and in the present specification, "a" or "an" does not only mean "only one" but also mean "more than one".
Example 1
In one embodiment of the present invention, the kernel loads the first process, including the steps of:
step 1, resetting the system, wherein all POFs are in the POFoffState, operating System sets System TS value to TS0,TS0Starting to count down, SIC will POF0Setting a candidate POF loaded by the next process;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF0And starting operation of the POF1The candidate POFs loaded for the next process are set.
Example 2
In an embodiment of the present invention, on the basis of embodiment 1, the kernel loads a new process, including the steps of:
step 1, SIC obtains a process from a system bus controller, and loads the process to a POF1And starting operation of the POF2The candidate POFs loaded for the next process are set.
Example 3
In an embodiment of the present invention, on the basis of embodiment 2, the kernel loads a new process, including the steps of:
step 1, SIC obtains a process from a system bus controller, and loads the process to a POF2And starting operation of the POF3The candidate POFs loaded for the next process are set.
Example 4
In an embodiment of the present invention, based on embodiment 3, the kernel loads a new process, including the steps of:
step 1, SIC obtains a process from a system bus controller, and loads the process to a POF3And starting to operate;
step 2, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
Example 5
In an embodiment of the present invention, based on embodiment 4, the kernel loads a process and loads a new process, including the steps of:
step 1, TS0The countdown is over and a new round of TS begins0Counting down, SIC to POF0The process in (1) is loaded from the kernel, POF0Return to POFoffA state;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF0And starting to operate;
step 3, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
Example 6
In an embodiment of the present invention, based on embodiment 5, the kernel loads a process and loads a new process, including the steps of:
step 1, TS0The countdown is over and a new round of TS begins0Counting down, SIC to POF1The process in (1) is loaded from the kernel, POF1Return to POFoffA state;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF1And starting to operate;
step 3, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
Example 7
In an embodiment of the present invention, based on embodiment 6, the kernel loads a process and loads a new process, including the steps of:
step 1, TS0The countdown is over and a new round of TS begins0Counting down, SIC to POF2The process in (1) is loaded from the kernel, POF2Return to POFoffA state;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF2And starting to operate;
step 3, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
Example 8
In an embodiment of the present invention, based on embodiment 7, the kernel loads a process and loads a new process, including the steps of:
step 1, TS0The countdown is over and a new round of TS begins0Counting down, SIC to POF3The process in (1) is loaded from the kernel, POF3Return to POFoffA state;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF3And starting to operate;
step 3, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
Example 9
In an embodiment of the present invention, based on embodiment 8, the kernel loads a process and loads a new process, including the steps of:
step 1, TS0The countdown is over and a new round of TS begins0Counting down, SIC to POF0The process in (1) is loaded from the kernel, POF0Return to POFoffA state;
step 2, SIC obtains a process from the system bus controller, and loads the process to the POF0And starting to operate;
step 3, all POFs are in the POFonState, SIC waiting for TS0The countdown is complete.
It should be noted that the above embodiments can be freely combined as necessary. The foregoing is only a preferred embodiment of the present invention, and it should be noted that, for those skilled in the art, various modifications and decorations can be made without departing from the principle of the invention, and these modifications and decorations should also be regarded as the protection scope of the present invention.

Claims (5)

1. Implementing a processor core circuit structure, defined as a POF, in a circuit structure characterized by:
all pipeline hardware resources which can only be used by a single process at the same time are necessarily contained in the POF;
the POF necessarily comprises a pipeline related control module which can only be used by a single process at the same time;
the POF includes a "code buffer" that can be used by only a single process at a time;
the POF includes a "data buffer" that can be used by only a single process at a time;
the POF includes a "code-coupled cache" that can be used by only a single process at a time;
the POF includes "data-coupled buffers" that can be used by only a single process at a time;
the POF includes a system interface adapter module that can be used by only a single process at a time.
2. A core circuit structure according to claim 1, characterized in that:
2 or more POFs are implemented in a single core;
and/or
Hardware resources between any two POFs cannot access each other;
and/or
Each POF has a unique index number;
and/or
Under the control of the SIC, each POF can be loaded with a process or can be unloaded from a POF.
3. According to claim 2, the implementation process is loaded into the kernel and unloaded out of the kernel, characterized in that:
the process occupation loaded into the kernel is currently in the POFoffA POF in a state;
and/or
The process unloading kernel is unloaded from the POF corresponding to the corresponding index number according to the principle of process first-in first-out;
and/or
Under the control of SIC, if there is a POF in the coreoffSIC will continue towards POFoffAnd (6) loading the process.
4. The method of claim 3, wherein the core slice overclocking effect is achieved by:
under the premise that the operating system sets a fixed TS value, the kernel realizes that the process loaded into the kernel can obtain the actual running time of 2 times or more than 2 times of the TS time length and then needs to be loaded out of the kernel.
5. According to claim 3, the kernel that occupies one "system bus controller" port can run 2 or more than 2 processes at the same time, which is characterized in that:
after the kernel is reset from the system, before the kernel loads the first process, 2 or more than 2 processes can be loaded into the kernel;
and/or
The number of processes that can run simultaneously in the kernel is equal to the number of POFs owned in the kernel.
CN202011433645.6A 2020-12-10 2020-12-10 Method for realizing over-frequency of kernel time slice Pending CN112416052A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011433645.6A CN112416052A (en) 2020-12-10 2020-12-10 Method for realizing over-frequency of kernel time slice

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011433645.6A CN112416052A (en) 2020-12-10 2020-12-10 Method for realizing over-frequency of kernel time slice

Publications (1)

Publication Number Publication Date
CN112416052A true CN112416052A (en) 2021-02-26

Family

ID=74776244

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011433645.6A Pending CN112416052A (en) 2020-12-10 2020-12-10 Method for realizing over-frequency of kernel time slice

Country Status (1)

Country Link
CN (1) CN112416052A (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6732138B1 (en) * 1995-07-26 2004-05-04 International Business Machines Corporation Method and system for accessing system resources of a data processing system utilizing a kernel-only thread within a user process
CN1825286A (en) * 2006-03-31 2006-08-30 浙江大学 Threading realizing and threading state transition method for embedded SRAM operating system
US20070283138A1 (en) * 2006-05-31 2007-12-06 Andy Miga Method and apparatus for EFI BIOS time-slicing at OS runtime
JP2009025973A (en) * 2007-07-18 2009-02-05 Sharp Corp Behavioral synthesis device, manufacturing method of semiconductor integrated circuit, behavioral synthesis method, behavioral synthesis control program, and readable storage medium
CN102110017A (en) * 2009-12-24 2011-06-29 杨槐 Processor multi-process technology
CN108984267A (en) * 2018-07-09 2018-12-11 北京东土科技股份有限公司 The microkernel architecture control system and industrial service device of industrial service device
US20200159572A1 (en) * 2016-09-27 2020-05-21 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling
US20200334075A1 (en) * 2016-04-12 2020-10-22 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling in a processing system having at least one processor and shared hardware resources

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6732138B1 (en) * 1995-07-26 2004-05-04 International Business Machines Corporation Method and system for accessing system resources of a data processing system utilizing a kernel-only thread within a user process
CN1825286A (en) * 2006-03-31 2006-08-30 浙江大学 Threading realizing and threading state transition method for embedded SRAM operating system
US20070283138A1 (en) * 2006-05-31 2007-12-06 Andy Miga Method and apparatus for EFI BIOS time-slicing at OS runtime
JP2009025973A (en) * 2007-07-18 2009-02-05 Sharp Corp Behavioral synthesis device, manufacturing method of semiconductor integrated circuit, behavioral synthesis method, behavioral synthesis control program, and readable storage medium
CN102110017A (en) * 2009-12-24 2011-06-29 杨槐 Processor multi-process technology
US20200334075A1 (en) * 2016-04-12 2020-10-22 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling in a processing system having at least one processor and shared hardware resources
US20200159572A1 (en) * 2016-09-27 2020-05-21 Telefonaktiebolaget Lm Ericsson (Publ) Process scheduling
CN108984267A (en) * 2018-07-09 2018-12-11 北京东土科技股份有限公司 The microkernel architecture control system and industrial service device of industrial service device

Similar Documents

Publication Publication Date Title
US11675598B2 (en) Loop execution control for a multi-threaded, self-scheduling reconfigurable computing fabric using a reenter queue
US11567766B2 (en) Control registers to store thread identifiers for threaded loop execution in a self-scheduling reconfigurable computing fabric
US20220188265A1 (en) Loop Thread Order Execution Control of a Multi-Threaded, Self-Scheduling Reconfigurable Computing Fabric
US20190303154A1 (en) Conditional Branching Control for a Multi-Threaded, Self-Scheduling Reconfigurable Computing Fabric
KR102338827B1 (en) Method and apparatus for a preemptive scheduling scheme in a real time operating system
US9779042B2 (en) Resource management in a multicore architecture
US7089340B2 (en) Hardware management of java threads utilizing a thread processor to manage a plurality of active threads with synchronization primitives
JP5789072B2 (en) Resource management in multi-core architecture
EP3571585B1 (en) Method and apparatus for implementing heterogeneous frequency operation and scheduling task of heterogeneous frequency cpu
EP2328076A1 (en) Scheduling in a multicore architecture
US10402223B1 (en) Scheduling hardware resources for offloading functions in a heterogeneous computing system
CN113504985B (en) Task processing method and network equipment
US20110321052A1 (en) Mutli-priority command processing among microcontrollers
US20120284720A1 (en) Hardware assisted scheduling in computer system
US20090183163A1 (en) Task Processing Device
CN112506808B (en) Test task execution method, computing device, computing system and storage medium
US20070157207A1 (en) Hardwired scheduler for low power wireless device processor and method for using the same
US8578384B2 (en) Method and apparatus for activating system components
CN111984402A (en) Unified scheduling monitoring method and system for thread pool
CN110955503B (en) Task scheduling method and device
US9229716B2 (en) Time-based task priority boost management using boost register values
CN112416052A (en) Method for realizing over-frequency of kernel time slice
CN116795503A (en) Task scheduling method, task scheduling device, graphic processor and electronic equipment
US20230096015A1 (en) Method, electronic deviice, and computer program product for task scheduling
US9201688B2 (en) Configuration of asynchronous message processing in dataflow networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination