CN116483545B - Multitasking execution method, device and equipment - Google Patents

Multitasking execution method, device and equipment Download PDF

Info

Publication number
CN116483545B
CN116483545B CN202310723051.6A CN202310723051A CN116483545B CN 116483545 B CN116483545 B CN 116483545B CN 202310723051 A CN202310723051 A CN 202310723051A CN 116483545 B CN116483545 B CN 116483545B
Authority
CN
China
Prior art keywords
type
variable
global variable
global
variables
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310723051.6A
Other languages
Chinese (zh)
Other versions
CN116483545A (en
Inventor
张维
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202310723051.6A priority Critical patent/CN116483545B/en
Publication of CN116483545A publication Critical patent/CN116483545A/en
Application granted granted Critical
Publication of CN116483545B publication Critical patent/CN116483545B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/48Program initiating; Program switching, e.g. by interrupt
    • G06F9/4806Task transfer initiation or dispatching
    • G06F9/4843Task transfer initiation or dispatching by program, e.g. task dispatcher, supervisor, operating system
    • G06F9/4881Scheduling strategies for dispatcher, e.g. round robin, multi-level priority queues
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/52Program synchronisation; Mutual exclusion, e.g. by means of semaphores
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The embodiment of the specification discloses a multi-task execution method, a device and equipment, which can more thoroughly perform environment isolation on global variables, can avoid the problem that variables with high-frequency access caused by fine-granularity lock mutual exclusion access become unreleasable objects, and can support high-efficiency execution of single-core multi-tasks and multi-core multi-tasks. The scheme comprises the following steps: determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage; determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables; according to the thread local storage and the threading template, performing environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables; and performing multi-thread and multi-task execution according to the corresponding thread local variable.

Description

Multitasking execution method, device and equipment
Technical Field
The present disclosure relates to the field of multithreading technologies, and in particular, to a method, an apparatus, and a device for performing multiple tasks.
Background
In different development environments, the use of global variables brings convenience to the developer, but may also bring trouble to the execution of the multitasking, illustrated by CPython as an example.
CPython is a C implementation of the Python language and is widely used in the fields of model training and model reasoning. Because of the wide use of global variables within CPython, single core multi-tasking mechanisms are currently only enabled through global interpreter locks (Global Interpreter Lock, GIL).
However, the current GIL scheme has a great influence on task execution performance, different tasks often correspond to different scenes respectively, and queuing time between corresponding tasks is longer and longer along with diversification and complexity of the scenes.
Based on this, a more efficient multitasking scheme is needed.
Disclosure of Invention
One or more embodiments of the present disclosure provide a method, an apparatus, a device, and a storage medium for performing multiple tasks, so as to solve the following technical problems: there is a need for a more efficient multitasking scheme.
To solve the above technical problems, one or more embodiments of the present specification are implemented as follows:
one or more embodiments of the present disclosure provide a method for performing multiple tasks, including:
Determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
One or more embodiments of the present specification provide a multitasking apparatus comprising:
the global variable division determining module is used for determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
the threading template determining module is used for determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
the global variable environment isolation module is used for carrying out environment isolation processing on the first type global variable and the second type global variable according to thread local storage and the threading template to obtain corresponding thread local variables;
And the multi-task execution module performs multi-thread multi-task execution according to the corresponding thread local variable.
One or more embodiments of the present specification provide a multitasking apparatus comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
One or more embodiments of the present specification provide a non-volatile computer storage medium storing computer-executable instructions configured to:
Determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
The above-mentioned at least one technical solution adopted by one or more embodiments of the present disclosure can achieve the following beneficial effects: by dividing two types of global variables with characteristic differences in a targeted manner and correspondingly constructing a differentiated threading template, global variable environment isolation processing is further carried out, the global variables can be more thoroughly isolated in environment, the problem that variables with high-frequency access caused by fine-grained lock mutual exclusion access become unreleasable objects can be avoided, and the efficient execution of single-core multi-task and the efficient execution of multi-core multi-task can be supported.
Drawings
In order to more clearly illustrate the embodiments of the present description or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described below, it being obvious that the drawings in the following description are only some of the embodiments described in the present description, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow diagram of a method of performing multiple tasks according to one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram illustrating the execution of single-core and multi-core multitasking provided by one or more embodiments of the present disclosure;
fig. 3 is a schematic diagram of a module in a CPython architecture related to variable transformation in a practical application scenario provided in one or more embodiments of the present disclosure;
FIG. 4 is a schematic diagram of a multi-task execution device according to one or more embodiments of the present disclosure;
fig. 5 is a schematic structural diagram of a multitasking device according to one or more embodiments of the present disclosure.
Detailed Description
The embodiment of the specification provides a multitasking method, a multitasking device, multitasking equipment and a storage medium.
In order to make the technical solutions in the present specification better understood by those skilled in the art, the technical solutions in the embodiments of the present specification will be clearly and completely described below with reference to the drawings in the embodiments of the present specification, and it is obvious that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by one of ordinary skill in the art based on the embodiments herein without making any inventive effort, shall fall within the scope of the present application.
As introduced in the background art, the GIL scheme has a great influence on task execution performance, especially, with the intelligent development of the terminal, the intelligent terminal scenes are more and more, the influence on the intelligent terminal performance of the GIL scheme is particularly great, queuing time between scenes is longer and longer, and the use experience of users is further influenced.
In order to improve the performance of multitasking in the CPython environment and other environments where similar problems exist, the applicant has tried the following two approaches during the development process, but none of them completely solves the problem.
First, the GIL is preserved, and the thread local storage is used to realize the environmental isolation of partial global variables, so as to realize single-core multi-task and multi-core multi-task mechanisms. The reason is that the workload is large, the environment of the global variable is difficult to be thoroughly isolated, and for some types of global variables, fine-grained locks are adopted for exclusive access due to high access frequency. However, the type global variable that causes high frequency access may become an unreleasable object, with memory leakage problems.
Second, the GIL is completely removed, and the thread local storage is used to uniformly realize the environment isolation of all global variables. However, this scheme only supports a multi-core multi-tasking mechanism, and does not support a single-core multi-tasking mechanism, single-core performance may be degraded.
In order to thoroughly solve the problem, the application provides a method for rapidly realizing global variable environment isolation based on thread local storage through a threading template of two types of global variables (for example, the two types of global variables are divided into common global variables and type global variables in a CPython scene), which can not only keep a single-core multi-task mechanism through a GIL, but also realize the multi-core multi-task mechanism without depending on the GIL.
Wherein CPython is C implementation of Python language, GIL can ensure CPython only interprets and executes one Python code in a period of time, and execution of other Python codes needs to wait. Single core multitasking refers to: single core multitasking refers to multiple tasks being executed concurrently in the same CPU core, since there is only one core, the tasks need to be queued for time slicing. Multi-core multi-tasking refers to: multiple tasks are concurrently executed at multiple CPU cores. Thread local storage refers to the memory space of each thread itself, and can be understood as a private key value pair, and the main effect is to avoid the overhead of lock contention in multi-thread programming.
The following continues to describe the solution of the present application in detail based on the development ideas introduced above.
Fig. 1 is a flow diagram of a method for performing multiple tasks according to one or more embodiments of the present disclosure. The process can be executed on a user terminal such as a smart phone, a tablet computer and the like, and particularly can exert the performance improvement advantage and the user experience of the scheme, and of course, the process can also be executed on equipment of platforms with multi-task execution requirements such as an e-commerce platform, a payment platform, a navigation platform and the like, for example, a recommendation algorithm engine device, a payment server, a route planning server and the like. Some input parameters or intermediate results in the flow allow for manual intervention adjustments to help improve accuracy.
The flow in fig. 1 includes the following steps:
s102: and determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage.
In one or more embodiments of the present disclosure, the degree of access frequency may be determined empirically, and may not be precisely quantified, for example, empirically, in an actual application, at least a portion of the Type global variables (for example, pytype_type in CPython) may be accessed more frequently than other global variables (for example, mro _str in CPython), which is why the reason that the thread is restricted to the operation of the Type global variables by using fine-grained lock in the solution tried before, so that all or a portion of the Type global variables (for example, the portion of the Type global variables accessed at high frequency) may be classified as the second Type global variables, while global variables other than the Type global variables may be considered as the common global variables that are not accessed frequently enough as the first Type global variables, and the locations where the combined variable setting phases are located may be classified more precisely so as to construct a plurality of thread templates with representatives, in which some types of global variables and some types of global variables may be classified as being within the same Type of global variables and the second Type of global variables.
In the case of precisely quantifying the access frequency of the global variable, the global variable may be divided more dynamically and with finer granularity, for example, the type global variable that is divided into sufficiently high frequency accesses (e.g., the access frequency exceeds a set threshold) is processed with a subsequent scheme, while the remaining type global variables are still controlled with fine granularity locks, and so on.
In one or more embodiments of the present disclosure, taking the CPython environment as an example, in order to comprehensively modify the global variable, the global variable may be extracted as comprehensively as possible from an interpreter, a memory management module, a cache pool, a type system and a C expansion module in the CPython environment, and the extracted global variable is divided into a first type global variable and a second type global variable, so that the omission of critical global variables is avoided, and the risk of memory leakage is avoided.
S104: and determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables.
In one or more embodiments of the present description, the normal processing is declared for the global variable, which may then be used. In order to be more suitable for performing multi-task through multithreading and to perform thread localization transformation on global variables, in order to improve transformation efficiency, a differentiated threading template which is respectively suitable for different types of global variables is provided, so that large quantities of even all global variables can be efficiently subjected to large-scale programming transformation, more thorough environmental isolation is realized, instead of transformation of part of global variables which are easy to process, and for research and development personnel, the workload and development difficulty are reduced.
Different global variables or the characteristics of the threads to be modified possibly have differentiation, and the differentiation characteristics bring about differentiation modification difficulty, so that the global variables can be divided according to the differentiation characteristics, and a plurality of threading templates respectively suitable for different modification difficulties can be constructed in a targeted manner.
For example, the first type of global variable and the second type of global variable may be classified according to where the variable setting phase corresponding to the global variable (the global variable itself or the variable setting phase corresponding to the thread local variable for which it is constructed) is located. Of course, more precisely, the division may be made in connection with the access frequency.
Take the way of dividing into a common global variable and a type global variable as an example. In a threaded template, emphasis is placed on introducing definitions of corresponding thread-local variables for global variables and on manipulating the thread-local variables. The operations include an acquisition operation, a setting operation, and the like.
For thread-local variables constructed for common global variables, the definition phase is relatively simple, while since such variables may be modified multiple times to change the magnitude, multiple setup phases may be required. Based on this, a threading template composed of a declaration section, an acquisition method, and a setting method (more sections may be added if necessary, such as additional operation methods such as remapping, state detection, etc.) can be constructed for the ordinary global variables as an ordinary global variable threading template defined by the declaration section, the setting stage being located at a relatively independent stage after the definition stage.
For thread local variables constructed for type global variables, the definition phase is more complex, and the base class correction operation is needed, the setting phase is fused in the definition phase, and the assignment is basically not re-performed once the setting phase is set. Based on this, a threading template composed of an initializing section and an acquiring method can be constructed for the type global variable as the type global variable threading template. The definition phase is completed by an initialization part, which, for example, specifically includes: determining the corresponding type size and start address, performing type copying to generate a thread local variable for the type copying (variable value setting can be completed through the step), adapting to the type copying result, and performing correction processing on the corresponding type base class.
Therefore, when the threaded template is constructed, according to the phase structure difference of different global variables, differentiated modularization is correspondingly carried out, and a local component part of the decoupled template is constructed, so that on one hand, the abnormal occurrence of logic mutual interference during the subsequent use of the local variables of the threads is prevented, and the efficient positioning and correction during the abnormal occurrence are prevented, and on the other hand, when the logic related to the global variables is structurally changed, the local component part of the template can be reused to efficiently correct the threaded transformation, and the transformation cost during the subsequent possible continuous transformation is reduced.
Similarly, for different types of global variables that are partitioned in other ways, it is necessary to precisely customize the corresponding differentiated threaded templates according to the differentiation of the respective characteristics, so as to have the ability to perform thread localization transformation on all the global variables.
S106: and carrying out environment isolation processing on the first type global variable and the second type global variable according to the thread local storage and the threading template to obtain corresponding thread local variables.
In one or more embodiments of the present description, thread local variables and their operating logic are defined using a threaded template, and the instance storage and maintenance of the thread local variables are implemented using thread local storage, so that, for the same global variable, different threads may have their own thread local variables corresponding to the global variable, and maintain and use the thread local variables in their own thread local storage, respectively, thereby implementing environmental isolation of the global variable.
Specifically, the thread local variable operation on the corresponding global variable is indicated in the threading template, and the threading processing can be performed efficiently by means of the template, including: and respectively constructing thread local storage corresponding to the first type global variable and the second type global variable for a plurality of different threads, and storing and operating the corresponding thread local variable for the corresponding thread in the thread local storage corresponding to each thread according to the threading template for the first type global variable and the second type global variable so as to isolate the same global variable for the plurality of different threads.
S108: and performing multi-thread multi-task execution according to the corresponding thread local variable.
Based on the more comprehensive and thorough localization processing of the global variable thread, the multithreading can use the global variable more flexibly, further execute concurrent multitasking efficiently, and keep the GIL in the scheme, but decide whether to use the GIL or not according to the execution scene.
In one or more embodiments of the present disclosure, for a single core case, single core multitasking may be performed through multithreading according to corresponding thread local variables and GIL, while for a multi-core case, one thread may be respectively run on each core in the multi-core according to corresponding thread local variables, without the thread using GIL, and multi-core multitasking may be performed through a plurality of threads that are run in total by the multi-core.
More intuitively, one or more embodiments of the present description provide a schematic diagram of the execution of single-core and multi-core multitasking, see fig. 2.
In fig. 2, when performing the single-core multi-tasking, although a plurality of threads for performing the tasks cannot simultaneously run the code, in order to prevent the conflict, the thread that successfully acquires the GIL may run the code to perform the task of the thread in a next time slice, and then release the GIL, and the next thread acquires the GIL again, so as to perform the task of the next thread in the next time slice.
Furthermore, when single-core multi-task is executed, the interpreter data space can be isolated for different threads, the different threads are based on the thread local storage obtained by the different threads, the use of the global variable is indirectly realized through the thread local variable, the use of the global variable by other threads is not influenced, and the threads executing the task on each core do not need to acquire the GIL lock.
Based on the multi-task execution mode, online data testing discovers that the execution efficiency of a single core can be ensured, and the waiting time of multi-core execution tasks can be reduced by more than 75%.
By the method of FIG. 1, two types of global variables with characteristic differences are purposefully divided, and differentiated threading templates are correspondingly constructed, so that global variable environment isolation processing is performed, the global variables can be more thoroughly isolated, the problem that variables with high-frequency access caused by fine-granularity lock mutual exclusion access become unreleasable objects can be avoided, and the efficient execution of single-core multi-task and efficient execution of multi-core multi-task can be supported.
Based on the method of fig. 1, the present specification also provides some specific embodiments and extensions of the method, and the following description will proceed.
For ease of understanding, exemplary partial code and comments are presented, employing global variables in the CPython environment.
Taking mro _str as an example, the common global variable comprises a declaration part and a use part before transformation, and the code is as follows:
"// statement section
static PyObject *mro_str = NULL;
Use part
mro = lookup_method((PyObject *)type, "mro",&mro_str);
mro_str = mro;”。
In the reconstruction, a common global variable threading template of 'declaration part + get method (indicating acquisition operation) +set method (indicating setting operation)' is adopted.
After transformation, the codes are as follows:
"// statement section
static PyObject *mro_str = NULL;
The// get method: obtaining a tls value, wherein tls represents thread local storage, and the value is the value of a corresponding thread local variable
PyObject *new_mro_str(void);
The// set method: setting the ts value
void air_set_mro_str(PyObject *mro_str);
Use/use: become get and set call
mro = lookup_method((PyObject *)type, "mro", new_mro_str());
air_set_mro_str(mro); ”。
It can be seen that the method is originally operated directly for the global variable, and after transformation, the method is operated for the local variable corresponding to the global variable.
In one or more embodiments of the present disclosure, when implementing the get method and the set method, further auxiliary logic may be added as needed, for example, a multithreaded switch may be preset to determine whether to allow multithreading to perform tasks currently, and if not (for example, when the multithreaded switch is in an off state), a global variable may be directly used instead of a thread local variable. According to such a concept, intuitively, implementation codes of the get method and the set method are given by way of example:
get method:
“PyObject *new_mro_str(void)
{
if (multithreaded switch on) {
Method for obtaining TLS data
air_tls_pytype *air_pytype = air_get(air_key_pytype);
return air_pytype->air_mro_str;
} else {
If the multithreaded switch is off, obtain the previous global variable data
return mro_str;”。
The set method comprises the following steps:
“void air_set_mro_str(PyObject *mro_str)
{
if (multithreaded switch on) {
Setting TLS data
air_tls_pytype *air_pytype = air_get(air_key_pytype);
air_pytype->air_mro_str = mro_str;
} else {
If the multithreaded switch is off, the previous global variable data is set
mro_str = mro_str;
}
}”。
It can be seen that in the above method implementation example, the threading template contains logic that reads and responds to the state of the multithreaded switch, the logic comprising: when the multithread switch state indicates multithread enabled, the thread local variable corresponding to the global variable is used for multithread, and when the multithread switch state indicates multithread disabled, the thread local variable is not used. Therefore, the scheme of the application has more convenient rollback capability and reduces risks.
The Type global variable takes PyType_Type as an example, and before transformation, the Type global variable comprises a declaration part and a use part, and codes (omitted) are as follows:
"// statement section
PyTypeObject PyType_Type = {
PyVarObject_HEAD_INIT(&PyType_Type, 0)
"type", /* tp_name */
sizeof(PyHeapTypeObject), /* tp_basicsize */
...
offsetof(PyTypeObject, tp_dict), /* tp_dictoffset */
type_init, /* tp_init */
...
PyObject_GC_Del, /* tp_free */
(inquiry)type_is_gc, /* tp_is_gc */
};
Use part
if (!PyArg_ParseTuple(args, "O!|O:super", PyType_Type,&type,&obj));”。
During transformation, a global variable threading template of the type of an initialization part and get method is adopted.
After transformation, the codes are as follows:
"// get method: obtaining the ts value
PyTypeObject *new_PyType_Type(void);
Setting the ts value
void air_tls_PyType_Type(void);
Use/use: become get call
if (!PyArg_ParseTuple(args, "O!|O:super", new_PyType_Type(),&type,&obj));”。
For a specific implementation, the get method may be implemented with reference to the foregoing, mainly by way of example to the code of the initialization part, as follows:
“void air_tls_PyType_Type(void)
{
type size and start address of/(get)
int pytype_size = sizeof(PyType_Type);
void *pytype_start =&PyType_Type;
if(_PyObject_GC_IS_TRACKED(&PyType_Type)) {
pytype_size += sizeof(PyGC_Head);
pytype_start = _Py_AS_GC(&PyType_Type);
}
Copy of// type
tls->pytype = PyMem_Malloc(pytype_size);
memset(tls->pytype, 0, pytype_size);
memcpy(tls->pytype, pytype_start, pytype_size);
+/-correction type base class
if(_PyObject_GC_IS_TRACKED(&PyType_Type)) {
tls->pytype = (typeof(tls->pytype))_Py_AS_TYPE(tls->pytype);
}
Py_TYPE(tls->pytype) = tls->pytype_TYPE;
}”。
The scheme of the application is particularly suitable for modifying global variables on a large scale, and corresponding variables in a plurality of modules are involved in a CPython scene, and intuitively, referring to FIG. 3, FIG. 3 is a schematic diagram of a part of a CPython architecture involved in variable modification in a practical application scene provided by one or more embodiments of the present specification.
In the architecture diagram shown in fig. 3, the gray modules are parts involved in variable transformation, and are mainly divided into 5-part modules: type system, memory management module, garbage collection, cache pool, and C extension library (C extension module).
The type system may involve modules in the data type definition layer.
Memory management and garbage collection may involve an "Eval Stack" module.
The cache pool may involve modules in the data type definition layer, as well as "inporter", "Eval Stack" modules.
The C extension library may refer to "std C module", "builtin C module", "threaded C module" modules.
To better support multi-core multi-tasking, the GIL module may also be eliminated and thread binding modifications made to code execution, type systems, and C modules: the byte code execution needs to be modified to include module import, interpreter state and memory allocation, the type system needs to be modified with the type of the cache or the global definition, and the C module needs to be modified with the structure of the cache or the global definition.
Based on the same thought, one or more embodiments of the present disclosure further provide apparatuses and devices corresponding to the above method, see fig. 4 and fig. 5. The apparatus and device are capable of performing the above method and related alternatives accordingly.
Fig. 4 is a schematic structural diagram of a multitasking device according to one or more embodiments of the present disclosure, where the device includes:
the global variable division determining module 402 determines a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
the threaded template determining module 404 determines differentiated threaded templates respectively constructed for the first type global variable and the second type global variable;
the global variable environment isolation module 406 performs environment isolation processing on the first type global variable and the second type global variable according to thread local storage and the threading template to obtain corresponding thread local variables;
The multi-task execution module 408 performs multi-thread multi-task execution according to the corresponding thread local variable.
Optionally, the global variable partition determining module 402 may partition, before determining the first type of global variable and the second type of global variable that are partitioned according to the access frequency of the global variable and the position of the variable setting stage, a common global variable as the first type of global variable and a type of global variable as the second type of global variable according to the access frequency of the global variable.
Optionally, before the threaded template determining module 404 determines the differentiated threaded templates respectively constructed for the first type global variable and the second type global variable, a threaded template composed of a declaration part, an acquisition method and a setting method is constructed for the common global variable as a common global variable threaded template; and/or the number of the groups of groups,
and constructing a threading template consisting of an initializing part and an acquiring method for the type global variable to serve as the type global variable threading template.
Optionally, the threaded template indicates a thread local variable operation for a corresponding global variable;
The global variable environment isolation module 406 constructs a thread local storage corresponding to the first type global variable and the second type global variable respectively for a plurality of different threads;
and for the first type global variable and the second type global variable, storing and operating the corresponding thread local variable for the corresponding thread in the thread local storage corresponding to each thread according to the threading template so as to isolate the same global variable for the plurality of different threads.
Optionally, the initializing portion specifically includes:
determining the corresponding type size and start address;
making a type copy to generate a thread-local variable for it;
and correcting the corresponding type base class according to the type copying result.
Optionally, the threaded template includes logic to read and respond to a multithreaded switch state;
the logic includes:
under the condition that the multithread switch state represents multithread enabling, using a thread local variable corresponding to a global variable for multithread;
in the case where the multithreaded switch state indicates multithreading is not enabled, the thread local variable is not used.
Optionally, the multitasking execution module 408 executes single-core multitasking via multithreading according to the corresponding thread local variable and global interpreter lock.
Optionally, the multitasking execution module 408 runs a thread on each core in the multi-core according to the corresponding thread local variable, without making the thread use a global interpreter lock, and executes multi-core multitasking through a plurality of threads running in total by the multi-core.
Optionally, the first type of global variable and the second type of global variable are variables in a CPython environment.
Optionally, the first type global variable and the second type global variable are obtained by dividing global variables comprehensively extracted from an interpreter, a memory management module, a cache pool, a type system and a C expansion module in the CPython environment.
Fig. 5 is a schematic structural diagram of a multitasking device according to one or more embodiments of the present disclosure, where the device includes:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to:
determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
Determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
Based on the same considerations, one or more embodiments of the present specification further provide a non-volatile computer storage medium storing computer-executable instructions configured to:
determining a first type global variable and a second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
In the 90 s of the 20 th century, improvements to one technology could clearly be distinguished as improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) or software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.
The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.
The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.
It will be appreciated by those skilled in the art that the present description may be provided as a method, system, or computer program product. Accordingly, the present specification embodiments may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description embodiments may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.
The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for apparatus, devices, non-volatile computer storage medium embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the section of the method embodiments being relevant.
The foregoing describes specific embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims can be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.
The foregoing is merely one or more embodiments of the present description and is not intended to limit the present description. Various modifications and alterations to one or more embodiments of this description will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, or the like, which is within the spirit and principles of one or more embodiments of the present description, is intended to be included within the scope of the claims of the present description.

Claims (19)

1. A method of multitasking comprising:
dividing a common global variable into a first type global variable and a type global variable as a second type global variable according to the access frequency of the global variable and the position of the variable setting stage;
determining the first type global variable and the second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
And performing multi-thread multi-task execution according to the corresponding thread local variable.
2. The method of claim 1, wherein before determining differentiated threaded templates respectively constructed for the first type of global variable and the second type of global variable, the method further comprises:
constructing a threading template consisting of a declaration part, an acquisition method and a setting method for the common global variable, and taking the threading template as the common global variable threading template; and/or the number of the groups of groups,
and constructing a threading template consisting of an initializing part and an acquiring method for the type global variable to serve as the type global variable threading template.
3. The method of claim 2, the threaded template indicating thread-local variable operations for a corresponding global variable;
and performing environment isolation processing on the first type global variable and the second type global variable according to the thread local storage and the threading template to obtain corresponding thread local variables, wherein the method specifically comprises the following steps of:
respectively constructing thread local storage corresponding to the first type global variable and the second type global variable for a plurality of different threads;
and for the first type global variable and the second type global variable, storing and operating the corresponding thread local variable for the corresponding thread in the thread local storage corresponding to each thread according to the threading template so as to isolate the same global variable for the plurality of different threads.
4. The method according to claim 2, the initializing section, in particular, comprising:
determining the corresponding type size and start address;
making a type copy to generate a thread-local variable for it;
and correcting the corresponding type base class according to the type copying result.
5. The method of claim 2, the threaded template comprising logic to read and respond to a multithreaded switch state;
the logic includes:
under the condition that the multithread switch state represents multithread enabling, using a thread local variable corresponding to a global variable for multithread;
in the case where the multithreaded switch state indicates multithreading is not enabled, the thread local variable is not used.
6. The method according to claim 1, wherein said performing multithreading and multitasking according to the corresponding thread local variable comprises:
and executing the single-core multi-task through the multithreading according to the corresponding thread local variable and the global interpreter lock.
7. The method according to claim 1, wherein said performing multithreading and multitasking according to the corresponding thread local variable comprises:
and respectively running a thread on each core in the multi-core according to the corresponding thread local variable, and executing multi-core multi-task through a plurality of threads which are totally run by the multi-core without using a global interpreter lock by the threads.
8. The method according to any one of claims 1-7, wherein the first type of global variable and the second type of global variable are variables in a CPython environment.
9. The method of claim 8, wherein the first type of global variable and the second type of global variable are obtained by dividing global variables comprehensively extracted from an interpreter, a memory management module, a cache pool, a type system and a C extension module in the CPython environment.
10. A multitasking apparatus comprising:
the global variable division determining module is used for dividing common global variables into first-type global variables and second-type global variables according to the access frequency of the global variables and the positions of the variable setting stages;
determining the first type global variable and the second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
the threading template determining module is used for determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
the global variable environment isolation module is used for carrying out environment isolation processing on the first type global variable and the second type global variable according to thread local storage and the threading template to obtain corresponding thread local variables;
And the multi-task execution module performs multi-thread multi-task execution according to the corresponding thread local variable.
11. The apparatus of claim 10, the threaded template determination module to construct a threaded template for the generic global variable consisting of a declaration portion, an acquisition method, and a setting method as a generic global variable threaded template prior to the determining of the differentiated threaded template constructed for the first class global variable and the second class global variable, respectively; and/or the number of the groups of groups,
and constructing a threading template consisting of an initializing part and an acquiring method for the type global variable to serve as the type global variable threading template.
12. The apparatus of claim 11, the threaded template indicating thread-local variable operations for a corresponding global variable;
the global variable environment isolation module is used for respectively constructing thread local storage corresponding to the first type global variable and the second type global variable for a plurality of different threads;
and for the first type global variable and the second type global variable, storing and operating the corresponding thread local variable for the corresponding thread in the thread local storage corresponding to each thread according to the threading template so as to isolate the same global variable for the plurality of different threads.
13. The apparatus of claim 11, the initializing section specifically comprises:
determining the corresponding type size and start address;
making a type copy to generate a thread-local variable for it;
and correcting the corresponding type base class according to the type copying result.
14. The apparatus of claim 11, the threaded template comprising logic to read and respond to a multithreaded switch state;
the logic includes:
under the condition that the multithread switch state represents multithread enabling, using a thread local variable corresponding to a global variable for multithread;
in the case where the multithreaded switch state indicates multithreading is not enabled, the thread local variable is not used.
15. The apparatus of claim 10, the multitasking execution module to execute single-core multitasking via multithreading based on corresponding thread local variables and global interpreter locks.
16. The apparatus of claim 10, the multitasking execution module to execute a multi-core multitasking by running one thread on each of the cores according to corresponding thread local variables, respectively, without having the thread use a global interpreter lock, through a plurality of the threads that the cores are running in total.
17. The apparatus of any one of claims 10 to 16, wherein the first type of global variable and the second type of global variable are variables in a CPython environment.
18. The apparatus of claim 17, wherein the first type of global variable and the second type of global variable are partitioned from global variables comprehensively extracted from an interpreter, a memory management module, a cache pool, a type system and a C-extension module in a CPython environment.
19. A multitasking device comprising:
at least one processor; the method comprises the steps of,
a memory communicatively coupled to the at least one processor; wherein, the liquid crystal display device comprises a liquid crystal display device,
the memory stores instructions executable by the at least one processor to enable the at least one processor to perform:
dividing a common global variable into a first type global variable and a type global variable as a second type global variable according to the access frequency of the global variable and the position of the variable setting stage;
determining the first type global variable and the second type global variable which are divided according to the access frequency degree of the global variable and the position of the variable setting stage;
Determining differentiated threading templates respectively constructed for the first type global variables and the second type global variables;
according to the thread local storage and the threading template, carrying out environment isolation processing on the first type global variable and the second type global variable to obtain corresponding thread local variables;
and performing multi-thread multi-task execution according to the corresponding thread local variable.
CN202310723051.6A 2023-06-19 2023-06-19 Multitasking execution method, device and equipment Active CN116483545B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310723051.6A CN116483545B (en) 2023-06-19 2023-06-19 Multitasking execution method, device and equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310723051.6A CN116483545B (en) 2023-06-19 2023-06-19 Multitasking execution method, device and equipment

Publications (2)

Publication Number Publication Date
CN116483545A CN116483545A (en) 2023-07-25
CN116483545B true CN116483545B (en) 2023-09-29

Family

ID=87227170

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310723051.6A Active CN116483545B (en) 2023-06-19 2023-06-19 Multitasking execution method, device and equipment

Country Status (1)

Country Link
CN (1) CN116483545B (en)

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708090A (en) * 2012-05-16 2012-10-03 中国人民解放军国防科学技术大学 Verification method for shared storage multicore multithreading processor hardware lock
CN106445656A (en) * 2016-09-06 2017-02-22 北京邮电大学 Method and device for realizing thread local storage
CN107231558A (en) * 2017-05-23 2017-10-03 江苏火米互动科技有限公司 A kind of implementation method of the H.264 parallel encoder based on CUDA
CN109240702A (en) * 2018-08-15 2019-01-18 无锡江南计算技术研究所 Quick segmentation addressing configuration and access method under a kind of multithread mode
CN110069243A (en) * 2018-10-31 2019-07-30 上海奥陶网络科技有限公司 A kind of java program threads optimization method
CN114398029A (en) * 2022-01-14 2022-04-26 武汉天喻信息产业股份有限公司 Operating system based on C language virtual machine

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9009726B2 (en) * 2010-12-10 2015-04-14 Microsoft Technology Licensing, Llc Deterministic sharing of data among concurrent tasks using pre-defined deterministic conflict resolution policies
US9513886B2 (en) * 2013-01-28 2016-12-06 Arizona Board Of Regents On Behalf Of Arizona State University Heap data management for limited local memory(LLM) multi-core processors

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102708090A (en) * 2012-05-16 2012-10-03 中国人民解放军国防科学技术大学 Verification method for shared storage multicore multithreading processor hardware lock
CN106445656A (en) * 2016-09-06 2017-02-22 北京邮电大学 Method and device for realizing thread local storage
CN107231558A (en) * 2017-05-23 2017-10-03 江苏火米互动科技有限公司 A kind of implementation method of the H.264 parallel encoder based on CUDA
CN109240702A (en) * 2018-08-15 2019-01-18 无锡江南计算技术研究所 Quick segmentation addressing configuration and access method under a kind of multithread mode
CN110069243A (en) * 2018-10-31 2019-07-30 上海奥陶网络科技有限公司 A kind of java program threads optimization method
CN114398029A (en) * 2022-01-14 2022-04-26 武汉天喻信息产业股份有限公司 Operating system based on C language virtual machine

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Accelerating the performance of Sequence Alignment using High Performance Multicore GPU;Karamjeet Kaur;2019 International Conference on Computational Intelligence and Knowledge Economy (ICCIKE);471-474 *
面向神威・太湖之光的国产异构众核处理器OpenCL编译系统;伍明川;黄磊;刘颖;何先波;冯晓兵;;计算机学报(第10期);64-78 *

Also Published As

Publication number Publication date
CN116483545A (en) 2023-07-25

Similar Documents

Publication Publication Date Title
US10331666B1 (en) Apparatus and method for parallel processing of a query
US7966459B2 (en) System and method for supporting phased transactional memory modes
TWI493452B (en) Binary translation in asymmetric multiprocessor system
US20140157287A1 (en) Optimized Context Switching for Long-Running Processes
US9348594B2 (en) Core switching acceleration in asymmetric multiprocessor system
JP2011070256A (en) Debugger and program
US20040216101A1 (en) Method and logical apparatus for managing resource redistribution in a simultaneous multi-threaded (SMT) processor
CN107111482B (en) Controlling execution of threads in a multithreaded processor
CN107111483B (en) Instruction to control access to shared registers of a multithreaded processor
CN116483545B (en) Multitasking execution method, device and equipment
US9354890B1 (en) Call stack structure for enabling execution of code outside of a subroutine and between call stack frames
US20030074390A1 (en) Hardware to support non-blocking synchronization
CN116107728A (en) Task execution method and device, storage medium and electronic equipment
TWI659361B (en) Method for debugging multiple threads / parallel programs, computer-readable recording medium, and computer program products
US8196123B2 (en) Object model for transactional memory
Li et al. Easyscale: Accuracy-consistent elastic training for deep learning
EP1387266A1 (en) Software pipelining for branching control flow
CN117032999B (en) CPU-GPU cooperative scheduling method and device based on asynchronous running
So et al. Procedure cloning and integration for converting parallelism from coarse to fine grain
CN110764880B (en) Three-state control method based on atomic operation
TWI784049B (en) Transaction nesting depth testing instruction
CN116167437B (en) Chip management system, method, device and storage medium
CN115016948B (en) Resource access method and device, electronic equipment and readable storage medium
US10922128B1 (en) Efficiently managing the interruption of user-level critical sections
CN114297299A (en) Data synchronization method, device, equipment and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant