CN112231102A - Method, device, equipment and product for improving performance of storage system - Google Patents

Method, device, equipment and product for improving performance of storage system Download PDF

Info

Publication number
CN112231102A
CN112231102A CN202011109038.4A CN202011109038A CN112231102A CN 112231102 A CN112231102 A CN 112231102A CN 202011109038 A CN202011109038 A CN 202011109038A CN 112231102 A CN112231102 A CN 112231102A
Authority
CN
China
Prior art keywords
thread
cpu
binding
core
logic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202011109038.4A
Other languages
Chinese (zh)
Inventor
刘伟锋
张在贵
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou Inspur Intelligent Technology Co Ltd
Original Assignee
Suzhou Inspur Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou Inspur Intelligent Technology Co Ltd filed Critical Suzhou Inspur Intelligent Technology Co Ltd
Priority to CN202011109038.4A priority Critical patent/CN112231102A/en
Publication of CN112231102A publication Critical patent/CN112231102A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5061Partitioning or combining of resources
    • G06F9/5066Algorithms for mapping a plurality of inter-dependent sub-tasks onto a plurality of physical CPUs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/46Multiprogramming arrangements
    • G06F9/50Allocation of resources, e.g. of the central processing unit [CPU]
    • G06F9/5083Techniques for rebalancing the load in a distributed system
    • G06F9/5088Techniques for rebalancing the load in a distributed system involving task migration
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/44Arrangements for executing specific programs
    • G06F9/455Emulation; Interpretation; Software simulation, e.g. virtualisation or emulation of application or operating system execution engines
    • G06F9/45533Hypervisors; Virtual machine monitors
    • G06F9/45558Hypervisor-specific management and integration aspects
    • G06F2009/4557Distribution of virtual machine instances; Migration and load balancing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2209/00Indexing scheme relating to G06F9/00
    • G06F2209/50Indexing scheme relating to G06F9/50
    • G06F2209/5018Thread allocation

Landscapes

  • Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention provides a method, a device, equipment and a product for improving the performance of a storage system, wherein the method comprises the following steps: inquiring the configuration of a CPU of a physical server; selecting a bound logic core according to the configuration of the CPU; creating a thread, and binding the created thread with a selected logic core of the CPU; and when the thread is dispatched, the thread is distributed to the bound CPU logic core. The cost of thread migration among a plurality of logic cores is reduced, and the performance of the storage system is improved. The created thread is bound to the appointed CPU logic core, and the thread is distributed to the appointed CPU logic core according to the binding strategy when the operating system schedules the thread, so that the thread migration overhead is reduced, and the performance of the storage system is improved.

Description

Method, device, equipment and product for improving performance of storage system
Technical Field
The invention relates to the technical field of performance improvement of storage systems, in particular to a method, a device, equipment and a product for improving the performance of a storage system.
Background
The server virtualization can convert physical resources into logically manageable virtual resources, integrate a plurality of logical servers into one physical server, realize the purpose of simultaneously operating a plurality of virtual environments, reduce the cost of the server and manage the server more easily and safely.
QEMU-KVM is a common solution for current server virtualization, especially for OpenStack to build private cloud scenarios. The KVM provides virtualization of CPU and memory in kernel space, and the QEMU virtualizes hardware I/O in user space. When the I/O of the VM operating system is intercepted by the KVM, it is handed to QEMU for processing. When the QEMU is connected to the distributed storage system, the backend storage is generally accessed by directly calling the librbd client.
In order to improve the parallel task processing capability, the physical server generally adopts a multi-core CPU of SMP or NUMA, so that a plurality of threads can run on a plurality of logic cores of the CPU simultaneously and in parallel. The operating system may employ a scheduling algorithm and may assign threads to run on the appropriate logical cores taking into account load balancing among the cores. If the load of a logic core is too high, the thread responsible for the logic core is migrated to other logic cores to be executed, and finally, the threads of all processes are approximately uniformly distributed on all cores of the CPU. If the target logical core and the current logical core are not in the same NUMA node or span multiple physical CPUs, the cost of thread migration is high, and the system performance is reduced.
For a physical server, one VM corresponds to one QEMU process, and one QEMU process can create a plurality of parallel threads to process I/O tasks, i.e., read and write access is performed on a storage back end. After the current QEMU calls a librbd client to create an I/O thread, the thread scheduling completely depends on a scheduling strategy of an operating system of a physical server, and when a plurality of QEMU processes or other processes exist on the physical server at the same time, the thread of the librbd can migrate among a plurality of cores of a CPU due to the fact that the I/O of the VM has burstiness, so that the migration cost of a thread logic core is high, and the I/O performance of a storage system is reduced.
Disclosure of Invention
The invention provides a method, a device, equipment and a product for improving the performance of a storage system, aiming at the problems that when a plurality of QEMU processes or other processes exist on a physical server at the same time, because the I/O of a VM has burstiness, a thread of a librbd can migrate among a plurality of cores of a CPU, the migration cost of a thread logic core is higher, and the I/O performance of the storage system is reduced.
The technical scheme of the invention is as follows:
in a first aspect, a technical solution of the present invention provides a method for improving performance of a storage system, including the following steps:
inquiring the configuration of a CPU of a physical server;
selecting a bound logic core according to the configuration of the CPU;
creating a thread, and binding the created thread with a selected logic core of the CPU;
and when the thread is dispatched, the thread is distributed to the bound CPU logic core. The cost of thread migration among a plurality of logic cores is reduced, and the performance of the storage system is improved.
Further, the step of creating a thread and binding the created thread with the selected logical core of the CPU includes:
writing the selected logic core into a configuration file;
creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and a binding parameter in the configuration file;
setting the hard affinity of the created thread to the configured logical core effects binding of the logical core. The created thread is bound to a specified CPU logical core.
Further, the step of querying the configuration of the physical server CPU comprises:
acquiring the number of NUMA nodes, the number of CPUs (central processing units) of each node, namely the number of physical cores, and the number of logical cores of each CPU.
Further, the step of selecting the bound logic core according to the configuration of the CPU includes:
judging according to the configuration of the CPU;
if a plurality of NUMA nodes exist, the CPU where the bound logic core is located in the same NUMA node;
if a plurality of physical cores exist, judging whether to start a hyper-thread, and if not, binding adjacent logic cores; if yes, the bound logic cores are distributed on the same physical core. When the operating system schedules the threads, the threads are distributed to the appointed CPU logic cores according to the binding strategy, so that the thread migration overhead is reduced, and the performance of the storage system is improved.
In a second aspect, the technical solution of the present invention provides a device for improving performance of a storage system, including an inquiry module, a selection module, a binding module, and a processing module;
the query module is used for querying the configuration of the CPU of the physical server;
the selection module is used for selecting the bound logic core according to the configuration of the CPU;
the binding module is used for creating a thread and binding the created thread with the selected logic core of the CPU;
and the processing module is used for distributing the threads to the bound CPU logic cores when the threads are dispatched.
Furthermore, the binding module comprises a writing unit, an analyzing and reading unit and a binding unit;
the write-in unit is used for writing the selected logic core into a configuration file;
the analysis reading unit is used for creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and binding parameters in the configuration file;
and the binding unit is used for setting the hard affinity of the created thread as the configured logic core to realize the binding of the logic core.
Furthermore, the query module comprises a node number acquisition unit, a physical core number acquisition unit and a logic core number acquisition unit;
a node number obtaining unit for obtaining the number of NUMA nodes;
a physical core number obtaining unit configured to obtain the number of CPUs, i.e., the number of physical cores, of each node;
and the logic core number acquisition unit is used for acquiring the number of the logic cores of each CPU.
Further, the selection module comprises a judgment unit and a selection unit;
the judging unit is used for judging according to the configuration of the CPU; the method is also used for judging whether to start the hyper-thread if a plurality of physical cores exist;
the selecting unit is used for selecting the CPU where the bound logic core is located in the same NUMA node if the judging unit judges that a plurality of NUMA nodes exist; the judging unit is also used for selecting and binding adjacent logic cores if the judging unit judges that the hyper-thread is not started; if the judging unit judges that the hyper-thread is started, the logic cores bound firstly are distributed on the same physical core.
In a third aspect, the present invention further provides an electronic device, including a memory and a processor, where the memory and the processor complete communication with each other through a bus; the memory stores program instructions executable by the processor, the processor invoking the program instructions to perform the method of improving performance of a storage system according to the first aspect.
In a fourth aspect, the present invention also provides a computer program product, which includes a computer program stored on a non-transitory computer-readable storage medium, where the computer program includes program instructions, and when the program instructions are executed by a computer, the computer executes the method for improving the performance of a storage system according to the first aspect.
According to the technical scheme, the invention has the following advantages: the created thread is bound to the appointed CPU logic core, and the thread is distributed to the appointed CPU logic core according to the binding strategy when the operating system schedules the thread, so that the thread migration overhead is reduced, and the performance of the storage system is improved.
In addition, the invention has reliable design principle, simple structure and very wide application prospect.
Therefore, compared with the prior art, the invention has prominent substantive features and remarkable progress, and the beneficial effects of the implementation are also obvious.
Drawings
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present invention, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained based on these drawings without creative efforts.
FIG. 1 is a schematic flow diagram of a method of one embodiment of the invention.
Fig. 2 is a schematic block diagram of an apparatus of one embodiment of the present invention.
Fig. 3 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
In order to make those skilled in the art better understand the technical solution of the present invention, the technical solution in the embodiment of the present invention will be clearly and completely described below with reference to the drawings in the embodiment of the present invention, and it is obvious that the described embodiment is only a part of the embodiment of the present invention, and not all embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention. The following explains key terms appearing in the present invention.
As shown in fig. 1, the technical solution of the present invention provides a method for improving performance of a storage system, including the following steps:
s1: inquiring the configuration of a CPU of a physical server;
s2: selecting a bound logic core according to the configuration of the CPU;
s3: creating a thread, and binding the created thread with a selected logic core of the CPU;
s4: and when the thread is dispatched, the thread is distributed to the bound CPU logic core.
In some embodiments, the step of creating a thread and binding the created thread with the selected logical core of the CPU in step S3 includes:
s31: writing the selected logic core into a configuration file;
s32: creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and a binding parameter in the configuration file;
s33: setting the hard affinity of the created thread to the configured logical core effects binding of the logical core. The created thread is bound to a specified CPU logical core. The logical core selected in step S2 is written into the configuration file of librbd. The method comprises the following specific steps:
Bind_mask_flag=true(false)
bind _ mask0 ═ 0xF 000-63 logical cores
64 bits unsigned number, 1bit represents 1 logical core ID, example: the 10-system 3840-16-system 0xF 00-2-system 111100000000, the 8 th to 11 th bits are 1, and the 8 th to 11 th logic cores are bound to
Bind _ mask 1-0 x 064-127 logical core
Bind _ mask 2-0 x 0128-191 logic core
Bind _ mask 3-0 x 0192-255 logical core
In step S1, the step of querying the configuration of the physical server CPU includes:
acquiring the number of NUMA nodes, the number of CPUs (central processing units) of each node, namely the number of physical cores, and the number of logical cores of each CPU.
In some embodiments, the step of selecting the bound logical core according to the configuration of the CPU in step S2 includes:
s21: judging according to the configuration of the CPU;
s22: if a plurality of NUMA nodes exist, the CPU where the bound logic core is located in the same NUMA node;
s23: if a plurality of physical cores exist, judging whether to start the hyper-thread, if not, executing step S24, and if so, executing step S25;
s24: binding adjacent logic cores;
s25: the bound logical cores are distributed on the same physical core. When the operating system schedules the threads, the threads are distributed to the appointed CPU logic cores according to the binding strategy, so that the thread migration overhead is reduced, and the performance of the storage system is improved. It should be noted here that the logical core bound by librbd is selected. And if a plurality of NUMA nodes exist, the CPU where the bound logic core is located is in the same NUMA node. If a plurality of physical cores exist and the hyper-thread is not started, the non-started hyper-thread is a physical core, namely a logic core, and adjacent logic cores are bound. If the physical core opens the hyper-thread, the bound logical cores are distributed on the same physical core. It should be noted that, a physical core should have two logical cores when the hyper-thread is turned on.
In step S4, it should be noted that, when the QEMU calls the librbd to create the I/O thread, the librbd parses the configuration file, reads the core binding switch and the core binding parameters, and sets the hard affinity of the created thread as the configured logical core. The cost of thread migration among a plurality of logic cores is reduced, and the performance of the storage system is improved.
As shown in fig. 2, the technical solution of the present invention provides a device for improving performance of a storage system, including an inquiry module, a selection module, a binding module, and a processing module;
the query module is used for querying the configuration of the CPU of the physical server;
the selection module is used for selecting the bound logic core according to the configuration of the CPU;
the binding module is used for creating a thread and binding the created thread with the selected logic core of the CPU;
and the processing module is used for distributing the threads to the bound CPU logic cores when the threads are dispatched.
In some embodiments, the binding module includes a writing unit, a parsing reading unit, and a binding unit;
the write-in unit is used for writing the selected logic core into a configuration file;
the analysis reading unit is used for creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and binding parameters in the configuration file;
and the binding unit is used for setting the hard affinity of the created thread as the configured logic core to realize the binding of the logic core.
In some embodiments, the query module includes a node number obtaining unit, a physical core number obtaining unit, and a logical core number obtaining unit;
a node number obtaining unit for obtaining the number of NUMA nodes;
a physical core number obtaining unit configured to obtain the number of CPUs, i.e., the number of physical cores, of each node;
and the logic core number acquisition unit is used for acquiring the number of the logic cores of each CPU.
In some embodiments, the selecting module comprises a judging unit and a selecting unit;
the judging unit is used for judging according to the configuration of the CPU; the method is also used for judging whether to start the hyper-thread if a plurality of physical cores exist;
the selecting unit is used for selecting the CPU where the bound logic core is located in the same NUMA node if the judging unit judges that a plurality of NUMA nodes exist; the judging unit is also used for selecting and binding adjacent logic cores if the judging unit judges that the hyper-thread is not started; if the judging unit judges that the hyper-thread is started, the logic cores bound firstly are distributed on the same physical core.
As shown in fig. 3, an embodiment of the present invention provides an electronic device, which may include: the system comprises a processor (processor), a communication Interface (communication Interface), a memory (memory) and a bus, wherein the processor, the communication Interface and the memory are communicated with each other through the bus. The bus may be used for information transfer between the electronic device and the sensor. The processor may call logic instructions in memory to perform the following method: s1: inquiring the configuration of a CPU of a physical server; s2: selecting a bound logic core according to the configuration of the CPU; s3: creating a thread, and binding the created thread with a selected logic core of the CPU; s4: and when the thread is dispatched, the thread is distributed to the bound CPU logic core.
In addition, the logic instructions in the memory may be implemented in the form of software functional units and may be stored in a computer readable storage medium when sold or used as a stand-alone product. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
Embodiments of the present invention also provide a computer program product comprising a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions that, when executed by a computer, cause the computer to perform the method of the above method embodiments, for example, comprising: s1: inquiring the configuration of a CPU of a physical server; s2: selecting a bound logic core according to the configuration of the CPU; s3: creating a thread, and binding the created thread with a selected logic core of the CPU; s4: and when the thread is dispatched, the thread is distributed to the bound CPU logic core.
Although the present invention has been described in detail by referring to the drawings in connection with the preferred embodiments, the present invention is not limited thereto. Various equivalent modifications or substitutions can be made on the embodiments of the present invention by those skilled in the art without departing from the spirit and scope of the present invention, and these modifications or substitutions are within the scope of the present invention/any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims (10)

1. A method for improving performance of a storage system, comprising the steps of:
inquiring the configuration of a CPU of a physical server;
selecting a bound logic core according to the configuration of the CPU;
creating a thread, and binding the created thread with a selected logic core of the CPU;
and when the thread is dispatched, the thread is distributed to the bound CPU logic core.
2. The method of claim 1, wherein the step of creating a thread and binding the created thread to the selected logical core of the CPU comprises:
writing the selected logic core into a configuration file;
creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and a binding parameter in the configuration file;
setting the hard affinity of the created thread to the configured logical core effects binding of the logical core.
3. The method for improving the performance of a storage system according to claim 1, wherein the step of querying the configuration of the physical server CPU comprises:
acquiring the number of NUMA nodes, the number of CPUs (central processing units) of each node, namely the number of physical cores, and the number of logical cores of each CPU.
4. The method of claim 3, wherein the step of selecting the bound logic core according to the configuration of the CPU comprises:
judging according to the configuration of the CPU;
if a plurality of NUMA nodes exist, the CPU where the bound logic core is located in the same NUMA node;
if a plurality of physical cores exist, judging whether to start a hyper-thread, and if not, binding adjacent logic cores; if yes, the bound logic cores are distributed on the same physical core.
5. A device for improving the performance of a storage system is characterized by comprising an inquiry module, a selection module, a binding module and a processing module;
the query module is used for querying the configuration of the CPU of the physical server;
the selection module is used for selecting the bound logic core according to the configuration of the CPU;
the binding module is used for creating a thread and binding the created thread with the selected logic core of the CPU;
and the processing module is used for distributing the threads to the bound CPU logic cores when the threads are dispatched.
6. The apparatus for improving the performance of a storage system according to claim 5, wherein the binding module includes a writing unit, a parsing reading unit, and a binding unit;
the write-in unit is used for writing the selected logic core into a configuration file;
the analysis reading unit is used for creating a thread, analyzing a configuration file when the thread is created, and reading a binding switch and binding parameters in the configuration file;
and the binding unit is used for setting the hard affinity of the created thread as the configured logic core to realize the binding of the logic core.
7. The apparatus for improving performance of a storage system according to claim 6, wherein the query module includes a node number obtaining unit, a physical core number obtaining unit, and a logical core number obtaining unit;
a node number obtaining unit for obtaining the number of NUMA nodes;
a physical core number obtaining unit configured to obtain the number of CPUs, i.e., the number of physical cores, of each node;
and the logic core number acquisition unit is used for acquiring the number of the logic cores of each CPU.
8. The apparatus for improving the performance of a storage system according to claim 7, wherein the selecting module comprises a determining unit and a selecting unit;
the judging unit is used for judging according to the configuration of the CPU; the method is also used for judging whether to start the hyper-thread if a plurality of physical cores exist;
the selecting unit is used for selecting the CPU where the bound logic core is located in the same NUMA node if the judging unit judges that a plurality of NUMA nodes exist; the judging unit is also used for selecting and binding adjacent logic cores if the judging unit judges that the hyper-thread is not started; if the judging unit judges that the hyper-thread is started, the logic cores bound firstly are distributed on the same physical core.
9. An electronic device is characterized by comprising a memory and a processor, wherein the memory and the processor are communicated with each other through a bus; the memory stores program instructions executable by the processor, the processor calling the program instructions to perform the method of improving performance of a storage system according to any one of claims 1 to 4.
10. A computer program product, characterized in that the computer program product comprises a computer program stored on a non-transitory computer-readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to carry out the method of improving the performance of a storage system according to any one of claims 1 to 4.
CN202011109038.4A 2020-10-16 2020-10-16 Method, device, equipment and product for improving performance of storage system Withdrawn CN112231102A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011109038.4A CN112231102A (en) 2020-10-16 2020-10-16 Method, device, equipment and product for improving performance of storage system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011109038.4A CN112231102A (en) 2020-10-16 2020-10-16 Method, device, equipment and product for improving performance of storage system

Publications (1)

Publication Number Publication Date
CN112231102A true CN112231102A (en) 2021-01-15

Family

ID=74117675

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011109038.4A Withdrawn CN112231102A (en) 2020-10-16 2020-10-16 Method, device, equipment and product for improving performance of storage system

Country Status (1)

Country Link
CN (1) CN112231102A (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860530A (en) * 2021-01-27 2021-05-28 中山大学 Method for improving parallelization NumPy calculation performance by utilizing non-uniform memory access architecture characteristics
CN113176950A (en) * 2021-04-09 2021-07-27 杭州迪普科技股份有限公司 Message processing method, device, equipment and computer readable storage medium
CN113672373A (en) * 2021-08-30 2021-11-19 浙江大华技术股份有限公司 Thread binding method and device and electronic equipment
CN113821174A (en) * 2021-09-26 2021-12-21 迈普通信技术股份有限公司 Storage processing method, device, network card equipment and storage medium
CN117971441A (en) * 2024-04-01 2024-05-03 之江实验室 High-performance thread model implementation method and device for all-in-one machine

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112860530A (en) * 2021-01-27 2021-05-28 中山大学 Method for improving parallelization NumPy calculation performance by utilizing non-uniform memory access architecture characteristics
CN112860530B (en) * 2021-01-27 2022-09-27 中山大学 Method for improving parallelization NumPy calculation performance by utilizing non-uniform memory access architecture characteristics
CN113176950A (en) * 2021-04-09 2021-07-27 杭州迪普科技股份有限公司 Message processing method, device, equipment and computer readable storage medium
CN113176950B (en) * 2021-04-09 2023-10-27 杭州迪普科技股份有限公司 Message processing method, device, equipment and computer readable storage medium
CN113672373A (en) * 2021-08-30 2021-11-19 浙江大华技术股份有限公司 Thread binding method and device and electronic equipment
CN113821174A (en) * 2021-09-26 2021-12-21 迈普通信技术股份有限公司 Storage processing method, device, network card equipment and storage medium
CN113821174B (en) * 2021-09-26 2024-03-22 迈普通信技术股份有限公司 Storage processing method, storage processing device, network card equipment and storage medium
CN117971441A (en) * 2024-04-01 2024-05-03 之江实验室 High-performance thread model implementation method and device for all-in-one machine
CN117971441B (en) * 2024-04-01 2024-06-11 之江实验室 High-performance thread model implementation method and device for all-in-one machine

Similar Documents

Publication Publication Date Title
CN112231102A (en) Method, device, equipment and product for improving performance of storage system
CN113243005B (en) Performance-based hardware emulation in an on-demand network code execution system
US10373284B2 (en) Capacity reservation for virtualized graphics processing
US10416996B1 (en) System and method for translating affliction programming interfaces for cloud platforms
US9026630B2 (en) Managing resources in a distributed system using dynamic clusters
US10649790B1 (en) Multithreaded rendering for virtualized graphics processing
US9807152B2 (en) Distributed processing device and distributed processing system as well as distributed processing method
US10362097B1 (en) Processing an operation with a plurality of processing steps
WO2021022964A1 (en) Task processing method, device, and computer-readable storage medium based on multi-core system
AU2021104528A4 (en) Task scheduling and load balancing in cloud computing using firefly algorithm
CN114385351A (en) Cloud management platform load balancing performance optimization method, device, equipment and medium
Mengistu et al. Scalability in distributed multi-agent based simulations: The jade case
Markthub et al. Using rcuda to reduce gpu resource-assignment fragmentation caused by job scheduler
Liu et al. Scheduling parallel jobs using migration and consolidation in the cloud
CN116795492A (en) Resource scheduling method, device and equipment of cloud platform and readable storage medium
CN112912849B (en) Graph data-based calculation operation scheduling method, system, computer readable medium and equipment
CN116302307A (en) Multi-virtual machine migration method, device, equipment and medium
CN113821174B (en) Storage processing method, storage processing device, network card equipment and storage medium
Satoh 5G-enabled edge computing for MapReduce-based data pre-processing
CN114610485A (en) Resource processing system and method
CN114253709A (en) Load scheduling method and system
CN112764837A (en) Data reporting method, device, storage medium and terminal
CN112463748A (en) Storage system file lock identification method, system, terminal and storage medium
KR101542605B1 (en) Parallel processing apparatus and processing apparatus for semantic heterogeneity of ontology matching
CN113282405B (en) Load adjustment optimization method and terminal

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20210115