CN116954721B - Asynchronous non-blocking splitting method for multi-modal operator of actuator - Google Patents

Asynchronous non-blocking splitting method for multi-modal operator of actuator Download PDF

Info

Publication number
CN116954721B
CN116954721B CN202311211785.2A CN202311211785A CN116954721B CN 116954721 B CN116954721 B CN 116954721B CN 202311211785 A CN202311211785 A CN 202311211785A CN 116954721 B CN116954721 B CN 116954721B
Authority
CN
China
Prior art keywords
operator
operators
splitting
sub
blocking
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202311211785.2A
Other languages
Chinese (zh)
Other versions
CN116954721A (en
Inventor
柳婉静
王东江
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tianjin Nankai University General Data Technologies Co ltd
Original Assignee
Tianjin Nankai University General Data Technologies Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tianjin Nankai University General Data Technologies Co ltd filed Critical Tianjin Nankai University General Data Technologies Co ltd
Priority to CN202311211785.2A priority Critical patent/CN116954721B/en
Publication of CN116954721A publication Critical patent/CN116954721A/en
Application granted granted Critical
Publication of CN116954721B publication Critical patent/CN116954721B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30003Arrangements for executing specific machine instructions
    • G06F9/30007Arrangements for executing specific machine instructions to perform operations on data operands
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/30094Condition code generation, e.g. Carry, Zero flag
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3836Instruction issuing, e.g. dynamic instruction scheduling or out of order instruction execution
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The application provides an asynchronous non-blocking splitting method of an executor multi-mode operator, which belongs to the technical field of computers and comprises the following steps: dynamically adjusting the splitting strategy of the operator by using an operator splitting method, splitting the operator with the execution time exceeding the standard value, and forming a sub operator with shorter execution time; and (3) utilizing a loading strategy to comprehensively judge the priorities of operators and sub operators formed after splitting, preferentially loading operators and sub operators which are more preferentially required to be executed, and then evaluating the relation between the operators and the sub operators, marking the operators and the sub operators which can be executed simultaneously for use in parallel execution. The application has the beneficial effects that: by carrying out dynamic asynchronous non-blocking splitting on operators of the executor and combining the application of intelligent loading strategy and multithreading technology, simultaneous execution of multi-mode operators is realized, performance bottlenecks caused by different execution time of operators are avoided, and the overall execution efficiency of the equipment is improved. The technology has wide application prospect and market prospect.

Description

Asynchronous non-blocking splitting method for multi-modal operator of actuator
Technical Field
The application belongs to the field of computers, and particularly relates to an asynchronous non-blocking splitting method for an actuator multi-mode operator.
Background
In modern computer systems, an actuator is an important computer component for performing various tasks. In an actuator, operators are a common type of computational unit that performs various computational operations. In an actuator, different computing operations are typically implemented using multi-modal operators. However, in the actuator, performance bottlenecks of the actuator may be caused due to different execution times of operators, which affects response speed and efficiency of the system.
Disclosure of Invention
In view of the above, the present application aims to provide an asynchronous non-blocking splitting method for multi-modal operators of an actuator, so as to implement simultaneous execution of the multi-modal operators by performing asynchronous non-blocking splitting on the operators of the actuator, thereby improving the execution efficiency of an execution engine.
The asynchronous non-blocking splitting method adopted by the technology splits the multi-mode operator into a plurality of sub operators and enables the sub operators to be executed in parallel, so that performance bottlenecks caused by different operator execution time are avoided, and further the execution efficiency of an execution engine is improved. In addition, when a plurality of sub operators are executed in parallel, a more intelligent loading strategy can be adopted, and small operators which are more preferentially required to be executed are loaded preferentially, so that the execution efficiency of an execution engine is not affected. In addition, the technology also adopts the multithreading technology application, and further improves the algorithm execution speed.
In order to achieve the above purpose, the technical scheme of the application is realized as follows:
an asynchronous non-blocking splitting method of an actuator multi-modal operator, comprising: an operator splitting method is adopted, a splitting strategy of an operator is dynamically adjusted, and the operator with the execution time exceeding a standard value is split to split the operator into sub operators with reasonable execution time, so that a large operator with long execution time is avoided, wherein the standard value can be preset according to requirements;
the loading strategy is adopted, the priorities of operators and sub operators after splitting are comprehensively judged, and operators and sub operators which need to be executed are loaded preferentially; evaluating the relation between operators and sub-operators, marking operators/sub-operators which can be executed simultaneously, and using the operators/sub-operators in parallel;
the operator and sub-operator are executed in parallel using a multithreading technique. The operator and the sub operator which can be evaluated to be executed simultaneously in the loading strategy are executed in parallel, so that the overall execution time is reduced, and the efficiency is improved;
and an asynchronous non-blocking mode is adopted to execute operators or sub-operators, so that the blocking waiting time is reduced.
The loading strategy is further that an operator splitting method is adopted to dynamically adjust the splitting strategy of the operator, and the method comprises the following steps:
the operator is split into a plurality of sub-operators with different quantity levels according to the comprehensive condition of input data, the calculation process of the operator is decomposed into a plurality of independent calculation steps, and each calculation step is regarded as one sub-operator and can be independently executed.
How to split the operators is determined according to the comprehensive situation, wherein the comprehensive situation comprises the quantity, the type, the characteristics and the like of input data, one of the following methods can be adopted:
number-based splitting: the operator is split into a plurality of sub-operators according to the amount of input data. For example, if the amount of input data is large, the operator may be split into multiple sub-operators, each sub-data set being processed in parallel. The number of split levels may be determined based on the size of the data or a preset threshold.
Type-based splitting: the operator is split into a plurality of sub-operators according to the type of input data. Different types of data may require different processing methods or algorithms. For example, if the input data contains an image and text, the operator may be split into two sub-operators, processing the image and text data separately.
Feature-based splitting: the operator is split into a plurality of sub-operators according to the characteristics of the input data. If certain features have an effect on the way the operator is handled, the operator may be split into a number of sub-operators based on these features. The sub-operators may be partitioned according to a range of values, intervals, or other attributes of the feature.
Comprehensive splitting: combining information such as quantity, type, characteristics and the like, the operator is comprehensively considered to be split into a plurality of sub-operators. A splitting strategy may be formulated based on a number of factors, such as splitting into groups based on the amount of input data first, and then further splitting within each group based on type or characteristics.
The specific splitting strategy depends on the specific application scenario and requirements. Depending on the practical problem, a suitable splitting method is chosen to process the input data to achieve a more efficient algorithm and better performance.
Further, the parallel execution of the operators by using the multithreading technology comprises the following steps:
multiple threads are managed by using a thread pool, so that multiple operators are executed in parallel, and the performance and efficiency of an executor are improved.
Further, the executing the operator or the sub-operator by adopting the asynchronous non-blocking mode comprises the following steps:
and an asynchronous callback function is used for realizing an asynchronous non-blocking mode to execute operators or sub-operators, and when one operator or sub-operator is executed, a callback function is called to inform an executor that the operator or sub-operator is executed, and the next operator or sub-operator is executed.
Furthermore, in the algorithm execution process, abnormal conditions are processed in time, so that correct execution of the algorithm is ensured.
Further, the splitting strategy includes: in the execution process of the executor, dynamically adjusting the splitting mode of an operator according to the actual requirement of the model so as to improve the performance and efficiency of the model, wherein the method comprises the following steps:
initial splitting: setting each operator as a single layer when an executor starts to execute, and calculating the gradient of each operator through a back propagation algorithm;
dynamic monitoring: in the execution process, indexes such as gradient size, calculation time consumption and the like of each operator are monitored in real time so as to judge whether operator splitting is needed or not;
splitting decision: judging which operators need to be split according to the monitored indexes to improve the performance and the efficiency, and if the gradient of a certain operator is larger or the calculation time is longer, considering to split the operator into a plurality of sub-operators so as to better utilize hardware resources and accelerate calculation;
splitting operation: according to the splitting decision, operating an operator to be split, splitting the operator into a plurality of sub operators, initializing corresponding parameters, and simultaneously updating forward propagation and backward propagation algorithms of the model to adapt to a new splitting result;
continuing training: after the splitting operation of the operator is finished, the task of the executor is continued, the performance and efficiency index of the model are monitored in real time, and if necessary, the splitting strategy of the operator is continuously adjusted in the subsequent training process, so that the performance and efficiency of the model are further improved.
Further, the loading strategy includes: according to the priority of the operators and the current system resource condition, automatically selecting the operators to be executed for loading;
when the loading strategy is realized, the operator needing to be loaded preferentially is determined by comprehensively judging the priority of the operator, the priority of the operator is divided into a plurality of levels, a weight is distributed for each level, and then when the operator is loaded, the operator needing to be loaded is selected according to the current system resource condition and the priority weight of the operator.
Further, the scheme discloses an electronic device, which comprises a processor and a memory, wherein the memory is in communication connection with the processor and is used for storing executable instructions of the processor, and the processor is used for executing an executor multi-mode operator asynchronous non-blocking splitting method.
Further, the present solution discloses a server comprising at least one processor, and a memory communicatively coupled to the processor, the memory storing instructions executable by the at least one processor, the instructions being executable by the processor to cause the at least one processor to perform an executor multi-modal operator asynchronous non-blocking splitting method.
Further, the present solution discloses a computer readable storage medium storing a computer program which when executed by a processor implements an executor multi-modal operator asynchronous non-blocking splitting method.
Compared with the prior art, the asynchronous non-blocking splitting method for the multi-mode operator of the actuator has the following beneficial effects:
(1) according to the asynchronous non-blocking splitting method for the multi-modal operators of the executor, provided by the asynchronous non-blocking splitting technology for the multi-modal operators of the executor, the operators of the executor are dynamically and asynchronously split, and the intelligent loading strategy and the application of the multi-threading technology are combined, so that the simultaneous execution of the multi-modal operators is realized, the performance bottleneck caused by different execution time of the operators is avoided, and the overall execution efficiency of equipment is improved. The technology has wide application prospect and market prospect;
(2) the asynchronous non-blocking splitting method of the multi-modal operator of the executor can realize simultaneous execution of the multi-modal operator, improve the overall utilization rate of system resources, improve the overall execution efficiency of the execution engine and enhance the intelligent degree of the execution engine. The technology can be widely applied to various intelligent devices, including but not limited to robots, intelligent homes and intelligent manufacturing.
Detailed Description
It should be noted that, without conflict, the embodiments of the present application and features of the embodiments may be combined with each other.
The present application will be described in detail with reference to examples.
An asynchronous non-blocking splitting method of an actuator multi-modal operator, comprising:
an operator splitting method is adopted, and a splitting strategy of an operator is dynamically adjusted;
adopting a loading strategy, comprehensively judging the priority of operators, and preferentially loading operators which are more preferentially required to be executed;
performing operators in parallel by adopting a multithreading technology;
the operators or sub-operators are executed in an asynchronous non-blocking manner.
The operator splitting method is adopted to dynamically adjust the splitting strategy of the operator, and comprises the following steps:
the operator is split into a plurality of sub-operators with different quantity levels according to the comprehensive condition of input data, the calculation process of the operator is decomposed into a plurality of independent calculation steps, and each calculation step is regarded as one sub-operator and can be independently executed.
The parallel execution of operators by adopting the multithreading technology comprises the following steps:
multiple threads are managed by using a thread pool, so that multiple operators are executed in parallel, and the performance and efficiency of an executor are improved.
The executing the operator or the sub-operator by adopting the asynchronous non-blocking mode comprises the following steps:
and an asynchronous callback function is used for realizing an asynchronous non-blocking mode to execute operators or sub-operators, and when one operator or sub-operator is executed, a callback function is called to inform an executor that the operator or sub-operator is executed, and the next operator or sub-operator is executed.
In the algorithm executing process, abnormal conditions are processed in time, and correct execution of the algorithm is ensured.
The splitting strategy comprises: in the execution process of the executor, dynamically adjusting the splitting mode of an operator according to the actual requirement of the model so as to improve the performance and efficiency of the model, wherein the method comprises the following steps:
initial splitting: setting each operator as a single layer when an executor starts to execute, and calculating the gradient of each operator through a back propagation algorithm;
dynamic monitoring: in the execution process, indexes such as gradient size, calculation time consumption and the like of each operator are monitored in real time so as to judge whether operator splitting is needed or not;
splitting decision: judging which operators need to be split according to the monitored indexes to improve the performance and the efficiency, and if the gradient of a certain operator is larger or the calculation time is longer, considering to split the operator into a plurality of sub-operators so as to better utilize hardware resources and accelerate calculation;
splitting operation: according to the splitting decision, operating an operator to be split, splitting the operator into a plurality of sub operators, initializing corresponding parameters, and simultaneously updating forward propagation and backward propagation algorithms of the model to adapt to a new splitting result;
continuing training: after the splitting operation of the operator is finished, the task of the executor is continued, the performance and efficiency index of the model are monitored in real time, and if necessary, the splitting strategy of the operator is continuously adjusted in the subsequent training process, so that the performance and efficiency of the model are further improved.
The loading strategy comprises the following steps: according to the priority of the operators and the current system resource condition, automatically selecting the operators to be executed for loading;
when the loading strategy is realized, the operator needing to be loaded preferentially is determined by comprehensively judging the priority of the operator, the priority of the operator is divided into a plurality of levels, a weight is distributed for each level, and then when the operator is loaded, the operator needing to be loaded is selected according to the current system resource condition and the priority weight of the operator.
Meanwhile, the scheme discloses the electronic equipment which comprises a processor and a memory, wherein the memory is in communication connection with the processor and is used for storing executable instructions of the processor, and the processor is used for executing an executor multi-mode operator asynchronous non-blocking splitting method.
Meanwhile, the scheme discloses a server which comprises at least one processor and a memory which is in communication connection with the processor, wherein the memory stores instructions which can be executed by the at least one processor, and the instructions are executed by the processor so that the at least one processor can execute an executor multi-mode operator asynchronous non-blocking splitting method.
Meanwhile, the scheme discloses a computer readable storage medium which stores a computer program, and the computer program realizes an asynchronous non-blocking splitting method of the multi-mode operator of the executor when being executed by a processor.
Those of ordinary skill in the art will appreciate that the elements and method steps of each example described in connection with the embodiments disclosed herein may be implemented as electronic hardware, computer software, or combinations of both, and that the elements and steps of each example have been described generally in terms of functionality in the foregoing description to clearly illustrate this interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
In the several embodiments provided in the present application, it should be understood that the disclosed methods and systems may be implemented in other ways. For example, the above-described division of units is merely a logical function division, and there may be another division manner when actually implemented, for example, a plurality of units or components may be combined or may be integrated into another system, or some features may be omitted or not performed. The units may or may not be physically separate, and components shown as units may or may not be physical units, may be located in one place, or may be distributed over a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the embodiment of the present application.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present application, and not for limiting the same; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the application, and are intended to be included within the scope of the appended claims and description.
The foregoing description of the preferred embodiments of the application is not intended to be limiting, but rather is intended to cover all modifications, equivalents, alternatives, and improvements that fall within the spirit and scope of the application.

Claims (10)

1. An asynchronous non-blocking splitting method of an actuator multi-modal operator, comprising:
firstly, dynamically adjusting a splitting strategy of an operator by using an operator splitting method, splitting the operator with the execution time exceeding a standard value, and forming a sub operator with shorter execution time;
secondly, comprehensively judging the priorities of operators and sub-operators formed after splitting by using a loading strategy, preferentially loading operators and sub-operators which are more preferentially required to be executed, evaluating the relation between the operators and the sub-operators, marking the operators and the sub-operators which can be executed simultaneously, and using the operators and the sub-operators in parallel;
and finally, performing parallel execution of the operators and the sub-operators by utilizing a multithreading technology, and evaluating the operators and the sub-operators which can be simultaneously executed in a loading strategy, so as to reduce the overall execution time and improve the efficiency.
2. The method for asynchronous non-blocking splitting of an actuator multi-modal operator according to claim 1, wherein the dynamically adjusting the splitting strategy of the operator by using the operator splitting method comprises:
the operator is split into a plurality of sub-operators with different quantity levels according to the comprehensive condition of input data, the calculation process of the operator is decomposed into a plurality of independent calculation steps, and each calculation step is regarded as one sub-operator and can be independently executed.
3. The method of asynchronous non-blocking splitting of an actuator multi-modal operator according to claim 1, wherein the parallel execution of operators using multi-threading techniques comprises:
multiple threads are managed by using a thread pool, so that multiple operators are executed in parallel, and the performance and efficiency of an executor are improved.
4. The method of claim 1, further comprising executing the operators or sub-operators in an asynchronous non-blocking manner, comprising:
and an asynchronous callback function is used for realizing an asynchronous non-blocking mode to execute operators or sub-operators, and when one operator or sub-operator is executed, a callback function is called to inform an executor that the operator or sub-operator is executed, and the next operator or sub-operator is executed.
5. The method for asynchronous non-blocking splitting of an actuator multi-modal operator according to claim 1, wherein: in the algorithm executing process, abnormal conditions are processed in time, and correct execution of the algorithm is ensured.
6. The method of claim 1, wherein the splitting strategy comprises: in the execution process of the executor, dynamically adjusting the splitting mode of an operator according to the actual requirement of the model so as to improve the performance and efficiency of the model, wherein the method comprises the following steps:
initial splitting: setting each operator as a single layer when an executor starts to execute, and calculating the gradient of each operator through a back propagation algorithm;
dynamic monitoring: in the execution process, indexes such as gradient size, calculation time consumption and the like of each operator are monitored in real time so as to judge whether operator splitting is needed or not;
splitting decision: judging which operators need to be split according to the monitored indexes to improve the performance and the efficiency, and if the gradient of a certain operator is larger or the calculation time is longer, considering to split the operator into a plurality of sub-operators so as to better utilize hardware resources and accelerate calculation;
splitting operation: according to the splitting decision, operating an operator to be split, splitting the operator into a plurality of sub operators, initializing corresponding parameters, and simultaneously updating forward propagation and backward propagation algorithms of the model to adapt to a new splitting result;
continuing training: after the splitting operation of the operator is finished, the task of the executor is continued, the performance and efficiency index of the model are monitored in real time, and if necessary, the splitting strategy of the operator is continuously adjusted in the subsequent training process, so that the performance and efficiency of the model are further improved.
7. The method of asynchronous non-blocking splitting of an actuator multi-modal operator according to claim 1, wherein the loading strategy comprises: according to the priority of the operators and the current system resource condition, automatically selecting the operators to be executed for loading;
when the loading strategy is realized, the operator needing to be loaded preferentially is determined by comprehensively judging the priority of the operator, the priority of the operator is divided into a plurality of levels, a weight is distributed for each level, and then when the operator is loaded, the operator needing to be loaded is selected according to the current system resource condition and the priority weight of the operator.
8. An electronic device comprising a processor and a memory communicatively coupled to the processor for storing processor-executable instructions, characterized in that: the processor is configured to perform an actuator multi-modal operator asynchronous non-blocking splitting method according to any of the preceding claims 1-7.
9. A server, characterized by: comprising at least one processor and a memory communicatively coupled to the processor, the memory storing instructions executable by the at least one processor to cause the at least one processor to perform an executor multi-modal operator asynchronous non-blocking splitting method according to any of claims 1-7.
10. A computer-readable storage medium storing a computer program, characterized in that: the computer program, when executed by a processor, implements the method of asynchronous non-blocking splitting of an executor multimodal operator of any of claims 1-7.
CN202311211785.2A 2023-09-20 2023-09-20 Asynchronous non-blocking splitting method for multi-modal operator of actuator Active CN116954721B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311211785.2A CN116954721B (en) 2023-09-20 2023-09-20 Asynchronous non-blocking splitting method for multi-modal operator of actuator

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311211785.2A CN116954721B (en) 2023-09-20 2023-09-20 Asynchronous non-blocking splitting method for multi-modal operator of actuator

Publications (2)

Publication Number Publication Date
CN116954721A CN116954721A (en) 2023-10-27
CN116954721B true CN116954721B (en) 2023-12-15

Family

ID=88462427

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311211785.2A Active CN116954721B (en) 2023-09-20 2023-09-20 Asynchronous non-blocking splitting method for multi-modal operator of actuator

Country Status (1)

Country Link
CN (1) CN116954721B (en)

Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955801A (en) * 2014-05-15 2014-07-30 华北电力大学 Electric power system distributed parallel computing management method based on time-space dimension
CN104239555A (en) * 2014-09-25 2014-12-24 天津神舟通用数据技术有限公司 MPP (massively parallel processing)-based parallel data mining framework and MPP-based parallel data mining method
CN106155635A (en) * 2015-04-03 2016-11-23 北京奇虎科技有限公司 A kind of data processing method and device
CN110069527A (en) * 2019-04-22 2019-07-30 电子科技大学 A kind of GPU and CPU isomery accelerated method of data base-oriented
CN113821208A (en) * 2021-06-18 2021-12-21 清华大学 Compiling optimization method and system for deep learning operator
CN113934410A (en) * 2021-10-19 2022-01-14 北京航空航天大学 Multi-hardware target depth model optimization deployment framework supporting custom operators
CN114217944A (en) * 2021-04-26 2022-03-22 无锡江南计算技术研究所 Dynamic load balancing method for neural network aiming at model parallelism
CN115794874A (en) * 2022-11-16 2023-03-14 华东师范大学 Method for accelerating GPU operator execution in heterogeneous database system and application
CN115827234A (en) * 2022-12-09 2023-03-21 武汉光网信息技术有限公司 Operator scheduling method and device for multi-model training task
CN115904539A (en) * 2022-11-29 2023-04-04 上海燧原科技有限公司 Online generation method, device and equipment of segmentation strategy and storage medium
WO2023071643A1 (en) * 2021-10-29 2023-05-04 华为技术有限公司 Method and apparatus for processing task, electronic device, and medium
CN116089895A (en) * 2021-10-30 2023-05-09 华为技术有限公司 Operator fusion method and device
WO2023093724A1 (en) * 2021-11-24 2023-06-01 华为技术有限公司 Neural network model processing method and device
WO2023134453A1 (en) * 2022-01-17 2023-07-20 华为技术有限公司 Operator processing method and computer device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110633153A (en) * 2019-09-24 2019-12-31 上海寒武纪信息科技有限公司 Method for realizing neural network model splitting by using multi-core processor and related product

Patent Citations (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103955801A (en) * 2014-05-15 2014-07-30 华北电力大学 Electric power system distributed parallel computing management method based on time-space dimension
CN104239555A (en) * 2014-09-25 2014-12-24 天津神舟通用数据技术有限公司 MPP (massively parallel processing)-based parallel data mining framework and MPP-based parallel data mining method
CN106155635A (en) * 2015-04-03 2016-11-23 北京奇虎科技有限公司 A kind of data processing method and device
CN110069527A (en) * 2019-04-22 2019-07-30 电子科技大学 A kind of GPU and CPU isomery accelerated method of data base-oriented
CN114217944A (en) * 2021-04-26 2022-03-22 无锡江南计算技术研究所 Dynamic load balancing method for neural network aiming at model parallelism
CN113821208A (en) * 2021-06-18 2021-12-21 清华大学 Compiling optimization method and system for deep learning operator
CN113934410A (en) * 2021-10-19 2022-01-14 北京航空航天大学 Multi-hardware target depth model optimization deployment framework supporting custom operators
WO2023071643A1 (en) * 2021-10-29 2023-05-04 华为技术有限公司 Method and apparatus for processing task, electronic device, and medium
CN116089895A (en) * 2021-10-30 2023-05-09 华为技术有限公司 Operator fusion method and device
WO2023093724A1 (en) * 2021-11-24 2023-06-01 华为技术有限公司 Neural network model processing method and device
WO2023134453A1 (en) * 2022-01-17 2023-07-20 华为技术有限公司 Operator processing method and computer device
CN115794874A (en) * 2022-11-16 2023-03-14 华东师范大学 Method for accelerating GPU operator execution in heterogeneous database system and application
CN115904539A (en) * 2022-11-29 2023-04-04 上海燧原科技有限公司 Online generation method, device and equipment of segmentation strategy and storage medium
CN115827234A (en) * 2022-12-09 2023-03-21 武汉光网信息技术有限公司 Operator scheduling method and device for multi-model training task

Also Published As

Publication number Publication date
CN116954721A (en) 2023-10-27

Similar Documents

Publication Publication Date Title
US10261806B2 (en) Adaptive hardware configuration for data analytics
CN112416585B (en) Deep learning-oriented GPU resource management and intelligent scheduling method
US20230244537A1 (en) Efficient gpu resource allocation optimization method and system
CN106528065B (en) A kind of thread acquisition methods and equipment
US20210117280A1 (en) Method, device, and computer program product for scheduling backup jobs
WO2024060788A1 (en) Intelligent-computing-oriented adaptive adjustment system and method for pipeline-parallel training
CN114579270A (en) Task scheduling method and system based on resource demand prediction
CN111061565B (en) Two-section pipeline task scheduling method and system in Spark environment
US10713096B2 (en) System and method for handling data skew at run time
CN116954721B (en) Asynchronous non-blocking splitting method for multi-modal operator of actuator
CN109271295B (en) Abnormal operation prediction method in cloud cluster environment
Ibrahim et al. Improving mapreduce performance with progress and feedback based speculative execution
CN116185584A (en) Multi-tenant database resource planning and scheduling method based on deep reinforcement learning
CN112395063B (en) Dynamic multithreading scheduling method and system
CN111930485B (en) Job scheduling method based on performance expression
CN111176847B (en) Method and device for optimizing performance of big data cluster on physical core ultra-multithreading server
US20140173340A1 (en) Incident handling
CN115639762A (en) Intelligent robot scheduling method and device, computing equipment and computer storage medium
US9152451B2 (en) Method of distributing processor loading between real-time processor threads
Murad et al. Priority Based Fair Scheduling: Enhancing Efficiency in Cloud Job Distribution
CN111708799A (en) Spark task processing method and device, electronic equipment and storage medium
CN112052087A (en) Deep learning training system and method for dynamic resource adjustment and migration
CN107479683B (en) Serial computing method oriented to configuration software
CN117453376B (en) Control method, device, equipment and storage medium for high-throughput calculation
US20220188150A1 (en) Method for cpu to execute artificial intelligence related processes

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant