CN116107728B

CN116107728B - Task execution method and device, storage medium and electronic equipment

Info

Publication number: CN116107728B
Application number: CN202310390935.4A
Authority: CN
Inventors: 王宏升; 陈�光
Original assignee: Zhejiang Lab
Current assignee: Zhejiang Lab
Priority date: 2023-04-06
Filing date: 2023-04-06
Publication date: 2023-08-18
Anticipated expiration: 2043-04-06
Also published as: CN116107728A

Abstract

The specification discloses a task execution method, a task execution device, a storage medium and electronic equipment. The task execution method comprises the following steps: the method comprises the steps of obtaining model data of a target model, analyzing the model data, determining an instruction type and an instruction object related to executing a calculation task aiming at the target model, generating each calculation instruction based on the instruction type and the instruction object, determining at least one target unit for executing the calculation instruction in preset calculation units according to distribution information of the instruction object corresponding to each calculation instruction, generating a deduction instruction corresponding to the calculation instruction, generating each physical instruction based on each calculation instruction and the deduction instruction corresponding to each calculation instruction, and sending each physical instruction to the target unit for executing each calculation instruction so as to execute the calculation task aiming at the target model.

Description

Task execution method and device, storage medium and electronic equipment

Technical Field

The present disclosure relates to the field of computer technologies, and in particular, to a task execution method, a task execution device, a storage medium, and an electronic device.

Background

Along with the development of technology, the number of parameters of the artificial intelligent model is gradually increased, and demands on computing units such as a central processing unit (Central Processing Unit, a CPU), a graphic processor (Graphic Processing Unit, a GPU) and the like are also gradually increased, so that along with the appearance of a distributed computing system, a plurality of computing units can be called to jointly execute computing tasks, and thus, the computing demands are met.

However, current distributed computing systems have low execution efficiency in the process of invoking computing units to perform computing tasks through computing instructions, which severely affects the use and deployment of models.

Therefore, how to improve the execution efficiency of the computing task is a problem to be solved.

Disclosure of Invention

The present disclosure provides a task execution method, a task execution device, a storage medium, and an electronic device, so as to partially solve the foregoing problems in the prior art.

The technical scheme adopted in the specification is as follows:

the specification provides a task execution method, which comprises the following steps:

obtaining model data of a target model;

analyzing the model data, determining the instruction type and the instruction object involved in executing the calculation task aiming at the target model, generating each calculation instruction based on the instruction type and the instruction object, and determining the allocation information of each calculation instruction corresponding to the instruction object;

For each calculation instruction, determining at least one calculation unit for executing the calculation instruction in preset calculation units according to the allocation information of an instruction object corresponding to the calculation instruction, and generating a deduction instruction corresponding to the calculation instruction, wherein the deduction instruction is used for controlling the target unit to deduce the memory capacity required by a calculation result obtained after the calculation instruction is executed;

based on each calculation instruction and a deduction instruction corresponding to each calculation instruction, each physical instruction is generated and sent to a target unit for executing each calculation instruction, so that a calculation task aiming at the target model is executed through the target unit for executing each calculation instruction.

Optionally, the method further comprises:

adding a preset instruction queue to each calculation instruction;

executing a calculation task aiming at the target model through a target unit executing each calculation instruction, wherein the calculation task specifically comprises the following steps:

and executing the physical instructions corresponding to the calculation instructions in the instruction queue through the target units corresponding to the calculation instructions in the instruction queue according to the dependency relationship among the calculation instructions.

Optionally, according to the dependency relationship between the calculation instructions, executing, by a target unit corresponding to each calculation instruction in the instruction queue, a physical instruction corresponding to each calculation instruction in the instruction queue, including:

And if the dependency relationship does not exist between the at least two computing instructions in the instruction queue, executing the physical instructions corresponding to the at least two computing instructions in parallel through the target units corresponding to the at least two computing instructions.

Optionally, generating each computing instruction based on the instruction type and the instruction object specifically includes:

determining a type identifier corresponding to the instruction type and an object identifier corresponding to the instruction object;

and binding the type identifier of the instruction type corresponding to the calculation instruction and the object identifier of the instruction object corresponding to the calculation instruction with each calculation instruction.

Optionally, for each calculation instruction, determining at least one calculation unit executing the calculation instruction according to the allocation information of the instruction object corresponding to the calculation instruction, and before the calculation instruction is used as the target unit, the method further includes:

for each calculation instruction, determining the allocation information of the instruction object corresponding to the calculation instruction according to the parallel descriptor carried by the instruction object corresponding to the calculation instruction;

and determining each target unit corresponding to the calculation instruction according to the allocation information.

Optionally, before determining the allocation information of the instruction object corresponding to the calculation instruction according to the parallel descriptor carried by the instruction object corresponding to the calculation instruction, the method further includes:

And determining at least one target unit for executing the calculation instruction according to the equipment information corresponding to each calculation unit, and determining the parallel descriptor based on the at least one target unit.

Optionally, the method further comprises:

and sending the instruction object corresponding to the calculation instruction to each target unit corresponding to the calculation instruction, so that each target unit stores the received instruction object locally.

Optionally, executing, by a target unit executing each computing instruction, a computing task for the target model, including:

for each calculation instruction, executing a calculation operation corresponding to the calculation instruction on a locally stored instruction object according to the received physical instruction by a target unit executing the calculation instruction, and deducing a memory space required by a calculation result obtained after calculating the target instruction object as an estimated memory space;

and determining a target storage position for storing the calculation result according to the estimated memory space, and storing the calculation result in the target storage position after obtaining the calculation result.

Optionally, the method further comprises:

and (3) acquiring the type of the newly added computing unit, and rewriting at least one function interface of a preset interface for acquiring the type name of the computing unit, an interface for initializing the context information of the computing unit, an interface for initializing the instruction state, an interface for destroying the instruction state, an interface for inquiring the instruction state, an interface for computing the instruction and an interface for constructing a computing flow descriptor according to the type information of the type of the newly added computing unit.

Optionally, the method further comprises:

the method comprises the steps of obtaining a new instruction type, rewriting a preset interface for calculating instructions, a preset interface for deriving instructions and a preset interface for obtaining the type name of a calculation unit, constructing a function interface of the new instruction type and a calculation flow type of the calculation unit corresponding to the new instruction type, and registering the new instruction type.

Optionally, the method further comprises:

and monitoring the state of each calculation instruction in the instruction queue, releasing the memory occupied by each calculation instruction in the current instruction queue after monitoring that all calculation instructions in the instruction queue are in the state of execution completion, and receiving the next batch of calculation instructions.

The present specification provides a task execution device including:

the acquisition module acquires model data of the target model;

the generation module analyzes the model data, determines the instruction type and the instruction object involved in executing the calculation task aiming at the target model, generates each calculation instruction based on the instruction type and the instruction object, and determines the allocation information of the instruction object corresponding to each calculation instruction;

The system comprises a determining module, a calculating module and a target unit, wherein the determining module is used for determining at least one calculating unit for executing the calculating instruction in preset calculating units according to the allocation information of an instruction object corresponding to the calculating instruction aiming at each calculating instruction, and generating a deducing instruction corresponding to the calculating instruction, wherein the deducing instruction is used for controlling the target unit to deduce the memory capacity required by a calculating result obtained after the calculating instruction is executed;

and the execution module is used for generating each physical instruction based on each calculation instruction and a deduction instruction corresponding to each calculation instruction, and sending each physical instruction to a target unit for executing each calculation instruction so as to execute the calculation task aiming at the target model through the target unit for executing each calculation instruction.

Optionally, the generating module is specifically configured to determine a type identifier corresponding to the instruction type and an object identifier corresponding to the instruction object; and binding the type identifier of the instruction type corresponding to the calculation instruction and the object identifier of the instruction object corresponding to the calculation instruction with each calculation instruction.

Optionally, the execution module is specifically configured to execute, for each calculation instruction, a calculation operation corresponding to the calculation instruction on a locally stored instruction object according to a received physical instruction by using a target unit for executing the calculation instruction, and deduce a memory space required by a calculation result obtained by calculating the target instruction object as an estimated memory space; and determining a target storage position for storing the calculation result according to the estimated memory space, and storing the calculation result in the target storage position after obtaining the calculation result.

The present specification provides a computer-readable storage medium storing a computer program which, when executed by a processor, implements the task execution method described above.

The present specification provides an electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the task execution method described above when executing the program.

The above-mentioned at least one technical scheme that this specification adopted can reach following beneficial effect: the method comprises the steps of obtaining model data of a target model, analyzing the model data, determining an instruction type and an instruction object related to executing a calculation task aiming at the target model, generating each calculation instruction based on the instruction type and the instruction object, determining at least one target unit for executing the calculation instruction in preset calculation units according to distribution information of the instruction object corresponding to each calculation instruction, generating a deduction instruction corresponding to the calculation instruction, generating each physical instruction based on each calculation instruction and the deduction instruction corresponding to each calculation instruction, and sending each physical instruction to the target unit for executing each calculation instruction so as to execute the calculation task aiming at the target model.

According to the task execution method provided by the specification, the target unit for executing the calculation instruction can be determined according to the allocation information of the instruction object corresponding to the calculation instruction, and each calculation instruction has the corresponding deducing instruction, so that the target unit can deduce the memory space required by executing the calculation instruction in the process of executing the physical instruction, and further execute the corresponding physical instruction through each target unit, therefore, the storage position can be determined according to the deduced memory space when the target unit executes the physical instruction, and the memory is not required to be applied after the calculation result is obtained, thereby improving the execution efficiency of the calculation task.

Drawings

The accompanying drawings, which are included to provide a further understanding of the specification, illustrate and explain the exemplary embodiments of the present specification and their description, are not intended to limit the specification unduly. In the drawings:

FIG. 1 is a schematic flow chart of a task execution method provided in the present specification;

FIG. 2 is a schematic diagram of an instruction execution process provided in the present specification;

FIG. 3 is a schematic diagram of a task performing device provided in the present specification;

Fig. 4 is a schematic view of an electronic device corresponding to fig. 1 provided in the present specification.

Detailed Description

For the purposes of making the objects, technical solutions and advantages of the present specification more apparent, the technical solutions of the present specification will be clearly and completely described below with reference to specific embodiments of the present specification and corresponding drawings. It will be apparent that the described embodiments are only some, but not all, of the embodiments of the present specification. All other embodiments, which can be made by one of ordinary skill in the art without undue burden from the present disclosure, are intended to be within the scope of the present disclosure.

The following describes in detail the technical solutions provided by the embodiments of the present specification with reference to the accompanying drawings.

Fig. 1 is a schematic flow chart of a task execution method provided in the present specification, including the following steps:

s101: model data of the target model is acquired.

S102: analyzing the model data, determining the instruction type and the instruction object involved in executing the calculation task aiming at the target model, generating each calculation instruction based on the instruction type and the instruction object, and determining the allocation information of each calculation instruction corresponding to the instruction object.

In performing computing tasks (e.g., model training, model reasoning, model prediction, etc.) on a target model, it is often necessary that a compiler in a model framework (e.g., a deep learning framework) compiles model code of the target model input by a user, so as to obtain executable code (e.g., a computational graph) corresponding to the target model.

And then, a corresponding calculation unit can be called to execute the calculation of the executable code, in the calculation process of the executable code, model data of the target model needs to be acquired, the model data can be a message structure body corresponding to the executable code of the target model, and the instruction type and the instruction object involved in executing the calculation task of the target model are determined by analyzing the message structure body. Of course, the model data may be a model code of the target model, and the instruction object may be determined by analyzing the model code.

In the present specification, the execution body for implementing the task execution method may be a server deployed with a distributed computing system, and of course, may also be a terminal device deployed with a distributed computing system.

In practical applications, the computing instruction of the computer is generally composed of operators and operands, so the server may determine the operators corresponding to the computing instruction according to the instruction types to characterize the computing logic of which computing operation (such as add operation, multiply-divide operation, etc.) the computing instruction corresponds to. The instruction object may include model parameters, operators and other target data required for executing the operation corresponding to the calculation instruction.

In the process of generating the calculation instructions, for each calculation instruction, the server may determine a preset type identifier (such as a number or an ID) of a calculation type corresponding to the calculation instruction, and add the type identifier to an operand of the calculation instruction, so as to bind the type identifier with the calculation instruction.

In this specification, the server may create a name corresponding to each instruction type in advance, for presentation to the user in the user terminal.

Wherein, the type identifier keeps global consistency and is read-only in the execution process of the computing task, and each computing unit can be provided with a corresponding backup.

Further, the server may determine an object identifier of an instruction object corresponding to the computing instruction, and the server may add the object identifier to an operand corresponding to the computing instruction, so as to bind the object identifier and the computing instruction.

In this specification, each instruction object carries a corresponding parallel descriptor, where the parallel descriptor is used to determine allocation information of the instruction object, and the allocation information is used to describe to which computing units the instruction object is allocated. The computing unit may be a CPU, GPU, etc. of different types, and of course, may be other types of computing devices, which are not specifically limited in this specification.

The server may determine, according to the device information (such as the computing capability, the remaining memory space, etc.) corresponding to each computing unit, at least one computing unit that matches each computing instruction as a target unit, and then the server may determine, according to each target unit, a parallel descriptor of an instruction object corresponding to the computing instruction.

In addition, the server can also determine a preset identification mark corresponding to the parallel descriptor, and add the identification mark to an operand of a calculation instruction corresponding to the identification mark, so that the identification mark is bound with the calculation instruction.

In this way, the server may generate an instruction identification code of the computing instruction by binding the type identification of the computing type, the object identification of the instruction object, and the descriptor identification of the parallel descriptor with the computing instruction, to perform subsequent computing tasks according to the instruction identification code.

After generating the calculation instruction, the server may add the calculation instruction to a preset instruction queue according to the execution sequence.

S103: for each calculation instruction, determining at least one calculation unit for executing the calculation instruction in preset calculation units according to the allocation information of the instruction object corresponding to the calculation instruction, taking the calculation unit as a target unit, and generating a deduction instruction corresponding to the calculation instruction, wherein the deduction instruction is used for controlling the target unit to deduce the memory capacity required by the calculation result obtained after the calculation instruction is executed.

After generating the calculation instruction, the server may determine allocation information of the instruction object corresponding to the calculation instruction according to the parallel descriptor of the instruction object corresponding to the calculation instruction, and further determine at least one calculation unit executing the calculation instruction as a target unit according to the allocation information.

At the same time, for each calculation instruction, the server may send a copy of the instruction object corresponding to the calculation instruction to each target unit corresponding to the calculation instruction, so that each target unit stores the received instruction object locally.

In this specification, each calculation instruction corresponds to a deriving instruction, and the server may obtain allocation information according to the parallel descriptor carried by the instruction object corresponding to each calculation instruction, and further generate the deriving instruction of each calculation instruction on the target unit according to the allocation information.

The deducing instruction is used for controlling each target unit to deduce the memory capacity required by the calculation result obtained after the physical instruction corresponding to the calculation instruction is executed, so as to obtain the estimated memory space of the calculation result corresponding to the calculation instruction.

S104: based on each calculation instruction and a deduction instruction corresponding to each calculation instruction, each physical instruction is generated and sent to a target unit for executing each calculation instruction, so that a calculation task aiming at the target model is executed through the target unit for executing each calculation instruction.

The server may generate, according to each calculation instruction and the derived instruction corresponding to each calculation instruction, a plurality of physical instructions that can be identified and executed by the hardware device, where each physical instruction only accesses the instruction object copy whose corresponding instruction object is stored locally in the target unit.

For example, the server may combine and compile each computing instruction and its corresponding physical instruction by a respective compiler, thereby generating binary physical instructions that the computing unit can recognize.

After generating the physical instruction, the server may call each target unit corresponding to the physical instruction according to the order of the calculation instruction in the instruction queue, and then each target unit may execute the received physical instruction.

The server may call the corresponding target units in turn and execute the physical instructions corresponding to the calculation instructions according to the dependency relationship between the calculation instructions in the instruction queue, and for a plurality of calculation instructions without dependency relationship, the server may call the corresponding target units to execute the physical instructions corresponding to the calculation instructions in parallel, for example, the server may call the target units corresponding to three calculation instructions without dependency relationship according to the limitation of the current memory resource, so as to execute the received physical instructions in parallel.

Since the physical instruction is generated based on the calculation instruction and the derivation instruction, the target unit corresponding to the physical instruction is the same as the target unit corresponding to the calculation instruction.

Specifically, in the process of executing the physical instruction, the target unit may derive, as the estimated memory space, the memory space occupied by the calculation result obtained after executing the calculation operation while executing the calculation operation corresponding to the physical instruction on the locally stored instruction object. It should be noted that, the computing operation corresponding to the physical instruction is equivalent to the computing operation corresponding to the computing instruction constructing the physical instruction.

The server may derive a memory space occupied by a calculation result obtained after the calculation operation is performed according to at least one of the number of operations involved in performing the calculation operation on the instruction object, the size of the instruction object, and the instruction type.

After determining the estimated memory space, the computing unit can determine the target storage position meeting the estimated memory space and apply for the corresponding memory, so that the calculated result is directly stored in the target storage position after being obtained, the application of the memory after the calculated result is obtained is avoided, and the time for executing the calculation task process is greatly reduced.

Further, the server may abstract each computing unit into a computing stream and transmit physical instructions to the corresponding computing streams according to the data dependencies. The server does not completely transmit the physical instructions corresponding to the calculation instructions in the instruction queue in sequence, but transmits the physical instructions according to the data dependency relationship. Instructions that do not have dependencies may be issued in parallel.

In addition, the server can construct a polling thread, the polling thread is 1, and the server can poll the state of each instruction in the instruction queue by using the polling thread so as to monitor the state of each calculation instruction and trigger a downstream instruction when the execution of the instruction is finished.

The polling thread monitors all state changes of the instruction, and the actual calculation of the instruction is performed by the worker thread. After the physical instruction corresponding to the calculation instruction completes execution, the state of the calculation instruction may be changed to an execution completion state.

Compared with calculation, the cost of state change of the instruction is not large, and the cost of lock can be saved by designing a single polling thread to process the state change of the instruction. Since the polling thread does not take on the actual computing task of the instruction, memory allocation and data synchronization operations between multiple machines can be reduced, which can optimize the performance of the single thread.

In the process of monitoring each calculation instruction in the instruction queue, for each calculation instruction, when the calculation instruction is in the execution state, the server can execute the next calculation instruction in the instruction queue or the next calculation instruction with a dependency relationship with the next calculation instruction, and when all calculation instructions in the instruction queue are in the execution state, the server can empty the instruction queue, release the memory occupied by each calculation instruction in the current instruction queue and receive the next calculation instruction, thereby completing the updating of the instruction queue.

In this specification, the server may also perform computing tasks by using a distributed computing system deployed therein, and for ease of understanding, this specification provides a schematic diagram of an instruction execution process, as shown in fig. 2.

Fig. 2 is a schematic diagram of an instruction execution process provided in the present specification.

The server can generate a calculation instruction and a deduction instruction through a business logic layer of the distributed computing system and send the calculation instruction and the deduction instruction to an execution layer, the execution layer generates a physical instruction based on the deduction instruction and the calculation instruction, calls a target unit corresponding to the physical instruction through a dispatcher, and transmits the physical instruction to a calculation flow of the target unit so that the target unit executes the physical instruction, and monitors the state of each calculation instruction through a polling thread in the execution process until all calculation instructions are in an execution completion state, so that a calculation task is completed.

In practical application, when the required computing unit type in the distributed computing system is newly added, the newly added computing unit type will also generate a newly added computing stream type when the newly added computing unit type is accessed to the distributed computing system in the server, so that the server can acquire the newly added computing unit type, and according to the type information of the newly added computing unit type, rewrite at least one basic function interface of the original computing unit type name interface, the initializing device context interface, the initializing instruction state interface, the destroying instruction state interface, the computing instruction interface and the interface for constructing the computing stream descriptor, thereby accessing the newly added computing unit type to the server without constructing the newly added function interface.

In addition, when the newly added instruction type needs to be added in the server, the newly added instruction type can integrate the original basic instruction type of the server, and the server can rewrite the function interface of the basic instruction type and the function interface newly added by the definition part.

Specifically, the server may rewrite at least one of an interface of the calculation instruction, an interface of the deriving instruction, and an interface of obtaining a type name of the calculation unit, and construct a newly added function interface, and execute a calculation flow type of the hardware device corresponding to the newly added instruction type. And then register the newly added instruction type. And registering the instruction type corresponding to the newly-added hardware computing unit type into the distributed computing system of the server.

According to the distributed task execution method and system provided by the invention, the target unit for executing the calculation instruction can be determined according to the allocation information of the instruction object corresponding to the calculation instruction, and each calculation instruction has the corresponding deducing instruction, so that the target unit can deduce the memory space required by executing the calculation instruction in the process of executing the physical instruction, and further execute the corresponding physical instruction through each target unit, therefore, the storage position can be determined according to the deduced memory space when the target unit executes the physical instruction, and the memory is not required to be applied after the calculation result is obtained, thereby improving the execution efficiency of the calculation task.

In addition, the scheme only needs to access the calculation task, does not need to consider the problems of high availability, abnormal fault recovery, parallel configuration and the like of a dispatching strategy of distributed dispatching, and only needs to concentrate on the business logic. And a flexible expansion interface is provided for the newly-added computing unit, so that the computing instruction of the newly-added hardware can be expanded based on the interface, and further the computing task can be executed by using the newly-added hardware.

The above is a method for implementing task execution for one or more embodiments of the present disclosure, and based on the same concept, the present disclosure further provides a corresponding task execution device, as shown in fig. 3.

Fig. 3 is a schematic diagram of a task execution device provided in the present specification, including:

an obtaining module 301, configured to obtain model data of a target model;

the generating module 302 is configured to parse the model data, determine an instruction type and an instruction object related to executing a computing task for the target model, generate each computing instruction based on the instruction type and the instruction object, and determine allocation information of each computing instruction corresponding to the instruction object;

the determining module 303 is configured to determine, for each calculation instruction, at least one calculation unit that executes the calculation instruction from preset calculation units according to allocation information of an instruction object corresponding to the calculation instruction, as a target unit, and generate a deduction instruction corresponding to the calculation instruction, where the deduction instruction is used to control the target unit to deduct a memory capacity required by a calculation result obtained after the calculation instruction is executed;

The execution module 304 is configured to generate each physical instruction based on each calculation instruction and a derived instruction corresponding to each calculation instruction, and send each physical instruction to a target unit executing each calculation instruction, so as to execute a calculation task for the target model by executing the target unit of each calculation instruction.

Optionally, the generating module 302 is further configured to add a preset instruction queue to each calculation instruction;

the execution module 304 is specifically configured to execute, according to a dependency relationship between each computing instruction, a physical instruction corresponding to each computing instruction in the instruction queue through a target unit corresponding to each computing instruction in the instruction queue.

Optionally, the execution module 304 is specifically configured to execute, in parallel, the physical instructions corresponding to the at least two computing instructions through the target units corresponding to the at least two computing instructions if there is no dependency relationship between the at least two computing instructions in the instruction queue.

Optionally, the generating module 302 is specifically configured to determine a type identifier corresponding to the instruction type and an object identifier corresponding to the instruction object; and binding the type identifier of the instruction type corresponding to the calculation instruction and the object identifier of the instruction object corresponding to the calculation instruction with each calculation instruction.

Optionally, determining at least one computing unit for executing the computing instruction according to the allocation information of the computing instruction corresponding instruction object, and before the determining module 303 is used as a target unit, for each computing instruction, determining the allocation information of the computing instruction corresponding instruction object according to the parallel descriptor carried by the computing instruction corresponding instruction object; and determining each target unit corresponding to the calculation instruction according to the allocation information.

Optionally, before determining the allocation information of the instruction object corresponding to the calculation instruction according to the parallel descriptor carried by the instruction object corresponding to the calculation instruction, the determining module 303 is further configured to determine at least one target unit for executing the calculation instruction according to the device information corresponding to each calculation unit, and determine the parallel descriptor based on the at least one target unit.

Optionally, the determining module 303 is further configured to send an instruction object corresponding to the calculation instruction to each target unit corresponding to the calculation instruction, so that each target unit stores the received instruction object locally.

Optionally, the execution module 304 is specifically configured to execute, for each calculation instruction, a calculation operation corresponding to the calculation instruction on a locally stored instruction object according to a received physical instruction by using a target unit that executes the calculation instruction, and deduce a memory space required by a calculation result obtained by calculating the target instruction object as an estimated memory space; and determining a target storage position for storing the calculation result according to the estimated memory space, and storing the calculation result in the target storage position after obtaining the calculation result.

Optionally, the apparatus further comprises: the new adding module 305 is configured to obtain a new type of computing unit, and rewrite at least one of a preset interface for obtaining a type name of the computing unit, an interface for initializing context information of the computing unit, an interface for initializing an instruction state, an interface for destroying the instruction state, an interface for querying the instruction state, an interface for computing an instruction, and an interface for constructing a computing flow descriptor according to type information of the new type of computing unit.

Optionally, the adding module 305 is specifically configured to obtain a new instruction type, rewrite a preset interface of a calculation instruction, a preset interface of a deriving instruction, and a preset interface of a calculating unit type name, construct a function interface of the new instruction type and a calculating flow type of a calculating unit corresponding to the new instruction type, and register the new instruction type.

Optionally, the execution module 304 is specifically configured to monitor a status of each calculation instruction in the instruction queue, and release a memory occupied by each calculation instruction in the instruction queue and receive a next batch of calculation instructions after it is monitored that all calculation instructions in the instruction queue are in a status of completed execution.

The present specification also provides a computer readable storage medium storing a computer program operable to perform a task execution method as provided in fig. 1 above.

The present specification also provides a schematic structural diagram of an electronic device corresponding to fig. 1 shown in fig. 4. At the hardware level, the electronic device includes a processor, an internal bus, a network interface, a memory, and a non-volatile storage, as described in fig. 4, although other hardware required by other services may be included. The processor reads the corresponding computer program from the nonvolatile memory into the memory and then runs the computer program to implement the task execution method described in fig. 1. Of course, other implementations, such as logic devices or combinations of hardware and software, are not excluded from the present description, that is, the execution subject of the following processing flows is not limited to each logic unit, but may be hardware or logic devices.

Improvements to one technology can clearly distinguish between improvements in hardware (e.g., improvements to circuit structures such as diodes, transistors, switches, etc.) and software (improvements to the process flow). However, with the development of technology, many improvements of the current method flows can be regarded as direct improvements of hardware circuit structures. Designers almost always obtain corresponding hardware circuit structures by programming improved method flows into hardware circuits. Therefore, an improvement of a method flow cannot be said to be realized by a hardware entity module. For example, a programmable logic device (Programmable Logic Device, PLD) (e.g., field programmable gate array (Field Programmable Gate Array, FPGA)) is an integrated circuit whose logic function is determined by the programming of the device by a user. A designer programs to "integrate" a digital system onto a PLD without requiring the chip manufacturer to design and fabricate application-specific integrated circuit chips. Moreover, nowadays, instead of manually manufacturing integrated circuit chips, such programming is mostly implemented by using "logic compiler" software, which is similar to the software compiler used in program development and writing, and the original code before the compiling is also written in a specific programming language, which is called hardware description language (Hardware Description Language, HDL), but not just one of the hdds, but a plurality of kinds, such as ABEL (Advanced Boolean Expression Language), AHDL (Altera Hardware Description Language), confluence, CUPL (Cornell University Programming Language), HDCal, JHDL (Java Hardware Description Language), lava, lola, myHDL, PALASM, RHDL (Ruby Hardware Description Language), etc., VHDL (Very-High-Speed Integrated Circuit Hardware Description Language) and Verilog are currently most commonly used. It will also be apparent to those skilled in the art that a hardware circuit implementing the logic method flow can be readily obtained by merely slightly programming the method flow into an integrated circuit using several of the hardware description languages described above.

The controller may be implemented in any suitable manner, for example, the controller may take the form of, for example, a microprocessor or processor and a computer readable medium storing computer readable program code (e.g., software or firmware) executable by the (micro) processor, logic gates, switches, application specific integrated circuits (Application Specific Integrated Circuit, ASIC), programmable logic controllers, and embedded microcontrollers, examples of which include, but are not limited to, the following microcontrollers: ARC 625D, atmel AT91SAM, microchip PIC18F26K20, and Silicone Labs C8051F320, the memory controller may also be implemented as part of the control logic of the memory. Those skilled in the art will also appreciate that, in addition to implementing the controller in a pure computer readable program code, it is well possible to implement the same functionality by logically programming the method steps such that the controller is in the form of logic gates, switches, application specific integrated circuits, programmable logic controllers, embedded microcontrollers, etc. Such a controller may thus be regarded as a kind of hardware component, and means for performing various functions included therein may also be regarded as structures within the hardware component. Or even means for achieving the various functions may be regarded as either software modules implementing the methods or structures within hardware components.

The system, apparatus, module or unit set forth in the above embodiments may be implemented in particular by a computer chip or entity, or by a product having a certain function. One typical implementation is a computer. In particular, the computer may be, for example, a personal computer, a laptop computer, a cellular telephone, a camera phone, a smart phone, a personal digital assistant, a media player, a navigation device, an email device, a game console, a tablet computer, a wearable device, or a combination of any of these devices.

For convenience of description, the above devices are described as being functionally divided into various units, respectively. Of course, the functions of each element may be implemented in one or more software and/or hardware elements when implemented in the present specification.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment, or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The present description is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the specification. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In one typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include volatile memory in a computer-readable medium, random Access Memory (RAM) and/or nonvolatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). Memory is an example of computer-readable media.

Computer readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device. Computer-readable media, as defined herein, does not include transitory computer-readable media (transmission media), such as modulated data signals and carrier waves.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, method, article or apparatus that comprises the element.

It will be appreciated by those skilled in the art that embodiments of the present description may be provided as a method, system, or computer program product. Accordingly, the present specification may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present description can take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.

The description may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The specification may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.

In this specification, each embodiment is described in a progressive manner, and identical and similar parts of each embodiment are all referred to each other, and each embodiment mainly describes differences from other embodiments. In particular, for system embodiments, since they are substantially similar to method embodiments, the description is relatively simple, as relevant to see a section of the description of method embodiments.

The foregoing is merely exemplary of the present disclosure and is not intended to limit the disclosure. Various modifications and alterations to this specification will become apparent to those skilled in the art. Any modifications, equivalent substitutions, improvements, or the like, which are within the spirit and principles of the present description, are intended to be included within the scope of the claims of the present description.

Claims

1. A method of performing a task, comprising:

obtaining model data of a target model;

analyzing the model data, determining an instruction type and an instruction object involved in executing a computing task aiming at the target model, generating each computing instruction based on the instruction type and the instruction object, and determining allocation information of each computing instruction corresponding to the instruction object, wherein the instruction object comprises: the allocation information is used for describing a calculation unit to which the instruction object is allocated;

generating each physical instruction based on each calculation instruction and a deduction instruction corresponding to each calculation instruction, and sending each physical instruction to a target unit executing each calculation instruction so as to execute a calculation task aiming at the target model through the target unit executing each calculation instruction, wherein the target unit deduces the memory space occupied by a calculation result obtained after executing the calculation operation while executing the calculation operation corresponding to the physical instruction on a locally stored instruction object in the process of executing the physical instruction.

2. The method of claim 1, wherein the method further comprises:

adding a preset instruction queue to each calculation instruction;

3. The method of claim 2, wherein executing, by the target unit corresponding to each calculation instruction in the instruction queue, the physical instruction corresponding to each calculation instruction in the instruction queue according to the dependency relationship between each calculation instruction, specifically comprises:

4. The method of claim 1, wherein generating each computing instruction based on the instruction type and the instruction object, comprises:

5. The method of claim 1, wherein for each computing instruction, determining at least one computing unit executing the computing instruction as a target unit based on allocation information of an instruction object corresponding to the computing instruction, the method further comprising:

6. The method of claim 5, wherein before determining the allocation information of the instruction object corresponding to the calculation instruction based on the parallel descriptor carried by the instruction object corresponding to the calculation instruction, the method further comprises:

7. The method of claim 1, wherein the method further comprises:

8. The method of claim 7, wherein the computing task for the target model is performed by a target unit executing each computing instruction, comprising:

9. The method of claim 1, wherein the method further comprises:

10. The method of claim 1, wherein the method further comprises:

11. The method of claim 2, wherein the method further comprises:

12. A task execution device, characterized by comprising:

the acquisition module acquires model data of the target model;

the generation module analyzes the model data, determines the instruction type and the instruction object involved in executing the calculation task aiming at the target model, generates each calculation instruction based on the instruction type and the instruction object, and determines the allocation information of each calculation instruction corresponding to the instruction object, wherein the instruction object comprises: the allocation information is used for describing a calculation unit to which the instruction object is allocated;

and the execution module is used for generating each physical instruction based on each calculation instruction and a deduction instruction corresponding to each calculation instruction, and sending each physical instruction to a target unit for executing each calculation instruction so as to execute a calculation task aiming at the target model through the target unit for executing each calculation instruction, wherein the target unit deducts the memory space occupied by a calculation result obtained after executing the calculation operation while executing the calculation operation corresponding to the physical instruction on a locally stored instruction object in the process of executing the physical instruction.

13. The apparatus of claim 12, wherein the generation module is specifically configured to determine a type identifier corresponding to the instruction type and an object identifier corresponding to the instruction object; and binding the type identifier of the instruction type corresponding to the calculation instruction and the object identifier of the instruction object corresponding to the calculation instruction with each calculation instruction.

14. The apparatus of claim 12, wherein the execution module is specifically configured to, for each calculation instruction, execute, by a target unit executing the calculation instruction, a calculation operation corresponding to the calculation instruction on a locally stored instruction object according to the received physical instruction, and derive, as the estimated memory space, a memory space required by a calculation result obtained by calculating the target instruction object; and determining a target storage position for storing the calculation result according to the estimated memory space, and storing the calculation result in the target storage position after obtaining the calculation result.

15. A computer readable storage medium, characterized in that the storage medium stores a computer program which, when executed by a processor, implements the method of any of the preceding claims 1-11.

16. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, characterized in that the processor implements the method of any of the preceding claims 1-11 when executing the program.